This PEP proposes the following additions to generic classes:
- Adding an
__args__
attribute that returns the specialised parameters to the class - Adding any type parameters to be available directly on the class as overwrite-able instance variables
- Substitution of default type parameters at runtime
- Automatically adding
__orig_class__
to a class's slots if it's subscriptable (even if it's defined in C)
The following change to TypeVarLike
s:
- Adding
__value__
as a way to compute the specialised value of a type parameter after subscription (if it has non-default parameters)
The following change to GenericAlias
es:
- Hooking
__getattr__
to handle accessing__args__
by name on the instance.
Currently getting the specialised types for :py:term:Generic
types is unintuitive and unreliable
class Foo[T]: ...
Foo[int]() # How do I get `int` inside Foo?
>>> Foo[int]().__orig_class__.__args__
(int,)
This however doesn't work inside __new__
/__init__
or any methods called from them as GenericAlias.__call__(*args, **kwargs)
only sets __orig_class__
after self.__origin__(*args, **kwargs)
returns.
class Bar[T]:
def __init__(self):
self.__orig_class__
>>> Bar[int]() # AttributeError: Bar has no attribute __orig_class__
Now what about if I subclass a generic?
class Bar(Foo[str]): ... # how do I now get `str`?
>>> types.get_original_bases(Bar)[0].__args__
(str,)
And what about a type parameter inside a generic function?
def foo[T](): ...
>>> foo[int]()
This isn't even possible without using implementation details/frame hacks.
With the new roots of runtime type checking beginning to sprout, I think it's unacceptable to have this kind of hard-to-use interface which is full of edge cases.
e.g.
class Slotted[T]:
__slots__ = ()
Slotted[int]().__orig_class__ # AttributeError: 'Slotted' object has no attribute '__orig_class__'
I propose a new interface design which solves all of the above problems by being easy to use and much more reliable:
>>> Foo[int]().__args__
(int,)
>>> Foo[int]().T.__value__
int
>>> Bar.__args__
(str,)
>>> Bar.T.__value__
str
def foo[T]():
return T.__value__
>>> foo[bool]()
bool
Anecdotally I've seen many requests for such a feature and I've needed it multiple times when writing typed code to get type parameters without duplicating values throughout code.
Prior discussion:
- python/typing#629
- https://mail.python.org/archives/list/typing-sig@python.org/thread/T7VEN5HYHIT5ABNJHYOW434JHELTTKT3/
- python/typing#1544
TODO Maybe try getting some stats on how popular this could be? Reach out to pydantic and other such introspection libraries
Adding this property to allows for easy checking of the current instances type parameters.
I would also like to enforce the arguments to a C-defined generic type at runtime to be the correct length. This would allow us to handle type parameter defaults at runtime and correctly substitute them at runtime.
This is a step towards deprecating all the typing aliases of collections.abc
and contextlib
classes as currently the number of parameters passed to them is not checked and to ensure a smooth transition to removing the typing aliases they should become re-exports of the original classes. TODO Something about the typing classes supporting defaults maybe?
This avoids having developers have to remember to set this slot themselves if they create a slotted class.
Adding __value__
as a way to compute the specialised value of a type parameter after subscription (if it has non-default parameters)
These properties should allow for access to a type parameter, which binds a value after the substitution has occurred.
Setting and deleting should raise deprecation warnings however as in a future version these should be read-only. Type-checkers should warn about setting or deleting these attributes.
Another note about needing __eq__
(and __hash__
) on them now so Class.T == class_.T
(???)
Code generation for accessing the specialised type parameters
class Foo[T]:
def __init__(self, value: T):
self.T = value # Oops, what a strange variable name
Would generate:
class Foo[T]:
# compiler-generated code for each type parameter
@property
def T(self):
try:
# intentionally bypass any attribute hooks as this should be entirely transparent to developers
return object.__getattribute__(self, "__T__")
except AttributeError:
T = self.__orig_class__.T
object.__setattr__(self, "__T__", T)
return T
@T.setter
def T(self, value):
warnings.warn(
"Setting type parameters is not supported",
DeprecationWarning,
)
object.__setattr__(self, "__T__", value)
# user code
def __init__(self, value: T):
self.T = value
Instantiation of Foo
should raise a DeprecationWarning
>>> Foo[int](1)
<stdin>:1: DeprecationWarning: Setting type parameters is not supported
With __slot__
ed classes or class variables/methods there should be a DeprecationWarning
if any names overlap with the __type_params__
and then the code should look something like
class Slotted[T]:
__slots__ = ("T",)
class Slotted[T]:
__slots__ = ("T",) # already should be in slots but is just ignored
warnings.warn(
"Setting type parameters is not supported",
DeprecationWarning,
)
Type checkers should warn about overriding instance variables with the same name as type parameters.s
Something about methods called T cough cough numpy.
This attribute gives access to a tuple
(or None
) of the type variables after any substitution has occurred and acts very similarly to GenericAlias.__args__
.
This descriptor can be accessed as both a class and an instance property depending on whether the class is used as a specialised base class.
class Foo[T]:
def __init__(self):
print("__args__ in Foo", self.__args__)
super().__init__()
class Baz(Foo[str]):
def __init__(self):
print("__args__ in Baz", self.__args__)
super().__init__()
class Bar[T, U](Foo[T]):
def __init__(self):
print("__args__ in Bar", self.__args__)
super().__init__()
>>> Foo[bool]()
__args__ in Foo (bool,)
>>> Baz()
__args__ in Baz None
__args__ in Foo (str,)
>>> Bar[int, str]()
__args__ in Bar (int, str)
__args__ in Foo (int,)
>>> Foo() # nothing passed to __orig_class__
__args__ in Foo None
This works with multiple inheritance as you might expect.
class Spam[U, V](Baz, Bar[int, U]):
def __init__(self):
print("__args__ in Spam", self.__args__)
super().__init__()
>>> Spam[complex, bool]()
__args__ in Spam (complex, bool)
__args__ in Baz None
__args__ in Bar (int, complex)
__args__ in Foo (int,)
__args__
can be accessed on the class.
class Foo[T]:
@classmethod
def bar(cls):
return cls.__args__
class Baz(Foo[str]): ...
>>> Foo[int].bar()
(int,)
>>> Baz.bar()
(str,)
Type checkers should be aware of the types passed to the instantiation and their associated variable types to allow the __args__
should be statically determinable. __args__
is erased to tuple[object, ...]
outside of the instance of the class to preserve safety. We are explicitly choosing to violate the Liskov substitution principle because practicality beats purity here. The __args__
property would be almost useless without this restriction as any subclass could have different parameters making introspection significantly more difficult to use without any measurable benefit.
If you wanted to access the parameters whilst pretending to emulate a particular call site TODO how?
Accessed on | Example | Type checker type | Type checker inferred type in example |
---|---|---|---|
self | def method(self: Bar[int, str]) |
tuple[T, U, V, ...] | None |
tuple[int, str] | None |
specific class | Bar[int, str]() |
tuple[T, U, V, ...] | None |
tuple[int, str] |
function parameter | def foo(x: Foo[int]) |
tuple[object, ...] | None |
tuple[object, ...] | None |
Currently, no checking of length or substitution of parameters occurs with types.GenericAlias
, this PEP requires this checking the length of any parameters passed.
The number of parameters required can be worked out from the class definition both in python and in an extension module. This proposal doesn't require any runtime changes to Generics defined in python in this regard as they already check the number of parameters passed, however, in extensions we propose adding new a new interface for defining type parameters in C.
These changes mean that types.GenericAlias
is now compatible with typing._GenericAlias
for most cases. This PEP means that collections.abc
and contextlib
classes can use :pep:695
syntax and typing
can simply re-export the classes without wrapping them like they currently do, the same applies for all builtin classes defined currently using :pep:585
in C that are wrapped by typing.
If a class uses type parameter syntax it should have __orig_class__
added to the class's __slots__
if required. This is not required if it's already included in a superclass's or the unmodified__slots__
.
__orig_class__
should have type GenericAlias | None
and is None
unless the instance was created through GenericAlias.__call__
in which case it will be the self
argument. GenericAlias.__call__
should reimplement object.__new__
for classes to set the attribute as early in instance creation as possible.
Need to provide implementations for __args__
as an attribute, don't call this attribute on self
from C? and a function to get a type param as an attribute.
Allows for accessing the specialised value of a type parameter after the substitution has occurred.
Currently, unused type variables in the signature which aren't bound to a parameter is a type-checking error, however, now the following snippet should type-check without any errors.
def foo[T]():
return T.__value__
foo[int]() # int
If someone calls a function like this without a type param default it should raise an AttributeError
if they try and access __value__
.
To make this work we set up a cell object to catch the type parameter specialisation. Or consider just throwing in locals?
Note about type parameters now being part of stability guarantees i.e. need to be right the first time round. It's now advised not to include variance information in the name because it can change under your feet with infer_variance
and implementing a new method.
Type checkers should give this type[__bound__]
or if constrained, type[Union[*__constraints__]]
failing both of those, implicit Any
.
types.GenericAlias
needs to add casing for __getattr__
for type parameters so the below works:
class Foo[T]: ...
class Bar(Foo[str]): ...
Foo[int].T.__value__ # int
Foo[int]().T.__value__ # int
Bar.T.__value__ # str
Bar().T.__value__ # str
It should return the first found value for the type parameter so if there are duplicates in the MRO it should work like regular attribute access.
There are not backwards incompatible changes introduced by this as the magic methods/attributes are reserved for internals without warning. The type parameter changes would be unlikely to cause issues for anyone even if they did override attributes inside of classes, however special care has been taken to ensure that there are no behaviour changes for developers if they do override a type parameter for at least 3 years after the PEP is accepted.
Since this PEP makes type parameters part of public API normal deprecation policy would need to be applied to them to match the process for deprecating arguments to functions.
Unfortunately to the best of my knowledge this PEP can't be backported in typing_extensions
due to the large amounts of coupling to the language without using frame hacks.
Access to __args__
property needs to be recompiled to call
TODO what about from C can you still access a field called __args__
if this is purely in compilation?
(instance: object, callee: type) -> tuple PyObject_GetArgs(self, callee) Check if isinstance(self, GenericAlias) first though
def PyObject_GetArgs(object, callee):
class Spam[U, V](Baz, Bar[int, U]):
def __init__(self):
print("__args__ in Spam", self.__args__)
super().__init__()
>>> Spam[complex, bool]()
__args__ in Spam (complex, bool)
__args__ in Baz (str,)
__args__ in Bar (int, complex)
__args__ in Foo (int,)
class Foo[T]:
def __init__(self):
print("__args__ in Foo", self.__args__)
super().__init__()
@magic_descriptor
def __args__(self, class_called_from=None):
return (self.__orig_class__.T.__value__,)
class Bar[T, U](Foo[T]):
def __init__(self):
print("__args__ in Bar", self.__args__)
super().__init__()
@magic_descriptor
def __args__(self, class_called_from=None):
if class_called_from is Bar:
return (self.__orig_class__.T.__value__, self.__orig_class__.U.__value__)
# should only expose the parameters that Foo knows about
return super().__args__
class Baz(Foo[str]):
def __init__(self):
print("__args__ in Baz", self.__args__)
super().__init__()
@magic_descriptor
def __args__(self, class_called_from=None): # this works as both a class and instance property
return (self.__orig_bases__[0].T.__value__,)
>>> Foo[bool]()
__args__ in Foo (bool,)
>>> Bar[int, str]()
__args__ in Bar (int, str)
__args__ in Foo (int,)
>>> Baz()
__args__ in Foo (str,)
A more complicated example with multiple inheritance
class Spam[U, V](Baz, Bar[int, U]):
def __init__(self):
print("__args__ in Spam", self.__args__)
super().__init__()
@magic_descriptor
def __args__(self, class_called_from=None):
if class_called_from is Spam:
return (self.__orig_class__.U.__value__, self.__orig_class__.V.__value__)
if class_called_from is Baz: # needs to know that this is a class property
return (self.__orig_bases__[0].T.__value__,)
if class_called_from is Bar:
return (self.__orig_bases__[1].T.__value__, self.__orig_class__.U.__value__)
if class_called_from is Foo:
return (self.__orig_bases__[1].T.__value__,)
Implementing this requires changes to the symtable to give the __class__
if the attribute is accessed inside its class. Inside a non-class scoped function or on a specific class no special action is required.
To enforce type parameters inside in a C extension module a new way to store a class's type parameters is needed. This PEP introduces a new "tp slot", tp_type_params
to PyTypeObject
which stores all the necessary information about a class's type parameters to bring them in-line with a pure python equivalent.
Setting again is handled the same way as currently. Needs mentioning that this also goes on function objects should current field be moved into __dict__
similarly to type?
Handles functionality of PyTypeParam but it can be used from C and python.
Caches by placing in type.__dict__
after first call which is used in fast path.
Extension module
*PyObject custom_class_type_params(PyObject *self) {
Py_TypeVar(T);
return PyTuple_Pack(1, T);
}
static PyTypeObject CustomClass = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
.tp_name = "CustomClass",
// Etc.
.tp_type_params = custom_class_type_params,
};
is equivalent to
class CustomClass[T]:
pass
and in the more complicated case
*PyObject fancy_custom_class_type_params(PyObject *self) {
Py_TypeVar(T, .bound = "int");
Py_TypeVar(AnyStr, .constraints = {"str", "bytes", NULL});
Py_TypeVar(SeqT, .bound = "Sequence[bool]", .evaluation_context = "from collections.abc import Sequence");
Py_ParamSpec(P, .default_ = "[str]");
return PyTuple_Pack(4, T, AnyStr, SeqT, P);
}
// snip
is roughly equivalent to
from collections.abc import Sequence
class FancyCustomClass[T: int, AnyStr: (str, bytes), SeqT: Sequence[bool], **P = [str]]:
pass
(Sequence
wouldn't be put in globals or be accessible to any type parameters other than SeqT
)
Example with functions
Returning subclasses of list
/dict
that have a slot for __orig_class__
. PyList_WithOrigClass
GenericAlias.__call__
should reimplement object.__new__
to set __orig_class__
so after calling self = super().__new__(cls)
you can access the attribute for types.
For functions, setting locals is a big performance loss so T.__value__
should look up the calling frame to get the GenericAlias
object to get the attribute from there.
X['Foo']()
ForwardRef
s cannot be safely handled (TODO why?) so .__value__
should return the literal string 'Foo'
in this case. If a user choses to handle this case they can.
import enum
import typing
class TypeParamKind(enum.Enum): # make _type_param_kind from pycore_ast.h public
TypeVar_kind = 1
ParamSpec_kind = 2
TypeVarTuple_kind = 3
def Py_CreateTypeParam(
kind: TypeParamKind,
name: str,
bound: str | None = None,
default: str | None = None,
constraints: str | None = None,
) -> object:
locals_ = {}
if evaluation_context is not None:
evaluation_context_code = compile(evaluation_context, f"<evaluation-context for {name}>", "exec")
exec(evaluation_context_code, locals=locals_)
# now that locals_ may be populated, we can compile our args
if default is not None:
compiled_default = compile(default, f"<default for {name}>", "eval")
evaluate_default = lambda: eval(compiled_default, locals=locals_)
match kind:
case TypeParamKind.TypeVar_kind:
if bound is not None:
compiled_bound = compile(bound, f"<bound for {name}>", "eval")
evaluate_bound = lambda: eval(compiled_bound, locals=locals_)
if constraints is not None:
compiled_constraints = compile(constraints, f"<constraints for {name}>", "eval")
evaluate_constraints = lambda: eval(compiled_constraints, locals=locals_)
return typing.TypeVar(
name,
evaluate_bound=evaluate_bound,
evaluate_default=evaluate_default,
evaluate_constraints=evaluate_constraints,
)
case TypeParamKind.ParamSpec_kind:
return typing.ParamSpec(
name,
evaluate_default=evaluate_default,
)
case TypeParamKind.TypeVarTuple_kind:
return typing.TypeVarTuple(
name,
evaluate_default=evaluate_default,
)
# "macros"
from functools import partial
Py_CreateTypeVar = partial(Py_CreateTypeParam, TypeParamKind.TypeVar_kind)
Py_CreateParamSpec = partial(Py_CreateTypeParam, TypeParamKind.ParamSpec_kind)
Py_CreateTypeVarTuple = partial(Py_CreateTypeParam, TypeParamKind.TypeVarTuple_kind)
Default behaviour is described by PyObject_TypeParams
.
*PyObject PyType_TypeParams(PyObject *self) {
return type_get_type_params(_PyType_CAST(self), NULL);
}
PyObject* Py_GetTypeParams(PyObject *self) {
PyTypeObject *type = (PyTypeObject*)self;
PyObject *cls_dict = PyType_GetDict(type);
int contains = PyDict_Contains(cls_dict, _Py_ID("__type_params__"));
if contains < 0 {
return NULL;
} else if contains {
return PyDict_GetItemWithError(cls_dict, _Py_ID("__type_params__"));
}
typeparamsfunc tp_type_params = type->tp_type_params;
if (tp_type_params == NULL) {
return PyTuple_New(0);
}
PyObject *type_params = tp_type_params(self);
if (type_params == NULL) {
return NULL;
}
int res = type_set_type_params(Py_TYPE(self), type_params); // we can bypass immutable check here?
if (res < 0) {
return NULL;
}
return type_params;
}
Evaluation context was chosen over the alternative of having users import, construct and call all the correct methods to initialise a type. If NULL
then PyEval_EvalCode
can be skipped and the bound
/default
/constraints
can be evaluated in the normal, "empty" globals/locals.
Evaluation context is also only given one value for all three type receiving kwargs bound
, constraints
and default
as a safe optimisation as bound and constraints are mutually exclusive and default
is a subtype of either.
GenericAlias.__call__
For types reimpl object new to set orig_class asap.
For functions store frame in dict with value of self. tv.value should look this up.
static PyObject *
ga_call(PyObject *self, PyObject *args, PyObject *kwds)
{
gaobject *alias = (gaobject *)self;
PyObject *obj;
if (PyType_Check(alias->origin)) {
obj = type_call;
set_orig_class;
// else if (PyCallable_Check(alias.origin)) {
// obj = set_type_param_values_and_call(alias->origin, PyObject_GetAttr(alias->origin, _Py_ID(__type_params__)), alias->args);
} else if (PyFunction_Check(alias->origin)) {
PyFunctionObject *func = (PyFunctionObject *)alias->origin;
obj = set_frame_(alias->origin, func->func_typeparams, alias->args);
} else if (PyMethod_Check(alias->origin)) {
PyMethodObject *meth = (PyMethodObject *)alias->origin;
obj = set_type_param_values_and_call(alias->origin, meth->ifunc->func_typeparams, alias->args);
} else {
obj = PyObject_Call(alias->origin, args, kwds);
}
return set_orig_class(obj, self);
}
Currently one of the recommended ways around this looks something like:
class Class[T]
def __init__(self, x: T, x_ty: T_instance):
self.x = x
# do something with `x_tp`
Class[int](1234, int) # or Class(1234, int)
This is not desirable as it not only duplicates information leaving room for things to become out of sync but it also requires each class to follow a certain specification if it wants to be interoperable and doesn't use the current machinery present for dealing with these cases which are outlined in the motivation section.
Whilst having
class Foo[T]: ...
x: Foo[int] = Foo()
x.T.__value__
being recompiled to
class Foo[T]: ...
x: Foo[int] = Foo[int]()
x.T.__value__
would be nice it is entirely infeasible and would require typing being used at compile time. It also makes this PEP more opt-in if better performance is desired and the runtime access is not required.
Original drafts of this PEP used a new struct
PyTypeParam
and some high level methods to operate on them.
This was placed in the tp_type_param
tp slot but that was rejected as it increased memory for no gain on the python side.
typedef struct { // New public C-API struct
enum Kind {
TypeVar,
TypeVarTuple,
ParamSpec,
} kind;
const char *name;
const char *bound;
const char *default_;
const char *constraints[];
} PyTypeParam;
+--------------------------------------------------------------------------+-----------------------------------------+ | Extension module | Python module | +==========================================================================+=========================================+ | .. code-block:: c | .. code-block:: python | | | | | static PyTypeParam custom_class_type_params[] = { | class CustomClass[T]: | | {.name = "T"}, | pass | | {NULL}, | | | }; | | | | | | static PyTypeObject CustomClass = { | | | PyVarObject_HEAD_INIT(&PyType_Type, 0) | | | .tp_name = "CustomClass", | | | // Etc. | | | .tp_type_params = custom_class_type_params, | | | }; | | +--------------------------------------------------------------------------+-----------------------------------------+ | .. code-block:: c | .. code-block:: python | | | | | static PyTypeParam fancy_custom_class_type_params[] = { | from collections.abc import Sequence | | {.name = "T", .bound = "int"}, | | | {.name = "AnyStr", .constraints = {"str", "bytes", NULL}}, | class FancyCustomClass[: | | | { | T: int, | | .name = "SeqT", | AnyStr: (str, bytes), | | .bound = "Sequence[bool]", | SeqT: Sequence[bool], | | .evaluation_context = "from collections.abc import Sequence", | **P = [str], | | }, | ]: | | {.name = "P", .kind = ParamSpec, .default_ = "[str]"}, | pass | | {NULL}, | | | }; | | | | | | // snip | | +--------------------------------------------------------------------------+-----------------------------------------+
This requires 3 new unstable C-API functions, PyTypeParam_EvalBound
, PyTypeParam_EvalDefault
and PyTypeParam_EvalConstraints
. These functions should execute evaluation_context
and the locals created, be captured before evaluating the bound
, default
or constraints
.
A example implementation for PyTypeParam_EvalBound
is provided:
def PyTypeParam_EvalBound(self: PyTypeParam) -> PyObject:
locals_ = {}
if self.evaluation_context is not None:
evaluation_context_code = compile(self.evaluation_context, f"<evaluation-context for {self.name}>", "exec")
exec(evaluation_context_code, locals=locals_)
# now that locals_ may be populated, we can evaluate the bound
compiled_bound = compile(self.bound, f"<bound for {self.name}>", "eval")
return eval(compiled_bound, locals=locals_)
Evaluation context was chosen over the alternative of having users import, construct and call all the correct methods to initialise a type. If NULL
then PyEval_EvalCode
can be skipped and the bound
/default
/constraints
can be evaluated in the normal, "empty" globals/locals.
Evaluation context is also only given one value for all three type receiving kwargs bound
, constraints
and default
as a safe optimisation as bound and constraints are mutually exclusive and default
is a subtype of either.
Whilst this would be a nice feature to have for people who really care about enforcing all of this at compile time, it is not practical for a number of reasons. This change would be backwards incompatible as it raises, which would mean churn. This could have been turned into an optional flag like --strict
which would allow for better gradual typing, but this doesn't have an easy option for configuration of which modules care about this. This feature however would also maybe give too much confidence/where that python is performing runtime type checking which isn't possible. The final nail in this idea's coffin is that they're not always required due to bi-directional inference.
Should there be a decision made on TypeAliasType
being made callable it might be useful to have T(*args, **kwargs)
forward to T.__value__(*args, **kwargs)
Why no __parameters__
, could do typing.get_parameters(self) if we really care.
- Note about PEP 696 addendum with functions being allowed means they should be bound asap.
- Section about reification in other langs
- How do class accesses work at runtime without copies? Because this doesn't work lol
- Reified soft keyword so code knows that stuff should be subscripted like in kotlin
- Investigate how the bases list can be deferred for generic aliases
- https://discuss.python.org/t/class-scoped-type-statement-that-references-outer-scoped-typevar/40026
- Usage of things with nested TVs, must require the variable to be passed directly to the function, types cannot be inferred. https://discord.com/channels/267624335836053506/891788761371906108/1188640440707190836