simonwhitaker/python-type-alias-vs-newtype.md

## python-type-alias-vs-newtype.md

      
    Raw
  

              python-type-alias-vs-newtype.md
            
          
    Python: Type Aliases vs New Types

If you're a fan of Python type hinting, you may have noticed that there are two different ways to create your own types: type aliases and using the NewType helper.
Type aliases look like this:
Vector = list[float]
Using the NewType helper looks like this:
from typing import NewType

Vector = NewType("Vector", list[float])
In either case, you now have a type called Vector that you can use in your code.
def min(v: Vector) -> float:
    # TODO: write implementation
    return 0.0
So, which should you use, and when?
Use type aliases to add clarity

Type aliases are exactly that; simply an alias for a type. Anywhere you refer to the original type (e.g. list[float]), you can now also refer to it as the alias (e.g. Vector) instead; they are interchangeable synonyms.
Type aliases are useful for simplifying and clarifying code that deals with complex types.
For example, in 2D geometry software it's common to deal with points in two-dimensional space, represented as a pair of floats:
origin = (0.0, 0.0)
In that example, the type of origin is tuple[float, float]. So to write a function that takes a point as input, you'd write:
def move_to(point: tuple[float, float]) -> None:
    # TODO: write implementation
    pass
Similarly, it's common to define sizes as a width and a height, also represented as a pair of floats:
size = (3.0, 4.0)
And finally, it's common to define a rectangle as a point, defining the origin of the rectangle, and a size:
origin = (0.0, 0.0)
size = (3.0, 4.0)
rect = (origin, size)
Here origin is of type tuple[float, float], and size is also of type tuple[float, float], so rect, which is a tuple containing origin and size, is of type tuple[tuple[float, float], tuple[float, float]]. This starts to become cumbersome. For example, let's look at a function that takes a rectangle as input:
def get_area(rect: tuple[tuple[float, float], tuple[float, float]]) -> float:
    # TODO: write implementation
    return 0.0
Ugh, that's messy! And it gets worse. Let's say we have a function that takes a point and a rectangle, and determines whether the point lies within the rectangle:
def point_is_in_rect(
    point: tuple[float, float], 
    rect: tuple[tuple[float, float], tuple[float, float]]
) -> bool:
    # TODO: write implementation
    return False
Your function definitions start to look less like function definitions and more like type soup. Ugh.
This is exactly the problem that type aliases solve.
Let's see how type aliases can make this code much more readable:
# Declare some type aliases to make sense of the chaos
Point2D = tuple[float, float]
Size2D = tuple[float, float]
Rectangle = tuple[Point2D, Size2D]

def get_area(rect: Rectangle) -> float:
    # TODO: write implementation
    return 0.0

def point_is_in_rect(point: Point2D, rect: Rectangle) -> bool:
    # TODO: write implementation
    return False

rect = ((0.0, 0.0), (3.0, 4.0))
print(get_area(rect))

x = (2.0, 5.0)
print(point_is_in_rect(x, rect))
Much better!
Note that these type aliases are just that: aliases. There's nothing special about Point2D, you can use it anywhere you would use tuple[float, float]. You can even pass a Point2D to a function expecting a Size2D and vice versa, since they're both synonyms for the same type. This may or may not be your intention.
Use the NewType helper to enforce type correctness

Unlike type aliases, the NewType helper creates a completely new type. It is not a synonym, and cannot be used interchangeably with its underlying type.
A good example of where this might be useful is in our code from the previous section. Consider this snippet of code:
Point2D = tuple[float, float]
Size2D = tuple[float, float]
Rectangle = tuple[Point2D, Size2D]

def get_area(rect: Rectangle) -> float:
    _, size = rect
    width, height = size
    return width * height

origin = (0.0, 0.0)
size = (3.0, 4.0)
rect = (size, origin)
print(get_area(rect)) # prints 0.0
There is a semantic error in this code. In the line where I instantiate the rect variable, I've passed the size and origin in the wrong order. The syntax is correct, and the code runs just fine, but the output is not as expected.
In its current form, the type checker cannot help me here, because Point2D and Size2D are exactly the same type. They are both just aliases for tuple[float, float].
Is there a way the type checker could have caught this bug at type-checking time, rather than allowing it to misbehave at runtime?
Yes. This is exactly the problem that NewType solves.
Let's rewrite the code, but this time declare Point2D and Size2D as new types.
from typing import NewType

Point2D = NewType("Point2D", tuple[float, float])
Size2D = NewType("Size2D", tuple[float, float])
Rectangle = tuple[Point2D, Size2D]

def get_area(rect: Rectangle) -> float:
    _, size = rect
    width, height = size
    return width * height

# To instantiate types declared with NewType with 
# literal values, wrap the values in TypeName(...)
origin = Point2D((0.0, 0.0))
size = Size2D((3.0, 4.0))

rect = (size, origin)
print(get_area(rect))
Now let's type-check this code. There are a number of type checkers available for Python; I'm using Microsoft's pyright, which is the default Python type checker in VS Code.
$ pyright demo.py 

  /Users/simon/demo.py:18:16 - error: Argument of type "tuple[Size2D, Point2D]" cannot be assigned to parameter "rect" of type "Rectangle" in function "get_area"
    Tuple entry 1 is incorrect type
      "Size2D" is incompatible with "Point2D" (reportGeneralTypeIssues)
1 error, 0 warnings, 0 informations 
Great! This error is also highlighted for me in VS Code, so I can see the error immediately as I type my code.

Should I just use NewType everywhere?

Probably not. NewType adds a small overhead for developers, namely having to wrap literal values in TypeName(...) every time you instantiate a value of a type created with NewType. (There's also a minuscule runtime overhead to using NewType, but you almost certainly don't need to worry about that.)
I like to use these two approaches together, as described above. I use NewType when I need to enforce type correctness, and type aliases where I want to avoid repeating complex types.