Skip to content

Instantly share code, notes, and snippets.

@jwhitlock
Last active December 12, 2023 13:44
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jwhitlock/bb149f3d1525b21c563b25b695d7ad4d to your computer and use it in GitHub Desktop.
Save jwhitlock/bb149f3d1525b21c563b25b695d7ad4d to your computer and use it in GitHub Desktop.
Converting fx-private-relay to use mypy

Converting fx-private-relay to use mypy

Notes for an interactive class or video

Overview

  • Intro to private relay if video
  • Starting with tag v3.5.14
    • Python 3.9 - I'm using 3.9.11
    • Django 3.2.13 - LTS version, extended support until April 2024.
    • Django REST Framework 3.12.4
    • Email masking implemented, code transitioning to background task
    • Phone masking in progress, some code in repo.
  • mypy history
    • Started in 2012 by Jukka Lehtosalo as a PhD student at University of Cambridge Computer Laboratory. He previously developed Alore, a Python-like language with optional static typing.
    • Presented at PyCon 2013, convinced to use Python 3 syntax. PEP 484 (Guido van Rossum, Lehtosalo, and Łukasz Langa) formalized type hints. Continues to be updated, tracked on python/typing.
    • Jukka joined Dropbox, which adopted mypy in 2015 to check 4 million lines of Python.
  • mypy is reference implementation but there are other type checkers:

Benefits of mypy:

  • Documentation of argument types and function return values
  • Better auto-completion in rich code editors
  • Warnings for failing to handle all return variants (for example, .first() can return a model instance or None)
  • Type-based thinking will transfer to TypeScript and Rust.
  • Possible future optimizations, with tools like mypyc, which doubled the performance of Black.

There are several things mypy can't do, or has trouble with:

  • Type annotations are not checked at run time. It requires a developer to do good research to add appropriate types
  • Annotations like Any can quick make mypy happy, but will lose the benefits of type checking.
  • Few 3rd party packages have adopted types, requiring ignoring them or creating stub files.
  • Python that uses runtime code generation and metaclasses can be hard to type completely. For example, mypy does not currently support metaclasses and inheritance. A lot of Django uses these features.
  • mypy has trouble annotating complex dictionaries. dict[str, Any] is often as good as you can get for multi-level dictionaries.
  • There is no annotation for "expected" exceptions

mypy and Python are co-evolving, and a lot of documentation and Stack Overflow answers are written against earlier versions of both. For example, you used to need aliases for common built-in types:

from typing import List, Text

block_list = []: List[Text]

Python 3.9 implemented PEP 585, so list can be used:

block_list = []: list[str]

We should use the Python 3.9 syntax and follow best practices. More changes are coming in Python 3.10, such as supporting "str | None" instead of "Optional[str]".

Convert Relay to use mypy

There's a good guide for Using mypy with an existing codebase. These notes are specific to our codebase.

Tag v3.5.14 is a recent release (June 16, 2022), and easier than a commit hash to type.

git checkout -b mypy-demo v3.5.14

Install mypy

Installing mypy 0.950, released April 27, 2022. There is a more recent 0.960 released 25 May 2022, but I've tested this one with other packages.

Add to requirements.txt:

# linting
black==22.3.0
mypy==0.950

And install it:

$ pip install -r requirements.txt
Requirement already satisfied: boto3==1.23.6 in /Users/john/.virtualenvs/fx-private-relay/lib/python3.9/site-packages (from -r requirements.txt (line 1)) (1.23.6)
...
Requirement already satisfied: MarkupSafe>=2.0 in /Users/john/.virtualenvs/fx-private-relay/lib/python3.9/site-packages (from jinja2->coreschema>=0.0.4->drf-yasg==1.20.0->-r requirements.txt (line 18)) (2.1.1)
Installing collected packages: mypy
Successfully installed mypy-0.950
WARNING: You are using pip version 21.3.1; however, version 22.1.2 is available.
You should consider upgrading via the '/Users/john/.virtualenvs/fx-private-relay/bin/python -m pip install --upgrade pip' command.

To get on the latest pip:

python -m pip install --upgrade pip

And now run mypy:

$ mypy .
manage.py:10: error: Skipping analyzing "django.core.management": module is installed, but missing library stubs or py.typed marker
privaterelay/wsgi.py:12: error: Skipping analyzing "django.core.wsgi": module is installed, but missing library stubs or py.typed marker
privaterelay/utils.py:1: error: Skipping analyzing "django.conf": module is installed, but missing library stubs or py.typed marker
privaterelay/ftl_bundles.py:1: error: Skipping analyzing "django_ftl.bundles": module is installed, but missing library stubs or py.typed marker
...
emails/tests/mgmt_process_emails_from_sqs_tests.py:8: error: Skipping analyzing "markus.testing": module is installed, but missing library stubs or py.typed marker
emails/tests/mgmt_process_emails_from_sqs_tests.py:9: error: Skipping analyzing "boto3": module is installed, but missing library stubs or py.typed marker
emails/tests/mgmt_process_emails_from_sqs_tests.py:11: error: Library stubs not installed for "OpenSSL" (or incompatible with Python 3.9)
emails/tests/mgmt_process_emails_from_sqs_tests.py:13: error: Skipping analyzing "django.core.management": module is installed, but missing library stubs or py.typed marker
emails/tests/mgmt_process_emails_from_sqs_tests.py:14: error: Skipping analyzing "django.core.management.base": module is installed, but missing library stubs or py.typed marker
Found 282 errors in 107 files (checked 123 source files)

If you get different results, it may be because mypy caches some things (.mypy_cache/). You can delete .mypy_cache to clear the cache, or tell mypy to skip the cache:

$ mypy --no-incremental .

Install stub packages

First, mypy suggests installing stub packages:

...
api/authentication.py:6: error: Library stubs not installed for "requests" (or incompatible with Python 3.9)
api/authentication.py:6: note: Hint: "python3 -m pip install types-requests"
api/authentication.py:6: note: (or run "mypy --install-types" to install all missing stub packages)
...

Stub packages contain typing hints for other packages. Some packagers prefer this method of shipping types, rather than integrating them into the main package. The stub files (.pyi extension) end up looking a lot like C/C++ header files (.h, .hpp extensions), with function and variable declarations, and the .py files like C/C++ implementation files (.c, .cpp extensions). See Stub files for more information, or wait until later when we create our own.

Let's see the hinted packages:

$ mypy . | grep "Hint"
api/authentication.py:6: note: Hint: "python3 -m pip install types-requests"
emails/management/commands/process_emails_from_sqs.py:24: note: Hint: "python3 -m pip install types-pyOpenSSL"

Rather than use the shortcut mypy --install-types, let's install manually and pay attention to the versions:

$ pip install types-requests
Collecting types-requests
  Downloading types_requests-2.27.31-py3-none-any.whl (12 kB)
Collecting types-urllib3<1.27
  Using cached types_urllib3-1.26.15-py3-none-any.whl (13 kB)
Installing collected packages: types-urllib3, types-requests
Successfully installed types-requests-2.27.31 types-urllib3-1.26.15
$ pip install types-pyOpenSSL
Collecting types-pyOpenSSL
  Using cached types_pyOpenSSL-22.0.3-py3-none-any.whl (5.7 kB)
Collecting types-cryptography
  Using cached types_cryptography-3.3.21-py3-none-any.whl (30 kB)
Installing collected packages: types-cryptography, types-pyOpenSSL
Successfully installed types-cryptography-3.3.21 types-pyOpenSSL-22.0.3

And then add these to requirements.txt as well:

# linting
black==22.3.0
mypy==0.950
types-requests==2.27.31
types-pyOpenSSL==22.0.3

Running mypy ., we're down from 282 to 277 errors. However, some look like this:

privaterelay/migrations/0001_initial.py:3: error: Skipping analyzing "django.db": module is installed, but missing library stubs or py.typed marker
privaterelay/migrations/0001_initial.py:10: note: (Skipping most remaining errors due to unresolved imports or missing stubs; fix these first)

So, as we resolve third-party packages, we'll see more errors in the code itself.

So far, we've just made a couple of changes to requirements.txt. This is a good time to commit.

Install mypy plugins

A lot of the "Skipping analyzing" errors refer to Django. Django generates a lot of class definitions at runtime, such as model definitions using a metaclass. mypy has trouble with this dynamic code.

There are a few projects that have attempted to address this:

I went with django-stubs, because of the companion project typeddjango/djangorestframework-stubs. Add two more modules to requirements.txt:

# linting
black==22.3.0
mypy==0.950
types-requests==2.27.31
types-pyOpenSSL==22.0.3
django-stubs==1.12.0
djangorestframework-stubs==1.6.0

and install with:

pip install -r requirements.txt

There are additional installation steps. Create pyproject.toml with this content:

[tool.mypy]
plugins = ["mypy_django_plugin.main"]

[tool.django-stubs]
django_settings_module = "privaterelay.settings"

In privaterelay/settings.py, add an import:

import django_stubs_ext

and some code:

# Install mypy plugin patches
django_stubs_ext.monkeypatch()

This will allow using annotations like QuerySet[RelayAddress].

Now when we run mypy, it also imports Django settings and initializes the app, which can slow things down. We can speed them up again later.

mypy .
 [sentry] DEBUG: Setting up integrations (with default = True)
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.flask.FlaskIntegration: Flask is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.bottle.BottleIntegration: Bottle not installed
...
 [sentry] DEBUG: [Tracing] Adding `sentry-trace` header fbf066f62eab4a1388d394470d650ecf-abe8ccf29e9e5fbb- to outgoing request to https://oauth.stage.mozaws.net/v1/jwks.
privaterelay/ftl_bundles.py:1: error: Skipping analyzing "django_ftl.bundles": module is installed, but missing library stubs or py.typed marker
privaterelay/settings.py:20: error: Skipping analyzing "decouple": module is installed, but missing library stubs or py.typed marker
...
emails/tests/mgmt_process_emails_from_sqs_tests.py:7: error: Skipping analyzing "botocore.exceptions": module is installed, but missing library stubs or py.typed marker
emails/tests/mgmt_process_emails_from_sqs_tests.py:8: error: Skipping analyzing "markus.testing": module is installed, but missing library stubs or py.typed marker
emails/tests/mgmt_process_emails_from_sqs_tests.py:9: error: Skipping analyzing "boto3": module is installed, but missing library stubs or py.typed marker
Found 67 errors in 24 files (checked 123 source files)

This plugin has further reduced errors from 277 to 67, and some are now in the code itself.

This is a good point to commit the changes:

  • privaterelay/settings.py - Add Monkey-patching of Django classes
  • requirements.txt - two new requirements
  • pyproject.toml - a new file

Handle installed 3rd party modules

The bulk of the remaining initial errors are for 3rd party libraries. These do no provide their own typing stubs, either in the shipped package or as a separate package. We can't do much with them at this stage other than to ignore them. In mypy terms, we should treat them as type Any, with unknown classes, variables, and methods.

Let's collect the errors:

$ mypy . 2> /dev/null | grep "Skipping analyzing" | cut -d: -f3- | sort | uniq -c | sort
   1  error: Skipping analyzing "allauth.account.models": module is installed, but missing library stubs or py.typed marker
   1  error: Skipping analyzing "allauth.account.signals": module is installed, but missing library stubs or py.typed marker
   1  error: Skipping analyzing "botocore.config": module is installed, but missing library stubs or py.typed marker
   1  error: Skipping analyzing "codetiming": module is installed, but missing library stubs or py.typed marker
...
   4  error: Skipping analyzing "boto3": module is installed, but missing library stubs or py.typed marker
   4  error: Skipping analyzing "waffle.models": module is installed, but missing library stubs or py.typed marker
   6  error: Skipping analyzing "botocore.exceptions": module is installed, but missing library stubs or py.typed marker
   7  error: Skipping analyzing "allauth.socialaccount.models": module is installed, but missing library stubs or py.typed marker

Here's what the Unix pipe wizardry is doing:

  • Runs mypy
  • Ignores the Sentry debug output by sending stderr (2>) to /dev/null
  • Looks for just the missing libraries with grep
  • Removes the filename and line number with cut
  • Sorts the results with sort
  • Counts unique instances with uniq -c
  • Sorts by count with sort again

Some imports appear once. These could be handle with an inline comment, like in api/tests/authentication_tests.py:

from datetime import datetime

from model_bakery import baker
import responses

from django.core.cache import cache
from django.test import RequestFactory, TestCase

from allauth.socialaccount.models import SocialAccount  # type: ignore

from ..authentication import FxaTokenAuthentication, get_cache_key

...

This makes it clear which import is untyped, in the file it is used. However, it becomes repetitive to repeat this in multiple files, and harder to rebase. I decided to use an alternate method of listing these in the configuration file.

Here's another command to just get the module names as a quoted list:

mypy . 2> /dev/null | grep "Skipping analyzing" | cut -d: -f3- | sort | uniq | cut -d" " -f5 | cut -d: -f1

Use this, with some added commas (,) to create a new section in pyproject.toml:

[[tool.mypy.overrides]]
module = [
    "allauth.account.models",
    "allauth.account.signals",
    "allauth.socialaccount.models",
    "allauth.socialaccount.providers.fxa.views",
    "boto3",
    "botocore.config",
    "botocore.exceptions",
    "codetiming",
    "debug_toolbar",
    "decouple",
    "dj_database_url",
    "django_filters",
    "django_ftl.bundles",
    "drf_yasg",
    "drf_yasg.views",
    "google_measurement_protocol",
    "jwcrypto",
    "jwcrypto.jwe",
    "jwcrypto.jwk",
    "markus",
    "markus.main",
    "markus.testing",
    "markus.utils",
    "oauthlib.oauth2.rfc6749.errors",
    "requests_oauthlib",
    "waffle",
    "waffle.models",
    "waffle.testutils",
]
ignore_missing_imports = true

With this change, mypy complains less:

$ mypy . 2> /dev/null
privaterelay/settings.py:36: error: Cannot find implementation or library stub for module named "silk"
privaterelay/settings.py:108: error: Incompatible types in assignment (expression has type "Tuple[str, str, str]", variable has type "Tuple[str, str]")
privaterelay/settings.py:636: error: Incompatible types in assignment (expression has type "Tuple[str]", variable has type "Tuple[str, str]")
privaterelay/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
phones/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
emails/migrations/0019_merge_20210825_1737.py:13: error: Need type annotation for "operations" (hint: "operations: List[<type>] = ...")
phones/views.py:4: error: Cannot find implementation or library stub for module named "phonenumbers"
phones/views.py:4: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
phones/views.py:6: error: Cannot find implementation or library stub for module named "twilio.rest"
phones/views.py:7: error: Cannot find implementation or library stub for module named "twilio.twiml.messaging_response"
emails/management/command_from_django_settings.py:22: error: First argument to namedtuple() should be "SettingToLocal", not "Setting"
privaterelay/urls.py:62: error: Cannot find implementation or library stub for module named "silk"
emails/tests/views_tests.py:635: error: Need type annotation for "sa"
Found 12 errors in 8 files (checked 123 source files)

If there is a syntax error in pyproject.toml, this will be listed first, and should be fixed.

Down from 67 to 12! This is another good time to commit.

Handle uninstalled 3rd party modules

The mypy error is different if a module is not installed:

privaterelay/settings.py:36: error: Cannot find implementation or library stub for module named "silk"

These are optional requirements, or are part of the in-progress phone number masking feature. Add these to a new section in pyproject.toml:

[[tool.mypy.overrides]]
# Optional modules or in-progress features
module = [
    "phonenumbers",
    "silk",
    "twilio.rest",
    "twilio.twiml",
    "twilio.twiml.messaging_response",
]
ignore_missing_imports = true

Once more:

$ mypy . 2> /dev/null
privaterelay/settings.py:108: error: Incompatible types in assignment (expression has type "Tuple[str, str, str]", variable has type "Tuple[str, str]")
privaterelay/settings.py:636: error: Incompatible types in assignment (expression has type "Tuple[str]", variable has type "Tuple[str, str]")
privaterelay/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
phones/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
emails/migrations/0019_merge_20210825_1737.py:13: error: Need type annotation for "operations" (hint: "operations: List[<type>] = ...")
emails/management/command_from_django_settings.py:22: error: First argument to namedtuple() should be "SettingToLocal", not "Setting"
emails/tests/views_tests.py:635: error: Need type annotation for "sa"
Found 7 errors in 6 files (checked 123 source files)

We started with 282 errors, and now there are 7. The remaining issues are in the code rather than the imports. This is another good time to commit.

Fix a bug in namedtuple()

This mypy error identifies a bug:

emails/management/command_from_django_settings.py:22: error: First argument to namedtuple() should be "SettingToLocal", not "Setting"

The code:

SettingToLocal = namedtuple(
    "Setting", ["setting_key", "local_name", "help_str", "validator"]
)

Because of the mismatch between the first argument ("Setting") and the variable (SettingToLocal), the generated class name does not match the import name:

$ ./manage.py shell
Python 3.9.11 (main, Apr 26 2022, 09:22:38) 
[Clang 13.1.6 (clang-1316.0.21.2.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from emails.management.command_from_django_settings import SettingToLocal
>>> SettingToLocal
<class 'emails.management.command_from_django_settings.Setting'>
>>> x = SettingToLocal("TEST", "test", "Are we in test mode?", bool)
>>> x
Setting(setting_key='TEST', local_name='test', help_str='Are we in test mode?', validator=<class 'bool'>)

The fix it to make the two agree:

SettingToLocal = namedtuple(
    "SettingToLocal", ["setting_key", "local_name", "help_str", "validator"]
)

and now the repr (and mypy) are happy:

$ ./manage.py shell
Python 3.9.11 (main, Apr 26 2022, 09:22:38) 
[Clang 13.1.6 (clang-1316.0.21.2.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from emails.management.command_from_django_settings import SettingToLocal
>>> SettingToLocal
<class 'emails.management.command_from_django_settings.SettingToLocal'>
>>> x = SettingToLocal("TEST", "test", "Are we in test mode?", bool)
>>> x
SettingToLocal(setting_key='TEST', local_name='test', help_str='Are we in test mode?', validator=<class 'bool'>)

Down to 6 errors. One line may seem thin for a commit, but you'll have a better commit message.

Fix settings

There are two errors in privaterelay/settings.py:

privaterelay/settings.py:108: error: Incompatible types in assignment (expression has type "Tuple[str, str, str]", variable has type "Tuple[str, str]")
privaterelay/settings.py:636: error: Incompatible types in assignment (expression has type "Tuple[str]", variable has type "Tuple[str, str]")

The code triggering the first error:

CSP_SCRIPT_SRC = (
    "'self'",
    "https://www.google-analytics.com/",
)
if USE_SILK:
    CSP_SCRIPT_SRC += ("'unsafe-inline'",)

mypy will infer the type of a variable based on the assignment, and will raise an error if a later assignment changes the inferred type. It detects that CSP_SCRIPT_SRC is a two-element tuple, but then gets assigned a three-element tuple.

There are a couple ways to fix this. One is to change CSP_SCRIPT_SRC to a variable-sized tuple:

CSP_SCRIPT_SRC: tuple[str, ...] = (
    "'self'",
    "https://www.google-analytics.com/",
)
if USE_SILK:
    CSP_SCRIPT_SRC += ("'unsafe-inline'",)

The section ": tuple[str, ...]" is a type annotation. It declares the type as a tuple of str (Unicode strings) of variable length. See type hints cheat sheet for more examples, and use the Python 3.9+ variants when following examples.

Also, type declarations use square brackets ([]). You'll see a lot of (sometimes nested) square brackets.

Python evaluates but ignores types at runtime. This sometimes causes issues, which can be solved by adding a from __future__ import annotations at the top, which will enable the Python 3.11 default of skipping evaluation at runtime.

Another way to fix it is to change to a list, which is variable length by default:

CSP_SCRIPT_SRC = [
    "'self'",
    "https://www.google-analytics.com/",
]
if USE_SILK:
    CSP_SCRIPT_SRC += ["'unsafe-inline'"]

Even better:

CSP_SCRIPT_SRC = [
    "'self'",
    "https://www.google-analytics.com/",
]
if USE_SILK:
    CSP_SCRIPT_SRC.append("'unsafe-inline'")

Fix this issue and the next one, which is very similar:

if DEBUG:
    DRF_RENDERERS = (
        "rest_framework.renderers.BrowsableAPIRenderer",
        "rest_framework.renderers.JSONRenderer",
    )
else:
    DRF_RENDERERS = ("rest_framework.renderers.JSONRenderer",)

We're down to 4 mypy errors. Maybe commit privaterelay/settings.py before moving on.

Annotate migrations

There are a few errors in migration files:

privaterelay/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
phones/migrations/0001_initial.py:10: error: Need type annotation for "dependencies" (hint: "dependencies: List[<type>] = ...")
emails/migrations/0019_merge_20210825_1737.py:13: error: Need type annotation for "operations" (hint: "operations: List[<type>] = ...")

In Python, a list or other collection can store different types of data, like [1, true, ("a", "tuple")]. mypy expects most collections to have a single type, like a list of strings ["one", "two", three"]. It infers the types of collections from the first assignment, and cries if the assignment is an empty collection, like in privaterelay/migrations/0001_initial.py:

    dependencies = []

What should this be? One way is to look at the next migration, privaterelay/migrations/0002_add_fxa_uid_to_invitation.py:

    dependencies = [
        ("privaterelay", "0001_initial"),
    ]

It looks like a declaration of the migrations that need to run before this one. With experience, you'll know how to annotate this. But instead, we can get mypy to help. Add this to privaterelay/migrations/0002_add_fxa_uid_to_invitation.py:

    dependencies = [
        ("privaterelay", "0001_initial"),
    ]
    reveal_type(dependencies)

Running mypy, we get this extra line in the output:

privaterelay/migrations/0002_add_fxa_uid_to_invitation.py:12: note: Revealed type is "builtins.list[Tuple[builtins.str, builtins.str]]"

reveal_type(expr) and the similar reveal_locals() are understood by mypy, and can be used to debug the inferred types. They are not Python built-ins, and need to be removed before running the code with Python.

After removing reveal_type(dependencies) from 0002_add_fxa_uid_to_invitation.py, we can annotate 0001_initial.py:

    dependencies: list[tuple[str, str]] = []

The same annotation can be used in phones/migrations/0001_initial.py.

What about emails/migrations/0019_merge_20210825_1737.py?

    dependencies = [
        ("emails", "0018_relayaddress_domain"),
        ("emails", "0018_reply_squashed_0022_auto_20210817_0327"),
    ]

    operations = []

It appears to be a placeholder, to define that the two migrations need to run before proceeding. Using the same reveal_type trick gives:

privaterelay/migrations/0002_add_fxa_uid_to_invitation.py:41: note: Revealed type is "builtins.list[django.db.migrations.operations.base.Operation]"

Leading to the annotation:

    operations: list[migrations.operations.base.Operation] = []

Down to 1 error! Revert any reveal_type changes to privaterelay/migrations/0002_add_fxa_uid_to_invitation.py, and commit the three changed migrations.

Annotate the test

The remaining error:

emails/tests/views_tests.py:635: error: Need type annotation for "sa"

Here's the code in emails/tests/views_tests.py:

class SnsMessageTest(TestCase):
    def setUp(self) -> None:
        self.user = baker.make(User)
        self.profile = self.user.profile_set.first()
        self.sa = baker.make(SocialAccount, user=self.user, provider="fxa")
        # test.com is the second domain listed and has the numerical value 2
        self.address = baker.make(
            RelayAddress, user=self.user, address="sender", domain=2
        )

        self.bucket = "test-bucket"
        self.key = "/emails/objectkey123"

        patcher = patch(
            "emails.views._get_text_html_attachments",
            return_value=("text", "html", "attachments"),
        )
        patcher.start()
        self.addCleanup(patcher.stop)

Why is this code causing mypy to complain, while the rest of the code raises no errors? Because the function is annotated with a return type:

    def setUp(self) -> None:

The bit -> None tells mypy that this is a function that does not return a value (in Python, these return None), but also that it should go ahead and check types in the function. None of the other code has annotations, so mypy skips them. The first solution is to remove that return annotation:

    def setUp(self):

mypy will skip this as well, and the error is gone. However, it can also be useful to have type checks of test code, to find missing assertions and tests that don't check what was intended.

The first solution is to annotate the variable, as requested:

    def setUp(self) -> None:
        self.user = baker.make(User)
        self.profile = self.user.profile_set.first()
        self.sa: SocialAccount = baker.make(SocialAccount, user=self.user, provider="fxa")
        # test.com is the second domain listed and has the numerical value 2
        self.address = baker.make(
            RelayAddress, user=self.user, address="sender", domain=2
        )

        self.bucket = "test-bucket"
        self.key = "/emails/objectkey123"

        patcher = patch(
            "emails.views._get_text_html_attachments",
            return_value=("text", "html", "attachments"),
        )
        patcher.start()
        self.addCleanup(patcher.stop)
        # TODO: remove Debugging
        reveal_type(self.user)
        reveal_type(self.profile)
        reveal_type(self.sa)
        reveal_type(self.address)

This quiets mypy, but some of the revealed types are not satisfying:

emails/tests/views_tests.py:650: note: Revealed type is "django.contrib.auth.models.User"
emails/tests/views_tests.py:651: note: Revealed type is "Union[emails.models.Profile, None]"
emails/tests/views_tests.py:652: note: Revealed type is "Any"
emails/tests/views_tests.py:653: note: Revealed type is "emails.models.RelayAddress"

The code self.user.profile_set.first() could return None, so the type is Union[emails.models.Profile, None], which is more commonly annotated as Optional[Profile], or maybe Profile | None in Python 3.10. And, while it accepted our annotation, it still thinks self.sa is Any, which will cause issues when we use it in a test.

Instead, we can use type narrowing to tell mypy that a variable is a different type than it inferred. We can assert that self.profile is not None, and that self.sa is a SocialAccount:

    def setUp(self) -> None:
        self.user = baker.make(User)
        self.profile = self.user.profile_set.first()
        assert self.profile is not None
        self.sa: SocialAccount = baker.make(SocialAccount, user=self.user, provider="fxa")
        assert isinstance(self.sa, SocialAccount)
        # test.com is the second domain listed and has the numerical value 2
        self.address = baker.make(
            RelayAddress, user=self.user, address="sender", domain=2
        )

        self.bucket = "test-bucket"
        self.key = "/emails/objectkey123"

        patcher = patch(
            "emails.views._get_text_html_attachments",
            return_value=("text", "html", "attachments"),
        )
        patcher.start()
        self.addCleanup(patcher.stop)
        # TODO: remove Debugging
        reveal_type(self.user)
        reveal_type(self.profile)
        reveal_type(self.sa)
        reveal_type(self.address)

The revealed types:

emails/tests/views_tests.py:652: note: Revealed type is "django.contrib.auth.models.User"
emails/tests/views_tests.py:653: note: Revealed type is "emails.models.Profile"
emails/tests/views_tests.py:654: note: Revealed type is "Any"
emails/tests/views_tests.py:655: note: Revealed type is "emails.models.RelayAddress"
Success: no issues found in 123 source files

Success for self.profile, but failure for self.sa. What's going on?

The root issue is that SocialAccount comes from django-allauth:

from allauth.socialaccount.models import SocialAccount

This is one of the 3rd party modules that we added to pyproject.toml and told mypy to explicitly ignore types. That means that any members get type Any, and further details are lost.

There are ways around this, but for now, let's be happy with the annotated version, without reveal_type:

    def setUp(self) -> None:
        self.user = baker.make(User)
        self.profile = self.user.profile_set.first()
        assert self.profile is not None
        self.sa: SocialAccount = baker.make(SocialAccount, user=self.user, provider="fxa")
        # test.com is the second domain listed and has the numerical value 2
        self.address = baker.make(
            RelayAddress, user=self.user, address="sender", domain=2
        )

        self.bucket = "test-bucket"
        self.key = "/emails/objectkey123"

        patcher = patch(
            "emails.views._get_text_html_attachments",
            return_value=("text", "html", "attachments"),
        )
        patcher.start()
        self.addCleanup(patcher.stop)

We now have a clean run of mypy:

$ mypy . 2> /dev/null
Success: no issues found in 123 source files

We can run pytest . to ensure that the changes didn't break anything. This is important since mypy checks static types but not runtime behaviour, and pytest checks runtime behaviour.

This is a great time to commit.

Peek at strict output

Is that it? That wasn't a lot of code changes. For a peek at what is left, run mypy in strict mode:

$ mypy --strict . 2> /dev/null
emails/models.py:35: error: Function is missing a return type annotation
emails/models.py:49: error: Function is missing a type annotation
emails/models.py:57: error: Call to untyped function "has_bad_words" in typed context
emails/models.py:59: error: Call to untyped function "is_blocklisted" in typed context
emails/models.py:63: error: Call to untyped function "hash_subdomain" in typed context
...
emails/tests/mgmt_process_emails_from_sqs_tests.py:405: error: Function is missing a type annotation
emails/tests/mgmt_process_emails_from_sqs_tests.py:407: error: Call to untyped function "make_client_error" in typed context
api/urls.py:19: error: Function is missing a type annotation
api/urls.py:22: error: Function is missing a type annotation
Found 1136 errors in 48 files (checked 123 source files)

That's a lot of errors, and a lot of work if mypy --strict were needed across the codebase. We may get there someday, but an incremental approach is probably better:

  • Use type annotations for new code and tests
  • Add annotations to "leaf" code like management commands, that other code doesn't use
  • Add annotations to utility code that lots of code uses
  • Convert entire files to enforced strict mode, with a file-based override in pyproject.toml.

Since mypy checks static but not runtime behaviour, it is also important to write tests that cover the runtime behaviour of code.

The strict checks are interesting data however, and may help find code where types are a little lose and could use some annotation. This error is interesting, and suggests we may need to ensure we're not getting a None:

emails/models.py:216: error: Unsupported operand types for + ("None" and "timedelta")
emails/models.py:216: note: Left operand is of type "Optional[datetime]"

This looks like a bug, but is probably a missing @staticmethod decorator:

emails/views.py:789: error: Argument 1 to "make_domain_address" has incompatible type "Profile"; expected "DomainAddress"

I think it is useful to periodically run in strict mode, to see what's there and how far we have to go.

Tighten up configuration

Let's add a few more lines to pyproject.toml:

[tool.mypy]
plugins = ["mypy_django_plugin.main"]
python_version = "3.9"
show_error_codes = true
warn_return_any = true
warn_unused_configs = true
warn_unused_ignores = true

This does not change the number of detected errors (0 for mypy, 1136 for mypy --strict). The settings:

  • python_version = "3.9" - Tell mypy to expect Python 3.9 syntax
  • show_error_codes = true - Show error code such as error: Function is missing a type annotation [no-untyped-def]. This is useful for looking up documentation. There's also an --ignore-without-code command line parameter.
  • warn_return_any = true - Warn if a function is defined to return Any, which doesn't provide useful information.
  • warn_unused_configs = true - Warn if there are unexpected configuration items. This can help detect typos in pyproject.toml.
  • warn_unused_ignores = true - Warn if a type: ignore is unneeded. This will not tell us if a 3rd-party module now has type information, however. We should still follow the release notes.

Commit again, if you like small commits.

Create a stub for 3rd Party Module decouple

One of the untyped 3rd party modules is decouple, used to load settings from the environment, mostly in privaterelay/settings.py. It is also a fairly simple module, and a good candidate for demonstrating stub creation.

You can see the result of using an untyped module by revealing types in privaterelay/settings.py. Add this to the end:

reveal_locals()

and then run:

$ mypy privaterelay/settings.py

Looking at the results, many of the settings variables have a type of Any:

privaterelay/settings.py:758: note:     SECURE_SSL_HOST: Any
privaterelay/settings.py:758: note:     SECURE_SSL_REDIRECT: Any
privaterelay/settings.py:758: note:     SENTRY_RELEASE: Any
privaterelay/settings.py:758: note:     SERVE_ADDON: Any
privaterelay/settings.py:758: note:     SERVE_REACT: Any

These could have more precise types. For example, SECURE_SSL_HOST should have a type Optional[str], and SECURE_SSL_REDIRECT should have type bool:

SECURE_SSL_HOST = config("DJANGO_SECURE_SSL_HOST", None)
SECURE_SSL_REDIRECT = config("DJANGO_SECURE_SSL_REDIRECT", False)

We could annotate all these variables, but if instead we add an annotation stub for decouple, then mypy will be able to better determine types and potentially warn us about incorrect usage.

This is a long example, and introduces a lot. However, the concepts are important for writing annotations for our own code, and I want you to not be scared to write a stub for a third-party library, especially if our interface to the library is fairly simple.

Creating a draft stub

First, create a folder for our stubs:

$ mkdir mypy_stubs

And update pyproject.toml:

[tool.mypy]
plugins = ["mypy_django_plugin.main"]
mypy_path = "$MYPY_CONFIG_FILE_DIR/mypy_stubs"
...

You may also need to get this in an environment variable for your code editor:

export MYPYPATH=/path/to/fx-private-relay/mypy_stubs

Now, let's generate a stub file:

$ stubgen -o mypy_stubs -m decouple
Processed 1 modules
Generated mypy_stubs/decouple.pyi

stubgen is included with mypy and is used for automatic stub generation. It generates draft stubs, which then need some hand tuning. In fact, this one doesn't even pass:

$ mypy . 2> /dev/null
mypy_stubs/decouple.pyi:5: error: Cannot assign multiple types to name "text_type" without an explicit "Type[...]" annotation  [misc]
mypy_stubs/decouple.pyi:5: error: Name "unicode" is not defined  [name-defined]
Found 2 errors in 1 file (checked 124 source files)

You may want to commit the draft stub, even though it fails, so you can see the future changes in git history.

Remove unused and error-causing code from draft stub

The next part is to examine the generated file mypy_stubs/decouple.pyi, versus the source code.

Here's the top of the stub file:

from _typeshed import Incomplete

PYVERSION: Incomplete
text_type = str
text_type = unicode
read_config: Incomplete
DEFAULT_ENCODING: str
TRUE_VALUES: Incomplete
FALSE_VALUES: Incomplete

def strtobool(value): ...

Versus the code:

# coding: utf-8
import os
import sys
import string
from shlex import shlex
from io import open
from collections import OrderedDict

# Useful for very coarse version differentiation.
PYVERSION = sys.version_info


if PYVERSION >= (3, 0, 0):
    from configparser import ConfigParser
    text_type = str
else:
    from ConfigParser import SafeConfigParser as ConfigParser
    text_type = unicode

if PYVERSION >= (3, 2, 0):
    read_config = lambda parser, file: parser.read_file(file)
else:
    read_config = lambda parser, file: parser.readfp(file)


DEFAULT_ENCODING = 'UTF-8'


# Python 3.10 don't have strtobool anymore. So we move it here.
TRUE_VALUES = {"y", "yes", "t", "true", "on", "1"}
FALSE_VALUES = {"n", "no", "f", "false", "off", "0"}

The code is being flexible, trying to work with both Python 2.7 and 3.0 and 3.2. mypy supports python version and system platform checks, but we don't have to - we're solidly in Python 3.9.

We need to remove or comment out some stuff to get the stub to "compile" without errors. We also don't need variables that our code doesn't import. Our use of decouple is:

  • strtobool
  • config
  • Choices
  • CSV

The entire opening section can be deleted or commented out. I'm using deletion for the demo, but commenting out may be easier for responding to package upgrades. Maybe.

The new top of mypy_stubs/decouple.pyi:

from _typeshed import Incomplete

def strtobool(value): ...
...

This takes care of the initial mypy errors with this draft stub. Now to make those stubs useful!

Convert strtobool

Looking at the code for strtobool, it expects a string and returns True or False, or raises a ValueError. There's no way to annotate expected exceptions in mypy at this time, so we'll focus on the return values with this improved signature:

def strtobool(value: str) -> bool: ...

That's it! Simple utility functions are easy.

Convert config

Next is config, which appears in the stub as:

class AutoConfig:
    SUPPORTED: Incomplete
    encoding: Incomplete
    search_path: Incomplete
    config: Incomplete
    def __init__(self, search_path: Incomplete | None = ...) -> None: ...
    def __call__(self, *args, **kwargs): ...

config: Incomplete

and in the code as:

class AutoConfig(object):
    """
    Autodetects the config file and type.
    Parameters
    ----------
    search_path : str, optional
        Initial search path. If empty, the default search path is the
        caller's path.
    """
    ...

# A pré-instantiated AutoConfig to improve decouple's usability
# now just import config and start using with no configuration.
config = AutoConfig()

decouple does some work to load the configuration from settings.ini or .env into a "Repository", and then returns a Config instance which looks for overrides in the environment. It is all super clever! However, our code just uses config as a function, such as:

SECURE_SSL_HOST = config("DJANGO_SECURE_SSL_HOST", None)
SECURE_SSL_REDIRECT = config("DJANGO_SECURE_SSL_REDIRECT", False)

The core of decouple is the Config.get function:

    def get(self, option, default=undefined, cast=undefined):
        """
        Return the value for option or default if defined.
        """

        # We can't avoid __contains__ because value may be empty.
        if option in os.environ:
            value = os.environ[option]
        elif option in self.repository:
            value = self.repository[option]
        else:
            if isinstance(default, Undefined):
                raise UndefinedValueError('{} not found. Declare it as envvar or define a default value.'.format(option))

            value = default

        if isinstance(cast, Undefined):
            cast = self._cast_do_nothing
        elif cast is bool:
            cast = self._cast_boolean

        return cast(value)

When we call config("VAR_NAME", default="true", cast=bool), we're calling Config.get() through a few layers of indirection.

Let's demo some usage of decouple:

$ export BOOL_VAR=true
$ export STR_VAR=foobar
$ export INT_VAR=42
$ python
Python 3.9.11 (main, Apr 26 2022, 09:22:38) 
[Clang 13.1.6 (clang-1316.0.21.2.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from decouple import config
>>> config("BOOL_VAR")
'true'
>>> config("NO_VAR", "baz")
'baz'
>>> config("NO_VAR", 123)
123
>>> config("BOOL_VAR", 123, bool)
True
>>> config("INT_VAR", cast=int)
42
>>> config("STR_VAR", cast=str)
'foobar'
>>> config("STR_VAR")
'foobar'
>> config("NO_VAR", "False", cast=bool)
False
>>> exit
Use exit() or Ctrl-D (i.e. EOF) to exit
>>> exit()
$

There's also some combos that raise exceptions:

config("NO_VAR")  # decouple.UndefinedValueError: NO_VAR not found. Declare it as envvar or define a default value.
config("INT_VAR", cast=bool)  # ValueError: Invalid truth value: 42
config("BOOL_VAR", cast=int)  # ValueError: invalid literal for int() with base 10: 'true'
config("NO_VAR", None, cast=bool)  # ValueError: Invalid truth value: none

mypy and Python annotations do not cover exceptions, but there are some tricks. There is a NoReturn for annotating a function that only raises an exception. You can sometime use annotations to cause a mypy error based on argument types. For now, just know that we're not worrying about runtime exceptions.

We can remove / comment all the stuff about UndefinedValueError and RepositorySecret, and treat config as a function, using the something like the signature of Config.get:

from typing import Any, Callable, Optional

...

def config(
    option: str,
    default: Optional[Any] = None,
    cast: Optional[Callable[[str], Any]] = None,
) -> Any: ...

This is a good time to mention that black can formant stub .pyi files as well.

There may be some new concepts introduced here:

  • The standard library module typing has useful members for type annotation
  • Optional is used to say a type can be None or the hinted type. It is equivalent to "Union[None,<Type>]", and (in Python 3.10) "<Type>|None".
  • Any means anything, and is often useless for type checking.
  • Callable is a callable object, such as a function, a lambda, or a class implementing __call__. It takes two arguments - a list of argument types, and the return type of the function.

This stub passes mypy, but throws away some useful information. A fuller description of config() would be:

  1. If only option is passed, the (non-exception) return is type str
  2. If option and a str default are passed, the return is type str
  3. If option and a non-str default are passed, the return is either str or the type of default
  4. If option, default, and cast are passed, cast should accept str or the type of default and the return is the return type of cast

For complicated functions with optional parameters, overload is useful to tease out the different patterns. It is easiest to demonstrate.

First, the most basic call - only the option parameter:

from typing import Any, Callable, Optional, overload

...

@overload
def config(option: str) -> str

This results in many errors:

$ mypy . 2> /dev/null
mypy_stubs/decouple.pyi:7: error: Single overload definition, multiple required  [misc]
privaterelay/settings.py:53: error: Too many arguments for "config"  [call-arg]
privaterelay/settings.py:53: error: Unexpected keyword argument "cast" for "config"  [call-arg]
mypy_stubs/decouple.pyi:8: note: "config" defined here
...
phones/views.py:17: error: Too many arguments for "config"  [call-arg]
phones/views.py:18: error: Too many arguments for "config"  [call-arg]
Found 112 errors in 3 files (checked 124 source files)

Next, add an overload with a default that is a string:

@overload
def config(option: str, default: str) -> str: ...
$ mypy . 2> /dev/null
privaterelay/settings.py:53: error: No overload variant of "config" matches argument types "str", "None", "Type[str]"  [call-overload]
privaterelay/settings.py:53: note: Possible overload variants:
privaterelay/settings.py:53: note:     def config(option: str) -> str
privaterelay/settings.py:53: note:     def config(option: str, default: str) -> str
...
Found 58 errors in 2 files (checked 124 source files)

Next, the non-string default. This requires a new part of the typing toolkit, TypeVar, used for generics, and Union, used to specify multiple types.

from typing import Any, Callable, Optional, TypeVar, Union, overload

_DefaultType = TypeVar('_DefaultType')
...

@overload
def config(option: str, default: _DefaultType) -> Union[str, _DefaultType]: ...
$ mypy . 2> /dev/null
privaterelay/settings.py:53: error: No overload variant of "config" matches argument types "str", "None", "Type[str]"  [call-overload]
privaterelay/settings.py:53: note: Possible overload variants:
privaterelay/settings.py:53: note:     def config(option: str) -> str
privaterelay/settings.py:53: note:     def config(option: str, default: str) -> str
privaterelay/settings.py:53: note:     def [_DefaultType] config(option: str, default: _DefaultType) -> Union[str, _DefaultType]
...
Found 33 errors in 1 file (checked 124 source files)

And finally, the cast parameter:

_CastReturnType = TypeVar("_CastReturnType")

@overload
def config(
    option: str, default: _DefaultType, cast: Callable[[_DefaultType], _CastReturnType]
) -> _CastReturnType: ...
$ mypy . 2> /dev/null
privaterelay/settings.py:62: error: Argument 1 to "get" of "dict" has incompatible type "Optional[str]"; expected "str"  [arg-type]
privaterelay/settings.py:65: error: Need type annotation for "INTERNAL_IPS"  [var-annotated]
privaterelay/settings.py:431: error: Incompatible types in assignment (expression has type "datetime", variable has type "str")  [assignment]
privaterelay/settings.py:697: error: Incompatible types in assignment (expression has type "None", variable has type "str")  [assignment]
Found 4 errors in 1 file (checked 124 source files)

These errors were not detected by previous stubs of config, so we've made some progress toward better code.

Here's mypy_stubs/decouple.pyi so far:

from _typeshed import Incomplete
from typing import Any, Callable, Optional, TypeVar, Union, overload

_DefaultType = TypeVar("_DefaultType")
_CastReturnType = TypeVar("_CastReturnType")

def strtobool(value: str) -> bool: ...
@overload
def config(option: str) -> str: ...
@overload
def config(option: str, default: str) -> str: ...
@overload
def config(option: str, default: _DefaultType) -> Union[str, _DefaultType]: ...
@overload
def config(
    option: str, default: Any, cast: Callable[[Any], _CastReturnType]
) -> _CastReturnType: ...

class Csv:
    cast: Incomplete
    delimiter: Incomplete
    strip: Incomplete
    post_process: Incomplete
    def __init__(
        self, cast=..., delimiter: str = ..., strip=..., post_process=...
    ) -> None: ...
    def __call__(self, value): ...

class Choices:
    flat: Incomplete
    cast: Incomplete
    choices: Incomplete
    def __init__(
        self, flat: Incomplete | None = ..., cast=..., choices: Incomplete | None = ...
    ) -> None: ...
    def __call__(self, value): ...

We can also remove "decouple" from the override list in pyproject.toml. Our ignore is already ignored, so this is mostly for documentation.

This looks like a commit point to me!

Fix type issues in settings - RELAY_CHANNEL

Let's get mypy back to no error by fixing the 4 detected errors.

The first error:

privaterelay/settings.py:62: error: Argument 1 to "get" of "dict" has incompatible type "Optional[str]"; expected "str"  [arg-type]

And the code:

ORIGIN_CHANNEL_MAP = {
    "http://127.0.0.1:8000": "local",
    "https://dev.fxprivaterelay.nonprod.cloudops.mozgcp.net": "dev",
    "https://stage.fxprivaterelay.nonprod.cloudops.mozgcp.net": "stage",
    "https://relay.firefox.com": "prod",
}
RELAY_CHANNEL = ORIGIN_CHANNEL_MAP.get(SITE_ORIGIN, "prod")

OK, what's going on? We can add some debug lines:

reveal_type(ORIGIN_CHANNEL_MAP)
reveal_type(SITE_ORIGIN)

Revealing:

privaterelay/settings.py:62: note: Revealed type is "builtins.dict[builtins.str, builtins.str]"
privaterelay/settings.py:63: note: Revealed type is "Union[builtins.str, None]"

mypy determined the type of ORIGIN_CHANNEL_MAP as dict[str, str], and SITE_ORIGIN as Optional[str]. If SITE_ORIGIN is None, that's OK, RELAY_CHANNEL gets the default of prod. But mypy still points out that a None could be passed where a str is expected.

There are several ways to fix this. Here's one:

from typing import Optional
...
ORIGIN_CHANNEL_MAP: dict[Optional[str], str] = {
    "http://127.0.0.1:8000": "local",
    "https://dev.fxprivaterelay.nonprod.cloudops.mozgcp.net": "dev",
    "https://stage.fxprivaterelay.nonprod.cloudops.mozgcp.net": "stage",
    "https://relay.firefox.com": "prod",
}
RELAY_CHANNEL = ORIGIN_CHANNEL_MAP.get(SITE_ORIGIN, "prod")

or:

ORIGIN_CHANNEL_MAP: {
    "http://127.0.0.1:8000": "local",
    "https://dev.fxprivaterelay.nonprod.cloudops.mozgcp.net": "dev",
    "https://stage.fxprivaterelay.nonprod.cloudops.mozgcp.net": "stage",
    "https://relay.firefox.com": "prod",
}
RELAY_CHANNEL = ORIGIN_CHANNEL_MAP.get(SITE_ORIGIN, "prod") if SITE_ORIGIN else "prod"

or:

ORIGIN_CHANNEL_MAP: {
    "http://127.0.0.1:8000": "local",
    "https://dev.fxprivaterelay.nonprod.cloudops.mozgcp.net": "dev",
    "https://stage.fxprivaterelay.nonprod.cloudops.mozgcp.net": "stage",
    "https://relay.firefox.com": "prod",
}
if SITE_ORIGIN is None:
    RELAY_CHANNEL = "prod"
else:
    RELAY_CHANNEL = ORIGIN_CHANNEL_MAP.get(SITE_ORIGIN, "prod")

or fail if SITE_ORIGIN is unset, but ensure str

SITE_ORIGIN = config("SITE_ORIGIN")

or make it an empty string:

SITE_ORIGIN = config("SITE_ORIGIN", "")

or make it default to the production URL:

SITE_ORIGIN = config("SITE_ORIGIN", "https://relay.firefox.com")

There are several ways to fix it. Some change the way the code works, which could either be an improvement or a bug waiting to happen. Some are stylistic choices. Use discretion, code review, and establish team norms.

My favorite is the first. The extra ": dict[Optional[str], str]" signals that this dictionary will be used in interesting ways, namely as a lookup for a default value.

You might have also found another issue - if you add a syntax error to settings.py, then mypy will fail as well, since the Django plugin includes processing it. Another reason to commit often, like now!

Fix type issues in settings - INTERNAL_IPS and Csv()

Next is INTERNAL_IPS

privaterelay/settings.py:65: error: Need type annotation for "INTERNAL_IPS"  [var-annotated]

The code:

if DEBUG:
    INTERNAL_IPS = config("DJANGO_INTERNAL_IPS", default=[])

There's a few issues here:

  • If we're not in debug, then INTERNAL_IPS is not defined. This is OK, Django will use the default of an empty list. mypy is OK with that, but might get confused.
  • mypy doesn't like empty lists (default=[]), because it can't tell the type of the contained elements
  • INTERNAL_IPS will be either a str or an empty list, but it should be a list[str]. This seems like a bug!

The fix requires decouple.Csv, which outputs a list of str, combined with changing the default to an empty string, suitable as an input for Csv:

if DEBUG:
    INTERNAL_IPS = config("DJANGO_INTERNAL_IPS", default="", cast=Csv())

This currently gives INTERNAL_IPS a type of "ANY", since we haven't updated the draft stub for decouple.CSV. Let's look at that code:

class Csv(object):
    """
    Produces a csv parser that return a list of transformed elements.
    """

    def __init__(self, cast=text_type, delimiter=',', strip=string.whitespace, post_process=list):
        """
        Parameters:
        cast -- callable that transforms the item just before it's added to the list.
        delimiter -- string of delimiters chars passed to shlex.
        strip -- string of non-relevant characters to be passed to str.strip after the split.
        post_process -- callable to post process all casted values. Default is `list`.
        """
        self.cast = cast
        self.delimiter = delimiter
        self.strip = strip
        self.post_process = post_process

    def __call__(self, value):
        """The actual transformation"""
        transform = lambda s: self.cast(s.strip(self.strip))

        splitter = shlex(value, posix=True)
        splitter.whitespace = self.delimiter
        splitter.whitespace_split = True

        return self.post_process(transform(s) for s in splitter)

Csv has a lot of options and flexibility - and we're using none of it. We use the defaults for INTERNAL_IPS and AWS_SNS_TOPIC:

AWS_SNS_TOPIC = set(config("AWS_SNS_TOPIC", "", cast=Csv()))

Let's confirm what Csv does with defaults:

$ python
Python 3.9.11 (main, Apr 26 2022, 09:22:38) 
[Clang 13.1.6 (clang-1316.0.21.2.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from decouple import Csv
>>> Csv()('foo,bar')
['foo', 'bar']
>>> Csv()('foo')
['foo']
>>> Csv()(123)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/john/.virtualenvs/fx-private-relay/lib/python3.9/site-packages/decouple.py", line 278, in __call__
    return self.post_process(transform(s) for s in splitter)
  File "/Users/john/.virtualenvs/fx-private-relay/lib/python3.9/site-packages/decouple.py", line 278, in <genexpr>
    return self.post_process(transform(s) for s in splitter)
  File "/Users/john/.pyenv/versions/3.9.11/lib/python3.9/shlex.py", line 300, in __next__
    token = self.get_token()
  File "/Users/john/.pyenv/versions/3.9.11/lib/python3.9/shlex.py", line 109, in get_token
    raw = self.read_token()
  File "/Users/john/.pyenv/versions/3.9.11/lib/python3.9/shlex.py", line 140, in read_token
    nextchar = self.instream.read(1)
AttributeError: 'int' object has no attribute 'read'
>>> Csv()('1,2,3')
['1', '2', '3']

Here's a stub for Csv that works for no arguments:

class Csv:
    # Note: there are additional parameters that Relay (currently) doesn't use:
    # cast, delimiter, strip, post_process
    def __init__(self) -> None: ...
    def __call__(self, value: str) -> list[str]: ...

If we do need one or more parameters for Csv in the future, we'll need to revisit this stub. But, that day may not come, or may be after python-decouple ships its own types.

This now sets the type of INTERNAL_IPS to list[str] rather than Any. Sometimes, you need to go a little farther than just eliminating the mypy error. This has also fixed a subtle bug. Time for a small commit!

Fix type issues in settings - PREMIUM_RELEASE_DATE

The next error:

privaterelay/settings.py:432: error: Incompatible types in assignment (expression has type "datetime", variable has type "str")  [assignment]

The code:

PREMIUM_RELEASE_DATE = config(
    "PREMIUM_RELEASE_DATE", "2021-10-27 17:00:00+00:00", cast=str
)
PREMIUM_RELEASE_DATE = datetime.fromisoformat(PREMIUM_RELEASE_DATE)

mypy determines the PREMIUM_RELEASE_DATE's type as str from the first assignment, then we re-assign as a datetime.

One fix is to use two variables:

_raw_premium_release_date = config(
    "PREMIUM_RELEASE_DATE", "2021-10-27 17:00:00+00:00", cast=str
)
PREMIUM_RELEASE_DATE = datetime.fromisoformat(_raw_premium_release_date)

Another way is to eliminate the first assignment:

PREMIUM_RELEASE_DATE = datetime.fromisoformat(config(
    "PREMIUM_RELEASE_DATE", "2021-10-27 17:00:00+00:00", cast=str
    )

and another way is to use the cast parameter (my favorite):

PREMIUM_RELEASE_DATE = config(
    "PREMIUM_RELEASE_DATE", "2021-10-27 17:00:00+00:00", cast=datetime.fromisoformat
)

a very ugly way to fix it:

PREMIUM_RELEASE_DATE: Union[datetime, str] = config(
    "PREMIUM_RELEASE_DATE", "2021-10-27 17:00:00+00:00", cast=str
)
PREMIUM_RELEASE_DATE = datetime.fromisoformat(PREMIUM_RELEASE_DATE)

and still another way (exercise for the reader) is remove PREMIUM_RELEASE_DATE from the code, since we've already shipped it.

One more fix, one more tiny commit.

Fix type issues in settings - sentry_release

The final error is:

privaterelay/settings.py:697: error: Incompatible types in assignment (expression has type "None", variable has type "str")  [assignment]

The code:

SENTRY_RELEASE = config("SENTRY_RELEASE", "")
CIRCLE_SHA1 = config("CIRCLE_SHA1", "")
CIRCLE_TAG = config("CIRCLE_TAG", "")
CIRCLE_BRANCH = config("CIRCLE_BRANCH", "")

if SENTRY_RELEASE:
    sentry_release = SENTRY_RELEASE
elif CIRCLE_TAG and CIRCLE_TAG != "unknown":
    sentry_release = CIRCLE_TAG
elif (
    CIRCLE_SHA1
    and CIRCLE_SHA1 != "unknown"
    and CIRCLE_BRANCH
    and CIRCLE_BRANCH != "unknown"
):
    sentry_release = f"{CIRCLE_BRANCH}:{CIRCLE_SHA1}"
else:
    sentry_release = None

sentry_sdk.init(
    dsn=config("SENTRY_DSN", None),
    integrations=[DjangoIntegration()],
    debug=DEBUG,
    with_locals=DEBUG,
    release=sentry_release,
)

sentry_release is detected as a str, but the last branch gives it a (valid) value of None. It can be fixed by annotating the first assignment:

if SENTRY_RELEASE:
    sentry_release: Optional[str] = SENTRY_RELEASE
...

Another way is to move the default assignment up and eliminate the final else: clause:

sentry_release: Optional[str] = None
if SENTRY_RELEASE:
    sentry_release = SENTRY_RELEASE
...

I slightly prefer this one, because the type declaration is not indented.

That's all the errors! Commit time!

Complete stub for Choices

We've got one last task before we're done with decouple. We use Choices in a few places, such as:

PROCESS_EMAIL_BATCH_SIZE = config(
    "PROCESS_EMAIL_BATCH_SIZE", 10, cast=Choices(range(1, 11), cast=int)
)

reveal_type(PROCESS_EMAIL_BATCH_SIZE) shows the type "Any", but it should be "int".

Let's look at the source for Choices:

class Choices(object):
    """
    Allows for cast and validation based on a list of choices.
    """

    def __init__(self, flat=None, cast=text_type, choices=None):
        """
        Parameters:
        flat -- a flat list of valid choices.
        cast -- callable that transforms value before validation.
        choices -- tuple of Django-like choices.
        """
        self.flat = flat or []
        self.cast = cast
        self.choices = choices or []

        self._valid_values = []
        self._valid_values.extend(self.flat)
        self._valid_values.extend([value for value, _ in self.choices])

    def __call__(self, value):
        transform = self.cast(value)
        if transform not in self._valid_values:
            raise ValueError((
                    'Value not in list: {!r}; valid values are {!r}'
                ).format(value, self._valid_values))
        else:
            return transform

In our code, we always set flat and cast, and don't use choices. Here's a custom stub for our usage:

from collections.abc import Sequence

...

class Choices:
    # Note: there are additional parameters that Relay (currently) doesn't use:
    # choices
    def __init__(
        self, flat: Sequence[_CastReturnType], cast: Callable[[Any], _CastReturnType]
    ) -> None: ...
    def __call__(self, value: Any) -> _CastReturnType: ...

We're using a new type Sequence, an Abstract Base Class (ABC) for read-only and mutable sequences. Before Python 3.9, this would have been imported from typing, but typing.Sequence is now deprecated. We're back to reading Python release notes to see how our code should change...

Sequence supports built-in types like list, str, and tuple, but also iterators like range(1, 10), which we're using. If your function takes any sequence, you may want to use it rather than a specific type like list.

reveal_type(PROCESS_EMAIL_BATCH_SIZE) now returns "_CastReturnType`-1", not the desired int. We'll have to go a little further with the generics:

from collections.abc import Sequence
from typing import Any, Callable, Generic, Optional, TypeVar, Union, overload

...

class Choices(Generic[_CastReturnType]):
    # Note: there are additional parameters that Relay (currently) doesn't use:
    # choices
    def __init__(
        self, flat: Sequence[_CastReturnType], cast: Callable[[Any], _CastReturnType]
    ) -> None: ...
    def __call__(self, value: Any) -> _CastReturnType: ...

This helps specify that the class will vary based on _CastReturnType, and mypy is clever enough to determine that type from what cast we pass to __init__. Finally reveal_type(PROCESS_EMAIL_BATCH_SIZE) return builtins.int, the typical wordy way to say int.

If we wanted to support some of the __init__ parameters on Csv, like cast or post_process, we'd use similar techniques.

Stubbing decouple: summary

Here's the final mypy_stubs/decouple.pyi (commit!):

from collections.abc import Sequence
from typing import Any, Callable, Generic, Optional, TypeVar, Union, overload

_DefaultType = TypeVar("_DefaultType")
_CastReturnType = TypeVar("_CastReturnType")

def strtobool(value: str) -> bool: ...
@overload
def config(option: str) -> str: ...
@overload
def config(option: str, default: str) -> str: ...
@overload
def config(option: str, default: _DefaultType) -> Union[str, _DefaultType]: ...
@overload
def config(
    option: str, default: Any, cast: Callable[[Any], _CastReturnType]
) -> _CastReturnType: ...

class Csv:
    # Note: there are additional parameters that Relay (currently) doesn't use:
    # cast, delimiter, strip, post_process
    def __init__(self) -> None: ...
    def __call__(self, value: str) -> list[str]: ...

class Choices(Generic[_CastReturnType]):
    # Note: there are additional parameters that Relay (currently) doesn't use:
    # choices
    def __init__(
        self, flat: Sequence[_CastReturnType], cast: Callable[[Any], _CastReturnType]
    ) -> None: ...
    def __call__(self, value: Any) -> _CastReturnType: ...

Along the way, we annotated a lot of privaterelay/settings.py (there's a few untyped functions) and fixed a few potential bugs. The Django typing plugin will preserve these types when used as from django.conf import settings, which will help when we're typing code that uses settings.

Next Steps

The document Using mypy with an existing codebase is good, and we're done with step 1.

To complete the initial work, we should run mypy in CircleCI, first as optional, then failing the build. We may also want to run mypy --strict to measure progress toward full typing.

The best way to get comfortable with types is to use them. Guess at types, use reveal_type to confirm your guesses, and annotate as needed.

The mypy documentation is helpful, to see what is available.

PR 2062 demonstrates some techniques from going from a loosely-typed object, like a multi-leveled dictionary, to a strictly-typed object using dataclasses.

@piyh
Copy link

piyh commented Nov 1, 2023

This puts into words so much implicit knowledge, thank you. Found this off of google off some arcane mypy django error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment