Skip to content

Instantly share code, notes, and snippets.

@csirmazbendeguz
Last active March 16, 2024 14:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save csirmazbendeguz/de1fdf88bf9df0dfba837f8e88c35df6 to your computer and use it in GitHub Desktop.
Save csirmazbendeguz/de1fdf88bf9df0dfba837f8e88c35df6 to your computer and use it in GitHub Desktop.
"Django ORM support for composite primary keys" proposal for Google Summer of Code 2024

"Django ORM support for composite primary keys" proposal for Google Summer of Code 2024

1. Motivation

In database design, composite primary keys are often necessary for the partitioning and sharding of database tables.

  • Citus is a PostgreSQL extension that transforms PostgreSQL into a distributed database.
  • django-multitenant is a library by Citus which enables developers to build multi-tenant applications in Django.

In Citus, composite primary keys are required. So, in order to use these tools with Django, the developers must work around the Django ORM.

To make building multi-tenant apps easier, I propose adding composite primary key support to the Django ORM.

1.1. Links

2. Proposal

2.1. Composite PK

A composite primary key can be defined by setting Meta.primary_key (similar to Peewee).

class User(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    id = models.BigAutoField(primary_key=False)

    class Meta:
        primary_key = ("tenant_id", "id")

If Meta.primary_key is set, primary_key=True can't be set on any fields, any attempt to do so will result in a check error.

2.1.1. DB support

All officially supported databases (PostgreSQL, MariaDB, MySQL, Oracle, SQLite) support composite primary keys.

If, for some reason, the database doesn't support composite primary keys, the feature can be disabled with the supports_composite_primary_keys feature flag defined on the db.backends.base.features.BaseDatabaseFeatures class.

2.1.2. API changes

The implementation of this feature doesn't need any backwards-incompatible changes to public APIs, only internal APIs.

e.g. def _create_primary_key_sql(self, model, field): -> def _create_primary_key_sql(self, model, fields):

A notable backwards-compatible change is, if a composite primary key is defined, _meta.pk is assigned a tuple of fields instead of a single field.

_meta.pk is used 100+ times in Django, all occurences have to be reviewed and adjusted.

2.2. Non-PK AutoField

In Django, AutoFields must set primary_key=True.

To make it possible for composite primary keys to include surrogate keys (e.g. SmallAutoField, AutoField, BigAutoField), AutoFieldMixin needs to allow setting primary_key=False for fields part of the composite primary key.

    id = models.BigAutoField(primary_key=False)

This proposal is only concerned about making auto fields work with composite primary keys, however, there have been requests (1, 2, 3) in the past to support other use cases and remove this limitation altogether.

2.3. Composite FK

2.3.1. ForeignKey vs ForeignObject

ForeignKey doesn't support composite foreign keys, but its parent class ForeignObject does.

class Comment(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    id = models.BigAutoField()
    user_id = models.BigIntegerField()
    user = models.ForeignObject(
        User,
        on_delete=models.CASCADE,
        from_fields=("tenant_id", "user_id"),
        to_fields=("tenant_id", "id"),
    )

ForeignObject works well with composite primary keys, it supports multi-column JOINs, but it doesn't create a composite foreign key in the database.

Also, while ForeignKey creates an index automatically, developers need to define an index explicitly when using ForeignObject.

While database-level composite foreign keys and automatic indexes are nice to have, they are not integral to implementing composite primary keys.

So, no changes needed (for now).

2.3.2. Generic Relations

Using GenericForeignKey is generally considered bad design 1.

That said, if support for composite primary keys is required, it could be achieved with the following:

class TaggedItem(models.Model):
    content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
    object_id = models.TextField()
    content_object = GenericForeignKey("content_type", "object_id")

The composite primary keys are JSON-encoded and stored in a text field, e.g. [1, 2], ['c141ef6c-4816-4377-8fab-cf8f3ac3152a', 'c378e84c-9e85-4aeb-bc95-a85a2e403c98'].

GenericForeignKey can JSON-decode the text field, and if it's an array of integers or strings, filter for composite primary keys.

2.4. Django Admin

A composite primary key can be displayed in URLs in the format quote(pk1) + ',' + quote(pk2). Since Django's quote function already URL-encodes ,, this change is backwards-compatible.

3. Scope

_meta.pk is used all over Django's source code. All occurences need to be reviewed and adjusted individually. I believe this can't be done 175hr, so I propose a scope of 350hr.

Fortunately, this proposal is backwards-compatible, so support for composite primary keys can be introduced incrementally.

3.1. Django ORM

The primary goal of this proposal is to add composite primary keys to the Django ORM.

So, among other things:

  • A model can define a composite primary key.
  • The migration system can create a database-level composite primary key.
  • The composite primary key works with the ORM's public APIs (e.g. .get(), .create(), .delete(), .bulk_update(), etc.).
  • The composite primary key works with other fields (e.g. auto fields).
  • It's tested and documented.

To deliver this, I'll need at least 5 weeks = 200hr.

3.2. Other

The secondary goal of this proposal is to add composite primary key support to other parts of Django, if time permits.

So, the remaining 150hr I would spend working on composite primary key support for other Django code (e.g. Django Admin).

4. About Me

My name is Bendegúz Csirmaz, I've been a professional software engineer since 2017. I have 4 years of experience developing Django applications (LinkedIn). I have some free time now to work on open source projects.

Google Summer of Code 2024 is a great opportunity to contribute to my favorite web framework and deliver a long-awaited, important feature - one that I would also like to use in my own projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment