Skip to content

Instantly share code, notes, and snippets.

@CMCDragonkai
Last active January 8, 2024 16:42
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save CMCDragonkai/224bcc345115f621c4c3 to your computer and use it in GitHub Desktop.
Save CMCDragonkai/224bcc345115f621c4c3 to your computer and use it in GitHub Desktop.
Databases: Dependent Types in Databases

The database can make use of a more expressive type system. Specifically to enforce invariants on data integrity at the schema level. But also understanding databases from a type theoretic point of view can lead to better, safer and more expressive type systems.

You do not want to keep writing database validation code in your application boundary. That is tiring. Instead let your database do that work. After all, it is where the data is stored. It is where the data is migrated. So shouldn't it also maintain the integrity and the constraints of the data?

I just looked into Postgresql's type system features, even though it doesn't always use the same terminology, there indeed seems lots of cross overs:

In a type theoretic point of view, what exactly is a foreign key?

Also what's the difference between CREATE TYPE vs CREATE DOMAIN? It appears that CREATE TYPE creates a basic primitive type, while CREATE DOMAIN creates a refinement type, which is a type with predicates.

The usage of enum can simulate some usecases of ADTs in Haskell.

The more constrained your database is, the more safer it is, and the less likely for data corruption. Also the more possibilities for optimisation, for example if values of a data type can fit in a smaller memory space.

Another advantage is that fake data generators can take advantage of the type information to generate more correct data. See: http://www.coelho.net/datafiller.html

Would one say that these types are dynamic types? That is they are checked at runtime?

Now how does one avoid writing validation code twice? If the database already establishes many invariants on the data, do you write it all again in your application code? And some invariants may not be supported by the database, then you'll need to put some invariant checking at the application level. That's just quite messy. Furthermore the way databases are accessed is mostly through string concatenated SQL, this still opens yourself up to SQL injection. What we need is a type safe SQL reified in the application langauge, so you can't ever possibly accidentally perform SQL injection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment