CMCDragonkai/database_typesystems.md

## database_typesystems.md

      
    Raw
  

              database_typesystems.md
            
          
    The database can make use of a more expressive type system. Specifically to
enforce invariants on data integrity at the schema level. But also understanding
databases from a type theoretic point of view can lead to better, safer and more
expressive type systems.
You do not want to keep writing database validation code in your application
boundary. That is tiring. Instead let your database do that work. After all,
it is where the data is stored. It is where the data is migrated. So shouldn't it
also maintain the integrity and the constraints of the data?
I just looked into Postgresql's type system features, even though it doesn't
always use the same terminology, there indeed seems lots of cross overs:

HoTT discussion on dependent types in databases HoTT/book#743
PostgreSQL Domains (are like refinement types): http://www.postgresonline.com/journal/archives/205-Using-Domains-to-Enforce-Business-Rules.html
PostgreSQL Constraints (Foreign Keys is a type constraint too!): http://www.postgresql.org/docs/9.2/static/ddl-constraints.html
PostreSQL Types: http://www.postgresql.org/docs/9.0/static/sql-createtype.html

In a type theoretic point of view, what exactly is a foreign key?
Also what's the difference between CREATE TYPE vs CREATE DOMAIN? It appears
that CREATE TYPE creates a basic primitive type, while CREATE DOMAIN creates
a refinement type, which is a type with predicates.
The usage of enum can simulate some usecases of ADTs in Haskell.
The more constrained your database is, the more safer it is, and the less likely
for data corruption. Also the more possibilities for optimisation, for example
if values of a data type can fit in a smaller memory space.
Another advantage is that fake data generators can take advantage of the type
information to generate more correct data. See: http://www.coelho.net/datafiller.html
Would one say that these types are dynamic types? That is they are checked at runtime?
Now how does one avoid writing validation code twice? If the database already
establishes many invariants on the data, do you write it all again in your
application code? And some invariants may not be supported by the database, then
you'll need to put some invariant checking at the application level. That's just
quite messy. Furthermore the way databases are accessed is mostly through string
concatenated SQL, this still opens yourself up to SQL injection. What we need is
a type safe SQL reified in the application langauge, so you can't ever possibly
accidentally perform SQL injection.