jameshfisher/ORM_constraints.md

## ORM_constraints.md

      
    Raw
  

              ORM_constraints.md
            
          
    ORM correctness constraints

At an old workplace, we used the Hibernate ORM to map a class hierarchy to our relational database.
Hibernate represents a class hierarchy in the following way.
Let's say we have the tables:
abstract class A
class B extends A
class C extends A

is mapped to your relational database like so:
create table A (id int primary key);
create table B (id int primary key references A (id));
create table C (id int primary key references A (id));

This strategy guarantees one important consistency property:
given a row in any one of these tables,
there will be rows for each of its super-classes.
However, there are two important things that it does not guarantee:

That the expected subclass rows for an object exist.
The class A is abstract.
There should therefore be no rows in A without corresponding a corresponding row in B or C.
But this strategy does not impose any database constraints that guarantee
that no rows exist in tables B or C with the same id.
That there are not multiple subclass rows for an object.
The ids in tables B and C must be disjoint,
but standard constraints cannot express this.

However,
if we wanted these guarantees from our database (and we should),
we could get them by encoding our subclasses like this:
create table A (
    id int primary key,
    class text not null,
    B_id int references B (id),
    C_id int references C (id),
    check (
        (class = 'B' and B_id is not null and C_id is null and B_id = id) or
        (class = 'C' and C_id is not null and B_id is null and C_id = id)
        )
    );
create table B (
    id int primary key,
    class text not null,
    check (text = 'B'),
    foreign key (id, class) references A (id, class)
    );
create table C (
    id int primary key,
    class text not null,
    check (text = 'C'),
    foreign key (id, class) references A (id, class)
    );

Notice that:

The expected subclass row must exist.
The check constraint enforces that the class value is a valid class, say, 'B'.
The constraint then enforces that B_id is not null, and is equal to id.
The references constraint on B_id then enforces that the expected row exists in table B.
The ids in tables B and C are disjoint.
Assume that there exists a row in B with the same id as a row in C.
Let i be that id.
Then their foreign key constraints guarantee
one row in A with id = i and class = 'B',
and another row in A with id = i and class = 'C'.
They must be separate rows since they differ in their class column.
But they must be the same row since they have the same id, which is a unique primary key.

Together, these two results guarantee the intuitive result that
select id from A

is equal to
(select id from B) union (select id from C)