Skip to content

Instantly share code, notes, and snippets.

@dwilliamson
Created June 5, 2012 12:02
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dwilliamson/2874586 to your computer and use it in GitHub Desktop.
Save dwilliamson/2874586 to your computer and use it in GitHub Desktop.
DB Notes
Create tables of simple types; don't try to replicate C++ entity structures in a DB as they don't map well. For example, a C++ structure like this:
struct Entity
{
PositionComponent* position;
AIComponent* ai;
};
Can be considered a "object-oriented database model" and is hard to extend. Generalising this in C++ a little, we get:
struct Entity
{
vector<Component*> components;
};
While this is easy to extend in C++ land, a direct DB mapping is pretty awful.
Instead of parent objects referencing child objects, having child objects reference parents works well for extensible database designs. In C++ this would like like:
struct Entity { };
struct PositionComponent { Entity* parent; }
struct AIComponent { Entity* parent; }
While not ideal for C++ (you lose low-cost O(1) pointer dereferencing), this allows for extensible database designs for entities.
CREATE TABLE Entities (id INT PRIMARY_KEY);
CREATE TABLE PositionComponents (parent_entity INT PRIMARY_KEY, x DECIMAL, y DECIMAL, z DECIMAL);
CREATE TABLE AIComponents (parent_entity INT PRIMARY_KEY, ...);
You can add as many different components as you like by adding new tables, you can easily change the format of individual components and transform the necessary tables and lookups are quick due to the entity ID being the primary key in all tables.
How this scales to very large Zynga-style databases, I have no idea. I would guess it doesn't.
@dwilliamson
Copy link
Author

Need a bunch of tables, so:

CLASSES
Name    Unique ID   Size    TypeOfVA    HasConstructor  HasDestructor   HasCopyConstructor  HasAssignmentOperator

BASE TYPES
Name    Unique ID   Size    TypeOfVA

ENUMS
Name    Unique ID   Size    TypeOfVA

ENUM ENTRIES
Name    Value

FUNCTIONS
Name    CallAddress ReturnParameter (IsConst?)

FIELDS
Name    Type    IsConst Modifier    ArrayRank   ArrayLength0    ArrayLength1    Offset

Remove:
Unique ID - what's the point? To help remove redundant data, see below...
Constructor/destructor info in base types and enums

Need a way of querying: give me all fields belonging to type X.

Could store a Parent field that gives the full name, but then we'd have:

Parent                          Name
std::basic_string<blah, blah>   std::basic_string<blah, blah>::whatever0
std::basic_string<blah, blah>   std::basic_string<blah, blah>::whatever1
std::basic_string<blah, blah>   std::basic_string<blah, blah>::whatever2
std::basic_string<blah, blah>   std::basic_string<blah, blah>::whatever3

which is an awful lot of duplicate data. Of course, Parent could be an ID but you'd then need to do two queries:

SELECT UniqueID FROM Classes WHERE Name='std::basic_string<blah, blah>'

...record the ID...

SELECT * FROM Fields WHERE ParentID=3

SQL subqueries should help here:

SELECT * FROM Fields WHERE ParentID IN
    (SELECT UniqueID FROM Classes WHERE Name='std::basic_string<blah, blah>')

In order to achieve this, the full names of each table entry need to be stored. This is, again, redundant data, but it's necessary
to prevent queries from getting more complex.

Joins can also be used to achieve this:

SELECT * From Fields JOIN Classes ON Fields.ParentID = Classes.UniqueID AND Classes.Name = 'std::basic_string<blah, blah>'
SELECT * From Fields, Classes WHERE Fields.ParentID = Classes.UniqueID AND Classes.Name = 'std::basic_string<blah, blah>'

In the case of inheritance, how would you achieve this? We would need a recursive search, which is a bit of a pain in SQL.
SQL doesn't represent tree hierarchies very well. For each type you could have a table that stores a list of all types it
inherits from. That way your query would be:

SELECT * FROM Fields Where ParentID IN (SELECT * FROM ClassNameInherits)

Note that this table would also have to mark that a type inherits from itself, in order to be included in the results.
Alternatively we could use a table union to merge with the types unique ID.
The ClassNameInherits table only exists if a type does inherit from something and its existence is purely to aid construction
of simpler SQL queries. This will work but it might put too much noise in the database.

Is there a way to dynamically build up the ClassNameInherits table? Probably. But SQLite is quite primitive in this scenario.

Rather than using the adjacency model, we could use the nested set model:

http://www.codeproject.com/KB/database/nestedsets.aspx
http://www.intelligententerprise.com/001020/celko.jhtml

This would actually work wonders and give so many benefits, but it limits the representation to a tree. Even
though I don't like the idea of multiple inheritance, we have to support it in some minimal form; diamond inheritance,
and all.

Of course, when queries get this complicated, it might be worth simply switching to Python.

There is another approach: Ancestor (closure) tables:

http://evolt.org/working_with_hierarchical_data_in_sql_using_ancestor_tables
http://kylecordes.com/2008/01/13/transitive-closure/

This is a generalisation of the ClassNameInherits approach, requiring only one table. This is exactly what's needed!
Only in this case we're not worried about having to rebuild the table from scratch so it's much simpler.
To find a class and all its bases, do:

SELECT AncestorID FROM Ancestors WHERE Ancestors.UniqueID IN
    (SELECT UniqueID FROM Classes WHERE Classes.Name='std::basic_string<blah, blah>')

Note that DISTINCT is not used here because we can do some post-processing on the ancestor table before commit that
removes duplicate references due to diamond inheritance.

Finally, to find all fields that belong to a given type or its bases, do:

SELECT * FROM Fields WHERE ParentID IN
    (SELECT AncestorID FROM Ancestors WHERE UniqueID IN
        (SELECT UniqueID FROM Classes WHERE Name='std::basic_string<blah, blah>'))

LISP, anyone?

Note also the same requirement as above: there must be an entry in the ancestor table where a class references itself
as an ancestor. If at a later date the database design needs to be adjusted for proper ancestor behaviour, the query
would become:

SELECT * FROM Fields WHERE ParentID IN
    (SELECT DISTINCT AncestorID FROM Ancestors WHERE UniqueID IN
        (SELECT UniqueID FROM Classes WHERE Name='std::basic_string<blah, blah>')
    UNION
    SELECT UniqueID FROM Classes WHERE Name='std::basic_string<blah, blah>')

Not great but I'm sure further research will turn up a simpler solution. Let's not worry about that now, though.

Call this the ClassInheritance table and there's no need to store inheritance information in the class table itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment