Skip to content

Instantly share code, notes, and snippets.

@borndangerous
Created December 5, 2010 22:06
Show Gist options
  • Save borndangerous/729527 to your computer and use it in GitHub Desktop.
Save borndangerous/729527 to your computer and use it in GitHub Desktop.
Quick database question...

Hey, this should be an ultra basic database question, but I've never known the answer and would love an explanation or point in the right direction. So, here it is:

When setting up a database schema - for storing related data (such as descriptive tags) for a piece of content (such as an article or photo), which of the below is the smartest way?

tags Table Schema (Option A)

item_id   tags
1         alpha,beta,gamma
2         beta
3         alpha,gamma

Then, for instance, I could access the tags for item 1 by exploding the delimited tags string & looping through the array.

tags Table Schema (Option B)

item_id   tag
1         alpha
1         beta
1         gamma
2         beta
3         alpha
3         gamma

Then, for instance, I could access the tags for item 1 by querying SELECT * FROM tags WHERE item_id = 1


Any answers/explanatory articles are appreciated. Always been curious.

David (@dvdwlsh)

@isaacdudek
Copy link

David,

Given the two options you've presented, Option B is the preferred implementation because it allows the database to do the kind of work it's best suited to do. The database can pull records using foreign keys much faster than your code can split strings on commas, especially if you have an index on the item_id column (which you should).

However, if you're trying to achieve the typical tagging behavior found on most sites/apps (with tags shared across items such that a click on the tag shows all of the tagged items), both options are suboptimal. What you need for this is a tags table as well as a standalone join table (often called taggings). For example:

items table

id   ...
1    ...
2    ...

taggings table

id   tag_id   item_id
1    1        1
2    2        1
3    1        2

tags table

id   name
1    'alpha'
2    'beta'

This schema allows an item to be associated with an arbitrary number of tags (and vice versa), while also allowing you to easily find all of the relevant objects from either direction. For example:

To find all of the tags on an item: SELECT * FROM tags INNER JOIN taggings ON tags.id = taggings.tag_id INNER JOIN items ON items.id = taggings.item_id WHERE items.id = 1;

To find all of the items that have a particular tag, do the reverse: SELECT * FROM items INNER JOIN taggings ON items.id = taggings.item_id INNER JOIN tags ON tags.id = taggings.tag_id WHERE tags.name = "alpha";

If this is for a Rails application, simply grab Acts as Taggable On (which uses this approach) and go on your merry way!

Best,
Isaac (@isaacdudek)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment