public
Created

Examples with MongoDB

  • Download Gist
mongify.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
This is a case study of how we would model parts of the spotify app in MongoDB.
 
MODEL
=====
 
users
-----
username
name
plain_txt_password
plan
list_ids: [id, id, id]
friend_ids: [id, id, id]
 
tracks
------
id
title
album_id
artist_ids [1,2,8]
 
albums
------
id
title
artist_id
published_at
produced_by
track_count
awards:[{id: new ObjectId(), title: "best pitbull featuring version", year: 2010, awarded_by: "MTV"}]
 
artists
-------
id
name
bio
picture: [{path: ..., filename: ..., version: "thumb", ...}]
active: true
overdose: true
 
lists
-----
id
user_id
title
public: true/false
track_ids: [id, id, id, id]
subscriber_count
 
plays (this collection should be hosted on a different host & database, because it will have a lot of frequent high volume writes)
-----
user_id
track_id
played_at
duration
 
SAMPLE QUERIES
==============
 
Your n favorite songs (the ones with the most plays by you)
-----------------------------------------------------------
 
One possibility is to use group but it has some limitations. We cannot sort by number of plays or limit the number of items we want to retrieve.
 
Group query:
 
db.plays.group({ query: { user_id: xxx }, key: 'track_id', reduce: function(doc, prev) { prev.n++; if (doc.played_at > prev.last_played_at) { prev.last_played_at = doc.played_at } }, initial: { n: 0, last_played_at: 0 })
 
result:
 
[ { _id: ..., value: { n: ..., last_played_at: ...}
 
map reduce query
----------------
 
db.plays.mapReduce(
function () { emit( this.track_id, { n: 1, t: this.played_at }) },
function (key, values) {
 
 
//// Prettier implementation
var n_res = 0;
var max_t = 0;
values.forEach(function(v){
n_res+=v.n;
if(v.t > max_t) max_t = v.t;
})
return {n: n_res, t: max_t};
 
 
///// Implementation with better performance
var ac = values[0];
for(var i = 1; i < values.length; i++) {
ac.n += values[i].n;
var v_t = values[i].t;
if(v_t > ac.t) ac.t = v_t;
}
return ac;
},
out: "fooo"
query: { user_id: xxx })
 
db.foo.find().sort(..)
 
How this map reduce works:
 
Sample data:
track_id played_at
--------------
1 100
2 101
1 102
2 100
1 200
 
emit(1, { n: 1, t: 100 })
emit(2, { n: 1, t: 101 })
emit(1, { n: 1, t: 102 })
emit(2, { n: 1, t: 100 })
emit(1, { n: 1, t: 200 })
 
 
reduce(1, [ { n: 1, t: 100 }, { n: 1, t: 102 }, { n: 1, t: 200 } ])
=> { n: 3, t: 200 }
 
reduce(2, [{ n: 1, t: 100 }, { n: 1, t: 101 }])
=> { n: 2, t: 101 }
 
...
 
Ideally this query should run in background and the results (e.g. your 10 most played songs) stored in the user's document for fast retrieval
posts_and_comments.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
Option A: comments are referencing a blogpost, like typically in ActiveRecord
=============================================================================
 
posts
-----
id
title
body
author_id -> user
 
comments
--------
id
text
author_id -> user
post_id -> post
 
Option B: comments are embedded in the blogpost document
========================================================
 
posts
-----
id
title
body
author_id -> user
comments: [{_id: ...., text: ..., author: ...., approved: true}, {...}]
 
 
An ID attribute is provided for the comment so that they can be handled invidually, as if they would be a separate entity, e.g. for approval process: Search blogpost by comment ID and set approved attribute.
 
posts.find({"comments._id": xxx})
 
It's also possible to define in the query params which parts of the document we do not want to retrieve, so we reduce the overhead when we are only interested in certain parts, e.g. the comment itself.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.