Skip to content

Instantly share code, notes, and snippets.

@markus2610
Forked from jnewman12/Mongo.md
Created December 31, 2020 10:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save markus2610/dff2165e87e4375e5b565a9d53cac6ac to your computer and use it in GitHub Desktop.
Save markus2610/dff2165e87e4375e5b565a9d53cac6ac to your computer and use it in GitHub Desktop.
Data Modeling With Mongo

Data Modeling with MongoDB

mongo


Objectives

  • Understand model relationships in MongoDB
  • Understand One-to-Many relationships
  • Understand Many-to-Many relationships
  • Reinforce the Difference between embedding and referencing

Recap

  • Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB’s collections do not enforce document structure.
  • This flexibility facilitates the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity, even if the data has substantial variation.

Recap: References vs Embeds

References

  • References store the relationships between data by including links or references from one document to another. Applications can resolve these references to access the related data. Broadly, these are normalized data models.

references

Embedded Data

  • Embedded documents capture relationships between data by storing related data in a single document structure. MongoDB documents make it possible to embed document structures in a field or array within a document. These denormalized data models allow applications to retrieve and manipulate related data in a single database operation.

embeds

Examples

var Comments = new Schema({
    title     : String
  , body      : String
  , date      : Date
});

var BlogPost = new Schema({
    author    : ObjectId
  , title     : String
  , body      : String
  , date      : Date
  , comments  : [Comments]
  , meta      : {
        votes : Number
      , favs  : Number
    }
});

mongoose.model('BlogPost', BlogPost);
  • add an embedded document to the array
  // retrieve my model
var BlogPost = mongoose.model('BlogPost');

// create a blog post
var post = new BlogPost();

// create a comment
post.comments.push({ title: 'My comment' });

post.save(function (err) {
  if (!err) console.log('Success!');
});
  • removing an embedded document
BlogPost.findById(myId, function (err, post) {
  if (!err) {
    post.comments[0].remove();
    post.save(function (err) {
      // do something
    });
  }
});
  • this might be how you'd find embedded objects by their id
  • DocumentArrays have an special method id that filters your embedded documents by their _id property (each embedded document gets one):
post.comments.id(my_id).remove();
post.save(function (err) {
  // embedded comment with id `my_id` removed!
});

When to embed? When to reference?

  • Both embedding and referencing have their strengths and weaknesses.
  • Unlike the strict structure of relational DB's, data modeling in mongo is more art than science due to mongo's unique flexibility
  • Without going into to much detail, here is a quick recap of some of the key points of when to use either referencing or embedding in your apps
  1. Referencing
  • good when you need more flexibility
  • good when you have a many-to-many relationship
  1. Embedding
  • good when the sub object always appears with it's parent. Like a comment to a post
  • good when you have a one-to-many relationship

A basic One-to-Many Example

  • Recap: In relational databases, a one-to-many relationship occurs when a parent record in one table can potentially reference several child records in another table. In a one-to-many relationship, the parent is not required to have child records; therefore, the one-to-many relationship allows zero child records, a single child record or multiple child records. The important thing is that the child cannot have more than one parent record.

one-to-many


Modeling One-to-Many Relationships with Embedded Documents

Consider the following example that maps a user and multiple address relationships. The example illustrates the advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-many relationship between user and address data, the user has multiple address entities.

In the normalized data model, the address documents contain a reference to the patron document.

Embedded

var mongoose = require('mongoose');
var addressSchema = new mongoose.Schema({
    street: String,
    city: String,
    cc: String
});

var userSchema = new mongoose.Schema({
    name: String, 
    ssn: String,
    addresses: [addressSchema]
});

module.exports = mongoose.model('User', userSchema);

Usage

var user = new User({
    name: 'Kate Monster', 
    ssn: '123-456-7890',
    addresses : [
        { street: '123 Sesame St', city: 'Anytown', cc: 'USA' },
        { street: '123 Avenue Q', city: 'New York', cc: 'USA' }
  ]
});

user.save();

// or
user.addresses.push({street: 'bancroft pkwy', city: 'wilmington', country: 'usa'});

Output

{
  name: 'Kate Monster',
  ssn: '123-456-7890',
  addresses : [
     { street: '123 Sesame St', city: 'Anytown', cc: 'USA' },
     { street: '123 Avenue Q', city: 'New York', cc: 'USA' }
  ]
}

If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data entities in the patron data, as in the following document


Modeling One-to-Many Relationships with Document References

Consider the following example that maps products and order relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the products information.

var mongoose = require('mongoose');

var orderSchema = new mongoose.Schema({
    products: [{type: mongoose.Schema.ObjectId, ref: 'Product'}]
});

var Order = mongoose.model('Order', orderSchema);

var productSchema = new mongoose.Schema({
    name: String,
    price: Number
});

var Product = mongoose.model('Product', productSchema);

Usage

var product = new Product({name: 'Wrench', price: 5});
product.save();
var order = new Order()
order.products.push(product)
order.products // ["57ec7d5cf292421828791b8b"] // just the objectId

Output

{ 
    _id: '57ec7d63f292421828791b8c',
    products: [ '57ec7d5cf292421828791b8b' ] 
}

In order to obtain the referenced documents we need to call 'populate' on the query.

Order.findById(id).populate('products').exec(function(err, order){
    console.log(order);
});

Output

{ _id: '57ec800a3130441eb4b52e39',
  __v: 0,
  products: 
   [ { _id: '57ec800a3130441eb4b52e38',
       name: 'Wrench',
       price: 5,
       __v: 0 } ] }

Question Checkpoint

  • what are mongo references?
  • what are mongo embeds?
  • when would you use a reference?
  • when would you use an embed?

Modeling Many to Many Relationships

  • Just like one-to-many relationships, many-to-many relationships are important for any app.
  • Normally we implement these relationships with MongoDB by linking documents via referencing.
  • We're going to discuss a few possible ways to model many-to-many relationships and show you how to pick one over another.

A Basic Many to Many Example

  • Recap: A many-to-many relationship refers to a relationship between tables in a database when a parent row in one table contains several child rows in the second table, and vice versa.

many-to-many

  • Another Example
    • product can be in many categories
    • category can have many products

Many to Many

var mongoose = require("mongoose"),
    Schema = mongoose.Schema,
    relationship = require("mongoose-relationship");

var ParentSchema = new Schema({
    children:[{ type:Schema.ObjectId, ref:"Child" }]
});
var Parent = mongoose.models("Parent", ParentSchema);

var OtherParentSchema = new Schema({
    children:[{ type:Schema.ObjectId, ref:"Child" }]
});
var OtherParent = mongoose.models("OtherParent", OtherParentSchema);

var ChildSchema = new Schema({
    parents: [{ type:Schema.ObjectId, ref:"Parent", childPath:"children" }]
    otherParents: [{ type:Schema.ObjectId, ref:"OtherParent", childPath:"children" }]
});
ChildSchema.plugin(relationship, { relationshipPathName:['parents', 'otherParents'] });
var Child = mongoose.models("Child", ChildSchema)

var parent = new Parent({});
parent.save();
var otherParent = new OtherParent({});
otherParent.save();

var child = new Child({});
child.parents.push(parent);
child.otherParents.push(otherParent);
child.save() //both parent and otherParent children property will now contain the child's id 
child.remove() 

Embedded and Referrencing Documents - TLDR

  • One: favor embedding unless there is a compelling reason not to

  • Two: needing to access an object on its own is a compelling reason not to embed it

  • Three: If there are more than a couple of hundred documents on the “many” side, don’t embed them; if there are more than a few thousand documents on the “many” side, don’t use an array of ObjectID references. High-cardinality arrays are a compelling reason not to embed.

  • Four: Don’t be afraid of application-level joins: if you index correctly and use the projection specifier (as shown in part 2) then application-level joins are barely more expensive than server-side joins in a relational database.

  • Five: Consider the write/read ratio when denormalizing. A field that will mostly be read and only seldom updated is a good candidate for denormalization: if you denormalize a field that is updated frequently then the extra work of finding and updating all the instances is likely to overwhelm the savings that you get from denormalizing.

  • Six: As always with MongoDB, how you model your data depends – entirely – on your particular application’s data access patterns. You want to structure your data to match the ways that your application queries and updates it.

http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1


Conclusion

  • referencing allows you to store id's like a table in SQL
  • embedding allows you to store a whole child object inside a parent object
  • one-to-many in mongo operates very similarly to how we handled it in rails
  • many-to-many in mongo operates differently, and there are a couple ways to handle that association

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment