multi-key indexing allows for indexing properties within an array. It makes things fast [ red, blue, green ]
On the way out Is red in the colors array? Is red not in the colors array? Scalar: $ne, $mod, $exists, $type, $lt, $gt, $gte, $ne Vector: $in, $nin, $all, $size
On the way in: Atomicness scalar: $inc, $set, $unset vector: $push, $pop, $pull, $pullAll, $addToSet
//Case #1: As a librarian, when I swipe a patrons cards, I want to verify their address //• One-to-one relationships = "Belongs to". They are ofect embedded
patron = { _id: "joe", name: "joe trojan", address: { street: "123 face st. ", city: "Los Angeles", state: "MA", Zip: 10000 } }
//Case #2: As a librarian, I want to store multiple addresses so I have a better chance of hunting you down
patron = { _id: "joe", name: "joe trojan", join_date: ISODate("2011-10-14") address: [ { street: "123 face st. ", city: "Los Angeles", state: "MA", Zip: 10000 } { street: "456 face st. ", city: "Los Angeles", state: "MA", Zip: 10000 } ] }
- As a librariean, I want to see the publisher of a book //• publisher puts out a lot of book
book = { _id: "123", title: "Mongo DB Book", authors: [], published_date: ISODate("2010-09-24"), pages: 215, language: "English", publisher: { name: "O'Reilery media", founded: 1980, location: "CA" } }
//This causes trouble because it'll be hard to find all the publishers in a query. Plus, this data is immutable (meaning that it will never change) but
- A librarian wants to do a query for all the publshers in the system //If you dont' care about history, you don't need to worry about changes to the publisher but data is history so you must respect it.
publisher = { _id: "oreilly", name: "O'Reilley media ", founded: 1980, location: "CA" } books = { _id: "123", publisher_id: "oreilly" }
//Attempt #3: bad Idea because this grows
publisher = { _id: "oreilly", name: "O'Reilley media ", founded: 1980, books: [ "123", ...] }
//Case #6: Find authors of book 'foo'
book = { _id: "123", title: "Mongo DB Book", authors: [ { id: "kchodoworow", name: "kristina cho" }, { id: "mdirol", name: "Mike dirioli" } ], published_date: ISODate("2010-09-24"), pages: 215, language: "English", } author = { _id: "kchodoworow", name: "kristina cho", hometown: "New York" } author = { _id: "mdirol", name: "Mike dirioli", hometown: "CA" }
//Attempe #2: Let's put teh book in the author
book = { _id: "123", title: "Mongo DB Book", published_date: ISODate("2010-09-24"), pages: 215, authors: [ { id: "kchodoworow", name: "kristina cho" }, { id: "mdirol", name: "Mike dirioli" } ], language: "English", } author = { _id: "kchodoworow", name: "kristina cho", hometown: "New York" books: [ { id: "123", title: "Mongo DB Definitive guide" } ] }
Embedding
- Great for read performances
- One seek to laod entire object
- one roundtrip to database
- writese can be slow
- maintaining data integrigy
- MongoDB has a 16mb data cap. Guttenberg bible is 4mb
Linking
- MOre flexibility
- Data integrity is maintained
- Work is done during reads
If a book has categories, then we can put books //Problem
book = { category: "MongoDB" } category = { _id: "mongo db", parent: "databases" } category = { _id: "databases", parent: "programming" }
//Option #1: Store the category heirarchy
book = { //If you index the category array, you'll be efficient categories: [ "MongoDB", "Databases", "Programming" ] } book = { categories: [ "MySQL", "Databases", "Programming" ] } book = { categories: [ "MySQL", "Databases", "Programming" ] }
//Optione #2: Putting things into a string
book = { category: "Programming/Databases/MySQL" } book = { category: "MongoDB/Databases/Programming" }
//If you have a problem you can solve with Regular expressions have two problems
//Option #3" Interval trees