Skip to content

Instantly share code, notes, and snippets.

@vvardhanz
Last active September 24, 2019 12:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vvardhanz/7ce74c2eac6b58eba8343ba05731deeb to your computer and use it in GitHub Desktop.
Save vvardhanz/7ce74c2eac6b58eba8343ba05731deeb to your computer and use it in GitHub Desktop.
MongoDbNotes
MongoDB University notes.
Mongo Db is a document.
Mongo Db uses JSON object/data.
Mongo Db supports scaling out using sharing technique.
Mongo Db supports scaling out vs scaling up.
BSON:
Mongo Db stores the data in the format of BSON. On the application side manogdb drivers map the BSON data into the native data types.
Drives usually
BSON —> Binary JSON
—> LightWeight
—> Traversable —> writing , reading etc
—> Efficient —> encoding and decoding… usually taken care by Drivers.
JSON : Drawbacks
—> There is only single number type in json. You cannot distinguish btw integer and float.
—> JSon dont support the Date type. You have to include date as a string or encoded data.
For this reason we use BSON which extends JSON and provide the date and other number types which are missing in the JSON.
ex:
//JSON
{ “hello”: “world” }
//BSON
Importing Json files using MongoImport
mongoimport --db m101 --collection zips --file zips.json
MongoDB 11/21/17
Mongo Db is No sql Database, comes under Document Store Category. (JSON/XML Record file format).
Mongo Db follows JSON format.
CouchDB and MongoDb are popular Document store Databases.
Advantages:
User to user Different attributes can be stored. Schema less behavior.
Mongo Db follows JSON Format. JSON supports complex structures and collections.
So MongoDB supports complex Data.
Mongo Db Commands 11/21/17
—> db is a logical group of collections, indexes ..etc.
Advantages of DB : Maintenance activities made simple or easy.
show dbs
—> lists all the existing databases.
use mydb
—> When use command is used , if the given db is existed db will be collected. If not existed, the db will be created. To show a db , a db should have at least one collection with one document.
db.myinfo.insert({“name”:’vivek ,”age”:29});
—> My info collection will be created and a document will be inserted.
db.myinfo.find().pretty();
—> gives the records or findings in a very readable fashion or format.
db.createDatabase(“name”);
—> db.createCollection(“name”);
—> db it will show on which db you are using. it will show.
show collections
—> To see list of collection in the Db.
db.createCollection(“myName”);
—> Another way to create a table. or collection.
db.myname.drop();
—> dropping a table or collection.
Dropping a Database.
select a database
—> use mydb
—> db
—> db.dropDatabase()
—> show dbs
Mongo Db CRUD Operations.
Read
db.myname.find();
—> It will pull all the records from myName table.
ex: select * from my name.
db.myname.find().pretty();
—> it will show the results in a neat or most readble format to fashion.
db.myname.find({“Name"});
—> It will select Name from myname table.
ex: select Name from myname;
db.myname.find({“Name”,”Age"});
—> It will select name and age from the table myname.
ex: select Name, Age from myname;
db.myname.find({“Name”:”Vivek" });
—> It will all the records which have a city name hyd.
ex: select * from my name where Name=“Vivek”;
db.myname.find({“age”:{$gt:25});
ex: select * from myname where age > 25;
—> $lt <
—> $gt >
—> $lte <=
—> $gte >=
—> $ne !=
Logical Operators( and/or):
Remove First or last elements of an array
db.students.update({_id:1},{$pop:{scores:-1}}); removes the first element.
db.students.update({_id:1},{$pop:{scores: 1}}); removes the last element.
here -1 is for the first element
1 for the last element.
—> $Pull : pull is used to remove multiple fields in an array.
ex:
db.students.updates({},
{$pull: {fruits:{ $in:{“appel”,”oranges”}},
{vegetables: “carrots”}},
{multi:true}
);
$Max or $Min
To set minimum and maximum limit for 2 different variables.
db.scores.update({_id:1},{ $min:{lowscores:150}}) min limit for the low scores.
db.scores.update({_id:1},{ $max:{highscores:950}}) maximum limit for the high scores.
—> if the given values =< minimum limit or >= maximum limit then only the fields will get updated.
$EACH
db.students.update(
{ name: “Joe”},
{$push: {scores:{ $each: [ 90,92,65 ] }}}
)
—> $EACH allows duplicates but the $SET doesn’t allows duplicates.
Mongo Commands Description 11/23/17
Explain () : Explain helps us to optimize a query. It gives the details of the query what it is trying to do in the background. It will give us explainable object.
- Query Planner
- Execution Stats
- all plans Executions
Query Planner:
syntax: db.foo.explain() —> it will give us the explainable object.
EX:db.foo.explain().find();
.update();
.remove();
.aggregate()
.count()
.group()
.help();
Execution Stats:
syntax: db.collections.explain(“executionStats”);
Mongo INDEXES 11/21/17
Creating index along with unique index.
Unique Index:
Unique index can be created for the purpose of stopping duplicate data entering into the table.
syntax : db.collections.createIndex({column_name: 1}, {unique: true});
Ex: db.students.createIndex({student_id: 1, class_id: 1}, {unique: true});
db.stuff.createIndex({thing:1},{unique:true});
Sparse Indexes:
Sparse Index helps when an column has null value while creating unique indexes, it will avoid that column and creates unique index on rest of the columns.
Ex: Creating unique index on a, b, and c. It will throw error since the c column is null for the 3rd and 4th records. You can handle these type of situations using the sparse indexes.
{ a:1, b:2 , c:3 }
{ a:2, b:3 , c:4 }
{ a:3, b:4 }
{ a:4, b:5 }
Note: It will throw duplicate record error in the above situation.
Sparse index uses lot of less space compare to normal index.
Sparse index cannot be used for the sort. It will do a regular sort rather then the index sort which is quicker because there are some records with null values. And mongo Db does not want to skip those records in the sort results.
Syantax: syntax : db.collections.createIndex({column_name: 1}, {unique: true, sparse:true});
Ex: db.students.createIndex({a_id: 1, b_id: 1, c_id: 1}, {unique: true, sparse: true});
11/22/17
Multikey Index: Multi key index refers to index on arrays. For multi key index you need to have a document with at least one array.
You cannot index parallel arrays. You can index one array and one scalar or vice versa.
11/23/17
Creating Indexes Background
You can create indexes in the background.
Note:
Creating the indexes in the background is slow and doesn’t block read and write.
Creating Indexes in the foreground is fast but it blocks read and writes to the database.
syntax: db.collections.createIndex({column_name: 1}, {background: true});
11/29/13
Index efficiency is important. Using the proper index search for the query to pull the results in quicker.
Logging and Profiling:
0 —>
1 —> Queries which are slow.
2 —> All Queries.
Ex: for the 2 —> Fetching all the queries.
mongod —dbpath /usr/local/var/mongodb —profiler 1 —slows 2
db.system.profile.find().pretty();
db.system.profile.find(ns:your collection here).pretty();
db.getProfilingLevel();
db.getProfilingStatus();
you can set the profiling status. Here you can mention the type of profiling and the number of milli seconds . Basically , number of milliseconds is for the queries that you want observe which are taking time beyond the number of milli seconds that you have mentioned.
If you want to turn off the profiling , then you have to use below syntax
db.setProfilingLevel(0);
Mongo Stat:
Mongo Stat command is kind of performance tuning command.
Its kind of similar to the Iostat command from the unix. It will sample database in 1 second interval.
Ex: inserts, updates, quires , deletes.
It will also tell you what you are running like wired tiger or mmapv1.
Ex: mongostat —port 27018
It will give all the details of that server.
Aggregation 11/30/17
Aggregation :
EX:
db.products.aggregate([
{$group:
{ _id:”$manufacturer”,
num_products:{$sum:1}
}
}
]);
Aggregation pipeline:
ORDER OF THE PIPLINE
Collection —> $Project —> $match —> $group —> $sort —> results
$Project: Project will reshape step. For every document which comes into project stage, there will one document going out. (1:1) .
$Match: Match is the second stage and its a filtering stage. Its an n: 1 by nature.
$Group : Group is an aggregate stage. It is an n:1 by nature.
$Sort: It is an sort stage. it will sort the document. It is of the nature 1:1.
$Skip: It is used to skip the documents. It is by nature n:1.
$Limit: It used to limit number of documents. It is n:1 by nature.
$unwind: It will normalize the data. It is used to unwind the data. It will create an explosion of data but it will help to aggregate the data. This will be of nature 1:n.
Ex: tags[“red”,”green”,”yellow”];
tag[“red”];
tag[“green”];
tag[“yellow”];
$OUT: It used to redirect your output to a collection rather then a cursor.
Other $redact $geonear
12/8/17
$sum —> will do the sum up the values
$avg —> will calculate the average across the documents.
$min —> will calculate the min value.
$max —> will calculate the max value.
$push —> used to build the arrays
$addtoset —> used to build arrays. It only adds unique values.
$first —> It only adds first value. This needs sort.
$last —> It only adds last value. This needs Sort.
$SUM
EX:
db.zips.aggregate([ {$group: { _id: {"state":"$state"}, population: {$sum: "$pop”}}}])
or
db.zips.aggregate([ {$group: { _id:"$state", population: {$sum: "$pop"}} ])
$AVG
> db.zips.aggregate([ {$group: {"_id": "state", "average":{ "$avg" : "$pop"}}} ])
{ "_id" : "state", "average" : 8462.794262937348 }
or
> db.zips.aggregate([ {$group: {"_id": "$state", "average":{ "$avg" : "$pop"}}} ])
$addToSet
It is very helpful if you to add to different types or unique values to a collection .Or if you want to find all the different types or values in a collection.
EX:
db.products.aggreate([
{ $group: { _id: {“maker”:“$manufacuter” }, “categories":{addToSet:”$category”} } } )]
In the above example we are creating a new category and we are adding unique values into it.
or
db.zips.aggregate([ {$group: {"_id": "$city", "postal_code":{ "$push" : "$_id"}}} ])
$PUSH
Push is similar to addToSet except it doesn’t check for unique values. It duplicates as well.
EX:
db.products.aggreate([ { $group: { _id: {“maker”:“$manufacuter” },“categories{addToSet:”$category”} }})]
OR
db.zips.aggregate([ {$group: {"_id": "$city", "postal_code":{ "$addToSet" : "$_id"}}} ])
$MAX and $MIN
Gives the max or min value
Ex:
db.products.aggreate([ { $group: { _id: {“maker”:“$manufactuter” },"maxprice":{ “$max":”$price”} }})]
or
db.products.aggreate([ { $group: { _id: {“maker”:“$manufactuter” },"maxprice":{ “$min":”$price”} }})]
> db.zips.aggregate([ {$group: {"_id":"$state", "pop":{"$max": "$pop"}} }])
> db.zips.aggregate([ {$group: {"_id":"$state", "pop":{"$max": "$pop"}} }])
Double Grouping
You can group twice.
EX:
db.grades.aggregate([ {“$group”:{“_id”:{class_id:”$class_id", student_id:"$student_id”}, ‘average’:{“$avg”:”$score”}}}, {“$group”:{_id:”$_id.class_id”, ‘average’:{“avg”:”$average”}}}])
or
Example:
> db.fun.find()
{ "_id" : 0, "a" : 0, "b" : 0, "c" : 21 }
{ "_id" : 1, "a" : 0, "b" : 0, "c" : 54 }
{ "_id" : 2, "a" : 0, "b" : 1, "c" : 52 }
{ "_id" : 3, "a" : 0, "b" : 1, "c" : 17 }
{ "_id" : 4, "a" : 1, "b" : 0, "c" : 22 }
{ "_id" : 5, "a" : 1, "b" : 0, "c" : 5 }
{ "_id" : 6, "a" : 1, "b" : 1, "c" : 87 }
{ "_id" : 7, "a" : 1, "b" : 1, "c" : 97 }
db.fun.aggregate([{$group:{_id:{a:"$a", b:"$b"}, c:{$max:"$c"}}}, {$group:{_id:"$_id.a", c:{$min:"$c"}}}])
ANS: 52 and 22
$PROJECT
- Removes keys
- Add new keys
- Reshape keys
- Use some simple function on keys.
- $toUpper
- $toLower
- $add
- $multiply
db.products.aggregate([
{$project:
{
_id:0,
‘maker’: {$toLower:”$manufacturer"},
‘details’: {‘category’:”$category", ‘price’:{"$multiply:[“$price”,10]} },
‘item’:"$name"
}
}
])
OR
db.zips.aggregate([ {$project:{'_id':0, 'city': "$city", 'pop':"$pop", 'state':"$state", 'zip':"$_id"} }])
OR
db.zips.aggregate([{$project:{_id:0, city:{$toLower:"$city”}, pop:1, state:1, zip:"$_id"}}])
$Match
—>Pre aggregate filter.
You may only want to aggregate on certain kind or do a filter.
db.zip.aggregate([ {$match: {‘state’:”CA”}},
{$group: {‘_id’:"$city”,
"population”: {“$sum”:“$pop”},
“zip_codes”: {"$addToSet”:"_id”}
}}
])
OR
db.zips.aggregate([{$match:{pop:{$gt:100000}}}]) It gives all the zip codes whose population is greater than 100000
$SORT
- Disk memory based sorting.
- In memory based sorting. Cannot be more than 160 MB.
- It can be used before or after grouping stage.
db.zips.aggregate([ {$match: {state:’NY’}}, {$group: {_id:”$city”, population: {$sum: “$pop”} },
{$project: {_id: 0, city: “$_id”, population: 1}}, {$sort: {population :-1}} ])
or
db.zips.aggregate([ {$sort: {state:1, city:1}} ])
$SKIP and $LIMIT
Ex:
db.zips.aggregate([ {$match: {state:’NY’}}, {$group: {_id:”$city”, population: {$sum: “$pop”} },
{$project: {_id: 0, city: “$_id”, population: 1}}, {$sort: {population :-1}}, {$skip: 10},{$limit: 5} ])
Skips the first 10 records and lists the results to 5.
Or
$First and $Last
EX:
db.zips.aggregate([ {$group: { _id: {state:"$state", city:"$city"},population: {"$sum":"$pop"}} }, {$sort: {state:1, city:-1}}, {$group:{ _id:"$_id.state", city:{$first: "$_id.city"}}} ])
OR
db.zips.aggregate([ {$group: { _id: {state:"$state", city:"$city"},population: {"$sum":"$pop"}} }, {$sort: {state:1, city:-1}}, {$group:{ _id:"$_id.state", city:{$last: "$_id.city"}}} ])
OR
db.zips.aggregate([ {$group: { _id: {state:"$state", city:"$city"},population: {"$sum":"$pop"}} }, {$sort: {"_id.state":1, population:-1}}, {$group:{ _id:"$_id.state", city:{$first: "$population"}}} ])
$UNWIND
db.posts.aggregate([ {"$unwind”:”$tags”}, {“$group”: {“_id”:"$tags”, “count”:{$sum:1} }}, {$sort:{“count”:-1}}, {$limit:10}, {“$project”:{_id:0, ‘tag’:’$_id’}}, {“$sort”: {“count”: -1}}, {$limit:10}, {"$project”: {_id:0, ‘tag’:’$_id’, count:1 }} ])
Double $UNWIND
db.inventory.aggregate([ {$unwind: “$sizes”}, {$unwind: “$color”}, {$group: { ‘_id’: {‘size’:’$sizes’, ‘color’:’$colors’}, ‘count’: {‘$sum’:1} }} ])
Note can u reverse $unwind ——> Yes
db.inventory.aggregate([ {$unwind: “$sizes”}, {$unwind: “$colors”},
/* Create the color array */
{$group: { ‘_id’: {name: “$name”, size:”$size”}, ‘colors’: {$push: “$colors”},}},
/* create the size array */
{$group: { ‘_id’: {‘name’: “$_id.name”,
‘colors’ : “$colors”},
‘sizes’: {$push: “$_id.size”},
}
}
])
OR if the items are unique.
db.inventory.aggregate([ {$unwind: “$sizes”}, {$unwind: “$colors”},
{$group: { ‘_id’: “$name”, sizes:{$addToSet:”$size”}, ‘colors’: {$addToSet: “$colors”}}},
])
Limitations in Aggregation:
- 100 mb limit for pipeline stages. —> using Disk Use — u need to specify it.
- 16 mb limit default for python. —> cursor = {} —>
- Sharding —> group by , sort.
Mongo Db CRUD Operations 12/2/17
—> the return value of an find is always a Cursor Object.
—> Shell is a fully functional javascript interpreter.
EX: var c = db.movies.find();
c.hasnext();
c.next();
*********************************************************************************************************
Reading Documents
Equality Matches:
- Scalars
- Embedded Documents
- Arrays
Scalar : The first argument to the find is a query document. We give selectors in query documents that will restrict the result set to pull only documents that matching the query.
Ex:
db.movieDetails.find( {rated: “PG-13”} ).count();
Ans: 152
db.movieDetails.find( {rated: “PG-13”, year: 2009 } ).count();
Ans 8
Embedded Documents: We use DOT(.) notation to identify the field nested with in the field. You have put the key in the quotes. This is used only for the fields that contain nested documents only.
ex:
“tomato” : {
“meter”: 100,
“image” “certified”,
“rating”: “8.9"
},
"imdb”:{
“id”: “tt0435761”,
“rating”: 8.4,
“votes”: 50084
}
Ex:
db.movieDetails.find( { “tomato.meter” :100 } ).pretty()
************************************************************************************************************
Arrays
Equality matches on Arrays
- On the entire Array.
- Based on any element.
- Based on a special element.
- More Complex matches using operators.
On the entire Array : Order of the elements matter here.
EX:
db.movieDetails.find({“Writers”:[“Ethan Coen”, "Joel Coen”] }).count();
Ans: 1
This above query will pull only documents which have Ethan followed by Joel.
db.movieDetails.find({“Writers”:["Joel Coen” , “Ethan Coen”] }).count();
Ans: 0
Based on any element: Here query search is done based on one element in an array.
Ex:
db.movieDetails.find({ “actors”: “Jeff Bridges” }).count()
Ans: 4
Here in the above query we are looking for the occurrence of “Jeff Bridges” in the actor array.Order doesn’t matter here.
Based on a special element: Here we are looking for specific element or element at a position in an array. Or element occurring at specific position in an array.
EX:
db.movieDetails.find({ “actors.0”: “Jeff Bridges” }).pretty()
Ans: 2
Here “Jeff Bridges” occurs 2 as first actor in list/array. Unlike the above example where “Jeff” occurrence was 4.
CURSORS: Find elements returns cursor. We have to iterate it in order to access to documents. If you don’t mention or assign VAR attribute by default Mongo Db will iterate documents in “PATCHES” and patch size doesn’t exceed BSON size. Initial retrieval will be 1 Mega bites or 101 documents and subsequent will be 4 megabits.
EX: var c = db.movies.find();
c.hasnext();
c.next();
Ex: var c = db.movieDetails.find();
var doc = function() { return c.hasNext() ? c.next() : null;}
PROJECTION: Projection is a handy way of reducing the size of the data returned from a query.
Mongo Db will return all the documents. Projection will help in limiting the number of documents. You can explicetly include or exclude fields. This is usually the second condition you give in the find function.
Ex:
db.movieDetails.find({ “rated” : “PG” }, {title: 1}).pretty();
ex:
db.movieDetails.find({ “rated” : “PG” }, {title: 1, _id: 0 }).pretty();
In the above example we are excluding the _id explicitly.
Here “_Id” is always returned by default. If you don’t want it , you have explicitly exclude it.
Comparison operators:
$gt ,$lt, $gte, $lte
Ex:
db.movieDetails.find({ runtime: { $gt: 90} }).pretty();
db.movieDetails.find({ runtime: { $gt: 90, $lt: 120} }).pretty();
db.movieDetails.find({ runtime: { $gte: 90, $lte: 120} }).pretty();
db.movieDetails.find({ runtime: { $gte: 90, $lte: 120} }, {title: 1, runtime: 1, _id: 0} );
$ne
Ex:
db.movieDetails.find({ rated: { $ne “UNRATED” } }).count();
It will pull all the documents which created or movies which contains other then "unrated".
Important: Mongo DB usually doesn’t stores null values. In mongo Db usually you may have documents which may not have the rated fields at all. So above query will pull all those documents as well.
$in
Ex:
db.movieDetails.find({ rated: { $in: [“G”, “PG”, “PG-13"] } }).pretty();
db.movieDetails.find({ actors:{ $in: [“Jeff bridges”] } });
$nin
Ex:
db.movieDetails.find({ rated: { $nin: [“G”] } }).pretty();
ELEMENT Operators:
$exists: Matches documents that have specified field.
EX:
db.movieDetails.find( { “tomato.meter” : {$exists: true} });
$type: Selects documents if a field is of the specified type.
Ex:
db.moviesScratch.find({ “_id”: { $type: “string” } }).count();
Above query will check for the _id of type “String” and it will pull all the documents which have _id as string.
In the above example we have a collection called movieScratch with the _id value being string “imdb id” and object id generated by mango Db. Its mix of both. So in this kind of situations if you want to find the records or documents with the imdb id we van use the above query.
EVALUATION Operators:
$mod: Performs modulo operator on the value of a field and selects documents with a specified results.
$regex: Selects documents where values match a specified regular expression.
Ex:
db.movieDetails.find({ “awards.text”: { $regex: /^Won\s.*/ } })a
Logical Operators
$or : Joins query clauses with a logical OR and returns all the documents that match the condition of either clause.
Ex:
db.movieDetails.find({ $or : [{“tomato.meter”: {$gt:95}, {“meteoritic”: {$gt: 88 } }] }).pretty();
$and: Joins query clauses with a logical AND and returns all the documents that match the conditions of both clauses.
Ex:
db.movieDetails.find({ $and : [{“tomato.meter”: {$gt:95}, {“meteoritic”: {$gt: 88 } }] }).pretty();
equally to:
db.movieDetails.find({ {“tomato.meter”: {$gt:95}}, {“meteoritic”: {$gt: 88 }} }).pretty();
In the above situation both the queries pull up the same results. SO whats the point od using $and ..? see next example:
Ex:
db.movieDetails.find({ $and :[{ “metacritic”: { $ne: null} },
“metacritic”: { $exits: true} } ]);
In the above example we are using multiple criteria on the same field.
Important:
So basically if you want to specify multiple criteria on the same filed then we can use the $and.
$not: Inverts the effects of a query expression and returns documents that do not match the query expression.
$nor: Joins query clauses with a logical NOR return all the documents that fail to match both clauses.
Array Operators:
$all : Matches arrays that contain all the elements specified in the query.
Ex:
db.movieDetails.find({ genres: {$all: [“comedy”, “Crime”, “Drama”] } }).Pretty();
$elemMatch: Selects documents if element in the array field matches all the specified $elemMatch conditions.
Ex:
db.movieDetails.find({ boxOffice: { country: “UK”, revenue: { $gt: 15 } } });
db.movieDetails.find({ boxOffice: { $eleMatch : {“country”:”USA”, “revenue” : {$gt :125} }
} });
NOTE:$elematch specifies all the conditions for the single element of an array need to be satisfied.
$size: Selects documents if the array field is a specified size.
Ex:
db.movieDetails.find({ countries: { $size: 1 } }).pretty();
***********************************************************************************************************
UPDATE DOCUMENT
UPDATEONE();
Update option
$set
Ex:
db.movieDetails.updateOne({ title: “The Martian” },
{ $set: {poster: “http://ia.media-imdb.com/images/M//ass.jpg”} });
In the above example we are using update along with $set operator.
Field Update Operators
$inc
Ex:
db.movieDetails.updateone({ title: “the Martian”},
{$inc: { “tomato.reviews”: 3, “tomato.userReviews”: 25 } });
$mul : Multiplies the value of the field by the specified amount.
$rename: Renames a field.
$setOnInsert: Sets the value of a field if an update results of a document.
$set: Sets a value of a field in a document.
$min: only updates the field the specified value is less than the existing value.
$max: only updates the field the specified value is greater than the existing value.
$currentDate: Sets the value of a field to current date, either as a Date or a Timestamp.
$addToSet:
$pop:
$pullAll:
$pushAll:
$push:
It will update if already exists if not then it will create.
EX:
db.movieDetails.updateone({ title: “the Martian”},
{$push: { reviews: { rating :4.5,
userReviews: 25,
Reviewer: “Spencer H.” } }
}
);
Ex:
db.movieDetails.updateone({ title: “the Martian”},
{$push: { reviews:
{ $each: [
{ rating :4.5,
userReviews: 25,
Reviewer: “Spencer H.”,
text: “Enjoyed watching with my kids!” } ],
$position: 0,
$ slice: 5 } } } )
UPDATEMANY();
$unset : removes the specified field from a document.
Ex:
db.movieDetails.updateMany( { rated: null },
{ $unset: { rated: “” } } );
It will unset or removes rated field from all the documents which has rated value as null.
UPSERTS:
db.movieDetails.updateOne( {“imdb.id”:detail.imdb.id},
{$set: detail},
{upset: true});
REPLACE ONE:
Replace one will take a a filter and it will does a wholesale document replacement.
db.movies.replaceOne(
{“imdb”: detail.imdb.id }, detail);
replace the summary version with a detailed version.
here details is a document, created or an object.
**************************************************************************************************
REPLICATION and SHARDDING
Replication:
Replication is Asysnc.
Types of Replica Sets
- Regular —> It can be primary or secondary .
- Arbitrary (voting) —> This is only for voting purposes. ( when u have even ).
- Delayed/Regular Priority == 0. It can be 1 hr or 2 hrs behind. (p=0)
- Hidden —> It cannot be primary .Its priority is set to 0 as well. (p=0). It can participant in elections.
Write Consistency:
You can always request for read and write to the primary.
If you want you can read from secondary but it can be stale. Because its Async.
Read Preference:
Reading from a secondary prevents it from being promoted to primary.
False.
Reading from a secondary does not directly affect a secondary's ability to become primary, though if the reads caused it to lag on writes and fall behind on the oplog, that might make it ineligible until it is able to catch up. Here's a note on replication lag.
If the secondary hardware has insufficient memory to keep the read working set in memory,
directing reads to it will likely slow it down.
True.
This could really go either way. If the secondary has excess capacity, beyond what it needs to take writes, then directing reads to it would cause it to work more, but perhaps it would still be able to keep up with the oplog. On the other hand, if the primary is taking writes faster than the secondary can keep up, then this scenario would definitely slow it down.
Generally, your secondary should be on the same hardware as your primary, so if that's the case, and your primary would be able to keep up with the reads, then this shouldn't be a problem. Of course, if your primary can handle both the read and write loads, then there's really no compelling reason to send the reads to the secondary.
If your write traffic is great enough, and your secondary is less powerful than the primary, you may overwhelm the secondary, which must process all the writes as well as the reads. Replication lag can result.
True.
This is a design anti-pattern that we sometimes see.
A similar anti-pattern occurs when reads are routed to the primary, but the secondary is underpowered and unable to handle the full read + write load. In this case, if the secondary becomes primary, it will be unable to fulfill its job.
You may not read what you previously wrote to MongoDB on a secondary because it will lag behind by some amount.
True.
This is pretty straightforward. Unless you are reading from the primary, the secondary will not necessarily have the most current version of the documents you need to read.
Whether this is a problem or not depends on your application's requirements and business concerns, so it goes a bit outside the scope of development.
Review Implication Of Replica sets
- Read Preferences
- Errors can happen
- Seed Lists (Atleast one node must be listed to handle when failover occurs)
- Write Concerns
- W parameter —> the idea of some number of nodes to acknowledge writes through writes parameter
- J parameter —> which lets you wait or not wait for the primary node to commit that write to disk
- Wtimeout —> How long you need to wait for the w parameter to write in replica sets.
SHARDDING 11/30/17
Shardding is a technique to splitting up a large collection on different servers.
In sharing you will be having MONGOS . MongoS communicates with all the replica sets or servers or . Application communicates to the MongoS and MongoS communicates with other mongoD servers.
X*******************************************************************************
MONGO DB Sample questions 12/1/17
DBA questions:
Section 1: Philosophy & Features:
1. Which of the following does MongoDB use to provide High Availability and fault tolerance?
a. Write Concern
b. Replication
c. Sharding
d. Indexing
2. Which of the following does MongoDB use to provide High Scalability?
a. Write Concern
b. Replication
c. Sharding
d. Indexing
Section 2: CRUD Operations:
1. Which of the following is a valid insert statement in mongodb? Select all valid.
a. db.test.insert({x:2,y:”apple”})
b. db.test.push({x:2,y:”apple”})
c. db.test.insert({“x”:2, “y”:”apple”})
d. db.test.insert({x:2},{y:”apple”})
Section 3: Aggregation Framework:
1. Which of the following is true about aggregation framework?
a. A single aggregation framework operator can be used more than once in a query
b. Each aggregation operator need to return atleast one of more documents as a result
c. Pipeline expressions are stateless except accumulator expressions used with $group operator
d. the aggregate command operates on a multiple collection
Section 4: Indexing:
Below is a sample document in a given collection test.
{ a : 5, b : 3, c: 2, d : 1 }
1. Given a compound index { a: 1, b:1, c:1, d:1}, Which of the below query will not use in-memory sorting? Select all valid.
a. db.test.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1 } )
b. db.test.find( { a: 5, b: 3 } ).sort( { a: 1} )
c. db.test.find( { a: 5, b: 3 } ).sort( {c: 1 } )
d. db.test.find( { a: 5, b: 3 } ).sort( { c: 1, d : 1 } )
Section 5: Replication:
1. In a replicated cluster, which of the following node would only be used during an election?
a. primary
b. secondary
c. arbiter
d. hidden
2. What is the first task that a secondary would perform on being prompted by another secondary for an election?
a. Start the election process for primary
b. Connect to primary to confirm its availability
c. Vote for the first secondary so that it would become the next primary
d. Vote for itself and then call for election
Section 6: Sharding:
1. In which of the following scenarios is sharding not the correct option. Select all that apply.
a. The working set in the collection is expected to grow very large in size
b. The write operations on the collection are low
c. The collection is a read intensive collection with less working set
d. The write operations on the collection are very high
Section 7: Server & Application Administration:
1. Which of the following collections stores authentication credentials in MongoDB?
a. system.users
b. local.users
c. test.users
d. users.users
Section 8: Backup & Restore:
1. In a sharded cluster, from which node does one stop the balancer process before initiating backup?
a. Any node
b. mongos node
c. config server node
d. replicaset primary node
Answers
Section 1: Philosophy & Features:
1. b
2. c
Section 2: CRUD Operations
1. a,c
Section 3: Aggregation Framework
1. a,b,c
Section 4: Indexing
1. c,d
Section 5: Replication
1. c
2. b
Section:6 Sharding
1. b,c
Section 7: Server & Application Administration
1. a
Section 8: Backup & Restore
1. b
Section 1: Philosophy & Features:
1. Which of the following are valid json documents? Select all that apply.
a. {“name”:”Fred Flintstone”;”occupation”:”Miner”;”wife”:”Wilma”}
b. {}
c. {“city”:”New York”, “population”, 7999034, boros:{“queens”, “manhattan”, “staten island”, “the bronx”, “brooklyn”}}
d. {“a”:1, “b”:{“b”:1, “c”:”foo”, “d”:”bar”, “e”:[1,2,4]}}
Section 2: CRUD Operations:
1. Which of the following operators is used to updated a document partially?
a. $update
b. $set
c. $project
d. $modify
Section 3: Aggregation Framework:
Questions 1 to 3
Below is a sample document of “orders” collection
{
cust_id: “abc123”,
ord_date: ISODate(“2012-11-02T17:04:11.102Z”),
status: ‘A’,
price: 50,
items: [ { sku: “xxx”, qty: 25, price: 1 },
{ sku: “yyy”, qty: 25, price: 1 } ]
}
Select operators for the below query to determine the sum of “qty” fields associated with the orders for each “cust_id”.
db.orders.aggregate( [
{ $OPR1: “$items” },
{
$OPR2: {
_id: “$cust_id”,
qty: { $OPR3: “$items.qty” }
}
}
] )
1. OPR1 is
a. $group
b. $project
c. $unwind
d. $sum
2. OPR2 is
a. $group
b. $sort
c. $limit
d. $sum
3. OPR3 is
a. $match
b. $project
c. $skip
d. $sum
Section 4: Indexing:
1. Which of the following index would be optimum for the query? Select all valid.
db.test.find( { a : 5, c : 2 })
a. db.test.ensureIndex( { a: 1, b :1, c:1, d:1})
b. db.test.ensureIndex( { a : 1, c: 1, d: 1, b : 1})
c. db.test.ensureIndex( { a :1, c:1})
d. db.test.ensureIndex( { c:1, a: 1})
Section 5: Replication:
1. What is the replication factor for a replicated cluster with 1 primary, 3 secondaries with one of them hidden. The set also has an arbiter?
a. 3
b. 4
c. 5
d. None of the above
Section 6: Sharding:
1. Write the command(s) are correct to enable sharding on a database “testdb” and shard a collection “testCollection” with _id as shard key.
Section 7: Server & Application Administration:
1. To add a new user and enable authentication in MongoDB, which of the following steps need be executed?
a. update users collection and restart mongodb
b. update users collection and restart mongodb with –auth option
c. update users collection and run db.enableAuthentication()
d. All of the above
Section 8: Backup & Restore:
1. Which of the following needs to be performed prior to initiate backup on a sharded cluster?
a. db.stopServer()
b. db.stopBalancer()
c. sh.stopServer()
d. sh.stopBalancer()
Answers
Section 1: Philosophy & Features:
1. b,d
Section 2: CRUD Operations
1. b
Section 3: Aggregation Framework
1. c
2. a
3. d
Section 4: Indexing
1. b,c
Section 5: Replication
1. b
Section:6 Sharding
1. sh.enableSharding(“testdb”) & sh.shardCollection(“testdb.testCollection”, {_id : 1 }, true )
Section 7: Server & Application Administration
1. b
Section 8: Backup & Restore
1. d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment