Skip to content

Instantly share code, notes, and snippets.

@ManasShettigar
Last active April 16, 2024 05:23
Show Gist options
  • Save ManasShettigar/069d7a3a037347ec9a9d8a08d3b1e392 to your computer and use it in GitHub Desktop.
Save ManasShettigar/069d7a3a037347ec9a9d8a08d3b1e392 to your computer and use it in GitHub Desktop.
MongoDB Overview

MongoDB

What is MongoDB?

MongoDB is a document-oriented NoSQL database system that provides high scalability, flexibility, and performance. Unlike standard relational databases, MongoDB stores data in a JSON document structure form. This makes it easy to operate with dynamic and unstructured data and MongoDB is an open-source and cross-platform database System.

JSON vs BSON

image

Database

Database is a container for collections. Each database gets its own set of files. A single MongoDB server can has multiple databases. image

Collection

  • Collection is a group of documents.
  • Collection is equivalent to RDBMS table.
  • Collections do not enforce a schema.
  • A Collection can have different fields within a Documents.

Why Use MongoDB?

  1. Document Oriented Storage: Data is stored in the form of JSON documents.
  2. Index on any attribute: Indexing in MongoDB allows for faster data retrieval by creating a searchable structure on selected attributes, optimizing query performance.
  3. Replication and high availability: MongoDB’s replica sets ensure data redundancy by maintaining multiple copies of the data, providing fault tolerance and continuous availability even in case of server failures.
  4. Auto-Sharding: Auto-sharding in MongoDB automatically distributes data across multiple servers, enabling horizontal scaling and efficient handling of large datasets.
  5. Rich queries: MongoDB supports complex queries with a variety of operators, allowing you to retrieve, filter, and manipulate data in a flexible and powerful manner.
  6. Fast in-place updates: MongoDB efficiently updates documents directly in their place, minimizing data movement and reducing write overhead.
  7. Professional support by MongoDB: MongoDB offers expert technical support and resources to help users with any issues or challenges they may encounter during their database operations.

MongoDB vs SQL

Data Model:

MongoDB: MongoDB is a document-oriented database, which means it stores data in flexible, JSON-like documents. Documents can have varying structures, and related data can be nested within a single document or across multiple documents.

SQL: SQL databases are relational databases that store data in tables with rows and columns. Data must be normalized into separate tables, and relationships between tables are defined using foreign keys.

Schema:

MongoDB: MongoDB is schema-less, allowing you to change the structure of documents without modifying the entire database schema. This flexibility is well-suited for agile development and evolving data models.

SQL: SQL databases require a predefined schema, where tables must be created with specified columns and data types. Any changes to the schema require altering the table structure, which can be complex for large databases.

Query Language:

MongoDB: MongoDB uses a query language called MongoDB Query Language (MQL), which is similar to JSON. MQL supports rich queries, including field queries, range queries, and geospatial queries.

SQL: SQL databases use SQL for querying data, which is a standardized language for interacting with relational databases. SQL supports complex queries, joins, and aggregation functions.

Scalability:

MongoDB: MongoDB is designed for horizontal scalability, allowing you to scale out by adding more servers to a cluster. This makes MongoDB well-suited for handling large volumes of data and high traffic loads.

SQL: SQL databases can scale vertically by increasing the resources of a single server.

Use Cases:

MongoDB: MongoDB is suitable for use cases requiring flexibility, scalability, and real-time analytics, such as content management systems, IoT applications, and mobile apps.

SQL: SQL databases are well-suited for applications that require complex transactions, strict data integrity, and well-defined schemas, such as financial systems, ERP systems, and e-commerce platforms.

More NoSQL Databases

  1. Amazon DynamoDB: A fully managed NoSQL database service provided by Amazon Web Services (AWS). It is designed for applications that require single-digit millisecond latency and can scale to handle millions of requests per second.
  2. Apache Cassandra: A distributed wide-column store NoSQL database that is highly scalable and fault-tolerant. It is commonly used for time-series data, IoT applications, and real-time analytics.
  3. Google Cloud Firestore: A flexible, scalable database for mobile, web, and server development from Google Cloud Platform. It is optimized for document storage and real-time updates.
  4. Riak: A distributed NoSQL database that is highly available and fault-tolerant. It is commonly used for session storage, user data, and distributed systems.

Basic MongoDB Commands

  1. Show Databases:
    show dbs
    
  2. Switch Database:
    use <database_name>
    
  3. Show Collections:
    show collections
    
  4. Insert Document:
    db.<collection_name>.insertOne({<document_data>})
    
  5. Find Documents:
    db.<collection_name>.find()
    
  6. Find Documents with Query:
    db.<collection_name>.find({<query_criteria>})
    
  7. Update Document:
    db.<collection_name>.updateOne({<filter_criteria>}, {$set: {<update_data>}})
    
  8. Delete Document:
    db.<collection_name>.deleteOne({<filter_criteria>})
    

Comparison Operators:

  • $eq: Matches values that are equal to a specified value.

    { "age": { "$eq": 30 } }
    
  • $gt: Matches values that are greater than a specified value.

    { "age": { "$gt": 30 } }
    
  • $gte: Matches values that are greater than or equal to a specified value.

    { "age": { "$gte": 30 } }
    
  • $lt: Matches values that are less than a specified value.

    { "age": { "$lt": 30 } }
    
  • $lte: Matches values that are less than or equal to a specified value.

    { "age": { "$lte": 30 } }
    
  • $ne: Matches all values that are not equal to a specified value.

    { "age": { "$ne": 30 } }
    
  • $in: Matches any of the values specified in an array.

    { "age": { "$in": [30, 40] } }
    
  • $nin: Matches none of the values specified in an array.

    { "age": { "$nin": [30, 40] } }
    

Logical Operators:

  • $and: Joins query clauses with a logical AND and returns all documents that match the conditions of both clauses.

    { "$and": [{ "age": { "$gt": 20 } }, { "age": { "$lt": 40 } }] }
    
  • $or: Joins query clauses with a logical OR and returns all documents that match the conditions of either clause.

    { "$or": [{ "age": { "$lt": 20 } }, { "age": { "$gt": 40 } }] }
    
  • $not: Inverts the effect of a query expression and returns documents that do not match the query expression.

    { "age": { "$not": { "$gt": 30 } } }
    
  • $nor: Joins query clauses with a logical NOR and returns all documents that fail to match both clauses.

    { "$nor": [{ "age": { "$lt": 20 } }, { "age": { "$gt": 40 } }] }
    

Indexing

Indexing in MongoDB is a way to optimize the performance of queries by creating a data structure that allows the database to quickly locate documents. Here's a brief overview:

What is indexing?

Indexing is the process of creating an index on a field or multiple fields within a collection. An index is a data structure that stores the value of a specific field or a combination of fields, along with a reference to the location of the corresponding documents in the collection. Why use indexing?

Improves query performance: Indexes allow MongoDB to quickly locate documents matching a query criteria without scanning the entire collection. Supports sorting and aggregation: Indexes can also improve the performance of sorting and aggregation operations. Reduces resource consumption: By minimizing the number of documents that need to be examined, indexing reduces the CPU and memory resources required for query execution.

How to create indexes?

You can create indexes using the createIndex() method in MongoDB. db.<collection_name>.createIndex({<field_name>: 1}) The value 1 indicates ascending order.

Considerations for indexing:

  • Indexes consume disk space and memory, so it's important to consider the trade-offs between query performance and resource consumption.
  • Over-indexing can lead to increased storage requirements and slower write operations
  • Indexes need to be maintained, so there might be a slight overhead on write operations

Aggregation

Aggregation in MongoDB is a powerful framework for performing data processing operations on documents within a collection. It allows you to filter, group, transform, and analyze data, similar to SQL's GROUP BY and aggregate functions.

  1. Pipeline-based Framework: Aggregation in MongoDB operates on the concept of pipelines, where documents pass through multiple stages of transformation. Each stage performs a specific operation on the input documents and passes the results to the next stage.

  2. Stages: Aggregation pipelines consist of various stages, each performing a specific operation. Some common stages include:

    • $match: Filters documents based on specified criteria.
    • $group: Groups documents together based on a specified key and performs aggregate functions on grouped data.
    • $project: Reshapes documents, includes, excludes, or adds new fields.
    • $sort: Sorts documents based on specified fields.
    • $limit: Limits the number of documents passed to the next stage.
    • $skip: Skips a specified number of documents.
  3. Aggregation Expressions: MongoDB provides a rich set of aggregation expressions to perform various operations within aggregation pipelines. Some common aggregation expressions include:

    • $sum: Calculates the sum of numeric values.
    • $avg: Calculates the average of numeric values.
    • $max: Finds the maximum value within a group.
    • $min: Finds the minimum value within a group.
    • $push: Appends values to an array.
    • $addToSet: Adds unique values to an array.

Example

Suppose we want to retrieve sales data for a specific date range and sort the results by date. We can use the $match stage to filter documents based on the date field and the $sort stage to sort the results.

var startDate = ISODate("2024-01-01");
var endDate = ISODate("2024-01-31");

db.sales.aggregate([
  { $match: { date: { $gte: startDate, $lte: endDate } } },
  { $sort: { date: 1 } }
]) 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment