Skip to content

Instantly share code, notes, and snippets.

@theperfectfuel
Created June 18, 2019 21:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save theperfectfuel/5fb8f0e4b474d040ac8d66a001f8ade3 to your computer and use it in GitHub Desktop.
Save theperfectfuel/5fb8f0e4b474d040ac8d66a001f8ade3 to your computer and use it in GitHub Desktop.
Unit 1 of the everLive MongoDB course curriculum

MongoDB Course Curriculum

Unit 1: Hello, Mongo!

Goals for this Unit

  • Be welcomed
  • Learn some database history
  • Understand "Why Mongo?"
  • Get Mongo Atlas set up

Theory and Knowledge

By the end of this Unit, you should understand:

  • Types of databases
  • Documents
  • Collections
  • BSON
    • BSON Data Types

Skills

By the end of this unit, you should be able to:

  • Create a MongoDB Atlas cluster
  • Create collections and documents via the command-line interface (CLI)
  • Execute basic queries

Intro and Expectations

Welcome to everLive's introductory course on MongoDB. MongoDB is a document-based database technology intended to work seamlessly with web applications.

To succeed in this course, you'll need to have some familiarity with:

  • JavaScript
  • The JSON data format
  • Command-line (Terminal) tools

If you haven't had a chance to get up to speed with JavaScript, that's okay; we won't be using that until the end of the Unit, so you have time to catch up.

By the end of this course, you'll be able to use Mongo to store document and collection-based data for use in real-world applications. You'll understand the benefits (and drawbacks) of document-based databases, and when they're the best choice for your application's needs. From there, you'll be ready to tackle building a data-driven application powered by MongoDB.

Tools and Accounts

Much of our work in this class will take place in a terminal emulator, so you'll want to make sure you have one you like. Here are some recommendations:

macOS

Windows

If you're on Linux, you probably already have a Terminal you like 🙂.

We won't be using a text editor that much, but it's always handy to have one around for reading sample data, etc. We recommend Visual Studio Code, but any text editor you've used for other courses will work just fine here.

Because we're going to interact with Mongo through the command line, we'll need to install the Mongo Shell. These pages provide instructions on installation on multiple platforms. We'll also walk through installation together in class.

Lastly, you're going to need a Mongo Atlas account. Atlas is MongoDB's cloud-hosted version of Mongo. We will go over setup in class, but you can follow the Getting Started documentation to set up a basic cluster on your own.

To SQL or NoSQL?

If you've gone through our SQl databases course, you're familiar with the term relational database. If not, don't worry; we're going to go over it now. It's worth understanding what relational databases are to understand how Mongo isn't one.

Relational Databases

Say I have a list of superheroes, including the team they're on:

Hero Team
Wonder Woman Justice League
Captain America Avengers
Batman Justice League
Spider-Man Avengers
Black Widow Avengers
Green Lantern Justice League

Hopefully that satisfies both Marvel and DC fans. Now let's say I wanted to rename the team Justice League to JLA. I could run a query against all my hero records, updating each one's Team column if necessary. But here's the thing: I shouldn't have to. Instead, what we would do is create a second table, called Teams, give each one an id, and use it to refer to that team. Then, whenever I reference that team, I can do so by an unchanging ID number, no matter what I make the name. No laborious updates needed.

Heroes

Hero TeamID
Wonder Woman 2
Captain America 1
Batman 2
Spider-Man 1
Black Widow 1
Green Lantern 2

Teams

ID Name
1 Avengers
2 JLA

That's a relation: connecting one set of unique data to another set of unique data. Sounds great, right? It is, but it's not the only way to consider data. One criticism of relational databases comes when we have many sets of data that are deeply interconnected. The queries to join them together can get diffcult to write and computationally expensive to process. And when most of that data is static (e.g. we won't be renaming teams or anything else), other approaches might make sense. This idea led to the rise of NoSQL: a tech movement to design database management systems that did not use the old relational approach. MongoDB is one such technology.

Do it Document-Style

Here's another way to consider our superhero data.

{"heroes": {
  "8a45dc": {
    "name": "Wonder Woman",
    "team": "Justice League"
  },
  "75bb5": {
    "name": "Captain America",
    "team": "Avengers"
  }
}}

What are we looking at here? This is JSON, or JavaScript Object Notation. Our parent object, heroes, is referred to as a collection in MongoDB. Each hero is given a unique id, and a corresponding object called a document. Now, these documents are fairly simplistic, but let's add some more attributes to them to understand how this is a fairly intuitive way of considering data. What if we included our heroes' powers?

{"heroes": {
  "8a45dc": {
    "name": "Wonder Woman",
    "team": "Justice League",
    "powers": [
      "strength",
      "flight",
      "Lasso of Truth",
      "gauntlet blast"
    ]
  },
  "75bb5": {
    "name": "Captain America",
    "team": "Avengers",
    "powers": [
      "strength",
      "that shield thing",
      "incorruptible moral center"
    ]
  }
}}

Imagine linking all those tables together. Yeesh. Here, when we ask for a hero, we get the necessary information with one query. Pretty elegant. From here, we can imagine a separate collection of, say, villains. In a document-based database, one way of relating heroes to villains would be to embed the villain document in the hero document, once the connection is made. Again, this allows us to ask for a single document with all the information we need.

Data Types

Not all the information in our document is of the same kind. We'll have text, numbers, dates, oh my! Here's an important point: MongoDB does not actually store data in JSON. Why does that matter? In JSON, we have a very limited number of data types that we can use to store information. Date is not one of them; nor are specifically-defined number types like double. Instead, Mongo stores its data in a binary format that is represented like JSON, called BSON. BSON allows us to use many more data types in our documents. These data types have direct analogs in JavaScript, making the integration between these two technologies quite tight.

MongoDB Atlas

If you've followed the directions above or joined us live, you now have MongoDB and the Mongo Shell installed on your computer. Great! That means that you can create collections and documents on your own computer. It also means you have the tools necessary to connect to instances of MongoDB running elsewhere. That's great news, because we'll be running our Mongo instance in the cloud.

Atlas instances are known as clusters. They live on 1 of 3 cloud hosting providers: AWS, Google Cloud, or Microsoft Azure.

We'll be following the getting started documentation from Atlas closely, so have it open in another tab. Make sure you've followed the steps there, including:

  1. Deploy a Free Tier Cluster
  2. Whitelist Your Connection IP Address
  3. Create a MongoDB User
  4. Insert Data into Your Cluster

From there, we are ready to actually work with data. In the last step, we've populated our cluster with several sample databases we can use to explore how Mongo works. Since we've already installed Mongo Shell (right?), we can move on to connecting to our cluster.

From the Clusters section of the Atlas dashboard, click the Connect button to access the full command-line string to connect to your cluster from the terminal. Copy/paste into a Terminal window on your computer. It'll look something like:

mongo "mongodb+srv://cluster0-chphg.mongodb.net/test" --username <your-user-here> 

Once you've authenticated, you'll see a long prompt that looks something like:

MongoDB Enterprise Cluster0-shard-0:PRIMARY>

We're finally ready to look at some data!

Basic Queries

The organization of a MongoDB cluster goes:

  • Cluster
    • Database
      • Collection
        • Document

We're already in our cluster, but we have multiple databases to choose from. Let's list them out by typing show dbs and pressing Enter.

You'll see those sample datasets we've loaded. Let's start simple with the sample_supplies database. Select it by typing use sample_supplies.

We're now set to use the sample_supplies database, which will contain a unique set of collections. As the help command will reveal (at any time), show collections will list a database's collections, once you've selected a database. Doing so reveals a sales collection. To see what we have in there, we are going to query the collection for all its records. This is where our commands start to look like JavaScript:

db.sales.find()

As you'll see, we get TOO MUCH DATA back. That's not at all helpful. So instead, let's just look at a single record to understand what documents in this collection look like.

db.sales.findOne()

This helper method returns only a single document for a query. I feel compelled to mention that this is a shortcut for the following command:

db.sales.find().limit(1).pretty()

The limit() function modifies the query result from find() and truncates it after the given number of documents. If you've done some JavaScript, this concept of function chaining should be rather familiar.

pretty() makes the BSON result of our query human-readable. Scroll up to the top of this document to begin exploring its contents.

To start, we see a "_id" key. The value is of type ObjectId, and it's a crazy series of letters and numbers. This id is unique to the document, and every document has one. Why not just a simple integer? For one thing, to prevent collisions—accidental identical Ids. For another, non-sequential ids make it harder for nefarious types to guess at ids and attempt to gain access to data they shouldn't have.

After the "_id" is a "salesDate" of type ISODate. While this is represented as a string here, this is in fact a Date object with associated methods, like getMonth().

The next top-level key of our document is items. The [] surrounding the value indicate this is an Array. Each element of the array is a sub-document, representing a component of the sale.

Beneath this large array, we see other keys for storeLocation, customer, couponsUsed, and purchaseMethod.

When we make a new document in the sales collection, all these keys should be present. Now, we likely wouldn't manually produce such a document; instead, our application would connect multiple data components from a user interface or point-of-sale terminal to create this document.

In our next Unit, we'll get busy making our own collections and performing more advanced queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment