Marcin Jahn | Dev Notebook
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • An icon of the Networking section Networking
    • HTTP Protocol
    • OSI Model
    • TCP Procol
    • UDP Protocol
    • WebSocket
    • HSTS
    • DNS
    • Server Name Indication
    • gRPC
  • An icon of the Security section Security
    • OAuth2
      • Sender Constraint
    • Cryptography
      • Cryptography Basics
    • TPM
      • Overiew
      • TPM Entities
      • TPM Operations
  • An icon of the Linux section Linux
    • Gist of Linux Tooling
    • Unknown
    • SELinux
    • Containers
    • Bash Scripting
    • Linux From Scratch
    • Networking
  • An icon of the Kubernetes section Kubernetes
    • Meaning and Purpose
    • Cluster
    • Dev Environment
    • Kubernetes API
    • Objects
    • Pods
    • Scaling
    • Events
    • Storage
    • Configuration
    • Organizing Objects
    • Services
    • Ingress
    • Helm
  • An icon of the Observability section Observability
    • Tracing
  • An icon of the Databases section Databases
    • ACID
    • Glossary
    • Index
    • B-Tree and B+Tree
    • Partitioning and Sharding
    • Concurrency
    • Database Tips
  • An icon of the SQL Server section SQL Server
    • Overview
    • T-SQL
  • An icon of the MongoDB section MongoDB
    • NoSQL Overview
    • MongoDB Overview
    • CRUD
    • Free Text Search
  • An icon of the Elasticsearch section Elasticsearch
    • Overview
  • An icon of the Git section Git
    • Git
  • An icon of the Ansible section Ansible
    • Ansible
  • An icon of the Azure section Azure
    • Table Storage
    • Microsoft Identity
  • An icon of the Google Cloud section Google Cloud
    • Overview
  • An icon of the Blockchain section Blockchain
    • Overview
    • Smart Contracts
    • Solidity
    • Dapps
  • Home Assistant
    • Home Assistant Tips
  • An icon of the Networking section Networking
    • HTTP Protocol
    • OSI Model
    • TCP Procol
    • UDP Protocol
    • WebSocket
    • HSTS
    • DNS
    • Server Name Indication
    • gRPC
  • An icon of the Security section Security
    • OAuth2
      • Sender Constraint
    • Cryptography
      • Cryptography Basics
    • TPM
      • Overiew
      • TPM Entities
      • TPM Operations
  • An icon of the Linux section Linux
    • Gist of Linux Tooling
    • Unknown
    • SELinux
    • Containers
    • Bash Scripting
    • Linux From Scratch
    • Networking
  • An icon of the Kubernetes section Kubernetes
    • Meaning and Purpose
    • Cluster
    • Dev Environment
    • Kubernetes API
    • Objects
    • Pods
    • Scaling
    • Events
    • Storage
    • Configuration
    • Organizing Objects
    • Services
    • Ingress
    • Helm
  • An icon of the Observability section Observability
    • Tracing
  • An icon of the Databases section Databases
    • ACID
    • Glossary
    • Index
    • B-Tree and B+Tree
    • Partitioning and Sharding
    • Concurrency
    • Database Tips
  • An icon of the SQL Server section SQL Server
    • Overview
    • T-SQL
  • An icon of the MongoDB section MongoDB
    • NoSQL Overview
    • MongoDB Overview
    • CRUD
    • Free Text Search
  • An icon of the Elasticsearch section Elasticsearch
    • Overview
  • An icon of the Git section Git
    • Git
  • An icon of the Ansible section Ansible
    • Ansible
  • An icon of the Azure section Azure
    • Table Storage
    • Microsoft Identity
  • An icon of the Google Cloud section Google Cloud
    • Overview
  • An icon of the Blockchain section Blockchain
    • Overview
    • Smart Contracts
    • Solidity
    • Dapps
  • Home Assistant
    • Home Assistant Tips

MongoDB Overview

  • Document DB
  • supports ACID
  • uses JSON/BSON
  • distributed scalability (sharding)
  • high availability (replica set)
  • free open source Community Edition
  • pluggable storage engine support
  • queries use MQL (MongoDB Query Language)
  • has free text search capabilities

Terminology

Compared with Relational databases, MongoDB has the following terms:

SQLMongoDB
DatabaseDatabase
TableCollection
RowDocument
ColumnField
IndexIndex

BSON

BSON stands for “Binary JSON”, it’s binary-encoded. It sacrifies human-readability for performance. BSON supports more data types than traditional JSON (different numeric types, raw binary, dates, ObjectId, etc.).

Example:

# JSON
{"hello": "world"}
# BSON
\x16\x00\x00\x00 // total document size
\x02 // 0x02 = type String
hello\x00 // field name
\x06\x00\x00\x00world\x00 // field value
\x00 // 0x00 = type EOO ('end of object')

On the “frontend”, we talk with MongoDB in JSON. In the backend, MongoDB stores the documents as BSONs.

All the BSON types can be found in the MongoDB Docs.

Connection

When talking with Mongo DB, it uses TCP/IP with TLS, and the Mongo Wire Protocol. It does not use HTTP. Therefore, when using some tracing or observability in general, you will not see Mongo traffic being recorded by default. You will most likely need some Mongo-specific instrumentation. The Mongo Wire Protocol might be sniffed with tools such as WireShark.

When initiating a connection with Mongo, before any user-initiated query gets sent, the client negotiates with the server on various parameters, which translates to about 4 request-responses (in Mongo Wire Protocol) being sent.

Dates

A date can be represented in MongoDB with ISODate.

{
"departureTime": ISODate("2019-05-07T09:13:00Z")
"arrivalDate": ISODate("2019-05-08")
}

We can use filtering based on date, including comparison operators such as $gt and $lt.

Index

To create an index:

db.{collection}.createIndex(
{
name: 1
}
)

The index will be created on the name field in ascending order.

We can get all indexes on a collection with:

db.{collection}.getIndexes()

Mongo supports clustered indexes, so that the primary _id key in the index is directly associated with the BSON. The default _id field in a document is an ObjectId, which takes 12 bytes. It’s quite a lot for a key, but it’s large for uniqueness among shards. Secondary indexes will point to the primary index, which means that secondary indexes become quite big, since each entry of the index needs to contain the 12-byte ObjectId. The situation might be improved (or worsened) with a custom choice of _id.

Tools

Shell

Mongo Shell (mongosh) is a CLI interface for accessing MongoDB. It uses JS-like syntax.

mongo

There is also an old CLI that is called just “mongo”.

Terminal window
mongosh "mongodb+srv://localhost/MyDatabase" --apiVersion 1 --username "myuser" --password "mypass"

Commands

db // Shows current DB
show dbs // Shows all DBs with storage size
use somedb // Switches context to some DB
show collections // Shows all collections in the current DB
db.createCollection("newcollection") // Creates a new collection
db.{collection}.drop() // Removes a collection
db.{collection}.insertOne({ "name": "Marcin" }) // Creates a document
db.{collection}.insertOne([{ "name": "Marcin" }, { "name": "Tom" }]) // Creates documents
db.{collection}.find() // Gets all documents in a collection

Formatting

When using the older CLI - mongo - the output is in one line. To format it for readability, we can use the pretty() function. E.g., db.mycollection.find().pretty().

Casing

Casing matters in the commands

GUI

MongoDB Compass is the official GUI for exploring MongoDB databases. It’s available for Linux, macOS, Windows.

mongoimport

A tool that is added to MongoDB installations that allows for bulk imports of data.

Testing

MongoDB has MongoDB Atlas offering, which is a cloud deploymen of MongoDB. There’s a free tier that is shared, but great for testing. It can be deployed to Azure/GCP/AWS. For testing, we can also load a Sample Dataset.

Sources

  • MongoDB Docs
  • Pluralsight
←  NoSQL Overview
CRUD  →
© 2023 Marcin Jahn | Dev Notebook | All Rights Reserved. | Built with Astro.