Quick overview on mongo db

Quick
Overview on
MongoDB
Eman Abdel Ghaffar

Agenda
1. Introduction
2. CRUD
3. Cursors
4. Indexing
5. Schema Design principles
6. Aggregation
7. Map-Reduce

Introduction - ACID
● Relational databases usually guarantee ACID properties related to how reliably
transactions (both reads and writes) are processed.
● The NoSQL movement trades off ACID compliance for other properties, such as
100% availability, and MongoDB is the leader in the field
● https://dzone.com/articles/how-acid-mongodb

Introduction - ACID
● Atomicity requires that each transaction is executed in its entirety, or fail
without any change being applied.
● Consistency requires that the database only passes from a valid state to the
next one, without intermediate points. Any data written to the database must
be valid according to all defined rules, including constraints, cascades, triggers.
● Isolation requires that if transactions are executed concurrently, the result is
equivalent to their serial execution.
● Durability means that the the result of a committed transaction is permanent,
even if the database crashes immediately or in the event of a power loss.

Introduction - CAP
● Consistency Every read receives the most recent write or an error.
● Availability Every request receives a (non-error) response – without
guarantee that it contains the most recent write.
● Partition tolerance The system continues to operate despite an arbitrary
number of messages being dropped (or delayed) by the network between
nodes.
“It is impossible for a distributed data store to simultaneously
provide more than two out of the following three guarantees”

Introduction - MongoDB
● MongoDB is written in C++, open source and licensed under the GNU -
AGPL .
● The core database server runs via an executable called mongod (
mongodb.exe on Windows)
● The MongoDB command shell is a JavaScript-based tool for
administering the database and manipulating data.
manual/reference/mongo-shell/

CRUD - Create
● Databases and collections are created only when documents are first inserted..
● Every MongoDB document requires an _id.
db.collection.insertOne()
db.collection.insertMany()
db.collection.insert()

CRUD - Read
db.collection.find(query, projection)
db.inventory.find( {} ) SELECT * FROM inventory
db.inventory.find( { status: "D" } ) SELECT * FROM inventory WHERE status = "D"
db.inventory.find( { status: {
$in: [ "A", "D" ] } } )
SELECT * FROM inventory WHERE status in ("A", "D")
db.inventory.find( { status: "A", qty:
{ $lt: 30 } } )
SELECT * FROM inventory WHERE status = "A" AND qty < 30
db.inventory.find( {
status: "A", $or: [ { qty:
{ $lt: 30 } }, { item: /^p/ }
] } )
SELECT * FROM inventory WHERE status = "A" AND ( qty <
30 OR item LIKE "p%")

CRUD - Update
● Some Update Operators
○ $currentDate
○ $inc
○ $min
○ $max
○ $mul
○ $rename
○ $set
db.collection.update()
db.collection.findAndModify()
db.collection.updateOne()
db.collection.updateMany()
db.collection.replaceOne()

CRUD - Delete
● Indexes
○ Delete operations do not drop indexes, even if deleting all documents from
a collection.
● Atomicity
○ All write operations in MongoDB are atomic on the level of a single
document.
db.collection.remove()
db.collection.deleteOne()
db.collection.deleteMany()

Cursors
● Cursors, found in many database systems, return query result sets in batches
for efficiency iteratively.
● Queries instantiate a cursor, which is then used to retrieve a resultset in
manageable chunks, successive calls to MongoDB occur as needed to fill the
driver’s cursor buffer.
● Returning a huge result right away would mean:
○ Copying all that data into memory.
○ Transferring it over the wire.
○ Deserializing it on the client side.

Indexing
● Introduction
● Indexing Types
● Indexing Properties

Indexing- Introduction
● Index keys are typically smaller than the documents they catalog, and indexes
are typically available in RAM or located sequentially on disk.
● Covered Queries
○ When the query criteria and the projection of a query include only the indexed fields
○ Results returned directly from the index without scanning any documents or bringing
documents into memory.
● Ensure Indexes Fit in RAM
○ use the db.collection.totalIndexSize() helper, which returns index size in bytes.

Indexing - Index Types
● Single Field
● Compound Index
● Multikey Index
● Geospatial Index
● Text Indexes
● Hashed Indexes

Indexing - Index Properties
● TTL Indexes
○ The TTL index is used for TTL collections, which expire data after a period of time.
● Unique Indexes
○ A unique index causes MongoDB to reject all documents that contain a duplicate value for the
indexed field.
● Partial Indexes
○ A partial index indexes only documents that meet specified filter criteria.
● Case Insensitive Indexes
○ A case insensitive index disregards the case of the index key values.
● Sparse Indexes
○ A sparse index does not index documents that do not have the indexed field.

Schema Design
principles ● Introduction
● Embedding Vs. Referencing
● Model One-to-One
Relationships
● Model One-to-Many
Relationships

Schema Design principles - Introduction
● The application’s data access patterns should govern schema design,
with specific understanding of
○ The read/write ratio of database operations.
○ The types of queries and updates performed by the database.
○ The life-cycle of the data and growth rate of documents.
● When designing a data model, consider how applications will use your database.
○ if your application only uses recently inserted documents, consider using Capped Collections
data-modeling

Embedding Vs. Refencing
● Embedding provides better performance for read operations, as well as the
ability to request and retrieve related data in a single database operation.
● Not all 1:1 or 1:Many relationships should be embedded in a single document.

Embedding Vs. Refencing
● References store the relationships between data by including links or
references from one document to another.
○ When embedding would not provide sufficient read performance advantages
○ Where the object is referenced from many different sources.
○ To represent complex many-to-many relationships.
○ To model large, hierarchical data sets.

One-to-One Relationships - Embedding

One-to-Many Relationships
One-to-ManyOne-to-Few

One-to-Many Relationships
One-to-Squillions

Aggregation
● Aggregation operations group values from multiple documents together, and
can perform a variety of operations on the grouped data to return a single
result.
● The aggregate command operates on a single collection, logically passing the
entire collection into the aggregation pipeline.
● The $match and $sort pipeline operators can take advantage of an index when
they occur at the beginning of the pipeline.

Aggregation
https://docs.mongodb.com/manual/core/aggregation-pipeline-optimization/

Aggregation - Limitations
● If any single document that exceeds the BSON Document Size limit, the
command will produce an error.
● The $group stage has a limit of 100 megabytes of RAM. By default, if the stage
exceeds this limit, $group will produce an error.

Map-Reduce
● Map-reduce is a data processing paradigm for condensing large volumes of data
into useful aggregated results.
● Map-Reduce is less efficient and more complex than the aggregation pipeline.
● All map-reduce functions in MongoDB are JavaScript and run within the
mongod process.
● Map-reduce operations take the documents of a single collection.

Quick overview on mongo db

More Related Content

What's hot

Similar to Quick overview on mongo db

Recently uploaded

Quick overview on mongo db