SlideShare a Scribd company logo
1 of 39
Document-Oriented Database
(MongoDB)
1
Index
● Introduction
○ Evolution of Database Systems
○ Introduction to NoSQL database
○ CAP theorem and BASE Property
● Document-oriented database (MongoDB)
○ Features of MongoDB
○ Use Cases of MongoDB
● Querying in MongoDB
○ CRUD operations
○ Aggregate Pipeline
○ Indexing
● Sharding in MongoDB
● References
2
Evaluation of database systems
3
4
Generation Era
Database
model
Motivation Examples
1st
Mid 1960’s- early 1970’s
(Main Frame)
Hierarchical
model
Representing relationships b/w data items vis
storing them on magnetic tapes and later on
addressable magnetic disks.
IMS
Network Model IDS,IDMS
2nd
Early 1970’s
(Mini Computer Era)
Relational
Model
Increasing data independence and providing
ad-hoc query processing.
System R, Oracle DB2,
Sybase, postgres
3rd
Early 1980’s
(Client Server and early
web’s application era)
Object
Oriented Model
Representing complex data items and tackling
the impedance mismatch problems
Versant, Matisse, O2,
VelocityDB
4th
Early 2000’s
(Global Scope
application’s era needs
big data)
NoSQL
Satisfying the high scalability along with high
availability requirements of massive web,cloud
and mobile applications
Big Table, MongoDB,
DynamoDb, Cassandra,
Neo4j
5th
Late 2000’s (Global
Scope application’s era
needs big data)
NewSQL
Satisfying the high availability and scalability
requirements of modern global-scope, OLAP
applications.
NuoDB, CockroachDB, H-
Store
History of NoSQL
• In the year 1998- Carlo Strozzi use the term NoSQL for his lightweight, open-source relational database
that did not expose the standard Structured Query Language (SQL) interface.
• Big Data explosion caused organizations (large, medium and small) to seek a better way to store, manage
and analyze large unstructured data sets which was unable to handle by conventional database like
relational.
• Johan Oskarsson, reintroduced the term NoSQL in early 2009 when he organized an event to discuss
"open source distributed, non relational databases."
• It stands for 'Not Only SQL'.
Why NoSQL was developed
 To handle new requirements: Horizontal scalable, high availability, fault tolerance, transaction reliability,
database schema maintainability.
 Can handle structured, semi- structured, unstructured data.
 Flexible data models that can be schema-less.
 Relaxation on ACID property enable scaling out while achieving the high availability and low latency.
 Data can easily be replicated and horizontally partitioned across local and remote servers.
 Examples: MongoDB, Amazon Dynamo
5
6
Key value
pair based
Document
based
Column
based
Graph based
e.g.Riak,
Radis etc.
Use:
Storing
session
information
e.g.
MongoDB,
CouchDB
etc.
e.g.
Cassandra ,
HBASE etc.
e.g. Neo4j,
OrientDB
etc.
Use:
e-commerce
applications
Use:
Blogging
platforms
Use:
Connections
on Social
Network
Terminology
Relational
database
(MySQL)
Document
based
(MongoDB)
Key-Value
based
(Riak)
Column based
(Cassandra)
Graph based
(Neo4j)
Table Collection Namespace Table Node
Row Document Key-Value Pair Row Node/Label
Column Field - Column Properties
Primary key Object_id Key Primary key -
Index Index Index Index Index
View View
-
Materialized view -
Nested table or
object
Embedded
document
-
Map Relationships
7
4th generation
database
Non Relational
Distributed architecture
Open source
Horizonally Scalable 8
Features of NoSQL Database
Schema free
Easy replication Simple API
Can manage huge
amount of data
Can be implemented
on commodity
hardware
More than 150
NoSql databases
9
CAP Theorem
• Consistency: Clients should read the
same data.
• Availability: Data to be available all time.
• Partial Tolerance: Data to be partitioned
across network segments due to network
failures.
Source:https://www.researchgate.net/figure/CAP-
theorem 10
BASE: Basically Available, Soft State, Eventually
Consistent
Basically Available : This means that there can be a partial failure in some
parts of the distributed system and the rest of the system continues to function.
Soft State: The state of the system and data changes over time due to
eventually consistency of data.
Eventually Consistent: A possibility that the multiple copies may not be
consistent for a short period of time.
11
MongoDB
 It is an open-source, cross-platform, document-oriented database written in C++.
 MongoDB stores data in JSON format.
 Structure of JSON is {key:value}
 JSON Example- {Name: “Jory”, age:15}
 MongoDB is preferred over RDBMS in the following scenarios:
• Big Data: If you have huge amount of data to be stored in tables, think of MongoDB before RDBMS
databases. MongoDB has built-in solution for partitioning and sharding of database.
• Undefined Schema: Adding a new column in RDBMS is hard whereas MongoDB is schema-less.
Adding a new field does not affect old documents and will be very easy.
• More Read operations: MongoDB is preferred over RDBMS when one has more and need fast
access of read operations than write operations.
13
Key Features of MongoDB
1. Dynamic Document Schema: They are schema-
free and can be customized according to the need.
2. Native Language drivers: MongoDB currently
provides official driver support for all popular
programming languages like C, C++, C#, Java,
Node.js, Perl, PHP, Python, Ruby, Scala, Go, and
Erlang.
3. High availability: The database of MongoDB can
be executed on multiple servers at a time to
reduce risk of data loss during hardware failure.
4. High performance: Ad hoc queries, indexing,
and real time aggregation provide powerful ways
to access data.
5. Horizontal Scalability: Horizontal scaling means
that each shard in every cluster houses a portion
of the dataset.
Source: https://www.mongodb.com/mongodb-architecture
14
Representation of MongoDB
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Join
(normalize
d)
Embedding/
Referencing
(denormalized)
Primary
Key
_id Field
15
Supported Data types
● JSON
● Integers
● Boolean
● Double
● Array
● Date
● Timestamp
● Objectid
● Null
● Objects
16
Use Cases of MongoDB
17
Case study: MongoDB with Aadhaar
● Aadhaar is India’s Unique Identification project, which has the biggest
biometrics database in the world.
● MongoDB is being used for the storage of images in the Aadhar project.
● Aadhaar chose to partner with MongoDB (in addition to other vendors such as
Hadoop, MySQL, HBase, and Solr) for several reasons.
1) MongoDB increases database efficiency with its NoSQL approach, which
enables Aadhaar to capture, process, search, and analyze large unstructured
datasets faster than most other management softwares.
2) MongoDB can efficiently store large volumes of biometric data and images,
whereas many other management systems, such as MySQL, are less suited for
image storage.
18
Sample Database (“Studentinfo”)
{
"_id" : ObjectId("61f6f4e0dd0c8af093eb9255"),
"studentName" : "Gaurav",
"section" : "A",
"Marks" : 90,
"subject" : [
"English"
]
}
{
"_id" : ObjectId("61f6f4e0dd0c8af093eb9254"),
"studentName" : "Vijay",
"section" : "A",
"Marks" : 70,
"subject" : [
"Hindi",
"English",
"Math"
]
}
Collection
(“Student”)
Document 1
Document 2
19
.
.
.
.
Document n
Fields
Fields
Primary
key
Primary
key
Embedded/Nested
documents
Querying in MongoDB
1. CRUD (Create, Read, Update, Delete) operations: These functions can be
broadly classified as data modification functions in MongoDB. These can only be
used for a single collection.
2. Aggregation:
● Process data records and return computed results. It can be applied on
multiple douments. Includes Operators like $sum, $max, $min etc.
● Aggregation pipeline method is used to perform the aggregations. Includes
Stages like $match, $project and many more.
3. Indexing: Used to make queries performance more efficient. It includes:
a) Create index b) Find Index 3) Drop Index
20
1. MongoDB CRUD Operations
Create (CRUD) Operations:
It includes
1) create() : Creation of new collection
2) insert() : insertion of one or more documents inside a collection
It includes two other methods:
● insertone(): To insert one document inside a collection
● insertmany(): To insert many docuemnts inside a collection
21
1. create()
Basic Syntax: db.createCollection(name, options)
Query: Create a new collection “student”
Syntax: db.createCollection(“student”);
2. insert(): The insert() method is used to insert one or multiple documents in a
collection.
Basic Syntax: db.collection_Name.insert(JSON document)
Query: Insert the marks of a students named ‘Vijay, Gaurav’ in section ‘A’ having
subjects ‘Hindi, English, Math’,and ‘English’ respectively.
Syntax: db.student.insert ({studentName: “Vijay”, section: “A”, Marks: 70, subject:
[“Hindi”, “English”, “Math”]})
db.student.insert[{studentName: “Gaurav”, section: “A”, Marks: 90, subject:
[“English”]}])
22
2. insertOne(): Another way to insert documents is by using the insertOne()
method for a single document in a collection.
Basic Syntax: db.collection_Name.insertone(JSON document)
Query: Insert the marks of a student named ‘Vijay’ in section ‘A’ having subjects
‘Hindi, English, Math’.
Syntax: db.student.insert ({studentName: “Vijay”, section: “A”, Marks: 70, subject:
[“Hindi”, “English”, “Math”]})
3. insertMany(): is used for inserting multiple documents:
Basic Syntax: db.collection_Name.insertmany([array of JSON document])
Query: Insert the marks of a students named ‘Vijay, Gaurav’ in section ‘A’ having
subjects ‘Hindi, English, Math’,and ‘English’ respectively.
Syntax: db.student.insertMany( [ { studentName: “Vijay”, section: “A”, Marks: 70,
subject: [“Hindi”, “English”, “Math”]}, { studentName: “Gaurav”, section: “A”,
Marks: 90, subject: [“English”]}]);
23
Read (CRUD) Operations
Read operations retrieves documents from a collection; i.e. queries a
collection for documents.
Basic syntax: db.collection.find()
Query: Display the details of students
Syntax: db.student.find{};
pretty(): This method is used for giving a proper format to the output
extracted by the query.
Basic Syntax: db.collection.find().pretty();
Query: Display the details of students
Syntax: db.student.find().pretty();
24
Update (CRUD) Operations
Update operations modify existing documents in a collection.
Basic Syntax:
db.collection_Name.update(selection_criteria,updated_data)
Query: Update the name of “Gaurav” to “Gorav”
Syntax: db.student.update({name: “Gaurav”}, {$set:{“name”: “Gorav”}}
25
Remove (CRUD) Operations
1. drop(): To delete the collection
2. remove(): To delete the document from a collection
1. drop()
Basic Syntax: db.collection_name.drop()
Query: Delete the collection “student”
Syntax: db.student.drop();
2. remove()
Basic Syntax: db.collection_name.remove(Deletion_Criteria )
Query: Delete details of student “Gorav”
Syntax: db.student.remove({“name”: “Gorav”});
26
2. Aggregation
● Aggregation operations process multiple documents and return computed
results
● The key element in aggregation is pipeline
● Pipeline is a sequence of data aggregate operators or stages.
● There are several aggregate pipeline operators like $max, $min, $avg etc
● There are total 32 different pipeline stages. e.g. $project, $match, $group etc
Basic syntax of aggregate() method is as follows −
db.Collection_Name.aggregate(pipeline)
27
Following is a list of some aggrege operators
28
Stage Description
$add Adds numbers to return the sum, or adds numbers and a date to return a new date.
$in The in operator returns a boolean indicating that the specified value is in the array or not.
$min It gets the minimum value from all the documents
$max It gets the maximum value from all the documents
$count It counts total numbers of documents
$avg It calculates the average of all given values from all documents
$first It gets the first document from the grouping
$last It gets the last document from the grouping
Following is a list of some aggrege stages
29
Stage Description
$match Filters the document stream to allow only matching documents to pass into the next pipeline
stage.
$project Reshapes each document in the stream, shows only selected fields of documents.
$group Groups input documents by a specified identifier expression and applies the accumulator
expression(s), if specified, to each group.
$limit Passes the first n documents unmodified to the pipeline where n is the specified limit.
$lookup Performs a left outer join to another collection in the same database to filter in documents
from the "joined" collection for processing.
$count Returns a count of the number of documents at this stage of the aggregation pipeline.
$merge Writes the resulting documents of the aggregation pipeline to a collection.
$unwind Deconstructs an array field from the input documents to output a document for each element.
Examples
Following are the three popular stages in aggregation framework:
1) $match − This is a filtering operation and thus this can reduce the amount of
documents that are given as input to the next stage.
Basic Syntax: { $match: { <query> } }
Query 1: Display the details of students belong to section “A”
Syntax: db.student.aggregate([{“$match:{“section”: “A”}}])
Query 2: Display the details of students belong to section “A” and marks greater
than 80
db.student.aggregate([{“$match”: { $and:[{“section”: “A”}, {Marks:
{“$gt”: 80}}]}}])
30
31
2) $project − Used to select some specific fields from a collection.
Basic Syntax:
Query 1: Display name, section and marks of all the students.
Syntax: db.student.aggregate([{“$project”: {studentName: 1, section: 1,
Marks: 1}}])
Query 2: Display the names and marks of students from section A.
Syntax: db.student.aggregate([{$match:{“section”: “A”}}, {“$project”:
{studentName: 1, Marks:1}}])
{ $project: { <specification(s)> } }
3) $group − This does the actual aggregation as discussed above.
Basic Syntax:
{ $group: { _id: <expression>, // Group By Expression <field1>: { <accumulator1> :
<expression1> },}}
Query 1: To find out total marks each section.
Syntax: db.student.aggregate([{“$group”:{“_id”: {“section : “$section”}, “Total
Marks”:{“$sum”: “$Marks”}}}])
Query 2: To find the total marks of section A.
Syntax: db.student.aggregate([{“$match”: {section: ‘A’}}, {“$group”:{“_id”: {“section :
“$section”}, “Total Marks”:{“$sum”: “$Marks”}}}])
32
Query 3: To find total and average marks of each section.
Syntax: db.student.aggregate([{“$group”:{“_id”: {“section : “$section”}, “Total
Marks”:{“$sum”: “$Marks”}, “Count”:{ “$sum”:1}, “Average”: {“$avg”: “$Marks”}}}])
33
3. Indexing
● Index in MongoDB is a special data
structure that holds the data of few fields
of documents on which the index is
created.
● MongoDB uses B-Tree data structure to
store indexes.
● Indexes improve the speed of search
operations in database because instead
of searching the whole document, the
search is performed on the indexes that
holds only few fields.
● On the other hand, having too many
indexes can hamper the performance of
insert, update and delete operations
because of the additional write and
additional data space used by indexes.
34
1. To create index in MongoDB
Basic Syntax: db.collection_name.createIndex({field_name: 1 or -1})
The value 1 is for ascending order and -1 is for descending order.
Query: Create index on student name.
Syntax: db.student.createIndex({“name”: 1})
2. Finding the indexes in a collection
We can use getIndexes() method to find all the indexes created on a collection.
Basic Syntax: db.collection_name.getIndexes()
Query: Get all indexes on Student collection.
Syntax: db.student.getIndexes()
35
3. Drop indexes in a collection
For this purpose the dropIndex() method is used.
Basic Syntax: db.collection_name.dropIndex({index_name: 1})
Query: Drop the index on student name.
Syntax: db.studentdropIndex({name: 1})
36
Sharding
● Sharding is the process of distributing data across multiple servers for storage
● It is MongoDB's approach to meeting the demands of data growth
● Sharding is used to achieve horizontal scaling. With sharding, more machines are
added meet the demands of read and write operations
Characterstics of sharding:
● It automatically adds more servers to a database and automatically balances data and
load across various servers
● It splits the data set and distributes them across multiple databases or shards. Each
shard serves as an independent database, and together, shards make a single logical
database
37
Example of sharding:
38
If a database has 1 terabyte data
set distributed amongst 4 shards,
then each shard may hold only 256
GB of data. If the database contain
40 shards, then each shard will
hold only 25 GB of data
Source: https://i.stack.imgur.com/zCOvb.png
Components are:
1. shard: Each shard contains a subset of the
sharded data. Each shard can be deployed as
a replica set.
2. mongos: The mongos acts as a query
router, providing an interface between client
applications and the sharded cluster.
3. config servers: Config servers store
metadata and configuration settings for the
cluster.
Source: https://docs.mongodb.com/manual/sharding/
39
References
1. MongoDB website, https://www.mongodb.com
2. Dan Sullivan. “Nosql for mere Mortals”,1st Edition,United States of America:Pearson
Education,2015.
3. “Top 5 considerations when evaluating NoSql databases”,MongoDB white paper,2015.
(https://www.mongodb.com/collateral/top-5-considerations-when-evaluating-nosql-databases).
4. Davoudian, Ali, Liu Chen, and Mengchi Liu. “A survey on NoSQL stores.” ACM Computing Surveys
(CSUR) 51, no. 2 (2018): 1-43
5. Cattell, R. “Scalable SQL and NoSQL data stores.” Acm Sigmod Record, 39(4), pp.12-27, 2015.
6. Floratou, Avrilia, Nikhil Teletia, David J. DeWitt, Jignesh M. Patel, and Donghui Zhang. "Can the
Elephants Handle the NoSQL Onslaught?." Proceedings of the VLDB Endowment 5, no. 12 (2012).
7. Patel, Jignesh M. "Operational NoSQL Systems: What's New and What's Next?." Computer 49, no.
4 (2016): 23-30.
8. Abadi, Daniel, Rakesh Agrawal, Anastasia Ailamaki, Magdalena Balazinska, Philip A. Bernstein,
Michael J. Carey, Surajit Chaudhuri et al. "The Beckman report on database
research." Communications of the ACM 59, no. 2 (2016): 92-99.
40

More Related Content

What's hot

Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 

What's hot (20)

An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Лекция 2. Основы Hadoop
Лекция 2. Основы HadoopЛекция 2. Основы Hadoop
Лекция 2. Основы Hadoop
 
Optimizing your Database Import!
Optimizing your Database Import! Optimizing your Database Import!
Optimizing your Database Import!
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to sqoop
Introduction to sqoopIntroduction to sqoop
Introduction to sqoop
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Sqoop
SqoopSqoop
Sqoop
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 

Similar to Copy of MongoDB .pptx

MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data science
bitragowthamkumar1
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
75waytechnologies
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker
 

Similar to Copy of MongoDB .pptx (20)

MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
Mongo db
Mongo dbMongo db
Mongo db
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practices
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Mongo db
Mongo dbMongo db
Mongo db
 
MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data science
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
Cassandra-vs-MongoDB
Cassandra-vs-MongoDBCassandra-vs-MongoDB
Cassandra-vs-MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
Introduction to MongoDB Basics from SQL to NoSQL
Introduction to MongoDB Basics from SQL to NoSQLIntroduction to MongoDB Basics from SQL to NoSQL
Introduction to MongoDB Basics from SQL to NoSQL
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Beginner's guide to Mongodb and NoSQL
Beginner's guide to Mongodb and NoSQL  Beginner's guide to Mongodb and NoSQL
Beginner's guide to Mongodb and NoSQL
 
A Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfA Study on Mongodb Database.pdf
A Study on Mongodb Database.pdf
 
A Study on Mongodb Database
A Study on Mongodb DatabaseA Study on Mongodb Database
A Study on Mongodb Database
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 

More from nehabsairam

More from nehabsairam (10)

Chapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortalsChapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortals
 
Chapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortalsChapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortals
 
Chapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortalsChapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortals
 
Chapter 7(documnet databse termininology) no sql for mere mortals
Chapter 7(documnet databse termininology) no sql for mere mortalsChapter 7(documnet databse termininology) no sql for mere mortals
Chapter 7(documnet databse termininology) no sql for mere mortals
 
Chapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortalsChapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortals
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Software security testing
Software security testingSoftware security testing
Software security testing
 
E governance and digital india initiative
E governance and digital india initiativeE governance and digital india initiative
E governance and digital india initiative
 
localization in wsn
localization in wsnlocalization in wsn
localization in wsn
 

Recently uploaded

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 

Recently uploaded (20)

💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Call Girls In GOA North Goa +91-8588052666 Direct Cash Escorts Service
Call Girls In GOA North Goa +91-8588052666 Direct Cash Escorts ServiceCall Girls In GOA North Goa +91-8588052666 Direct Cash Escorts Service
Call Girls In GOA North Goa +91-8588052666 Direct Cash Escorts Service
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 

Copy of MongoDB .pptx

  • 2. Index ● Introduction ○ Evolution of Database Systems ○ Introduction to NoSQL database ○ CAP theorem and BASE Property ● Document-oriented database (MongoDB) ○ Features of MongoDB ○ Use Cases of MongoDB ● Querying in MongoDB ○ CRUD operations ○ Aggregate Pipeline ○ Indexing ● Sharding in MongoDB ● References 2
  • 4. 4 Generation Era Database model Motivation Examples 1st Mid 1960’s- early 1970’s (Main Frame) Hierarchical model Representing relationships b/w data items vis storing them on magnetic tapes and later on addressable magnetic disks. IMS Network Model IDS,IDMS 2nd Early 1970’s (Mini Computer Era) Relational Model Increasing data independence and providing ad-hoc query processing. System R, Oracle DB2, Sybase, postgres 3rd Early 1980’s (Client Server and early web’s application era) Object Oriented Model Representing complex data items and tackling the impedance mismatch problems Versant, Matisse, O2, VelocityDB 4th Early 2000’s (Global Scope application’s era needs big data) NoSQL Satisfying the high scalability along with high availability requirements of massive web,cloud and mobile applications Big Table, MongoDB, DynamoDb, Cassandra, Neo4j 5th Late 2000’s (Global Scope application’s era needs big data) NewSQL Satisfying the high availability and scalability requirements of modern global-scope, OLAP applications. NuoDB, CockroachDB, H- Store
  • 5. History of NoSQL • In the year 1998- Carlo Strozzi use the term NoSQL for his lightweight, open-source relational database that did not expose the standard Structured Query Language (SQL) interface. • Big Data explosion caused organizations (large, medium and small) to seek a better way to store, manage and analyze large unstructured data sets which was unable to handle by conventional database like relational. • Johan Oskarsson, reintroduced the term NoSQL in early 2009 when he organized an event to discuss "open source distributed, non relational databases." • It stands for 'Not Only SQL'. Why NoSQL was developed  To handle new requirements: Horizontal scalable, high availability, fault tolerance, transaction reliability, database schema maintainability.  Can handle structured, semi- structured, unstructured data.  Flexible data models that can be schema-less.  Relaxation on ACID property enable scaling out while achieving the high availability and low latency.  Data can easily be replicated and horizontally partitioned across local and remote servers.  Examples: MongoDB, Amazon Dynamo 5
  • 6. 6 Key value pair based Document based Column based Graph based e.g.Riak, Radis etc. Use: Storing session information e.g. MongoDB, CouchDB etc. e.g. Cassandra , HBASE etc. e.g. Neo4j, OrientDB etc. Use: e-commerce applications Use: Blogging platforms Use: Connections on Social Network
  • 7. Terminology Relational database (MySQL) Document based (MongoDB) Key-Value based (Riak) Column based (Cassandra) Graph based (Neo4j) Table Collection Namespace Table Node Row Document Key-Value Pair Row Node/Label Column Field - Column Properties Primary key Object_id Key Primary key - Index Index Index Index Index View View - Materialized view - Nested table or object Embedded document - Map Relationships 7
  • 8. 4th generation database Non Relational Distributed architecture Open source Horizonally Scalable 8 Features of NoSQL Database
  • 9. Schema free Easy replication Simple API Can manage huge amount of data Can be implemented on commodity hardware More than 150 NoSql databases 9
  • 10. CAP Theorem • Consistency: Clients should read the same data. • Availability: Data to be available all time. • Partial Tolerance: Data to be partitioned across network segments due to network failures. Source:https://www.researchgate.net/figure/CAP- theorem 10
  • 11. BASE: Basically Available, Soft State, Eventually Consistent Basically Available : This means that there can be a partial failure in some parts of the distributed system and the rest of the system continues to function. Soft State: The state of the system and data changes over time due to eventually consistency of data. Eventually Consistent: A possibility that the multiple copies may not be consistent for a short period of time. 11
  • 12. MongoDB  It is an open-source, cross-platform, document-oriented database written in C++.  MongoDB stores data in JSON format.  Structure of JSON is {key:value}  JSON Example- {Name: “Jory”, age:15}  MongoDB is preferred over RDBMS in the following scenarios: • Big Data: If you have huge amount of data to be stored in tables, think of MongoDB before RDBMS databases. MongoDB has built-in solution for partitioning and sharding of database. • Undefined Schema: Adding a new column in RDBMS is hard whereas MongoDB is schema-less. Adding a new field does not affect old documents and will be very easy. • More Read operations: MongoDB is preferred over RDBMS when one has more and need fast access of read operations than write operations. 13
  • 13. Key Features of MongoDB 1. Dynamic Document Schema: They are schema- free and can be customized according to the need. 2. Native Language drivers: MongoDB currently provides official driver support for all popular programming languages like C, C++, C#, Java, Node.js, Perl, PHP, Python, Ruby, Scala, Go, and Erlang. 3. High availability: The database of MongoDB can be executed on multiple servers at a time to reduce risk of data loss during hardware failure. 4. High performance: Ad hoc queries, indexing, and real time aggregation provide powerful ways to access data. 5. Horizontal Scalability: Horizontal scaling means that each shard in every cluster houses a portion of the dataset. Source: https://www.mongodb.com/mongodb-architecture 14
  • 14. Representation of MongoDB RDBMS MongoDB Database Database Table Collection Row Document Column Field Join (normalize d) Embedding/ Referencing (denormalized) Primary Key _id Field 15
  • 15. Supported Data types ● JSON ● Integers ● Boolean ● Double ● Array ● Date ● Timestamp ● Objectid ● Null ● Objects 16
  • 16. Use Cases of MongoDB 17
  • 17. Case study: MongoDB with Aadhaar ● Aadhaar is India’s Unique Identification project, which has the biggest biometrics database in the world. ● MongoDB is being used for the storage of images in the Aadhar project. ● Aadhaar chose to partner with MongoDB (in addition to other vendors such as Hadoop, MySQL, HBase, and Solr) for several reasons. 1) MongoDB increases database efficiency with its NoSQL approach, which enables Aadhaar to capture, process, search, and analyze large unstructured datasets faster than most other management softwares. 2) MongoDB can efficiently store large volumes of biometric data and images, whereas many other management systems, such as MySQL, are less suited for image storage. 18
  • 18. Sample Database (“Studentinfo”) { "_id" : ObjectId("61f6f4e0dd0c8af093eb9255"), "studentName" : "Gaurav", "section" : "A", "Marks" : 90, "subject" : [ "English" ] } { "_id" : ObjectId("61f6f4e0dd0c8af093eb9254"), "studentName" : "Vijay", "section" : "A", "Marks" : 70, "subject" : [ "Hindi", "English", "Math" ] } Collection (“Student”) Document 1 Document 2 19 . . . . Document n Fields Fields Primary key Primary key Embedded/Nested documents
  • 19. Querying in MongoDB 1. CRUD (Create, Read, Update, Delete) operations: These functions can be broadly classified as data modification functions in MongoDB. These can only be used for a single collection. 2. Aggregation: ● Process data records and return computed results. It can be applied on multiple douments. Includes Operators like $sum, $max, $min etc. ● Aggregation pipeline method is used to perform the aggregations. Includes Stages like $match, $project and many more. 3. Indexing: Used to make queries performance more efficient. It includes: a) Create index b) Find Index 3) Drop Index 20
  • 20. 1. MongoDB CRUD Operations Create (CRUD) Operations: It includes 1) create() : Creation of new collection 2) insert() : insertion of one or more documents inside a collection It includes two other methods: ● insertone(): To insert one document inside a collection ● insertmany(): To insert many docuemnts inside a collection 21
  • 21. 1. create() Basic Syntax: db.createCollection(name, options) Query: Create a new collection “student” Syntax: db.createCollection(“student”); 2. insert(): The insert() method is used to insert one or multiple documents in a collection. Basic Syntax: db.collection_Name.insert(JSON document) Query: Insert the marks of a students named ‘Vijay, Gaurav’ in section ‘A’ having subjects ‘Hindi, English, Math’,and ‘English’ respectively. Syntax: db.student.insert ({studentName: “Vijay”, section: “A”, Marks: 70, subject: [“Hindi”, “English”, “Math”]}) db.student.insert[{studentName: “Gaurav”, section: “A”, Marks: 90, subject: [“English”]}]) 22
  • 22. 2. insertOne(): Another way to insert documents is by using the insertOne() method for a single document in a collection. Basic Syntax: db.collection_Name.insertone(JSON document) Query: Insert the marks of a student named ‘Vijay’ in section ‘A’ having subjects ‘Hindi, English, Math’. Syntax: db.student.insert ({studentName: “Vijay”, section: “A”, Marks: 70, subject: [“Hindi”, “English”, “Math”]}) 3. insertMany(): is used for inserting multiple documents: Basic Syntax: db.collection_Name.insertmany([array of JSON document]) Query: Insert the marks of a students named ‘Vijay, Gaurav’ in section ‘A’ having subjects ‘Hindi, English, Math’,and ‘English’ respectively. Syntax: db.student.insertMany( [ { studentName: “Vijay”, section: “A”, Marks: 70, subject: [“Hindi”, “English”, “Math”]}, { studentName: “Gaurav”, section: “A”, Marks: 90, subject: [“English”]}]); 23
  • 23. Read (CRUD) Operations Read operations retrieves documents from a collection; i.e. queries a collection for documents. Basic syntax: db.collection.find() Query: Display the details of students Syntax: db.student.find{}; pretty(): This method is used for giving a proper format to the output extracted by the query. Basic Syntax: db.collection.find().pretty(); Query: Display the details of students Syntax: db.student.find().pretty(); 24
  • 24. Update (CRUD) Operations Update operations modify existing documents in a collection. Basic Syntax: db.collection_Name.update(selection_criteria,updated_data) Query: Update the name of “Gaurav” to “Gorav” Syntax: db.student.update({name: “Gaurav”}, {$set:{“name”: “Gorav”}} 25
  • 25. Remove (CRUD) Operations 1. drop(): To delete the collection 2. remove(): To delete the document from a collection 1. drop() Basic Syntax: db.collection_name.drop() Query: Delete the collection “student” Syntax: db.student.drop(); 2. remove() Basic Syntax: db.collection_name.remove(Deletion_Criteria ) Query: Delete details of student “Gorav” Syntax: db.student.remove({“name”: “Gorav”}); 26
  • 26. 2. Aggregation ● Aggregation operations process multiple documents and return computed results ● The key element in aggregation is pipeline ● Pipeline is a sequence of data aggregate operators or stages. ● There are several aggregate pipeline operators like $max, $min, $avg etc ● There are total 32 different pipeline stages. e.g. $project, $match, $group etc Basic syntax of aggregate() method is as follows − db.Collection_Name.aggregate(pipeline) 27
  • 27. Following is a list of some aggrege operators 28 Stage Description $add Adds numbers to return the sum, or adds numbers and a date to return a new date. $in The in operator returns a boolean indicating that the specified value is in the array or not. $min It gets the minimum value from all the documents $max It gets the maximum value from all the documents $count It counts total numbers of documents $avg It calculates the average of all given values from all documents $first It gets the first document from the grouping $last It gets the last document from the grouping
  • 28. Following is a list of some aggrege stages 29 Stage Description $match Filters the document stream to allow only matching documents to pass into the next pipeline stage. $project Reshapes each document in the stream, shows only selected fields of documents. $group Groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group. $limit Passes the first n documents unmodified to the pipeline where n is the specified limit. $lookup Performs a left outer join to another collection in the same database to filter in documents from the "joined" collection for processing. $count Returns a count of the number of documents at this stage of the aggregation pipeline. $merge Writes the resulting documents of the aggregation pipeline to a collection. $unwind Deconstructs an array field from the input documents to output a document for each element.
  • 29. Examples Following are the three popular stages in aggregation framework: 1) $match − This is a filtering operation and thus this can reduce the amount of documents that are given as input to the next stage. Basic Syntax: { $match: { <query> } } Query 1: Display the details of students belong to section “A” Syntax: db.student.aggregate([{“$match:{“section”: “A”}}]) Query 2: Display the details of students belong to section “A” and marks greater than 80 db.student.aggregate([{“$match”: { $and:[{“section”: “A”}, {Marks: {“$gt”: 80}}]}}]) 30
  • 30. 31 2) $project − Used to select some specific fields from a collection. Basic Syntax: Query 1: Display name, section and marks of all the students. Syntax: db.student.aggregate([{“$project”: {studentName: 1, section: 1, Marks: 1}}]) Query 2: Display the names and marks of students from section A. Syntax: db.student.aggregate([{$match:{“section”: “A”}}, {“$project”: {studentName: 1, Marks:1}}]) { $project: { <specification(s)> } }
  • 31. 3) $group − This does the actual aggregation as discussed above. Basic Syntax: { $group: { _id: <expression>, // Group By Expression <field1>: { <accumulator1> : <expression1> },}} Query 1: To find out total marks each section. Syntax: db.student.aggregate([{“$group”:{“_id”: {“section : “$section”}, “Total Marks”:{“$sum”: “$Marks”}}}]) Query 2: To find the total marks of section A. Syntax: db.student.aggregate([{“$match”: {section: ‘A’}}, {“$group”:{“_id”: {“section : “$section”}, “Total Marks”:{“$sum”: “$Marks”}}}]) 32
  • 32. Query 3: To find total and average marks of each section. Syntax: db.student.aggregate([{“$group”:{“_id”: {“section : “$section”}, “Total Marks”:{“$sum”: “$Marks”}, “Count”:{ “$sum”:1}, “Average”: {“$avg”: “$Marks”}}}]) 33
  • 33. 3. Indexing ● Index in MongoDB is a special data structure that holds the data of few fields of documents on which the index is created. ● MongoDB uses B-Tree data structure to store indexes. ● Indexes improve the speed of search operations in database because instead of searching the whole document, the search is performed on the indexes that holds only few fields. ● On the other hand, having too many indexes can hamper the performance of insert, update and delete operations because of the additional write and additional data space used by indexes. 34
  • 34. 1. To create index in MongoDB Basic Syntax: db.collection_name.createIndex({field_name: 1 or -1}) The value 1 is for ascending order and -1 is for descending order. Query: Create index on student name. Syntax: db.student.createIndex({“name”: 1}) 2. Finding the indexes in a collection We can use getIndexes() method to find all the indexes created on a collection. Basic Syntax: db.collection_name.getIndexes() Query: Get all indexes on Student collection. Syntax: db.student.getIndexes() 35
  • 35. 3. Drop indexes in a collection For this purpose the dropIndex() method is used. Basic Syntax: db.collection_name.dropIndex({index_name: 1}) Query: Drop the index on student name. Syntax: db.studentdropIndex({name: 1}) 36
  • 36. Sharding ● Sharding is the process of distributing data across multiple servers for storage ● It is MongoDB's approach to meeting the demands of data growth ● Sharding is used to achieve horizontal scaling. With sharding, more machines are added meet the demands of read and write operations Characterstics of sharding: ● It automatically adds more servers to a database and automatically balances data and load across various servers ● It splits the data set and distributes them across multiple databases or shards. Each shard serves as an independent database, and together, shards make a single logical database 37
  • 37. Example of sharding: 38 If a database has 1 terabyte data set distributed amongst 4 shards, then each shard may hold only 256 GB of data. If the database contain 40 shards, then each shard will hold only 25 GB of data Source: https://i.stack.imgur.com/zCOvb.png
  • 38. Components are: 1. shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set. 2. mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster. 3. config servers: Config servers store metadata and configuration settings for the cluster. Source: https://docs.mongodb.com/manual/sharding/ 39
  • 39. References 1. MongoDB website, https://www.mongodb.com 2. Dan Sullivan. “Nosql for mere Mortals”,1st Edition,United States of America:Pearson Education,2015. 3. “Top 5 considerations when evaluating NoSql databases”,MongoDB white paper,2015. (https://www.mongodb.com/collateral/top-5-considerations-when-evaluating-nosql-databases). 4. Davoudian, Ali, Liu Chen, and Mengchi Liu. “A survey on NoSQL stores.” ACM Computing Surveys (CSUR) 51, no. 2 (2018): 1-43 5. Cattell, R. “Scalable SQL and NoSQL data stores.” Acm Sigmod Record, 39(4), pp.12-27, 2015. 6. Floratou, Avrilia, Nikhil Teletia, David J. DeWitt, Jignesh M. Patel, and Donghui Zhang. "Can the Elephants Handle the NoSQL Onslaught?." Proceedings of the VLDB Endowment 5, no. 12 (2012). 7. Patel, Jignesh M. "Operational NoSQL Systems: What's New and What's Next?." Computer 49, no. 4 (2016): 23-30. 8. Abadi, Daniel, Rakesh Agrawal, Anastasia Ailamaki, Magdalena Balazinska, Philip A. Bernstein, Michael J. Carey, Surajit Chaudhuri et al. "The Beckman report on database research." Communications of the ACM 59, no. 2 (2016): 92-99. 40