SlideShare a Scribd company logo
1 of 28
© 2011 Xpanxion all rights reserved
GLOBAL SOFTWARE ENGINEERING EXCELLENCE
MongoDB
<Version 5.1>
17 April 2013
Internal
<Internal Restricted/Confidential(when filled) >
- Sachin Bhosale
© 2011 Xpanxion all rights reserved
The Evolution of Databases
2010
RDBMS
NoSQL
OLAP/BI
Hadoop
2000
RDBMS
OLAP/BI
1990
RDBMS
Operational
Data
Datawarehouse
© 2011 Xpanxion all rights reserved
Big Data
 "Big Data" describes data sets so large and complex they are impractical
to manage with traditional software tools. Big Data relates to data
creation, storage, retrieval and analysis that is remarkable in terms
of volume, velocity, and variety.
 Volume - A typical PC might have had 10 gigabytes of storage in 2000.
Today, Facebook ingests 500 terabytes of new data every day
 Velocity - Clickstreams and ad impressions capture user behavior at
millions of events per second; high-frequency stock trading algorithms
reflect market changes within microseconds
 Variety - Big Data data isn't just numbers, dates, and strings. Big Data
is also geospatial data, 3D data, audio and video, and unstructured
text, including log files and social media
© 2011 Xpanxion all rights reserved
Big Data Technologies
Operational Analytical
Latency 10 ms - 100 ms 1 min - 100 min
Concurrency 1000 - 100,000 1 - 10
Access Pattern Writes and Reads Reads
Queries Selective Unselective
Data Scope Operational Retrospective
End User Customer Data Scientist
Technology NoSQL MapReduce, MPP Database
© 2011 Xpanxion all rights reserved
Relational Database Challenges
Data Types
• Unstructured data
• Semi-structured data
• Polymorphic data
Volume of Data
• Petabytes of data
• Trillions of records
• Tens of millions of queries per second
Agile Development
• Iterative
• Short development cycles
• New workloads
New Architectures
• Horizontal scaling
• Commodity servers
• Cloud computing
© 2011 Xpanxion all rights reserved
NOSQL Categories
Redis Cassandra MongoDB Neo4j
© 2011 Xpanxion all rights reserved
Which one is the best?
© 2011 Xpanxion all rights reserved
What is MongoDB?
 MongoDB is a ___________ database
 Document
 Open source
 High performance
 Horizontally scalable
 Full featured
© 2011 Xpanxion all rights reserved
Document Database
 Not for .PDF & .DOC files
 A document is essentially an associative array
 Document == JSON object
 Document == PHP Array
 Document == Python Dictionary
 Document == Ruby Hash
 etc
© 2011 Xpanxion all rights reserved
Open Source
 MongoDB is an open source project
 On GitHub
 Licensed under the AGPL
 Commercial licenses available
 Started & sponsored by 10gen
© 2011 Xpanxion all rights reserved
High Performance
 Written in C++
 Extensive use of memory-mapped files
i.e. read-through write-through memory caching.
 Runs nearly everywhere
 Data serialized as BSON (fast parsing)
 Full support for primary & secondary indexes
 Document model = less work
© 2011 Xpanxion all rights reserved
Horizontally Scalable
© 2011 Xpanxion all rights reserved
Full Featured
 Ad Hoc queries
 Real time aggregation
 Rich query capabilities
 Traditionally consistent
 Geospatial features
 Support for most programming languages
 JavaScript, Python, Ruby, PHP, Perl, Java, Scala, C#, C, C++
 Flexible schema
© 2011 Xpanxion all rights reserved
MongoDB Installation
 Get the MongoDB distributions by platform and version from
http://www.mongodb.org/downloads
 MongoDB requires a data folder to store its files. The default location for
the MongoDB data directory is C:datadb (Windows) or /data/db (Linux)
 Running MongoDB
Windows
C:mongodbbinmongod.exe --dbpath d:testdata
Linux
./bin/mongod --dbpath /data/mongodb
© 2011 Xpanxion all rights reserved
MongoDB Package Components - 1
 Core Processes
 mongod
 mongos
 mongo
 Binary Import and Export Tools
 mongodump
 mongorestore
 bsondump
 Mongooplog
© 2011 Xpanxion all rights reserved
MongoDB Package Components - 2
 Data Import and Export Tools
 mongoimport
 Mongoexport
 Diagnostic Tools
 mongostat
 mongotop
 mongosniff
 Mongoperf
 GridFS
 mongofiles
© 2011 Xpanxion all rights reserved
Mongo Shell
vars / functions / data structs + types
Spidermonkey / V8
ObjectId("...")
new Date()
Object.bsonsize()
db["collection"].find/count/update
short-hand for collections
Doesn't require quoted keys
Don’t copy and paste too much
Embedded
Javascript
Interpreter
Global Functions
and Objects
MongoDB driver
Exposed
JSON-like stuff
© 2011 Xpanxion all rights reserved
Terminology
© 2011 Xpanxion all rights reserved
Core MongoDB Operations (CRUD) - 1
 CREATE
 insert() - is the primary method to insert a document or documents
into a MongoDB collection
db.studs.insert({_id : 1, name : “Sachin”, score : 110})
 save() - performs an insert if the document to save does not contain
the _id field
db.studs.save({name : “Sachin”, score : 110})
 READ
 find() - method returns a cursor that contains a number of documents
db.collection.find( <query>, <projection> )
 findOne() - selects a single document from a collection and returns
that document
© 2011 Xpanxion all rights reserved
Core MongoDB Operations (CRUD) - 2
 UPDATE
 update() - method updates a single document, but by using the multi
option, update() can update all documents that match the query
criteria in the collection
 Update Operators
 Fields - $inc, $rename, $set, $unset
 Array - $addToSet, $pop, $pullAll, $pull, $push
 save() - performs a special type of update(), depending on the _id field
of the specified document
 Examples
db.bios.update( { _id: 3}, {$unset: {birth: 1 } }, { multi: true } )
db.bios.update( { _id: 1}, {$set: {'contribs.1': 'ALGOL 58' } } )
© 2011 Xpanxion all rights reserved
Core MongoDB Operations (CRUD) - 3
 DELETE
 remove() - deletes documents from a collection.
db.collection.remove( <query>, <justOne> )
 Remove All documents
db.bios.remove()
 Remove a single document that matches a condition
db.bios.remove( { turing: true }, 1 )
© 2011 Xpanxion all rights reserved
Data Modeling
 Data in MongoDB has a flexible schema.
 Collections do not enforce document structure.
 documents in the same collection do not need to have the same set of
fields or structure, and
 common fields in a collection’s documents may hold different types of
data.
 MongoDB does not support
 Joins – on multiple collections
 Transaction - across multiple documents
© 2011 Xpanxion all rights reserved
Data Modeling Considerations
 Inherent properties and requirements of the application objects and the
relationships
 MongoDB data models must also reflect
 how data will grow and change over time, and
 the kinds of queries your application will perform
 These considerations and requirements force to make a number of multi-
factored decisions:
 normalization and de-normalization
 indexing strategy
 representation of data in arrays in BSON
© 2011 Xpanxion all rights reserved
Data Modeling Decisions
Data modeling decisions involve determining how to structure the
documents to model the data effectively.
 Embedding
 To de-normalize data, store two related pieces of data in a single
document.
 Referencing
 To normalize data, store references between two documents to
indicate a relationship between the data represented in each
document.
 Atomicity
 MongoDB only provides atomic operations on the level of a single
document
© 2011 Xpanxion all rights reserved
Aggregation
 MongoDB introduced the aggregation framework that provides a
powerful and flexible set of tools to use for many data aggregation tasks
without having to use map-reduce
 While map-reduce is powerful, it is often more difficult than necessary for
many simple aggregation tasks, such as totaling or averaging field values.
db.collection.mapReduce()
 Pipeline Operators and Indexes
$match, $sort, $limit, $skip, $project, $unwind, $group
db.articles.aggregate(
{ $project : {
author : 1,
tags : 1,
} },
{ $unwind : "$tags" },
{ $group : {
_id : { tags : "$tags" },
authors : { $addToSet : "$author" }
} }
)
© 2011 Xpanxion all rights reserved
Blog Project withMongoDB
 Blogger with following functionality
 Singup
 New Post
 Login
 Logout
 It uses Python, Pymongo drivers, MongoDB
© 2011 Xpanxion all rights reserved
Questions ?
© 2011 Xpanxion all rights reserved
Thank You

More Related Content

What's hot

MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNodeXperts
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Power JSON with PostgreSQL
Power JSON with PostgreSQLPower JSON with PostgreSQL
Power JSON with PostgreSQLEDB
 
Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbDimgba Kalu
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couchdelagoya
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerIBM Cloud Data Services
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionBrian Enochson
 
Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseXpand IT
 
Transitioning from SQL to MongoDB
Transitioning from SQL to MongoDBTransitioning from SQL to MongoDB
Transitioning from SQL to MongoDBMongoDB
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the RoadmapEDB
 

What's hot (20)

MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
CouchDB
CouchDBCouchDB
CouchDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Power JSON with PostgreSQL
Power JSON with PostgreSQLPower JSON with PostgreSQL
Power JSON with PostgreSQL
 
Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdb
 
Couch db
Couch dbCouch db
Couch db
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couch
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data Layer
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
 
Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data Database
 
Transitioning from SQL to MongoDB
Transitioning from SQL to MongoDBTransitioning from SQL to MongoDB
Transitioning from SQL to MongoDB
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the Roadmap
 

Similar to MongoDB Introduction and Data Modelling

Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesAshishRathore72
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introductiondinkar thakur
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
Querying Mongo Without Programming Using Funql
Querying Mongo Without Programming Using FunqlQuerying Mongo Without Programming Using Funql
Querying Mongo Without Programming Using FunqlMongoDB
 
Introduction to MongoDB (Webinar Jan 2011)
Introduction to MongoDB (Webinar Jan 2011)Introduction to MongoDB (Webinar Jan 2011)
Introduction to MongoDB (Webinar Jan 2011)MongoDB
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 
introtomongodb
introtomongodbintrotomongodb
introtomongodbsaikiran
 
Mdb dn 2016_07_elastic_search
Mdb dn 2016_07_elastic_searchMdb dn 2016_07_elastic_search
Mdb dn 2016_07_elastic_searchDaniel M. Farrell
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...Ram Murat Sharma
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialPHP Support
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBaseSindhujanDhayalan
 
A Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfA Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfJessica Navarro
 
A Study on Mongodb Database
A Study on Mongodb DatabaseA Study on Mongodb Database
A Study on Mongodb DatabaseIJSRD
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleEl Taller Web
 
TCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleTCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleJeremy Taylor
 
Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)Ulrich Krause
 

Similar to MongoDB Introduction and Data Modelling (20)

Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practices
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
Querying Mongo Without Programming Using Funql
Querying Mongo Without Programming Using FunqlQuerying Mongo Without Programming Using Funql
Querying Mongo Without Programming Using Funql
 
Introduction to MongoDB (Webinar Jan 2011)
Introduction to MongoDB (Webinar Jan 2011)Introduction to MongoDB (Webinar Jan 2011)
Introduction to MongoDB (Webinar Jan 2011)
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 
introtomongodb
introtomongodbintrotomongodb
introtomongodb
 
Mdb dn 2016_07_elastic_search
Mdb dn 2016_07_elastic_searchMdb dn 2016_07_elastic_search
Mdb dn 2016_07_elastic_search
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
Mongo db
Mongo dbMongo db
Mongo db
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js Tutorial
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBase
 
A Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfA Study on Mongodb Database.pdf
A Study on Mongodb Database.pdf
 
A Study on Mongodb Database
A Study on Mongodb DatabaseA Study on Mongodb Database
A Study on Mongodb Database
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & Oracle
 
Mongo db operations_v2
Mongo db operations_v2Mongo db operations_v2
Mongo db operations_v2
 
TCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleTCO - MongoDB vs. Oracle
TCO - MongoDB vs. Oracle
 
Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)
 

Recently uploaded

The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 

Recently uploaded (20)

The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 

MongoDB Introduction and Data Modelling

  • 1. © 2011 Xpanxion all rights reserved GLOBAL SOFTWARE ENGINEERING EXCELLENCE MongoDB <Version 5.1> 17 April 2013 Internal <Internal Restricted/Confidential(when filled) > - Sachin Bhosale
  • 2. © 2011 Xpanxion all rights reserved The Evolution of Databases 2010 RDBMS NoSQL OLAP/BI Hadoop 2000 RDBMS OLAP/BI 1990 RDBMS Operational Data Datawarehouse
  • 3. © 2011 Xpanxion all rights reserved Big Data  "Big Data" describes data sets so large and complex they are impractical to manage with traditional software tools. Big Data relates to data creation, storage, retrieval and analysis that is remarkable in terms of volume, velocity, and variety.  Volume - A typical PC might have had 10 gigabytes of storage in 2000. Today, Facebook ingests 500 terabytes of new data every day  Velocity - Clickstreams and ad impressions capture user behavior at millions of events per second; high-frequency stock trading algorithms reflect market changes within microseconds  Variety - Big Data data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media
  • 4. © 2011 Xpanxion all rights reserved Big Data Technologies Operational Analytical Latency 10 ms - 100 ms 1 min - 100 min Concurrency 1000 - 100,000 1 - 10 Access Pattern Writes and Reads Reads Queries Selective Unselective Data Scope Operational Retrospective End User Customer Data Scientist Technology NoSQL MapReduce, MPP Database
  • 5. © 2011 Xpanxion all rights reserved Relational Database Challenges Data Types • Unstructured data • Semi-structured data • Polymorphic data Volume of Data • Petabytes of data • Trillions of records • Tens of millions of queries per second Agile Development • Iterative • Short development cycles • New workloads New Architectures • Horizontal scaling • Commodity servers • Cloud computing
  • 6. © 2011 Xpanxion all rights reserved NOSQL Categories Redis Cassandra MongoDB Neo4j
  • 7. © 2011 Xpanxion all rights reserved Which one is the best?
  • 8. © 2011 Xpanxion all rights reserved What is MongoDB?  MongoDB is a ___________ database  Document  Open source  High performance  Horizontally scalable  Full featured
  • 9. © 2011 Xpanxion all rights reserved Document Database  Not for .PDF & .DOC files  A document is essentially an associative array  Document == JSON object  Document == PHP Array  Document == Python Dictionary  Document == Ruby Hash  etc
  • 10. © 2011 Xpanxion all rights reserved Open Source  MongoDB is an open source project  On GitHub  Licensed under the AGPL  Commercial licenses available  Started & sponsored by 10gen
  • 11. © 2011 Xpanxion all rights reserved High Performance  Written in C++  Extensive use of memory-mapped files i.e. read-through write-through memory caching.  Runs nearly everywhere  Data serialized as BSON (fast parsing)  Full support for primary & secondary indexes  Document model = less work
  • 12. © 2011 Xpanxion all rights reserved Horizontally Scalable
  • 13. © 2011 Xpanxion all rights reserved Full Featured  Ad Hoc queries  Real time aggregation  Rich query capabilities  Traditionally consistent  Geospatial features  Support for most programming languages  JavaScript, Python, Ruby, PHP, Perl, Java, Scala, C#, C, C++  Flexible schema
  • 14. © 2011 Xpanxion all rights reserved MongoDB Installation  Get the MongoDB distributions by platform and version from http://www.mongodb.org/downloads  MongoDB requires a data folder to store its files. The default location for the MongoDB data directory is C:datadb (Windows) or /data/db (Linux)  Running MongoDB Windows C:mongodbbinmongod.exe --dbpath d:testdata Linux ./bin/mongod --dbpath /data/mongodb
  • 15. © 2011 Xpanxion all rights reserved MongoDB Package Components - 1  Core Processes  mongod  mongos  mongo  Binary Import and Export Tools  mongodump  mongorestore  bsondump  Mongooplog
  • 16. © 2011 Xpanxion all rights reserved MongoDB Package Components - 2  Data Import and Export Tools  mongoimport  Mongoexport  Diagnostic Tools  mongostat  mongotop  mongosniff  Mongoperf  GridFS  mongofiles
  • 17. © 2011 Xpanxion all rights reserved Mongo Shell vars / functions / data structs + types Spidermonkey / V8 ObjectId("...") new Date() Object.bsonsize() db["collection"].find/count/update short-hand for collections Doesn't require quoted keys Don’t copy and paste too much Embedded Javascript Interpreter Global Functions and Objects MongoDB driver Exposed JSON-like stuff
  • 18. © 2011 Xpanxion all rights reserved Terminology
  • 19. © 2011 Xpanxion all rights reserved Core MongoDB Operations (CRUD) - 1  CREATE  insert() - is the primary method to insert a document or documents into a MongoDB collection db.studs.insert({_id : 1, name : “Sachin”, score : 110})  save() - performs an insert if the document to save does not contain the _id field db.studs.save({name : “Sachin”, score : 110})  READ  find() - method returns a cursor that contains a number of documents db.collection.find( <query>, <projection> )  findOne() - selects a single document from a collection and returns that document
  • 20. © 2011 Xpanxion all rights reserved Core MongoDB Operations (CRUD) - 2  UPDATE  update() - method updates a single document, but by using the multi option, update() can update all documents that match the query criteria in the collection  Update Operators  Fields - $inc, $rename, $set, $unset  Array - $addToSet, $pop, $pullAll, $pull, $push  save() - performs a special type of update(), depending on the _id field of the specified document  Examples db.bios.update( { _id: 3}, {$unset: {birth: 1 } }, { multi: true } ) db.bios.update( { _id: 1}, {$set: {'contribs.1': 'ALGOL 58' } } )
  • 21. © 2011 Xpanxion all rights reserved Core MongoDB Operations (CRUD) - 3  DELETE  remove() - deletes documents from a collection. db.collection.remove( <query>, <justOne> )  Remove All documents db.bios.remove()  Remove a single document that matches a condition db.bios.remove( { turing: true }, 1 )
  • 22. © 2011 Xpanxion all rights reserved Data Modeling  Data in MongoDB has a flexible schema.  Collections do not enforce document structure.  documents in the same collection do not need to have the same set of fields or structure, and  common fields in a collection’s documents may hold different types of data.  MongoDB does not support  Joins – on multiple collections  Transaction - across multiple documents
  • 23. © 2011 Xpanxion all rights reserved Data Modeling Considerations  Inherent properties and requirements of the application objects and the relationships  MongoDB data models must also reflect  how data will grow and change over time, and  the kinds of queries your application will perform  These considerations and requirements force to make a number of multi- factored decisions:  normalization and de-normalization  indexing strategy  representation of data in arrays in BSON
  • 24. © 2011 Xpanxion all rights reserved Data Modeling Decisions Data modeling decisions involve determining how to structure the documents to model the data effectively.  Embedding  To de-normalize data, store two related pieces of data in a single document.  Referencing  To normalize data, store references between two documents to indicate a relationship between the data represented in each document.  Atomicity  MongoDB only provides atomic operations on the level of a single document
  • 25. © 2011 Xpanxion all rights reserved Aggregation  MongoDB introduced the aggregation framework that provides a powerful and flexible set of tools to use for many data aggregation tasks without having to use map-reduce  While map-reduce is powerful, it is often more difficult than necessary for many simple aggregation tasks, such as totaling or averaging field values. db.collection.mapReduce()  Pipeline Operators and Indexes $match, $sort, $limit, $skip, $project, $unwind, $group db.articles.aggregate( { $project : { author : 1, tags : 1, } }, { $unwind : "$tags" }, { $group : { _id : { tags : "$tags" }, authors : { $addToSet : "$author" } } } )
  • 26. © 2011 Xpanxion all rights reserved Blog Project withMongoDB  Blogger with following functionality  Singup  New Post  Login  Logout  It uses Python, Pymongo drivers, MongoDB
  • 27. © 2011 Xpanxion all rights reserved Questions ?
  • 28. © 2011 Xpanxion all rights reserved Thank You