SlideShare a Scribd company logo
1 of 66
NoSQL Database
Akshay Mathur
Sarang Shravagi
@akshaymathu, @_sarangs
{name: ‘mongo’, type: ‘db’}
Who uses MongoDB
@akshaymathu, @_sarangs 2
Let’s Know Each Other
• Do you code?
• OS?
• Programing Language?
• Why are you attending?
@akshaymathu, @_sarangs 3
Akshay Mathur
• Managed development, testing and
release teams in last 14+ years
– Currently Principal Architect at ShopSocially
• Founding Team Member of
– ShopSocially (Enabling “social” for retailers)
– AirTight Neworks (Global leader of WIPS)
@akshaymathu, @_sarangs 4
Sarang Shravagi
• 10gen Certified Developer and DBA
• CS graduate from PICT Pune
• 3+ years in Software Product industry
• Currently Senior Full-stack Developer at
ShopSocially
@akshaymathu, @_sarangs 5
How we use MongoDB
@akshaymathu, @_sarangs 6
Python MongoDB
MongoEngine
Where MongoDB Fits
@akshaymathu, @_sarangs 7
Program Outline: Understanding NoSQL
• Data Landscape
• Different Storage Needs
• Design Paradigm Shift from SQL to
NoSQL
• Different Datastores
• Closer look to Document Storage
• Drawing parallel from RDBMS
@akshaymathu, @_sarangs 8
Program Outline: Hands on Lab
• Installation and basic configuration
• Mongo Shell
• Creating and Changing Schema
• Create, Read, Update and Delete of Data
• Analyzing Performance
• Improving performance by creating Indices
• Assignment
• Problem solving for the assignment
@akshaymathu, @_sarangs 9
Program Outline: Advance Topics
• Handling Big Data
– Introduction to Map/Reduce
– Introduction to Data Partitioning (Sharding)
• Disaster Recovery
– Introduction to Replica set and High
Availability
@akshaymathu, @_sarangs 10
Ground Rules
• Disturb Everyone
– Not by phone rings
– Not by local talks
– By more information
and questions
@akshaymathu, @_sarangs 11
Data Patterns & Storage Needs
@akshaymathu, @_sarangs 12
Data at an Online Store
• Product Information
• User Information
• Purchase Information
• Product Reviews
• Site Interactions
• Social Graph
• Search Index
@akshaymathu, @_sarangs 13
SQL to NoSQL
Design Paradigm Shift
@akshaymathu, @_sarangs 14
SQL Storage
• Was designed when
– Storage and data transfer was costly
– Processing was slow
– Applications were oriented more towards data
collection
• Initial adopters were financial institutions
@akshaymathu, @_sarangs 15
SQL Storage
• Structured
– schema
• Relational
– foreign keys, constraints
• Transactional
– Atomicity, Consistency, Isolation, Durability
• High Availability through robustness
– Minimize failures
• Optimized for Writes
• Typically Scale Up
@akshaymathu, @_sarangs 16
NoSQL Storage
• Is designed when
– Storage is cheap
– Data transfer is fast
– Much more processing power is available
• Clustering of machines is also possible
– Applications are oriented towards
consumption of User Generated Content
– Better on-screen user experience is in
demand
@akshaymathu, @_sarangs 17
NoSQL Storage
• Semi-structured
– Schemaless
• Consistency, Availability, Partition
Tolerance
• High Availability through clustering
– expect failures
• Optimized for Reads
• Typically Scale Out
@akshaymathu, @_sarangs 18
Different Datastores
Half Level Deep
@akshaymathu, @_sarangs 19
SQL: RDBMS
• MySql, Postgresql, Oracle etc.
• Stores data in tables having columns
– Basic (number, text) data types
• Strong query language
• Transparent values
– Query language can read and filter on them
– Relationship between tables based on values
• Suited for user info and transactions
@akshaymathu, @_sarangs 20
NoSQL: Key/Value
• Redis, DynamoDB etc.
• Stores a values against a key
– Strings
• Values are opaque
– Can not be part of query
• Suited for site interactions
@akshaymathu, @_sarangs 21
NoSQL: Key/Value
NoSQL: Document
• MongoDB, CouchDB etc.
• Object Oriented data models
– Stores data in document objects having fields
– Basic and compound (list, dict) data types
• SQL like queries
• Transparent values
– Can be part of query
• Suited for product info and its reviews
@akshaymathu, @_sarangs 23
NoSQL: Document
NoSQL: Column Family
• Cassandra, Big Table etc.
• Stores data in columns
• Transparent values
– Can be part of query
• SQL like queries
• Suited for search
@akshaymathu, @_sarangs 25
NoSQL: Column Family
NoSQL: Graph
• Neo4j
• Stores data in form of nodes and
relationships
• Query is in form of traversal
• In-memory
• Suited for social graph
@akshaymathu, @_sarangs 27
NoSQL: Graph
Document Storage: Closer Look
@akshaymathu, @_sarangs 30
MongoDB
• Document database
• Powerful query language
• Docs, sub-docs, indexes
• Map/reduce
• Replicas, shards, replicated shards
• SDKs/drivers for so many languages
– C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl,
Ruby, Scala
@akshaymathu, @_sarangs 31
RDBMS: DB Design
@akshaymathu, @_sarangs 32
RDBMS: Query
@akshaymathu, @_sarangs 33
RDBMS  MongoDB
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Select c1, c2 from Table where c1 = ‘v1’
order by c2 limit n
Collection.objects(F1 =
‘v1’).order_by(‘c2’).limit(n)
@akshaymathu, @_sarangs 34
MongoDB: Design
@akshaymathu, @_sarangs 35
MongoDB: Query
• Movies.objects()
@akshaymathu, @_sarangs 36
@akshaymathu, @_sarangs 37
Have you Installed?
http://www.mongodb.org/downloads
@akshaymathu, @_sarangs
Hands-on
Dive-in with Sarang
@akshaymathu, @_sarangs 39
MongoDB: Core Binaries
• mongod
– Database server
• mongo
– Database client shell
• mongos
– Router for Sharding
@akshaymathu, @_sarangs 40
Getting Help
• For mongo shell
– mongo –help
• Shows options available for running the shell
• Inside mongo shell
– Object.help()
• Shows commands available on the object
@akshaymathu, @_sarangs 41
Import Export Tools
• For objects
– mongodump
– mongorestore
– bsondump
– mongooplog
• For data items
– mongoimport
– mongoexport
@akshaymathu, @_sarangs 42
Database Operations
• Database creation
• Creating/changing collection
• Data insertion
• Data read
• Data update
• Creating indices
• Data deletion
• Dropping collection
@akshaymathu, @_sarangs 43
Diagnostic Tools
• mongostat
• mongoperf
• mongosnif
• mongotop
@akshaymathu, @_sarangs 44
@akshaymathu, @_sarangs 45
Assignment
• Go to http://www.velocitainc.com/mongo/
– Tasks
• assignments.txt
– Data
• students.json
@akshaymathu, @_sarangs 46
Disaster Recovery
Introduction to Replica Sets and
High Availability
@akshaymathu, @_sarangs 47
Disasters
• Physical Failure
– Hardware
– Network
• Solution
– Replica Sets
• Provide redundant storage for High Availability
– Real time data synchronization
• Automatic failover for zero down time
@akshaymathu, @_sarangs 48
Replication
@akshaymathu, @_sarangs 49
Multi Replication
• Data can be replicated to multiple places
simultaneously
• Odd number of machines are always
needed in a replica set
@akshaymathu, @_sarangs 50
Single Replication
• If you want to have only one or odd
number of secondary, you need to setup
an arbiter
@akshaymathu, @_sarangs 51
Failover
• When primary fails, remaining machines
vote for electing new primary
@akshaymathu, @_sarangs 52
Handling Big Data
Introduction to Map/Reduce
and Sharding
@akshaymathu, @_sarangs 53
Large Data Sets
• Problem 1
– Performance
• Queries go slow
• Solution
– Map/Reduce
@akshaymathu, @_sarangs 54
Map Reduce
• A way to divide large query computation
into smaller chunks
• May run in multiple processes across
multiple machines
• Think of it as GROUP BY of SQL
@akshaymathu, @_sarangs 55
Map/Reduce Example
• Map function digs the data and returns
required values
@akshaymathu, @_sarangs 56
Map/Reduce Example
• Reduce function uses the output of Map
function and generates aggregated value
@akshaymathu, @_sarangs 57
Large Data Sets
• Problem 2
– Vertical Scaling of Hardware
• Can’t increase machine size beyond a limit
• Solution
– Sharding
@akshaymathu, @_sarangs 58
Sharding
• A method for storing data across multiple
machines
• Data is partitioned using Shard Keys
@akshaymathu, @_sarangs 59
Data Partitioning: Range Based
• A range of Shard Keys stay in a chunk
@akshaymathu, @_sarangs 60
Data Partitioning: Hash Bsed
• A hash function on Shard Keys decides the chunk
@akshaymathu, @_sarangs 61
Sharded Cluster
@akshaymathu, @_sarangs 62
Optimizing Shards: Splitting
• In a shard, when size of a chunk
increases, the chunk is divided into two
@akshaymathu, @_sarangs 63
Optimizing Shards: Balancing
• When number of chunks in a shard
increase, a few chunks are migrated to
other shard
@akshaymathu, @_sarangs 64
Summary
• MongoDB is good
– Stores objects as we use in programming
language
– Flexible semi-structured design
– Scales out to store big data
– Embedded documents eliminates need for join
• MongoDB is bad
– No multi-document query
– De-normalized storage
– No support for transactions
@akshaymathu, @_sarangs 65
Thanks
@akshaymathu, @_sarangs 66
@akshaymathu @_sarangs

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB InternalsSiraj Memon
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Rohit Agrawal
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 

Viewers also liked

MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for BeginnersEnoch Joshua
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Connecting NodeJS & MongoDB
Connecting NodeJS & MongoDBConnecting NodeJS & MongoDB
Connecting NodeJS & MongoDBEnoch Joshua
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsSpringPeople
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBLee Theobald
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMetatagg Solutions
 
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028iis dahlia
 
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...GIS in the Rockies
 

Viewers also liked (20)

MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Connecting NodeJS & MongoDB
Connecting NodeJS & MongoDBConnecting NodeJS & MongoDB
Connecting NodeJS & MongoDB
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Pdf almas
Pdf almasPdf almas
Pdf almas
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
 
Mongo db
Mongo dbMongo db
Mongo db
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
 
Administrasi MongoDB
Administrasi MongoDBAdministrasi MongoDB
Administrasi MongoDB
 
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
 
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
 
Nosql
NosqlNosql
Nosql
 

Similar to Mongo db

NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Milind Bhandarkar
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessInfiniteGraph
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...Ashnikbiz
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Dave Nielsen
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data ModelingAdam Doyle
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)Minal Patil
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBAhmed Farag
 
Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)Tech in Asia ID
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopJoe Drumgoole
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisKai Sasaki
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 

Similar to Mongo db (20)

Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data Modeling
 
Couchbase 3.0.2 d1
Couchbase 3.0.2  d1Couchbase 3.0.2  d1
Couchbase 3.0.2 d1
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Scalability Design Principles - Internal Session
Scalability Design Principles - Internal SessionScalability Design Principles - Internal Session
Scalability Design Principles - Internal Session
 
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)
 
Datastore PPT.pptx
Datastore PPT.pptxDatastore PPT.pptx
Datastore PPT.pptx
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 

More from Akshay Mathur

Documentation with Sphinx
Documentation with SphinxDocumentation with Sphinx
Documentation with SphinxAkshay Mathur
 
Kubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTechKubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTechAkshay Mathur
 
Security and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in KubernetesSecurity and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in KubernetesAkshay Mathur
 
Enhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices ApplicationsEnhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices ApplicationsAkshay Mathur
 
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...Akshay Mathur
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerAkshay Mathur
 
Cloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADSCloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADSAkshay Mathur
 
Shared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWSShared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWSAkshay Mathur
 
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudAkshay Mathur
 
Introduction to Node js
Introduction to Node jsIntroduction to Node js
Introduction to Node jsAkshay Mathur
 
Object Oriented Programing in JavaScript
Object Oriented Programing in JavaScriptObject Oriented Programing in JavaScript
Object Oriented Programing in JavaScriptAkshay Mathur
 
Getting Started with Angular JS
Getting Started with Angular JSGetting Started with Angular JS
Getting Started with Angular JSAkshay Mathur
 
Releasing Software Without Testing Team
Releasing Software Without Testing TeamReleasing Software Without Testing Team
Releasing Software Without Testing TeamAkshay Mathur
 
Getting Started with jQuery
Getting Started with jQueryGetting Started with jQuery
Getting Started with jQueryAkshay Mathur
 
Creating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JSCreating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JSAkshay Mathur
 
Getting Started with Web
Getting Started with WebGetting Started with Web
Getting Started with WebAkshay Mathur
 
Getting Started with Javascript
Getting Started with JavascriptGetting Started with Javascript
Getting Started with JavascriptAkshay Mathur
 
Using Google App Engine Python
Using Google App Engine PythonUsing Google App Engine Python
Using Google App Engine PythonAkshay Mathur
 

More from Akshay Mathur (20)

Documentation with Sphinx
Documentation with SphinxDocumentation with Sphinx
Documentation with Sphinx
 
Kubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTechKubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTech
 
Security and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in KubernetesSecurity and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in Kubernetes
 
Enhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices ApplicationsEnhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices Applications
 
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning Controller
 
Cloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADSCloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADS
 
Shared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWSShared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWS
 
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloud
 
Introduction to Node js
Introduction to Node jsIntroduction to Node js
Introduction to Node js
 
Object Oriented Programing in JavaScript
Object Oriented Programing in JavaScriptObject Oriented Programing in JavaScript
Object Oriented Programing in JavaScript
 
Getting Started with Angular JS
Getting Started with Angular JSGetting Started with Angular JS
Getting Started with Angular JS
 
Releasing Software Without Testing Team
Releasing Software Without Testing TeamReleasing Software Without Testing Team
Releasing Software Without Testing Team
 
Getting Started with jQuery
Getting Started with jQueryGetting Started with jQuery
Getting Started with jQuery
 
CoffeeScript
CoffeeScriptCoffeeScript
CoffeeScript
 
Creating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JSCreating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JS
 
Getting Started with Web
Getting Started with WebGetting Started with Web
Getting Started with Web
 
Getting Started with Javascript
Getting Started with JavascriptGetting Started with Javascript
Getting Started with Javascript
 
Using Google App Engine Python
Using Google App Engine PythonUsing Google App Engine Python
Using Google App Engine Python
 
Working with GIT
Working with GITWorking with GIT
Working with GIT
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Mongo db

  • 1. NoSQL Database Akshay Mathur Sarang Shravagi @akshaymathu, @_sarangs {name: ‘mongo’, type: ‘db’}
  • 3. Let’s Know Each Other • Do you code? • OS? • Programing Language? • Why are you attending? @akshaymathu, @_sarangs 3
  • 4. Akshay Mathur • Managed development, testing and release teams in last 14+ years – Currently Principal Architect at ShopSocially • Founding Team Member of – ShopSocially (Enabling “social” for retailers) – AirTight Neworks (Global leader of WIPS) @akshaymathu, @_sarangs 4
  • 5. Sarang Shravagi • 10gen Certified Developer and DBA • CS graduate from PICT Pune • 3+ years in Software Product industry • Currently Senior Full-stack Developer at ShopSocially @akshaymathu, @_sarangs 5
  • 6. How we use MongoDB @akshaymathu, @_sarangs 6 Python MongoDB MongoEngine
  • 8. Program Outline: Understanding NoSQL • Data Landscape • Different Storage Needs • Design Paradigm Shift from SQL to NoSQL • Different Datastores • Closer look to Document Storage • Drawing parallel from RDBMS @akshaymathu, @_sarangs 8
  • 9. Program Outline: Hands on Lab • Installation and basic configuration • Mongo Shell • Creating and Changing Schema • Create, Read, Update and Delete of Data • Analyzing Performance • Improving performance by creating Indices • Assignment • Problem solving for the assignment @akshaymathu, @_sarangs 9
  • 10. Program Outline: Advance Topics • Handling Big Data – Introduction to Map/Reduce – Introduction to Data Partitioning (Sharding) • Disaster Recovery – Introduction to Replica set and High Availability @akshaymathu, @_sarangs 10
  • 11. Ground Rules • Disturb Everyone – Not by phone rings – Not by local talks – By more information and questions @akshaymathu, @_sarangs 11
  • 12. Data Patterns & Storage Needs @akshaymathu, @_sarangs 12
  • 13. Data at an Online Store • Product Information • User Information • Purchase Information • Product Reviews • Site Interactions • Social Graph • Search Index @akshaymathu, @_sarangs 13
  • 14. SQL to NoSQL Design Paradigm Shift @akshaymathu, @_sarangs 14
  • 15. SQL Storage • Was designed when – Storage and data transfer was costly – Processing was slow – Applications were oriented more towards data collection • Initial adopters were financial institutions @akshaymathu, @_sarangs 15
  • 16. SQL Storage • Structured – schema • Relational – foreign keys, constraints • Transactional – Atomicity, Consistency, Isolation, Durability • High Availability through robustness – Minimize failures • Optimized for Writes • Typically Scale Up @akshaymathu, @_sarangs 16
  • 17. NoSQL Storage • Is designed when – Storage is cheap – Data transfer is fast – Much more processing power is available • Clustering of machines is also possible – Applications are oriented towards consumption of User Generated Content – Better on-screen user experience is in demand @akshaymathu, @_sarangs 17
  • 18. NoSQL Storage • Semi-structured – Schemaless • Consistency, Availability, Partition Tolerance • High Availability through clustering – expect failures • Optimized for Reads • Typically Scale Out @akshaymathu, @_sarangs 18
  • 19. Different Datastores Half Level Deep @akshaymathu, @_sarangs 19
  • 20. SQL: RDBMS • MySql, Postgresql, Oracle etc. • Stores data in tables having columns – Basic (number, text) data types • Strong query language • Transparent values – Query language can read and filter on them – Relationship between tables based on values • Suited for user info and transactions @akshaymathu, @_sarangs 20
  • 21. NoSQL: Key/Value • Redis, DynamoDB etc. • Stores a values against a key – Strings • Values are opaque – Can not be part of query • Suited for site interactions @akshaymathu, @_sarangs 21
  • 23. NoSQL: Document • MongoDB, CouchDB etc. • Object Oriented data models – Stores data in document objects having fields – Basic and compound (list, dict) data types • SQL like queries • Transparent values – Can be part of query • Suited for product info and its reviews @akshaymathu, @_sarangs 23
  • 25. NoSQL: Column Family • Cassandra, Big Table etc. • Stores data in columns • Transparent values – Can be part of query • SQL like queries • Suited for search @akshaymathu, @_sarangs 25
  • 27. NoSQL: Graph • Neo4j • Stores data in form of nodes and relationships • Query is in form of traversal • In-memory • Suited for social graph @akshaymathu, @_sarangs 27
  • 29.
  • 30. Document Storage: Closer Look @akshaymathu, @_sarangs 30
  • 31. MongoDB • Document database • Powerful query language • Docs, sub-docs, indexes • Map/reduce • Replicas, shards, replicated shards • SDKs/drivers for so many languages – C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl, Ruby, Scala @akshaymathu, @_sarangs 31
  • 34. RDBMS  MongoDB RDBMS MongoDB Database Database Table Collection Row Document Column Field Select c1, c2 from Table where c1 = ‘v1’ order by c2 limit n Collection.objects(F1 = ‘v1’).order_by(‘c2’).limit(n) @akshaymathu, @_sarangs 34
  • 40. MongoDB: Core Binaries • mongod – Database server • mongo – Database client shell • mongos – Router for Sharding @akshaymathu, @_sarangs 40
  • 41. Getting Help • For mongo shell – mongo –help • Shows options available for running the shell • Inside mongo shell – Object.help() • Shows commands available on the object @akshaymathu, @_sarangs 41
  • 42. Import Export Tools • For objects – mongodump – mongorestore – bsondump – mongooplog • For data items – mongoimport – mongoexport @akshaymathu, @_sarangs 42
  • 43. Database Operations • Database creation • Creating/changing collection • Data insertion • Data read • Data update • Creating indices • Data deletion • Dropping collection @akshaymathu, @_sarangs 43
  • 44. Diagnostic Tools • mongostat • mongoperf • mongosnif • mongotop @akshaymathu, @_sarangs 44
  • 46. Assignment • Go to http://www.velocitainc.com/mongo/ – Tasks • assignments.txt – Data • students.json @akshaymathu, @_sarangs 46
  • 47. Disaster Recovery Introduction to Replica Sets and High Availability @akshaymathu, @_sarangs 47
  • 48. Disasters • Physical Failure – Hardware – Network • Solution – Replica Sets • Provide redundant storage for High Availability – Real time data synchronization • Automatic failover for zero down time @akshaymathu, @_sarangs 48
  • 50. Multi Replication • Data can be replicated to multiple places simultaneously • Odd number of machines are always needed in a replica set @akshaymathu, @_sarangs 50
  • 51. Single Replication • If you want to have only one or odd number of secondary, you need to setup an arbiter @akshaymathu, @_sarangs 51
  • 52. Failover • When primary fails, remaining machines vote for electing new primary @akshaymathu, @_sarangs 52
  • 53. Handling Big Data Introduction to Map/Reduce and Sharding @akshaymathu, @_sarangs 53
  • 54. Large Data Sets • Problem 1 – Performance • Queries go slow • Solution – Map/Reduce @akshaymathu, @_sarangs 54
  • 55. Map Reduce • A way to divide large query computation into smaller chunks • May run in multiple processes across multiple machines • Think of it as GROUP BY of SQL @akshaymathu, @_sarangs 55
  • 56. Map/Reduce Example • Map function digs the data and returns required values @akshaymathu, @_sarangs 56
  • 57. Map/Reduce Example • Reduce function uses the output of Map function and generates aggregated value @akshaymathu, @_sarangs 57
  • 58. Large Data Sets • Problem 2 – Vertical Scaling of Hardware • Can’t increase machine size beyond a limit • Solution – Sharding @akshaymathu, @_sarangs 58
  • 59. Sharding • A method for storing data across multiple machines • Data is partitioned using Shard Keys @akshaymathu, @_sarangs 59
  • 60. Data Partitioning: Range Based • A range of Shard Keys stay in a chunk @akshaymathu, @_sarangs 60
  • 61. Data Partitioning: Hash Bsed • A hash function on Shard Keys decides the chunk @akshaymathu, @_sarangs 61
  • 63. Optimizing Shards: Splitting • In a shard, when size of a chunk increases, the chunk is divided into two @akshaymathu, @_sarangs 63
  • 64. Optimizing Shards: Balancing • When number of chunks in a shard increase, a few chunks are migrated to other shard @akshaymathu, @_sarangs 64
  • 65. Summary • MongoDB is good – Stores objects as we use in programming language – Flexible semi-structured design – Scales out to store big data – Embedded documents eliminates need for join • MongoDB is bad – No multi-document query – De-normalized storage – No support for transactions @akshaymathu, @_sarangs 65