MongoDB
Introduction and Internal
by
Shridhar Joshi
What is MongoDB?
Open source, scalable, high-performance, document-oriented NoSQL Key-Value
ased database.
Features
•
JSON-style document –oriented storage with schema-less
•
B-tree index supported on any attribute
•
Log-based replication for Master/Slave and Replica Set
•
Auto-sharding architecture (via horizontal partition) scales to thousands of no
•
NoSQL-style query
•
Surprising updating behaviors
•
Map/Reduce support
•
GridFS specification for storing large files
•
Developed by 10gen with commercial support
Well/Less Well Suited
Source: http://www.mongodb.org/display/DOCS/Use+Cases
Basic concepts in MongoDB
NoSQL MongoDB
Database
Collection
Document
Field
Index
Cursor
Relational DBMS
Database
Relation
Tuple
Column
Index
Cursor
MongoDB
Databases*
Collections*
Documents* Indexes*
Fields*
* means 0 or more objects
Relational DBMS
Databases*
Relations*
Columns* Indexes*
Each document has its own fields
and makes MongoDB schema-
less.
CRUD Demo time
Ø
show dbs view existing databases
Ø
use test use database “test”
Ø
db.t.insert({name:’bob’,age:’30’}) insert 30 years bob
Ø
db. t.insert({name:’alice’,gender:’female’}) insert lady alice
Ø
db. t.find() list all documents in
collection t
Ø
db. t.find({name:’bob’},{age:1}) find 1 year old bob
Ø
db. t.find().limit(1).skip(1) find the second document
Ø
db. t.find().sort({name:1}) sort the results with ascend
name
Ø
db. t.find({$or:[{name:’bob’},{name:’tom’}]}) find bob or tom’s documents
Ø
db. t.update({name:’ bob’},{$set:{age:31}}, update all bob’s age to 31
Ø
false,true})
Ø
db.stats() database statistic
Ø
db.getCollectionNames() collections under this db
Ø
db.t.ensureIndex({name:1}) create index on name
Ø
db.people.find({name:“bob"}).explain() explain plan step
Query Optimization
db.people.find({x:10,y:”foo”})
Index on x
Index on y
Collection people
Index Scan
Index Scan
DiskLocation Scan
MongoDB Architecture
Source: mongoDB Replication and Replica Set by Dwight Merriman 10gen
MongoDB Sharding
ongoDB uses two key operations to facilitate sharding - split and migrate.
plit splits a chunk into two ranges; it is done to assure no one chunk is unusually la
igrate moves a chunk (the data associated with a key range) to another shard.
his is done as needed to rebalance.
plit is an inexpensive metadata operation, while migrate is expensive as large amo
data may be moving server to server.
oth splits and migrates are performed automatically.
ongoDB has a sub-system called Balancer, which monitors shards loads and move
hunks around if it finds an imbalance.
you add a new shard to the system, some chunks will eventually be moved to
at shard to spread out the load.
recently split chunk may be moved immediately to a new shard if the system
edicts that future insertions will benefit from that move.
MongoDB
Sharding
Pull mode
MongoDB Sharding: Briefly
FROM:C TO:N
#Copy Index Definition from C
#Remove existing data in [min~max]
#Clone the data in[min~max] from C
#Ask C to replicate the changes
#Make sure my view is complete and lock
#Get the document’s DiskLoc for sharding
#Trigger the N to sharding in Pull mode
Sequence
#N commit
#Ask N to commit
MongoDB Sharding: In Details
FROM TO
Notice: The FROM can be updated/deleted during sharding and TO can catch up in
step 4.
Replication and Sharding
Source:
MongoDB Replication: Pull mode
Slave continuously pull the OpLog from Master.
Question
Reference:
1: Source code digest: http://www.cnblogs.com/daizhj/category/260889.html
2: Books http://www.mongodb.org/display/DOCS/Books
3: MongoDB offical website http://www.mongodb.com/

Mongo presentation conf

  • 1.
  • 2.
    What is MongoDB? Opensource, scalable, high-performance, document-oriented NoSQL Key-Value ased database. Features • JSON-style document –oriented storage with schema-less • B-tree index supported on any attribute • Log-based replication for Master/Slave and Replica Set • Auto-sharding architecture (via horizontal partition) scales to thousands of no • NoSQL-style query • Surprising updating behaviors • Map/Reduce support • GridFS specification for storing large files • Developed by 10gen with commercial support
  • 3.
    Well/Less Well Suited Source:http://www.mongodb.org/display/DOCS/Use+Cases
  • 4.
    Basic concepts inMongoDB NoSQL MongoDB Database Collection Document Field Index Cursor Relational DBMS Database Relation Tuple Column Index Cursor MongoDB Databases* Collections* Documents* Indexes* Fields* * means 0 or more objects Relational DBMS Databases* Relations* Columns* Indexes* Each document has its own fields and makes MongoDB schema- less.
  • 5.
    CRUD Demo time Ø showdbs view existing databases Ø use test use database “test” Ø db.t.insert({name:’bob’,age:’30’}) insert 30 years bob Ø db. t.insert({name:’alice’,gender:’female’}) insert lady alice Ø db. t.find() list all documents in collection t Ø db. t.find({name:’bob’},{age:1}) find 1 year old bob Ø db. t.find().limit(1).skip(1) find the second document Ø db. t.find().sort({name:1}) sort the results with ascend name Ø db. t.find({$or:[{name:’bob’},{name:’tom’}]}) find bob or tom’s documents Ø db. t.update({name:’ bob’},{$set:{age:31}}, update all bob’s age to 31 Ø false,true}) Ø db.stats() database statistic Ø db.getCollectionNames() collections under this db Ø db.t.ensureIndex({name:1}) create index on name Ø db.people.find({name:“bob"}).explain() explain plan step
  • 6.
    Query Optimization db.people.find({x:10,y:”foo”}) Index onx Index on y Collection people Index Scan Index Scan DiskLocation Scan
  • 7.
    MongoDB Architecture Source: mongoDBReplication and Replica Set by Dwight Merriman 10gen
  • 8.
    MongoDB Sharding ongoDB usestwo key operations to facilitate sharding - split and migrate. plit splits a chunk into two ranges; it is done to assure no one chunk is unusually la igrate moves a chunk (the data associated with a key range) to another shard. his is done as needed to rebalance. plit is an inexpensive metadata operation, while migrate is expensive as large amo data may be moving server to server. oth splits and migrates are performed automatically. ongoDB has a sub-system called Balancer, which monitors shards loads and move hunks around if it finds an imbalance. you add a new shard to the system, some chunks will eventually be moved to at shard to spread out the load. recently split chunk may be moved immediately to a new shard if the system edicts that future insertions will benefit from that move.
  • 9.
  • 10.
    MongoDB Sharding: Briefly FROM:CTO:N #Copy Index Definition from C #Remove existing data in [min~max] #Clone the data in[min~max] from C #Ask C to replicate the changes #Make sure my view is complete and lock #Get the document’s DiskLoc for sharding #Trigger the N to sharding in Pull mode Sequence #N commit #Ask N to commit
  • 11.
    MongoDB Sharding: InDetails FROM TO Notice: The FROM can be updated/deleted during sharding and TO can catch up in step 4.
  • 12.
  • 13.
    MongoDB Replication: Pullmode Slave continuously pull the OpLog from Master.
  • 14.
    Question Reference: 1: Source codedigest: http://www.cnblogs.com/daizhj/category/260889.html 2: Books http://www.mongodb.org/display/DOCS/Books 3: MongoDB offical website http://www.mongodb.com/