MongoDB

 Architecture
 Data Model
 Query Language
 Data Management
 References

 A storage engine is the part of a database that is responsible for managing how data is stored on disk.
 Many databases support multiple storage engines, where different engines perform better for specific
workloads.
For example, one storage engine might offer better performance for read-heavy workloads, and another
might support a higher-throughput for write operations
 You can have a replica set members that use different storage engines

Example relational data model for a blogging
application
Data as documents: simpler for developers,
faster for users.

 Dynamic/Flexible
 Collections (Tables) can be created without defining structure of the documents
 Documents in a collection need not have an identical set of fields.
 In practice, it is common for the documents in a collection to have a largely
homogeneous structure; however, this is not a requirement
 The structure of documents can be changed simply by adding new fields or deleting
existing ones (which simplifies and facilitates iterative software development)
 Schema Design is still important!
 Types of queries the application will perform
 How objects are managed in application code
 How documents will change over time

Repetition of publisher data If the number of books per publisher
is small with limited growth
To avoid mutable, growing arrays,
store the publisher reference inside
the book document

If your application frequently retrieves
the address data with the name information,
then your application needs to issue multiple queries
With the embedded data model,
your application can retrieve the
complete patron information with one query

 Core processes
 mongod – database process
 mongos – controller/query router of sharded clusters
 mongo – interactive mongoDB shell
 Import / Export Tools
 Binary
 mongodump – create BSON dump files
 mongorestore – restore BSON dump files
 Bsondump – convert BSON dump files to JSON
 mongooplog – stream oplog entries outside of normal replication
 JSON/CSV/TSV
 mongoimport – taking data
 mongoexport – export data
 Diagnostic Tools
 mongostat – status of currently running mongod or mongos instance
 mongotop - the amount of time statistics on a per-collection level a MongoDB instance spends reading and writing data
 mongosniff - a low-level operation tracing/sniffing view into database activity in real time (only in Unix)
 mongoperf - utility to check disk I/O performance independently of MongoDB
 GridFS
 Mongofiles - utility makes it possible to manipulate files stored in your MongoDB instance in GridFS objects from the command line

 Linux
 mongod --dbpath <path to data directory>
 Windows
 mongod.exe --dbpath <path to data directory>

 Rich, interactive JavaScript shell
 Included in all MongoDB distributions
 Think sqlcmd in MS SQL or sqlplus in Oracle
 Support all commands/queries, including administrative operations

 A collection can be created by inserting row(s)

 find
 Query criteria
 Projection
 Cursor modifier
 Pretty
 findOne

 update
 $set/$unset
 Replace whole document
 Array (addToSet, push, pop)
 upsert
 findAndModify
 upsert
 Remove
 New

 drop the collection
 remove
 All records
 Based on criteria (multiple rows by default)
 justOne parameter
 findAndModify with remove option

 aggregate - Aggregation Pipeline (recommended/preferred)
 mapReduce - Map Reduce
 group

 Supported drivers: Java, .NET, Ruby, PHP, JavaScript, node.js, Python,
Perl, PHP, Scala and others
 Implemented as methods or functions within the API of a specific
programming language, as opposed to a completely separate language like
SQL
 [Example here]

 Types:
 Unique Indexes
 Compound Indexes
 Array Indexes - For fields that contain an array, each array value is stored as a separate index
entry
 TTL (Time to Live) Indexes - allow the user to specify a period of time after which the data
will automatically be deleted from the database
 Geospatial Indexes - optimize queries related to location within a two dimensional space
 Sparse Indexes - allow for smaller, more efficient indexes when fields are not present in all
documents.
 Text Search Indexes - uses advanced, language-specific linguistic rules for stemming,
tokenization and stop words
 Covered Queries - Queries that return results containing only indexed fields can be
returned without reading from the source documents

Sharding and replica sets:
- automatic sharding provides horizontal scalability
- replica sets help prevent database downtime

 Sharding, or horizontal scaling, divides the data set and distributes the
data over multiple servers, or shards. Each shard is an independent
database, and collectively, the shards make up a single logical database.

 Replication provides redundancy and increases data availability.
 With multiple copies of data on different database servers, replication
protects a database from the loss of a single server

 Find/Identify/Target the most frequent (>80%) data access pattern
 Flexible Schema promotes “Agile”, be prepared for “Changes” to the data
model for improvements
 For storages
 Use the _id field explicitly (else will default to 12-bytes ObjectId)
 Use shorter field names
 Embed documents (data model consideration)
 Use Index & Profiling for performance
 docs.mongodb.org has very wealthy resources (offline file(s) is available
at http://docs.mongodb.org/manual/about)

 Documentation (http://docs.mongodb.org)
 Free Online Training (http://university.mongodb.com)
 Presentations (http://mongodb.com/presentations)
 Case Studies (http://mongodb.com/customers)
 http://www.newtonsoft.com/json

MongoDB

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MongoDB

Similar to MongoDB (20)

Recently uploaded

Recently uploaded (20)

MongoDB

Editor's Notes