4. A storage engine is the part of a database that is responsible for managing how data is stored on disk.
Many databases support multiple storage engines, where different engines perform better for specific
workloads.
For example, one storage engine might offer better performance for read-heavy workloads, and another
might support a higher-throughput for write operations
You can have a replica set members that use different storage engines
6. Example relational data model for a blogging
application
Data as documents: simpler for developers,
faster for users.
7.
8.
9. Dynamic/Flexible
Collections (Tables) can be created without defining structure of the documents
Documents in a collection need not have an identical set of fields.
In practice, it is common for the documents in a collection to have a largely
homogeneous structure; however, this is not a requirement
The structure of documents can be changed simply by adding new fields or deleting
existing ones (which simplifies and facilitates iterative software development)
Schema Design is still important!
Types of queries the application will perform
How objects are managed in application code
How documents will change over time
10. Repetition of publisher data If the number of books per publisher
is small with limited growth
To avoid mutable, growing arrays,
store the publisher reference inside
the book document
11. If your application frequently retrieves
the address data with the name information,
then your application needs to issue multiple queries
With the embedded data model,
your application can retrieve the
complete patron information with one query
13. Core processes
mongod – database process
mongos – controller/query router of sharded clusters
mongo – interactive mongoDB shell
Import / Export Tools
Binary
mongodump – create BSON dump files
mongorestore – restore BSON dump files
Bsondump – convert BSON dump files to JSON
mongooplog – stream oplog entries outside of normal replication
JSON/CSV/TSV
mongoimport – taking data
mongoexport – export data
Diagnostic Tools
mongostat – status of currently running mongod or mongos instance
mongotop - the amount of time statistics on a per-collection level a MongoDB instance spends reading and writing data
mongosniff - a low-level operation tracing/sniffing view into database activity in real time (only in Unix)
mongoperf - utility to check disk I/O performance independently of MongoDB
GridFS
Mongofiles - utility makes it possible to manipulate files stored in your MongoDB instance in GridFS objects from the command line
14. Linux
mongod --dbpath <path to data directory>
Windows
mongod.exe --dbpath <path to data directory>
15. Rich, interactive JavaScript shell
Included in all MongoDB distributions
Think sqlcmd in MS SQL or sqlplus in Oracle
Support all commands/queries, including administrative operations
22. Supported drivers: Java, .NET, Ruby, PHP, JavaScript, node.js, Python,
Perl, PHP, Scala and others
Implemented as methods or functions within the API of a specific
programming language, as opposed to a completely separate language like
SQL
[Example here]
23. Types:
Unique Indexes
Compound Indexes
Array Indexes - For fields that contain an array, each array value is stored as a separate index
entry
TTL (Time to Live) Indexes - allow the user to specify a period of time after which the data
will automatically be deleted from the database
Geospatial Indexes - optimize queries related to location within a two dimensional space
Sparse Indexes - allow for smaller, more efficient indexes when fields are not present in all
documents.
Text Search Indexes - uses advanced, language-specific linguistic rules for stemming,
tokenization and stop words
Covered Queries - Queries that return results containing only indexed fields can be
returned without reading from the source documents
25. Sharding and replica sets:
- automatic sharding provides horizontal scalability
- replica sets help prevent database downtime
26. Sharding, or horizontal scaling, divides the data set and distributes the
data over multiple servers, or shards. Each shard is an independent
database, and collectively, the shards make up a single logical database.
27. Replication provides redundancy and increases data availability.
With multiple copies of data on different database servers, replication
protects a database from the loss of a single server
28. Find/Identify/Target the most frequent (>80%) data access pattern
Flexible Schema promotes “Agile”, be prepared for “Changes” to the data
model for improvements
For storages
Use the _id field explicitly (else will default to 12-bytes ObjectId)
Use shorter field names
Embed documents (data model consideration)
Use Index & Profiling for performance
docs.mongodb.org has very wealthy resources (offline file(s) is available
at http://docs.mongodb.org/manual/about)
29. Documentation (http://docs.mongodb.org)
Free Online Training (http://university.mongodb.com)
Presentations (http://mongodb.com/presentations)
Case Studies (http://mongodb.com/customers)
http://www.newtonsoft.com/json