MongoDb - Details on the POC
Upcoming SlideShare
Loading in...5

MongoDb - Details on the POC






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

MongoDb - Details on the POC MongoDb - Details on the POC Presentation Transcript

  • Goodbye rows and tables, hello documents and collections
  • Lots of pretty pictures to fool you.
  • Noise
  • Introduction M ongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality). MongoDB is document-oriented , schema-free , scalable , high-performance , open source. Written in C++ Mongo is not a relational database like MySQL Goodbye rows and tables, hello documents and collections
    • Features
    • Document-oriented
      • Documents (objects) map nicely to programming language data types
      • Embedded documents and arrays reduce need for joins
      • No joins and no multi-document transactions for high performance and easy scalability
    • High performance
      • No joins and embedding makes reads and writes fast
      • Indexes including indexing of keys from embedded documents and arrays
    • High availability
      • Replicated servers with automatic master failover
    • Easy scalability
      • Automatic sharding (auto-partitioning of data across servers)
        • Reads and writes are distributed over shards
        • No joins or multi-document transactions make distributed queries easy and fast
      • Eventually-consistent reads can be distributed over replicated servers
    • Cost - MongoDB is free
    • MongoDb is easily installable.
    • MongoDb supports various programming languages like C, C++, Java,Javascript, PHP.
    • MongoDB is blazingly fast
    • MongoDB is schemaless
    • Ease of scale-out
    • If load increases it can be distributed to other nodes across computer networks.
    • It's trivially easy to add more fields -- even complex fields -- to your objects.
    • So as requirements change, you can adapt code quickly.
    • Background Indexing
    • MongoDB is a stand-alone server
    • Development time is faster, too, since there are no schemas to manage.
    • It supports Server-side JavaScript execution.
    • Which allows a developer to use a single programming language for both client and server side code
    Why ?
    • Mongo is limited to a total data size of 2GB for all databases in 32-bit mode.
    • No referential integrity
    • Data size in MongoDB is typically higher.
    • At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK,
    • but not blisteringly fast.
    • Group By : less than 10,000 keys.
    • For larger grouping operations without limits, please use map/reduce .
    • Lack of predefined schema is a double-edged sword
    • No support for Joins & transactions
  • Benchmarking (MongoDB Vs. MySQL) Test Machine configuration: CPU : Intel Xeon 1.6 GHz - Quad Core, 64 Bit Memory : 8 GB RAM OS : Centos 5.2 - Kernel 2.6.18 64 bit Record Structure Field1 -> String, Indexed Field2 -> String, Indexed Filed3 -> Date, Not Indexed Filed4 -> Integer, Indexed
  • Mongo data model
    • A Mongo system (see deployment above) holds a set of databases
    • A database holds a set of collections
    • A collection holds a set of documents
    • A document is a set of fields
    • A field is a key-value pair
    • A key is a name (string)
    • A value is a
      • basic type like string, integer, float, timestamp, binary, etc.,
      • a document, or
      • an array of values
    MySQL Term Mongo Term database database table collection index index row BSON document column BSON field Primary key _id field
  • SQL to Mongo Mapping Chart
  • Continued ... SQL Statement Mongo Statement
  • Replication / Sharding
    • Data Redundancy
    • Automated Failover
    • Distribute read load
    • Simplify maintenance
    • (compared to "normal" master-slave)
    • Disaster recovery from user error
    • Automatic balancing for changes in
    • load and data distribution
    • Easy addition of new machines
    • Scaling out to one thousand nodes
    • No single points of failure
    • Automatic failover
  • These slides are online: