Mongo DB


Published on

Published in: Education, Technology

Mongo DB

  1. 1. 1 ®
  2. 2. 2  Module 1  Introduction and Overview  No SQL  Module 2  CRUD Operations  CRUD Concerns  Module 3  Schema Design and Data Modeling  Comparison with Relational Systems  Module 4  Administration  Backup and Recovery  Module 5  Scalability and Availability  Replication and Sharding  Module 6  Indexing and Aggregation Frame work  Performance Tuning  Module 7  Application Engineering and Mongo DB Tools  Interface with Other Language  Module 8  Project, Additional Concepts and Cases Studies MongoDB Course Topics
  3. 3. 3 MongoDB Course Module 01 : Design Goals, Architecture and Installation
  4. 4. 4 How it Works….  Online Instructor–led Live classes  Class Recordings in Learning Management System (LMS)  Module–wise Quizzes and Practical Assignments  24x7 On–demand Technical Support  Project Based Verifiable Graded Certificate  Lifetime Access to the Learning Management System
  5. 5. 5 Topics of the Day Database Categories Mongo DB Overview Design Goals for MongoDB Server and Database Mongo DB Tools Introduction to JSON and BSON Installation of MongoDB on Windows, Linux, MAC,OS etc. Environment Setup for MongoDB
  6. 6. 6 RDBMS/OLTP/Real Time NoSQL/New SQL/BigData OLAP/DSS/DW Oracle MySQL MS SQL DB2 Etc. Netezza SAP Hanna Oracle Express Etc. MongoDB Hbase Cassandra CauchDB Etc. Database Categories
  7. 7. 7 RDBMS/OLTP/Real Time Oracle MySQL MS SQL DB2 Etc. Online Sales App ATM Transaction Retail Transaction Others OLTP Database
  8. 8. 8 Oracle SQL Server MySQL DB2 OLAP/DSS/DW Netezza SAP Hanna Oracle Express Etc. OLTP Database
  9. 9. 9 NoSQL/New SQL/BigData Oracle SQL Server MySQL DB2 OLAP/DSS/DW Netezza SAP Hanna Oracle Express Etc. MongoDB Hbase Cassandra CauchDB Etc. NoSQL Databses Online Sales App Retail Transaction Others
  10. 10. 10 Next Generation Databases Not Only SQL Non – Relational Distributed Architecture Open Source Horizontally Scalable What is NoSQL?
  11. 11. 11 Schema - Free Easy – Replication Simple API Can Manage Huge Amount of Data Can be implement on Commodity Hardware's ~ 150 No SQL Database are there in Market Schema – Free ! ~150 What is NoSQL?
  12. 12. 12 Nature of Data Application Development (high coding velocity & agility) Why NoSQL?
  13. 13. 13 Operational Issues (scale, performance, high availability) Why NoSQL?
  14. 14. 14 Data Warehousing and Analytics Why NoSQL?
  15. 15. 15  Large volumes of structured, semi-structured, and unstructured data Benefit of NoSQL
  16. 16. 16  Agile development, quick changes, and frequent code pushes Benefit of NoSQL Scrum Design Many More… Feature Driven Development Extreme Programming Agile Methodologies
  17. 17. 17  Object-oriented programming that is easy to use and flexible  Horizontal scaling instead of expensive hardware's Benefit of NoSQL
  18. 18. 18 Graph Store Key – value Stores Wide Column Stores% Document Base  Document databases pair each key with a complex data structure known as a document.  Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.  Graph stores are used to store information about networks, such as social connections.  Graph stores include Neo4J and HyperGraphDB.  Key-value stores are the simplest NoSQL databases.  Every single item in the database is stored as an attribute name (or "key"), together with its value.  Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows. Categories of NoSQL Database
  19. 19. 19 Key Value Store Memcached Coherence Redis Tabular Big Table Hbase Accumulo Document Oriented MongoDB Couch DB Cloudant Graph Stores Neo4J Oracle NoSQL HyperGraphDB Type of No SQL Databases
  20. 20. 20 Entity SQL Databases NoSQL Databases Type One Type (SQL) with Minor Variation Many Types (Document, Ke-Value, Tabular, Graph) Development 1970 2000 Examples Oracle, MSSQL, DB2 etc. MongoDB, Cassandra, Hbase, Neo4J Schemas Fixed Dynamic Scaling Vertical Horizontal Dev Model Mix Open Source Consistency Follow ACID Follow BASE NoSQL vs. SQL Comparison
  21. 21. 21 ACID Property Atomic  A transaction is a logical unit of work which must be either completed with all of its data modifications, or none of them is performed. Consistent  At the end of the transaction, all data must be left in a consistent state. Isolated  Modifications of data performed by a transaction must be independent of another transaction. Unless this happens, the outcome of a transaction may be erroneous. Durable  When the transaction is completed, effects of the modifications performed by the transaction must be permanent in the system. Durable Isolated SQL Databases (ACID Property)
  22. 22. 22 We must understand the CAP theorem when we talk about NoSQL databases or in fact when designing any distributed system. CAP theorem states that there are 3 basic requirements which exist in a special relation when designing applications for a distributed architecture. Consistency Availability Partition Tolerance Cap Theorem This means that the system is always on (service guarantee availability), no downtime. This means that the system continues to function even the communication among the servers is unreliable, i.e. the servers may be partitioned into multiple groups that cannot communicate with one another. This means that the data in the database remains consistent after the execution of an operation. For example after an update operation all clients see the same data.
  23. 23. 23  In theoretically it is impossible to fulfill all 3 requirements.  CAP provides the basic requirements for a distributed system to follow 2 of the 3 requirements.  Therefore all the current NoSQL database follow the different combinations of the C, A, P from the CAP theorem. Cap Theorem
  24. 24. 24  CA - Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks.  CP - Some data may not be accessible, but the rest is still consistent/accurate.  AP - System is still available under partitioning, but some of the data returned may be inaccurate. Here is the brief description of three combinations CA, CP, AP : Cap Theorem
  25. 25. 25  Basically Available indicates that the system does guarantee availability, in terms of the CAP theorem. Basically Available  Soft State indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model. Soft State  Eventual Consistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time. Eventual Consistency A BASE system gives up on consistency. BASE
  26. 26. 26 Hello There!! My name is Annie. I love quizzes and puzzles and I am here to make you guys think and answer my questions. Annie’s Introduction
  27. 27. 27 Annie’s Question Map the following to corresponding data bases: MongoDB Neo4J Cassandra Hbase
  28. 28. 28 Annie’s Answer MongoDB  Document Oriented Database Neo4J  Graph Database Cassandra  Columnar Database Hbase  Tabular Database
  29. 29. 29 Structured Data Text, Log Files, Click Streams, Blogs, Tweets, Audio, Video, etc Unstructured and Semi–structured Data NoSQL vs. SQL Comparison
  30. 30. 30 Right Data Model Pros and Cons of Consistent Compromising Features of RDBMS Step 2 Step 3 Before Selection and Implementation of NoSQL Step 1
  31. 31. 31 NoSQL Landscape
  32. 32. 32 Annie’s Question Which concept is followed by NoSql, chose from below list 1 ACIS 2 CAP 3 BASE
  33. 33. 33 Annie’s Answer BASE
  34. 34. 34 MongoDB Overview MongoDB Overview
  35. 35. 35 Mongo DB is an Open-source database. Developed by 10gen, for a wide variety of applications. It is an agile database that allows schemas to change quickly as applications evolve. Scalability, High Performance and Availability. By leveraging in-memory computing. MongoDB’s native replication and automated failover enable enterprise-grade reliability and operational flexibility. MongoDB Overview Overview
  36. 36. 36 New Apps New Development Methods New Data Volumes New Architectures New Data Types Create ReleaseEvolve Involve New World
  37. 37. 37 Open Source Document Oriented Storage Object Oriented Written in C++ Easy to Use Full Index Support What is MongoDB?
  38. 38. 38 Replication and High Availability Auto Sharding Easy Query Map Reduce Gird FS Support from Expert What is MongoDB? MongoDB Replica set Application
  39. 39. 39 Key MongoDB Features Flexibility Power Easy to Use Speed /Scalling Key MongoDB Features
  40. 40. 40 Built-In Replication for High Availability Drivers: 13 MongoDB-Supported Drivers; 37 Community- Supported Drivers Hadoop Integration Aggregation Framework & Native MapReduce Rich Secondary Indexes, including geospatial and TTL indexes Auto-Sharding for Horizontal Scalability JSON Data Model with Dynamic Schemas MongoDB Features
  41. 41. 41 Applications Analytical Tools, BI Apps, Mobile Apps, CRM, ERP etc. Data Management Online Data Offline Data Infrastructure OS & Virtualization, Storage, Network etc. Management&Monitoring Security&Auditing Mongo dMongo dMongod RDMS RDMS RDMS DWDWEDW MongoDB High Level Architecture
  42. 42. 42 Annie’s Question Which kind of data can be processed with MongoDB, choose from below option 1 Online Data 2 Offline Data 3 Both
  43. 43. 43 Annie’s Answer Both
  44. 44. 44 MongoDB Licensing Free software Foundation’s Commercial Licenses MongoDB MongoDB Enterprise MongoDB Licensing
  45. 45. 45  MongoDB Enterprise is the commercial edition of MongoDB that provides enterprise-grade capabilities.  MongoDB Enterprise includes advanced security features, management tools, software integrations and certifications.  These value-added capabilities are not included in the open-source edition of MongoDB. MongoDB Enterprise
  46. 46. 46 MongoDB Enterprise Certified OS Support Advanced Security On-demand Training Enterprise Software Integration Management MongoDB Enterprise Includes
  47. 47. 47 Features MongoDB MongoDB Enterprise JSON Data Model with Dynamic Schemas • • Auto-Sharding for Horizontal Scalability • • Built-In Replication and High Availability • • Full, Flexible Index Support • • Rich Document Queries • • Fast In-Place Updates • • Aggregation Framework and MapReduce • • Large Media Storage with GridFS • • Text Search • • Cloud, On-Premise and Hybrid Deployments • • Role-Based Privileges • • Advanced Security with Kerberos • On-Prem Management • SNMP Support • OS Certifications • Private On-Demand Training • MongoDB Enterprise Includes
  48. 48. 48 Use Cases Mobile and Social Infrastructure Content Management Big Data Customer Data Management Data Hub Use Cases
  49. 49. 49 Annie’s Question Can you give example of Big Data?
  50. 50. 50 Annie’s Answer 1. Facebook ingests 500 terabytes of new data every day. 2. A Boeing 737 will generate 240 terabytes of flight data during a single flight across the US.
  51. 51. 51 Few MongoDB Clients
  52. 52. 52  Metlife uses MongoDB for “The Wall” an innovative customer service application provides a 360-degree, consolidated view of MetLife customers, including policy details and transactions across lines of business.  ebay has a number of projects running on MongoDB for search suggestions, metadata storage, cloud management and merchandizing categorization.  MongoDB is the repository that powers MTV Networks’ next-generation CMS, which is used to manage and distribute content for all of MTV Networks’ major websites.  MongoDB is used for back-end storage on the SourceForge front pages, project pages, and download pages for all projects.  Craigslist uses MongoDB to archive billions of records.  ADP uses MongoDB for its high performance, scalability, reliability and its ability to preserve the data manipulation capabilities of traditional relational databases. Few MongoDB Clients
  53. 53. 53  CNN Turk uses MongoDB for its infrastructure and content management system, including the  Foursquare uses MongoDB to store venues and user ‘check-ins’ into venues, sharding the data over more than 25 machines on Amazon EC2.  is the easy, fun, and fast way to share live video online. MongoDB powers’s internal analytics tools for virality, user retention, and general usage stats that out-of-the-box solutions can’t provide.  ibibo (‘I build, I bond’) is a social network using MongoDB for its dashboard feeds. Each feed is represented as a single document containing an average of 1000 entries; the site currently stores over two million of these documents in MongoDB. Few MongoDB Clients
  54. 54. 54 Industry /Domains Where MongoDB is Used Government Financial Services Media and Entertainment RetailTele-communications Healthcare
  55. 55. 55  Risk Analytics and Reporting  Reference Data Management  Market Data Management  Portfolio Management  Order Capture  Time Series Data Financial Services
  56. 56. 56  Surveillance Data Aggregation  Crime Data Management and Analytics  Citizen Engagement Platform  Program Data Management  Healthcare Record Management Government
  57. 57. 57  360-Degree Patient View  Population Management for At-Risk Demographics  Lab Data Management and Analytics  Mobile Apps for Doctors and Nurses  Electronic Healthcare Records (EHR) Health Care
  58. 58. 58  Content Management and Delivery  User Data Management  Digital Asset Management  Mobile and Social Apps  Content Archiving Media and Entertainment
  59. 59. 59  Rich Product Catalogs  Customer Data Management  New Services  Digital Coupons  Real-Time Price Optimization Retail
  60. 60. 60  Consumer Cloud  Product Catalog  Customer Service Improvement  Machine-to-Machine (M2M) Platform  Real-Time Network Analysis and Optimization Telecommunication
  61. 61. 61 MongoDB Tools MongoDB Tools
  62. 62. 62  mongod  mongos  mongo  mongod.exe  mongos.exe  mongodump  mongorestore  bsondump  mongooplog  mongoimport  mongoexport  mongostat  mongotop  mongosniff  mongoperf  mongofiles MongoDB Package Components
  63. 63. 63 Core Processes mongod mongos mongo Windows Services mongod.exe mongos.exe Binary Import and Export Tools mongodump mongorestore bsondump mongooplog Data Import and Export Tools mongoimport mongoexport Diagnostic Tools mongostat mongotop mongosniff mongoperf GridFS mongofiles MongoDB Package Components ( Tools)
  64. 64. 64 Annie’s Question Which binary do we use for Data Import & Export?
  65. 65. 65 Annie’s Answer mongoimport & mongoexport
  66. 66. 66 Annie’s Question Which binary do we use for backup & recovery?
  67. 67. 67 Annie’s Answer mongodump & mongorestore
  68. 68. 68 Annie’s Question Which binary do we use to process large files?
  69. 69. 69 Annie’s Answer mongofiles
  70. 70. 70  Mongod is the primary daemon process for the MongoDB system.  Database is a physical container for collections.  Each database gets its own set of files on the file system.  A single MongoDB server typically has multiple databases.  It handles data requests, manages data format, and performs background management operations. mongod MongoDB Database
  71. 71. 71  Collection is a group of MongoDB documents.  It is the equivalent of an RDBMS table.  A collection exists within a single database.  Collections do not enforce a schema.  Documents within a collection can have different fields.  Typically, all documents in a collection are of similar or related purpose. mongod Collection 1 Collection 2 Collection 3 Collection 4 Collection 5 Collection ……..n MongoDB Collection
  72. 72. 72  A document is a set of key-value pairs.  Documents have dynamic schema. Collection 1 Collection 2 Collection 3 Collection 4 DOC 1 DOC 4DOC 5DOC 6 DOC 2DOC 3 DOC 7DOC 8DOC 9 DOC 10DOC 11DOC 12 MongoDB Document
  73. 73. 73 { _id: ObjectId(7df78ad8902c) title: 'edureka', description: 'Leading Training Provider Across Glob', by: 'edureka', url: '', tags: ['mongodb', 'database', 'NoSQL'], likes: 100, comments: [ { user:'user1', message: 'My first comment', dateCreated: new Date(2011,1,20,2,15), like: 0 }, { user:'user2', message: 'My second comments', dateCreated: new Date(2011,1,25,7,45), like: 5 } ] } MongoDB Sample Document
  74. 74. 74 RDBMS MongoDB Database Database Table Collection Tuple/Row Document Column/Attribute/Variable Field Table Join Embedded Documents Database Server and Client Primary Key Primary Key (Default key _id provided by mongodb itself) Mysqld/Oracle mongod mysql/sqlplus mongo RDBMS Terminology with MongoDB
  75. 75. 75 Annie’s Question Can we join 2 or more documents to get desired outcome in MongoDB ?
  76. 76. 76 Annie’s Answer Yes, Through linking we can do
  77. 77. 77 Introduction to JSON and BSON Introduction to JSON and BSON
  78. 78. 78 JavaScript Object Notation JSON Abbreviation Lightweight data- interchange format Easy for humans to read and write Easy for machines to parse and generate Text format that is completely language independent JSON
  79. 79. 79 A collection of name/value pairs An ordered list of values JSON Structure Object Array JSON
  80. 80. 80 JSON
  81. 81. 81 Binary JavaScript Object Notation BJSON Abbreviation Supports the embedding of documents and arrays within other documents and arrays Easy for machines to parse and generate Text format that is completely language independent Contains extensions that allow representation of data types that are not part of the JSON spec BSON
  82. 82. 82 Annie’s Question What is difference in JSON & BSON ?
  83. 83. 83 Annie’s Answer JSON Java Script Object Notation BSON Binary JSON (Represents JSON Data in Binary Format)
  84. 84. 84 MongoDB Installation – Live Demo MongoDB Installation – Live Demo
  85. 85. 85  Running MongoDB on Windows  Installation of MongoDB on Windows as a Service  Running of MongoDB on Linux (CentOS)  Installation of MongoDB on CentOS MongoDB Installation – Live Demo
  86. 86. 86 Your Questions
  87. 87. 87  More on MongoDB  Connection with PHP to MongoDB  NoSQL Databases processing/  Big Data Further Reading
  88. 88. 88 Tasks for you Attempt the following Assignments using the documents present in the LMS: Write a JSON document which can have all data types supported by JSON? What all core differences are there in MongoDB, Hadoop, HBase and Cassandra? How can you define Horizontal & Vertical Scalability? Can we design a Social Media App with MongoDB, if yes then how? To design a content management system what all databases can be used and why? I want to create a solution for Data Hub and I have choice of MySQL, Hadoop, Cassandra, MongoDB, HBase, which one is more suitable and why? What is Online & Offline Big Data? What is Agility, What is tailored and elastic? Assignments
  89. 89. 89 MongoDB Installation on Windows or MongoDB Installation on Centos Generate Test Data on MongoDB Database Execute all Module1 Script present in LMS Read FAQ Module1 in LMS Read FAQ Module1 in LMS Take Quiz in LMS Complete Assignment Pre-work: Module 2
  90. 90. 90 Agenda of Next Class  MongoDB Development Overview  MongoDB Production Overview  MongoDB CRUD Introduction  MongoDB CRUD Concepts  MongoDB CRUD Syntax & Queries Module 2 CRUD Operations
  91. 91. 91 This section will give you an insight of MongoDB course Old Batch recordings – Handy for you What’s within the LMS
  92. 92. 92 What’s within the LMS Click here to expand and view all the elements of this Module
  93. 93. 93 What’s within the LMS Recording of the Class Presentation Hands-on Guides
  94. 94. 94 Assignment What’s within the LMS Further Reading Pre-work Quiz Scripts FAQ
  95. 95. Thank You See You in Class Next Week