Introduction to MongoDB

  • 1,848 views
Uploaded on

 

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,848
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
2
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • http://blog.mongolab.com/2012/08/why-is-mongodb-wildly-popular/
  • http://redmonk.com/jgovernor/2012/05/11/nosql-jobs-market-gets-real-mongo-exploding/

Transcript

  • 1. Slide 1 www.edureka.in/mongodb Introduction to MongoDB® View Mongo DB ® Course at http://www.edureka.in/mongodb
  • 2. Slide 2Slide 2 www.edureka.in/mongodbTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Why is a NoSQL database needed? Benefits of NoSQL over RDBMS dbs Comparing NoSQL vs SQL dbs How MongoDB® solves the problem Use cases of MongoDB® Job opportunity and trends on MongoDB® For Queries during the session and class recording: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN Objectives of this Session
  • 3. Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 3 www.edureka.in/mongodb RDBMS/OLTP/Real Time NoSQL/New SQL/BigData DSS/OLAP/DW Oracle MySQL MS SQL DB2 Netezza SAP Hana Oracle Express Mongo DB® HBase Cassandra Couch DB Database Categories
  • 4. Slide 4Slide 4 www.edureka.in/mongodbTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Why NoSQL Volume - 15 - 20 petabyte data in Govt. of India “AADHAR” project, NYSE, Facebook Velocity - Fastest growing app, Facebook, Jet, Video stream Variety - Data types e.g. Images, Videos, Text etc
  • 5. Slide 5Slide 5 www.edureka.in/mongodbTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Horizontal Scaling (scale out) versus Vertical Scaling (scale up)  Scale-out - Cache dependent ‘Read’ and ‘Write’ operations  Scale-up - Limited by Memory and Processing (CPU) capabilities  Complex RDBMS model – Parsing, Locking, Logging, Buffer pool, Threads etc.  Monitoring and managing the distributed architecture  Agility with fastest changing world Why NoSQL
  • 6. Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 6 www.edureka.in/mongodb Structured Data Text, Log Files, Click Streams, Blogs, Tweets, Audio, Video, etc Unstructured and Semi–structured Data Data Growth Pattern
  • 7. Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 7 www.edureka.in/mongodb NoSQL to the Rescue Distributed Architecture: easy to scale-out, easy to manage large number of nodes First Reads: Satisfying ‘write once, read many times’ behaviour Easy Replication: The failover solution Higher Performance: An architecture providing much higher per-node performance than the traditional SQL-based databases Schema Free: Data model with dynamic and flexible schema Fast Development: No need of the additional ORM layer To achieve the same, we compromise on:  No joins  No ACID transaction
  • 8. Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 8 www.edureka.in/mongodb NoSQL to the Rescue
  • 9. Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 9 www.edureka.in/mongodb CAP We must understand the CAP theorem when we talk about NoSQL databases or in fact when designing any distributed system. CAP theorem states that there are 3 basic requirements which exist in a special relation when designing applications for a distributed architecture. Consistency Availability Partition Tolerance CAP Theorem This means that the system is always on (guaranteed service availability), no downtime. This means that the system continues to function even if the communication among the servers is unreliable, i.e. the servers may be partitioned into multiple groups that cannot communicate with one another. This means that the data in the database remains consistent after the execution of an operation. For example, after an update operation all clients see the same data.
  • 10. Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 10 www.edureka.in/mongodb  CAP provides the basic requirements for a distributed system to follow 2 of the 3 requirements.  Theoretically it is impossible to fulfill all 3 requirements.  Therefore, all the current NoSQL database follows the different combinations of the C, A, P from the CAP theorem. CAP Theorem and NoSQL Databases  CA - Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks.  CP - Some data may not be accessible, but the rest is still consistent/accurate.  AP - System is still available under partitioning, but some of the data returned may be inaccurate.
  • 11. Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 11 www.edureka.in/mongodb  Basically Available indicates that the system does guarantee availability, in terms of the CAP theorem. Basically Available  Soft State indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model. Soft State  Eventual Consistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time. Eventual Consistency A BASE system gives up on consistency. NoSQL Database - A BASE not ACID System
  • 12. Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 12 www.edureka.in/mongodb ~ 150 No SQL Database are there in Market ~150 NoSQL Database – Not a Panacea
  • 13. Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 13 www.edureka.in/mongodb Graph Store Key – value Stores Wide Column Stores% Document Base  Document databases pair each key with a complex data structure known as a document.  Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.  Graph stores are used to store information about networks, such as social connections.  Graph stores include Neo4J and HyperGraphDB.  Key-value stores are the simplest NoSQL databases.  Every single item in the database is stored as an attribute name (or "key"), together with its value.  Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows. Categories of NoSQL Database
  • 14. Slide 14 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 14 www.edureka.in/mongodb Key Value Store Memcached Coherence Redis Tabular Big Table Hbase Accumulo Document Oriented MongoDB® Couch DB Cloudant Graph Stores Neo4J Oracle NoSQL HyperGraphDB Types of NoSQL Databases
  • 15. Slide 15 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 15 www.edureka.in/mongodb Compromising Features of RDBMS Step 2 Step 3 Selecting a NoSQL Database Step 1 Right Data Model Pros and Cons of Consistency
  • 16. Slide 16 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 16 www.edureka.in/mongodb 1000 TPS Caching Layer 300 ~ 500 SQL Transaction 100 ~ 200 SQL Transaction 1000 TPS WEB APPLICATION RDBMS1 Applications Changing Data RDBMS1 A Traditional Database Solution
  • 17. Slide 17 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 17 www.edureka.in/mongodb WEB APPLICATION 5000 TPS A NoSQL Database Solution Applications Changing Data Application grows with user base and data volume 5000 TPS
  • 18. Slide 18 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 18 www.edureka.in/mongodb Suppose a client needs database design for his blog or website with below features, then what could be the difference between RDBMS and MongoDB® schema design approach? Website has the following requirements. Every post has the unique title, description and url. Every post can have one or more tags. Every post has the name of its publisher and total number of likes. Every post has comments given by users along with their name, message, data-time and likes. On each post there can be zero or more comments. Data Modeling Example in RDBMS and MongoDB ®
  • 19. Slide 19 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 19 www.edureka.in/mongodb In RDBMS schema design for above requirements will have minimum 3 tables. posts id title description url likes post_by tags id post_id tag comments Comment_id post_id by_user message date_time likes ∞ 1 1 ∞ Data Modeling Example in RDBMS and MongoDB®
  • 20. Slide 20 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 20 www.edureka.in/mongodb While in MongoDB® schema design will have one collection post and has the following structure. So while showing the data, in RDBMS we need to join three tables and in Mongodb® data will be shown from one collection only. { _id: POST_ID title: TITLE_OF_POST, description: POST_DESCRIPTION, by: POST_BY, url: URL_OF_POST, tags: [ TAG1, TAG2, TAG3], likes: TOTAL_LIKES, comments: [ { user:'COMMENT_BY', message: TEXT, dateCreated: DATE_TIME, like: LIKES }, { user:'COMMENT_BY', message: TEXT, dateCreated: DATE_TIME, like: LIKES } ] } Data Modeling Example in RDBMS and MongoDB
  • 21. Slide 21 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 21 www.edureka.in/mongodb Where to Use MongoDB®? RDBMS replacement for Web Applications Semi-structured Content Management Real-time Analytics & High-Speed Logging Caching and High Scalability Web 2.0, Media, SAAS, Gaming http://www.mongodb.org/about/production-deployments/
  • 22. Slide 22 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 22 www.edureka.in/mongodb  Metlife uses MongoDB® for “The Wall”, an innovative customer service application which provides a 360-degree, consolidated view of MetLife customers, including policy details and transactions across lines of business.  ebay has a number of projects running on MongoDB® for search suggestions, metadata storage, cloud management and merchandizing categorization.  MongoDB® is the repository that powers MTV Networks’ next-generation CMS, which is used to manage and distribute content for all of MTV Networks’ major websites.  MongoDB® is used for back-end storage on the SourceForge front pages, project pages, and download pages for all projects.  Craigslist uses MongoDB® to archive billions of records.  ADP uses MongoDB® for its high performance, scalability, reliability and its ability to preserve the data manipulation capabilities of traditional relational databases. Real World Use Cases of MongoDB
  • 23. Slide 23 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 23 www.edureka.in/mongodb  CNN Turk uses MongoDB ® for its infrastructure and content management system, including the tv.cnnturk.com.  Foursquare uses MongoDB ® to store venues and user ‘check-ins’ into venues, sharding the data over more than 25 machines on Amazon EC2.  Justin.tv is the easy, fun, and fast way to share live video online. MongoDB ® powers Justin.tv’s internal analytics tools for virality, user retention, and general usage stats that out-of-the-box solutions can’t provide.  ibibo (‘I build, I bond’) is a social network using MongoDB ® for its dashboard feeds. Each feed is represented as a single document containing an average of 1000 entries; the site currently stores over two million of these documents in MongoDB ® . Real World Use Cases of MongoDB ®
  • 24. Slide 24 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 24 www.edureka.in/mongodb Financial Services  Risk Analytics and Reporting  Reference Data Management  Market Data Management  Portfolio Management  Order Capture  Time Series Data Government  Surveillance Data Aggregation  Crime Data Management and Analytics  Citizen Engagement Platform  Program Data Management  Healthcare Record Management Industry/Domains where MongoDB® is Used
  • 25. Slide 25 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 25 www.edureka.in/mongodb Health Care  360-degree Patient View  Population Management for At-risk Demographics  Lab Data Management and Analytics  Mobile Apps for Doctors and Nurses  Electronic Healthcare Records (EHR) Media and Entertainment  Content Management and Delivery  User Data Management  Digital Asset Management  Mobile and Social Apps  Content Archiving Industry/Domains where MongoDB® is Used
  • 26. Slide 26 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 26 www.edureka.in/mongodb Retail  Rich Product Catalogs  Customer Data Management  New Services  Digital Coupons  Real-time Price Optimization Telecommunication  Consumer Cloud  Product Catalog  Customer Service Improvement  Machine-to-Machine (M2M) Platform  Real-time Network Analysis and Optimization Industry/Domains where MongoDB® is Used
  • 27. Slide 27 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 27 www.edureka.in/mongodb How Popular is MongoDB® in the Industry?  Google search provides a wide indicator of MongoDB® adoption in the industry.
  • 28. Slide 28 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 28 www.edureka.in/mongodb Job Opportunity and Trends in MongoDB®
  • 29. Slide 29 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for QuestionsSlide 29 www.edureka.in/mongodb Where Not to Use MongoDB®? Highly transactional applications Applications with traditional database system where requirements such as foreign-key constraints etc are needed.
  • 30. Slide 30 www.edureka.in/mongodb Questions? Buy MongoDB Courses at : www.edureka.in Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.in/mongodb