Mongo DB: Operational Big Data Database

1,627 views

Published on

MongoDB is the leading NoSQL database due to a plenitude of reasons, open source, general purpose, document oriented database supported by a large community and educational platform. It's horizontal scalability features allows this to fit in the operational big data scenarios where the business needs point to realtime analytics and ever-increasing data sets. This talk will focus on the usage of MongoDB for big data operational purposes and why it's ideal to be used in such scenarios. Also integration with other notable big data technology out there like Hadoop and BI tools.

Norberto Leite - Senior Solutions Architect, @MongoDB.

Mongo DB presentation during the Pentaho & Big Data Ecosystem - Live Seminar 2013

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,627
On SlideShare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
232
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Mongo DB: Operational Big Data Database

  1. 1. @nleite MongoDB: Operational Big Data Norberto Leite Senior Solutions Architect, MongoDB norberto@mongodb.com
  2. 2. Agenda •  MongoDB Intro •  Big Data •  MongoDB Operation Big Data(base) •  Use Cases •  QA
  3. 3. Ola! •  Norberto Leite •  Solutions Architect –  wingman •  Barcelona/Brussels
  4. 4. MongoDB
  5. 5. MongoDB The leading NoSQL database General Purpose Document Database OpenSource
  6. 6. Global Community 5,000,000+ MongoDB Downloads 100,000+ Online Education Registrants 20,000+ MongoDB User Group Members 20,000+ MongoDB Days Attendees 20,000+ MongoDB Management Service (MMS) Users
  7. 7. MongoDB Overview 300+ employees Offices in New York, Palo Alto, Washington DC, London, Dublin, Barcelona and Sydney 600+ customers Over $231 million in funding
  8. 8. MongoDB Overview Agile Scalable
  9. 9. MongoDB Vision To provide the best database for how we build and run apps today Build Run –  New and complex data –  Flexible –  New languages –  Faster development –  Big Data scalability –  Real-time –  Commodity hardware –  Cloud
  10. 10. Operational Database Landscape
  11. 11. Document Data Model Relational MongoDB { ! first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: [45.123,47.232],! cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }! }!
  12. 12. MongoDB is full featured Rich Queries •  Find Paul’s cars •  Find everybody in London with a car built between 1970 and 1980 Geospatial •  Find all of the car owners within 5km of Trafalgar Sq. Text Search •  Find all the cars described as having leather seats Aggregation •  Calculate the average value of Paul’s car collection Map Reduce •  What is the ownership pattern of colors by geography over time? (is purple trending up in China?) MongoDB { ! first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: [45.123,47.232],! cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }! }!
  13. 13. Developers are more productive
  14. 14. Big Data
  15. 15. Best definition so far!
  16. 16. RDBMS Scale = Bigger Computers “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012
  17. 17. Vertical Scalability
  18. 18. 2010 Search Index Size: 100,000,000 GB New data added per day 100,000+ GB Databases they could use 0 250,000+ MBP’s == 4.1 miles This Was a Problem for Google Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  19. 19. And for Facebook 2010: 13,000,000 queries per second
  20. 20. And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504,161 tps TPC Top Results
  21. 21. And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504,161 tps TPC Top Results Top 10 combined: 1,370,368 tps
  22. 22. Living in the Post-transactional Future Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL) Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS) Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL)
  23. 23. Shift in What We’re Computing
  24. 24. How IT/Data Scientists Define Big Data Source: Silicon Angle, 2012
  25. 25. MongoDB Operational Big Data(base)
  26. 26. Consideration – Online vs. Offline Online •  Real-time •  Low-latency •  High availability vs. Offline •  Long-running •  High-Latency •  Availability is lower priority
  27. 27. Consideration – Online vs. Offline Online vs. Offline
  28. 28. MongoDB/NoSQL Is Good for… 360° View of the Customer Mobile & Social Apps Fraud Detection User Data Management Content Management & Delivery Reference Data Product Catalogs Machine to Machine Apps Data Hub
  29. 29. MongoDB and Enterprise IT Stack CRM, ERP, Collaboration, Mobile, BI Data Management Online Data Offline Data RDBMS RDBMS Hadoop EDW Infrastructure OS & Virtualization, Compute, Storage, Network Security & Auditing Management & Monitoring Applications
  30. 30. Horizontal Scalability
  31. 31. MongoDB Architecture
  32. 32. Use Cases
  33. 33. Leading Organizations Rely on MongoDB
  34. 34. Fortune 500 & Global 500 •  10 of the Top Financial Services Institutions •  10 of the Top Electronics Companies •  10 of the Top Media and Entertainment Companies •  8 of the Top Retailers •  6 of the Top Telcos •  5 of the Top Technology Companies •  4 of the Top Healthcare Companies
  35. 35. MongoDB Solutions Big Data Content Mgmt & Delivery User Data Management Mobile & Social Data Hub
  36. 36. Customer example: Online Travel Travel Algorithms MongoDB Connector for Hadoop •  •  •  •  Flights, hotels and cars Real-time offers User profiles, reviews User metadata (previous purchases, clicks, views) •  •  •  •  User segmentation Offer recommendation engine Ad serving engine Bundling engine
  37. 37. Machine Learning Ad-Serving Algorithms MongoDB Connector for Hadoop •  •  •  •  •  Catalogs and products User profiles Clicks Views Transactions •  User segmentation •  Recommendation engine •  Prediction engine
  38. 38. Data Hub Insurance Churn Analysis MongoDB Connector for Hadoop •  •  •  •  •  Insurance policies Demographic data Customer web data Call center data Real-time churn detection •  Customer action analysis •  Churn prediction algorithms
  39. 39. @nleite Obrigado! Norberto Leite Senior Solutions Architect, MongoDB norberto@mongodb.com
  40. 40. QA ?

×