Why NoSQL and MongoDB for Big Data


Published on

This ingite length deck talks about why we have seen so much database innovation and the genesis of the NoSQL movement over the last 5 year. While there are many great NoSQL products it speaks to why MongoDB is dominating the space and is the heir apparent to the RDBMS for modern operational data.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • What has driven the need to have the term big data?Big data first used in 1997 but refers to just sizeVolume, Variety, Velocity grouped together 2001What do they have in common?They are all difficult to handle with the traditional data stack2011 “Big Data” = 3VsIs variability or velocity big?Not really but big data was a convenient umbrella termShould have called it Awkward DataI’ll never get a job at Gartner
  • Big Data is born online. Latency for these applications must be very low and availability must be high in order to meet SLAs and user expectations for modern application performance. Offline Big Data encompasses applications that ingest, transform, manage and/or analyze data in a batch context. They typically do not create new data. For these applications, response time can be slow (up to hours or days), which is often be acceptable for this type of use case. Since they usually produce a static (vs. operational) output, such as a report or dashboard, they can even go offline temporarily without impacting the overall goal or end product.
  • Indeed: #2 just after HTML and ahead of iOS, Android, HadoopJasper: Demand for MongoDB, the document-oriented NoSQL database, saw the biggest spike with over 200% growth in 2011.451 Group: Bigger than next 3 or 4 COMBINED; biggest quarter-over-quarter and year-over-year growth (again)
  • Why NoSQL and MongoDB for Big Data

    1. 1. Why NoSQL
    2. 2. 2 Dawn of Databases to Present Brewer’s Cap bornWWW born 10gen founded 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 SQL invented Oracle founded PC’s gain traction Client Server Dynamic Web Content 3 tier architecture Web applications SOA Cloud Computing released NoSQL Movement BigTable IDS (network) IMS (hierarchical) MUMPS Codd’s paper IDMS (network)
    3. 3. 3 Big Data Sensor Data (volume, velocity) Situational Awareness (Variety, Volume) SIGINT(V ) Asset Management (variety, velocity) OSINT( 3V ) Social Media ( 3V ) 3 Modern Data
    4. 4. 4 Relational Database Challenges Data Types • Unstructured data • Semi-structured data • Polymorphic data Volume of Data • Petabytes of data • Trillions of records • Millions of queries per second Agile Development • Iterative • Short development cycles • Changing data model New Architectures • Horizontal scaling • Commodity servers • Cloud computing
    5. 5. 5 The Evolution of Databases 2010 RDBMS NoSQL OLAP/BI Hadoop 2000 RDBMS OLAP/BI 1990 RDBMS Operational Data Datawarehouse Online Offline
    6. 6. 6 Fully Featured NoSQL Data Model { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } } } Rich Queries • Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 Geospatial • Find all of the car owners within 5km of Trafalgar Sq. Text Search • Find all the cars described as having leather seats Aggregation • Calculate the average value of Paul’s car collection Native Indexes • Secondary • Compound • Geospatial • Full Text • Hash • Covering Security • Kerberos • FIPS 140-2 • Field Level Security • LDAP • Auditing • RBAC
    7. 7. 7 Indeed.com Trends Top Job Trends 1.HTML 5 2.MongoDB 3.iOS 4.Android 5.Mobile Apps 6.Puppet 7.Hadoop 8.jQuery 9.PaaS 10.Social Media NoSQL Space LinkedIn Job Skills MongoDB Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 All Others Google Search MongoDB Competitor 1 Competitor 2 Competitor 3 Competitor 4 Jaspersoft Big Data Index Direct Real-Time Downloads MongoDB Competitor 1 Competitor 2 Competitor 3
    8. 8. 8 Open Source Software Technology must scale Cost must scale!