NoSQL databases and managing big data
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

NoSQL databases and managing big data

on

  • 9,738 views

An unprecedented amount of data is being created and is accessible. This presentation will instruct on using the new NoSQL technologies to make sense of all this data.

An unprecedented amount of data is being created and is accessible. This presentation will instruct on using the new NoSQL technologies to make sense of all this data.

Statistics

Views

Total Views
9,738
Views on SlideShare
6,544
Embed Views
3,194

Actions

Likes
12
Downloads
241
Comments
2

13 Embeds 3,194

http://www.scoop.it 1843
http://spf13.com 704
http://architects.dzone.com 300
http://css.dzone.com 174
http://itknowledgeexchange.techtarget.com 116
http://feeds.feedburner.com 38
http://localhost 7
https://si0.twimg.com 5
http://cloud.feedly.com 2
https://twimg0-a.akamaihd.net 2
http://translate.googleusercontent.com 1
http://dzone.com 1
http://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Really it is simple and brief...
    Are you sure you want to
    Your message goes here
    Processing…
  • Superb steve...
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • 10\n15\n10\n5\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * memcache, redis, membase\n* mongodb, couch\n* cassandra, riak\n* neo4j, flockdb\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • By reducing transactional semantics the db provides, one can still solve an interesting set of problems where performance is very important, and horizontal scaling then becomes easier.\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • One site is generating nearly as many URLs as the entire internet 6 years ago.\n
  • \n
  • \n
  • \n
  • \n

NoSQL databases and managing big data Presentation Transcript

  • 1. NoSQLDatabases &Managing Big Data
  • 2. Talking aboutWhat is BIG DataNoSQLMongoDBFuture of BIG Data
  • 3. @spf13 AKASteve Francia15+ years buildingthe internet Father, husband, skateboarderChief Solutions Architect @responsible for drivers,integrations, web & docs
  • 4. Company behind MongoDBOffices in NYC, Palo Alto, London & Dublin100+ employeesSupport, consulting, trainingMgt: Google/DoubleClick, Oracle, Apple, NetApp, Mark LogicWell Funded: Sequoia, Union Square, Flybridge
  • 5. What is BIG data ?
  • 6. 2000Google IncToday announced it has releasedthe largest search engine on theInternet.Google’s new index, comprisingmore than 1 billion URLs
  • 7. 2008Our indexing system for processinglinks indicates thatwe now count 1 trillion unique URLs(and the number of individual webpages out there is growing byseveral billion pages per day).
  • 8. Data Growth 1,0001000 750 500 500 250 250 120 55 4 10 24 1 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 Millions of URLs
  • 9. An unprecedentedamount of data isbeing created and isaccessible
  • 10. What good is it ifwe can’t utilize thisdata?
  • 11. ?What isNoSQL
  • 12. What is NoSQL?Key / Value Column Graph Document
  • 13. Key-Value StoresA mapping from a key to a valueThe store doesnt know anything about the thekey or valueThe store doesnt know anything about theinsides of the valueOperations :•Set, get, or delete a key-value pair
  • 14. Column-Oriented StoresLike a relational store, but flipped around: alldata for a column is kept togetherAn index provides a means to get a columnvalue for a recordOperations: •Get, insert, delete records; updating fieldsStreaming column data in and out of Hadoop
  • 15. Graph DatabasesStores vertex-to-vertex edgesOperations: •Getting and setting edges •Sometimes possible to annotate vertices or edgesQuery languages support finding pathsbetween vertices, subject to variousconstraints
  • 16. Document StoresThe store is a container for documentsDocuments are made up of named fields (think object/array/dict/hash...)Can query on any document field(s)Operations:•Insert and delete documents•Update fields within documents
  • 17. MySQLData Model Columns Key:Value Columns Documents Relational Eventual / Eventual /Consistency Strong Strong Strong Quorum Quorum Multi- Multi- Single Single SingleAvailability Master Master Master Master Master Range orPartitioning Hash Hash Range N/A Hash Thrift, Native Rest, Native Query SQL CQL Drivers (6) Thrift Drivers (12)
  • 18. Introduction toMongoDB
  • 19. What do we want in an ideal world?
  • 20. What do we want in an ideal world?•Horizontal scaling •cloud compatible •works with standard servers•Fast•Development is easy •Features •The Right Data Model •Schema Agility
  • 21. MongoDB philosophy Keep functionality when we can (key/value stores are great, but we need more) Non-relational (no joins) makes scaling horizontally practical Document data models are good Database technology should run anywhere virtualized, cloud, metal, etc
  • 22. Under the hoodWritten in C++Runs nearly everywhereData serialized to BSONExtensive use of memory-mapped filesi.e. read-through write-throughmemory caching.
  • 23. Database LandscapeScalability & Performance Memcached MongoDB RDBMS Depth of Functionality
  • 24. “MongoDB has the bestfeatures of key/valuestores, documentdatabases andrelational databasesin one. John Nunemaker
  • 25. Relational made normalized data look like this Category • Name • Url Article User • Name Tag• Name • Slug • Name• Email Address • Publish date • Url • Text Comment • Comment • Date • Author
  • 26. Document databases makenormalized data look like this Article • Name • Slug • Publish date User • Text • Name • Author • Email Address Comment[] • Comment • Date • Author Tag[] • Value Category[] • Value
  • 27. MongoD B
  • 28. Start with an (or array, hash, dict, eplace1 = { name : "10gen HQ", address : "578 Broadway 7th Floor", city : "New York", zip : "10011", tags : [ "business", "awesome" ]}
  • 29. Inserting the record Initial Data Load > db.places.insert(place1)> db.places.insert(place1)
  • 30. Querying{ name : "10gen HQ", address : "134 5th Avenue 3rd Floor", city : "New York", zip : "10011", tags : [ "business", "awesome" ]}> db.places.findOne({ zip: "10011", tags: "awesome" })> db.places.find({tags: "business" })
  • 31. Nested Documents{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), name : "10gen HQ", address : "578 Broadway 7th Floor", city : "New York", zip : "10011", tags : [ "business", "awesome" ], tips : [{ author : "Fred", date : "Sat Apr 25 2010 20:51:03", text : "Best Place Ever!" }]}
  • 32. Updating> db.places.update( {name : "10gen HQ"}, { $push : { tips : { author : "nosh", date : 6/26/2011, text : "Office hours are great!" } } })
  • 33. MongoDBUse Cases
  • 34. CMS / BlogNeeds:• Business needed modern data store for rapid development and scaleSolution:• Use PHP & MongoDBResults:• Real time statistics• All data, images, etc stored together easy access, easy deployment, easy high availability• No need for complex migrations• Enabled very rapid development and growth
  • 35. Photo Meta-DataProblem:• Business needed more flexibility than Oracle could deliverSolution:• Use MongoDB instead of OracleResults:• Developed application in one sprint cycle• 500% cost reduction compared to Oracle• 900% performance improvement compared to Oracle
  • 36. Customer AnalyticsProblem:• Deal with massive data volume across all customer sitesSolution:• Use MongoDB to replace Google Analytics / Omniture optionsResults:• Less than one week to build prototype and prove business case• Rapid deployment of new features
  • 37. ArchivingWhy MongoDB:• Existing application built on MySQL• Lots of friction with RDBMS based archive storage• Needed more scalable archive storage backendSolution:• Keep MySQL for active data (100mil)• MongoDB for archive (2+ billion)Results:• No more alter table statements taking over 2 months to run• Sharding enabled horizontal scale• Very happily looking at other places to use MongoDB
  • 38. Online DictionaryProblem:• MySQL could not scale to handle their 5B+ documentsSolution:• Switched from MySQL to MongoDBResults:• Massive simplification of code base• Eliminated need for external caching system• 20x performance improvement over MySQL
  • 39. E-commerceProblem:• Multi-vertical E-commerce impossible to model (efficiently) in RDBMSSolution:• Switched from MySQL to MongoDBResults:• Massive simplification of code base• Rapidly build, halving time to market (and cost)• Eliminated need for external caching system• 50x+ performance improvement over MySQL
  • 40. Tons more MongoDB casts a wide net people keep coming up with new and brilliant ways to use it
  • 41. In Good Company and 1000s more
  • 42. The Futureof BIGdata
  • 43. What is BIG? BIG today isnormal tomorrow
  • 44. Data Growth 9,00090006750 4,4004500 2,1502250 1,000 500 55 120 250 1 4 10 24 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Millions of URLs
  • 45. Data Growth 9,00090006750 4,4004500 2,1502250 1,000 500 55 120 250 1 4 10 24 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Millions of URLs
  • 46. 2012Generating over250 Millions oftweets per day
  • 47. MongoDB enablesus to scale withthe redefinitionof BIG.
  • 48. MongoDB High EasyPerformance Development { author : “steve”, date : new Date(), text : “About MongoDB...”, tags : [“tech”, “database”]} Horizontally Scalable
  • 49. http://spf13.com http://github.com/s @spf13Question download at mongodb.orgWe’re hiring!! Contact us at jobs@10gen.com