NoSQL in the context of Social Web

6,187 views
5,920 views

Published on

NoSQL presentation of basic principles.

Published in: Technology
2 Comments
13 Likes
Statistics
Notes
No Downloads
Views
Total views
6,187
On SlideShare
0
From Embeds
0
Number of Embeds
523
Actions
Shares
0
Downloads
320
Comments
2
Likes
13
Embeds 0
No embeds

No notes for slide
































  • NoSQL in the context of Social Web

    1. 1. NoSQL In the Context of Social Web Summer of Web 2010 Bogdan Gaza
    2. 2. About me • Student at Faculty of Computer Science - first year • Ruby & Rails fan • Building RailsAdmin for RubySOC 2010 • Interested in Scalability and High-Availability • http://twitter.com/hurrycane
    3. 3. Data, data Everywhere
    4. 4. Data growth on the web • Facebook Photos +25TB/week • Twitter +7TB/day • Flickr +21GB/hour • +150GB of tweets while I give this talk • All this amounts multiply every year
    5. 5. Data size 988.00 1000.00 750.00 623.00 500.00 397.00 253.00 250.00 161.00 0 2006 2007 2008 2009 2010 ExaBytes of data stored on the web 1ExaByte = 1018 bytes (source IDC)
    6. 6. How to store this huge amount of data?
    7. 7. Databases More T RDBMS MySQL, Oracle Online transaction processing (OLTP) NoSQL Mongodb, couchdb Less T
    8. 8. RDBMS • Relational database management system • Based on E.F. Codd - relational model - 1969 • Dominant for transactional & analytical applications • Most popular and easy to use RDBMS is MySQL
    9. 9. RDBMS performance SalaryList Majority of Web Apps Performance Social networks Trend analysis Data complexity
    10. 10. NoSQL • Doesn’t mean No to SQL • It actually means Not Only SQL • A class of data stores that may not require fixed table schemas (wikipedia) • Majority of them based on Google’s BigTable
    11. 11. NoSQL Categories • Key-Value stores: Voldemort • BigTable clones: HBase • Document Databases: Mongodb, Couchdb • Graph Databases: Neo4j
    12. 12. Common grounds • They all support huge amount of data • The majority of them supports replication and sharding • Have some sort of failure detection mechanism • Can scale to the complexity of data they store
    13. 13. Key-Value stores • Scales to huge amount of data • Can handle massive load • Based on Amazon’s Dynamo • A big persistent associative-array • Examples: Voldermort, Tokyo Tyrant/Cabinet
    14. 14. BigTable clones • Tables similar to RDBMS but semi- structured • Based on Google’s BigTable paper • Column oriented • Examples: HBase, Cassandra
    15. 15. Document databases • Similar to Key-Value Stores but the DB knows what the Value is • Collection of key-value collections • Documents are often versioned • Example: CouchDB, MongoDB, Redis
    16. 16. Graph databases • Focus in structure of data • Scales to the complexity of data • Data stored in nodes • Lots of cool Graph algorithms can be implemented • Examples: Neo4J, FlockDB
    17. 17. But how do I query it? • RESTful interface • QueryAPIs • SPARQL • Gremlin - graph traversal database • GQL - SQL-like Query Lange for Google BT
    18. 18. Who uses NoSQL?
    19. 19. Who uses NoSQL? BigTable Cassandra FlockDB Dynamo Voldemort MongoDB
    20. 20. • Scalable, high-performance, open-source, document oriented database • Somewhere between key-value stores and document databases • Big community • Tons of features
    21. 21. • JSON-style documents { author: 'joe', created : new Date('03-28-2009'), title : 'Yet another blog post' } • Schema free • Flexibility • Data is stored in collections
    22. 22. • Dynamic queries db.people.update( { name:"Joe" }, { $inc: { n : 1 } } ); • Replication - very easy to set up • Indexing • Sharding • GeoSpatial Index - location based queries
    23. 23. • Map Reduce search res = db.events.mapReduce(m, r, { query : {type:'sale'} }); • Simple querying db.users.find({'last_name': 'Smith'}) • GridFS - for storing large files • Support for many programming languages
    24. 24. Don’t search for: One database to rule them all!
    25. 25. Use the best suited storage for each kind of data
    26. 26. Thanks!
    27. 27. One more thing!
    28. 28. Hummingbird • Real time Web Traffic Visualiser • Let’s you see visitors interacting with your site in real time • Did a say real time?
    29. 29. Hummingbird technology • Node.js Evented TCP server written in JavaScript powered by the V8 JS engine. • WebSocks • SVG Graphs • Real time = 20 times per second
    30. 30. Hummingbird technology • Tracking pixel to gather data • Build over

    ×