NoSQL in the context of Social Web
Upcoming SlideShare
Loading in...5
×
 

NoSQL in the context of Social Web

on

  • 7,221 views

NoSQL presentation of basic principles.

NoSQL presentation of basic principles.

Statistics

Views

Total Views
7,221
Views on SlideShare
6,746
Embed Views
475

Actions

Likes
11
Downloads
310
Comments
1

6 Embeds 475

http://fiistudent.wordpress.com 319
http://www.slideshare.net 87
http://www.worldit.info 51
http://www.infoeducatie.ro 9
http://infoeducatie.ro 8
https://fiistudent.wordpress.com 1

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />

NoSQL in the context of Social Web NoSQL in the context of Social Web Presentation Transcript

  • NoSQL In the Context of Social Web Summer of Web 2010 Bogdan Gaza
  • About me • Student at Faculty of Computer Science - first year • Ruby & Rails fan • Building RailsAdmin for RubySOC 2010 • Interested in Scalability and High-Availability • http://twitter.com/hurrycane
  • Data, data Everywhere
  • Data growth on the web • Facebook Photos +25TB/week • Twitter +7TB/day • Flickr +21GB/hour • +150GB of tweets while I give this talk • All this amounts multiply every year
  • Data size 988.00 1000.00 750.00 623.00 500.00 397.00 253.00 250.00 161.00 0 2006 2007 2008 2009 2010 ExaBytes of data stored on the web 1ExaByte = 1018 bytes (source IDC)
  • How to store this huge amount of data?
  • Databases More T RDBMS MySQL, Oracle Online transaction processing (OLTP) NoSQL Mongodb, couchdb Less T
  • RDBMS • Relational database management system • Based on E.F. Codd - relational model - 1969 • Dominant for transactional & analytical applications • Most popular and easy to use RDBMS is MySQL
  • RDBMS performance SalaryList Majority of Web Apps Performance Social networks Trend analysis Data complexity
  • NoSQL • Doesn’t mean No to SQL • It actually means Not Only SQL • A class of data stores that may not require fixed table schemas (wikipedia) • Majority of them based on Google’s BigTable
  • NoSQL Categories • Key-Value stores: Voldemort • BigTable clones: HBase • Document Databases: Mongodb, Couchdb • Graph Databases: Neo4j
  • Common grounds • They all support huge amount of data • The majority of them supports replication and sharding • Have some sort of failure detection mechanism • Can scale to the complexity of data they store
  • Key-Value stores • Scales to huge amount of data • Can handle massive load • Based on Amazon’s Dynamo • A big persistent associative-array • Examples: Voldermort, Tokyo Tyrant/Cabinet
  • BigTable clones • Tables similar to RDBMS but semi- structured • Based on Google’s BigTable paper • Column oriented • Examples: HBase, Cassandra
  • Document databases • Similar to Key-Value Stores but the DB knows what the Value is • Collection of key-value collections • Documents are often versioned • Example: CouchDB, MongoDB, Redis
  • Graph databases • Focus in structure of data • Scales to the complexity of data • Data stored in nodes • Lots of cool Graph algorithms can be implemented • Examples: Neo4J, FlockDB
  • But how do I query it? • RESTful interface • QueryAPIs • SPARQL • Gremlin - graph traversal database • GQL - SQL-like Query Lange for Google BT
  • Who uses NoSQL?
  • Who uses NoSQL? BigTable Cassandra FlockDB Dynamo Voldemort MongoDB
  • • Scalable, high-performance, open-source, document oriented database • Somewhere between key-value stores and document databases • Big community • Tons of features
  • • JSON-style documents { author: 'joe', created : new Date('03-28-2009'), title : 'Yet another blog post' } • Schema free • Flexibility • Data is stored in collections
  • • Dynamic queries db.people.update( { name:"Joe" }, { $inc: { n : 1 } } ); • Replication - very easy to set up • Indexing • Sharding • GeoSpatial Index - location based queries
  • • Map Reduce search res = db.events.mapReduce(m, r, { query : {type:'sale'} }); • Simple querying db.users.find({'last_name': 'Smith'}) • GridFS - for storing large files • Support for many programming languages
  • Don’t search for: One database to rule them all!
  • Use the best suited storage for each kind of data
  • Thanks!
  • One more thing!
  • Hummingbird • Real time Web Traffic Visualiser • Let’s you see visitors interacting with your site in real time • Did a say real time?
  • Hummingbird technology • Node.js Evented TCP server written in JavaScript powered by the V8 JS engine. • WebSocks • SVG Graphs • Real time = 20 times per second
  • Hummingbird technology • Tracking pixel to gather data • Build over