DevNexus 2011
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

DevNexus 2011

on

  • 6,404 views

"An Introduction to NoSQL"

"An Introduction to NoSQL"
DevNexus talk 3/21/2011

Statistics

Views

Total Views
6,404
Views on SlideShare
3,921
Embed Views
2,483

Actions

Likes
17
Downloads
112
Comments
1

16 Embeds 2,483

http://www.nosqldatabases.com 1326
http://nosql.mypopescu.com 826
http://www.nosqlbr.com.br 270
http://static.slidesharecdn.com 37
http://www.linkedin.com 5
https://www.linkedin.com 5
http://paper.li 3
http://10.150.200.76 2
http://feeds.feedburner.com 2
http://flavors.me 1
http://nosqldatabases.squarespace.com 1
http://fasoulas.posterous.com 1
resource://brief-content 1
http://translate.googleusercontent.com 1
http://www.slideshare.net 1
http://preacherspen.org 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

DevNexus 2011 Presentation Transcript

  • 1. An Introduction to NoSQLBrad Anderson - DevNexusMarch 21, 2011
  • 2. Me‘boorad’ most places (twitter, github, etc.)Erlang Programmer Cloudant BigCouch, Ericsson Monaco, Verdeeco Java, Python, D, Javascript, Common LispNoSQL East - October 2009Data Warehousing / Big Datapre-lunch talks... always.
  • 3. AgendaNoSQL is BULLSHITYou Don’t Need ItYou Can’t Query It
  • 4. The NamePlay on MySQL (Eric Evans, Rackspace)Not Only SQL (Emil Eifrem)Broad UmbrellaShitty Marketing Term and we’re stuck with it
  • 5. Why do you need NoSQL?
  • 6. Why do you need NoSQL? YOU DON’T!
  • 7. Seriously, you don’t...Vastly different performance characteristicsImmature APIs and tools / ecosystemsBugs, most are actively being developedYour situation doesn’t warrant it
  • 8. Why do they exist?Every one of these new data storage systemscame from a particular pain someone washaving.Each system was created to specifically solvethe pain point the authors were experiencing.This pain usually involves a metric shit-tonne ofdata and distributed processing is required.Schema-free
  • 9. Prediction: Pain
  • 10. ExamplesGoogle - index Internet (mapreduce/bigtable)Yahoo - keep up with Google (Hadoop)Amazon - shopping cart (Dynamo)Facebook - inbox search (Cassandra)Lotus - Notes legacy restrictions (CouchDB)Cloudant - physics research (BigCouch)Basho - CRM product (Riak)Neo - graph traversal (Neo4J)
  • 11. Pain of ScalingScale Reads with master-slave replicationScale Writes with master-master replicationPartitioning Vertically (by functional groups)Partitioning Horizontally (by key, i.e. ‘date’)Caching works, kinda
  • 12. What to do?Distribute both data and processing horizontal scalingOrganize data differentlyUse appropriate on-disk storage
  • 13. Sorting Hat Says...Distribution ModelData ModelDisk Data Structure
  • 14. Distribution ModelEmbedded (no distribution)Replication / ShardingChord - peer to peerDynamo consistent hashing, vnodes, vector clocks
  • 15. No DistributionBDBNeo4J
  • 16. Replication / ShardingDistributionMongoDBCouchDBRedis
  • 17. Dynamo DistributionBigCouchRiakVoldemortCassandra no vnodes no vector clocksHibari ?
  • 18. Dynamo - how does it work? N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A DZ E C N od e D 3 E F D No de E 4 F G 17
  • 19. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
  • 20. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
  • 21. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
  • 22. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
  • 23. CAP TheoremPick Two (at any given time) Consistency Availability Partition ToleranceCP refuses requests, AP eventually consistentMust Read: http://codahale.com/you-cant-sacrifice-partition-tolerance/
  • 24. Data ModelKey/ValueDocumentColumnGraph
  • 25. Key / ValueBDBRiakVoldemortRedisHibari
  • 26. DocumentCouchDBMongoDBSimpleDB
  • 27. Column StoresHBaseCassandraHypertable
  • 28. Graph DatabasesNeo4JAllegroGraphFlockDB
  • 29. Disk Data Structurebtree - many different kindsmmap - compact bsonmemtable/sstable or log structured merge treelog-structured linear hashingadjacency lists / adjacency matrices
  • 30. Querying NoSQLKey Lookups fast, easy, limitingSecondary Indexes Immature part of most systems Roll your own MapReduceMongo query language
  • 31. Polyglot Persistence RDBMS batch processes CacheRaw Hadoop NoSQL AppsData NoSQL
  • 32. DriversSpring commons, hadoop, kv, document, graph membase, hbase, cassandra comingSerialization Thrift, Protocol Buffers, AvroNative Cassandra, Hadoop, Voldemort JInterface to Erlang?
  • 33. Good Luck! You’ll Need It.
  • 34. Questions?