DevNexus 2011
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

DevNexus 2011

  • 6,439 views
Uploaded on

"An Introduction to NoSQL"...

"An Introduction to NoSQL"
DevNexus talk 3/21/2011

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,439
On Slideshare
3,955
From Embeds
2,484
Number of Embeds
16

Actions

Shares
Downloads
112
Comments
1
Likes
17

Embeds 2,484

http://www.nosqldatabases.com 1,326
http://nosql.mypopescu.com 827
http://www.nosqlbr.com.br 270
http://static.slidesharecdn.com 37
http://www.linkedin.com 5
https://www.linkedin.com 5
http://paper.li 3
http://10.150.200.76 2
http://feeds.feedburner.com 2
http://flavors.me 1
http://nosqldatabases.squarespace.com 1
http://fasoulas.posterous.com 1
resource://brief-content 1
http://translate.googleusercontent.com 1
http://www.slideshare.net 1
http://preacherspen.org 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. An Introduction to NoSQLBrad Anderson - DevNexusMarch 21, 2011
  • 2. Me‘boorad’ most places (twitter, github, etc.)Erlang Programmer Cloudant BigCouch, Ericsson Monaco, Verdeeco Java, Python, D, Javascript, Common LispNoSQL East - October 2009Data Warehousing / Big Datapre-lunch talks... always.
  • 3. AgendaNoSQL is BULLSHITYou Don’t Need ItYou Can’t Query It
  • 4. The NamePlay on MySQL (Eric Evans, Rackspace)Not Only SQL (Emil Eifrem)Broad UmbrellaShitty Marketing Term and we’re stuck with it
  • 5. Why do you need NoSQL?
  • 6. Why do you need NoSQL? YOU DON’T!
  • 7. Seriously, you don’t...Vastly different performance characteristicsImmature APIs and tools / ecosystemsBugs, most are actively being developedYour situation doesn’t warrant it
  • 8. Why do they exist?Every one of these new data storage systemscame from a particular pain someone washaving.Each system was created to specifically solvethe pain point the authors were experiencing.This pain usually involves a metric shit-tonne ofdata and distributed processing is required.Schema-free
  • 9. Prediction: Pain
  • 10. ExamplesGoogle - index Internet (mapreduce/bigtable)Yahoo - keep up with Google (Hadoop)Amazon - shopping cart (Dynamo)Facebook - inbox search (Cassandra)Lotus - Notes legacy restrictions (CouchDB)Cloudant - physics research (BigCouch)Basho - CRM product (Riak)Neo - graph traversal (Neo4J)
  • 11. Pain of ScalingScale Reads with master-slave replicationScale Writes with master-master replicationPartitioning Vertically (by functional groups)Partitioning Horizontally (by key, i.e. ‘date’)Caching works, kinda
  • 12. What to do?Distribute both data and processing horizontal scalingOrganize data differentlyUse appropriate on-disk storage
  • 13. Sorting Hat Says...Distribution ModelData ModelDisk Data Structure
  • 14. Distribution ModelEmbedded (no distribution)Replication / ShardingChord - peer to peerDynamo consistent hashing, vnodes, vector clocks
  • 15. No DistributionBDBNeo4J
  • 16. Replication / ShardingDistributionMongoDBCouchDBRedis
  • 17. Dynamo DistributionBigCouchRiakVoldemortCassandra no vnodes no vector clocksHibari ?
  • 18. Dynamo - how does it work? N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A DZ E C N od e D 3 E F D No de E 4 F G 17
  • 19. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
  • 20. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
  • 21. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
  • 22. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
  • 23. CAP TheoremPick Two (at any given time) Consistency Availability Partition ToleranceCP refuses requests, AP eventually consistentMust Read: http://codahale.com/you-cant-sacrifice-partition-tolerance/
  • 24. Data ModelKey/ValueDocumentColumnGraph
  • 25. Key / ValueBDBRiakVoldemortRedisHibari
  • 26. DocumentCouchDBMongoDBSimpleDB
  • 27. Column StoresHBaseCassandraHypertable
  • 28. Graph DatabasesNeo4JAllegroGraphFlockDB
  • 29. Disk Data Structurebtree - many different kindsmmap - compact bsonmemtable/sstable or log structured merge treelog-structured linear hashingadjacency lists / adjacency matrices
  • 30. Querying NoSQLKey Lookups fast, easy, limitingSecondary Indexes Immature part of most systems Roll your own MapReduceMongo query language
  • 31. Polyglot Persistence RDBMS batch processes CacheRaw Hadoop NoSQL AppsData NoSQL
  • 32. DriversSpring commons, hadoop, kv, document, graph membase, hbase, cassandra comingSerialization Thrift, Protocol Buffers, AvroNative Cassandra, Hadoop, Voldemort JInterface to Erlang?
  • 33. Good Luck! You’ll Need It.
  • 34. Questions?