Your SlideShare is downloading. ×
DevNexus 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

DevNexus 2011

6,078
views

Published on

"An Introduction to NoSQL" …

"An Introduction to NoSQL"
DevNexus talk 3/21/2011

Published in: Technology

1 Comment
17 Likes
Statistics
Notes
No Downloads
Views
Total Views
6,078
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
114
Comments
1
Likes
17
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. An Introduction to NoSQLBrad Anderson - DevNexusMarch 21, 2011
    • 2. Me‘boorad’ most places (twitter, github, etc.)Erlang Programmer Cloudant BigCouch, Ericsson Monaco, Verdeeco Java, Python, D, Javascript, Common LispNoSQL East - October 2009Data Warehousing / Big Datapre-lunch talks... always.
    • 3. AgendaNoSQL is BULLSHITYou Don’t Need ItYou Can’t Query It
    • 4. The NamePlay on MySQL (Eric Evans, Rackspace)Not Only SQL (Emil Eifrem)Broad UmbrellaShitty Marketing Term and we’re stuck with it
    • 5. Why do you need NoSQL?
    • 6. Why do you need NoSQL? YOU DON’T!
    • 7. Seriously, you don’t...Vastly different performance characteristicsImmature APIs and tools / ecosystemsBugs, most are actively being developedYour situation doesn’t warrant it
    • 8. Why do they exist?Every one of these new data storage systemscame from a particular pain someone washaving.Each system was created to specifically solvethe pain point the authors were experiencing.This pain usually involves a metric shit-tonne ofdata and distributed processing is required.Schema-free
    • 9. Prediction: Pain
    • 10. ExamplesGoogle - index Internet (mapreduce/bigtable)Yahoo - keep up with Google (Hadoop)Amazon - shopping cart (Dynamo)Facebook - inbox search (Cassandra)Lotus - Notes legacy restrictions (CouchDB)Cloudant - physics research (BigCouch)Basho - CRM product (Riak)Neo - graph traversal (Neo4J)
    • 11. Pain of ScalingScale Reads with master-slave replicationScale Writes with master-master replicationPartitioning Vertically (by functional groups)Partitioning Horizontally (by key, i.e. ‘date’)Caching works, kinda
    • 12. What to do?Distribute both data and processing horizontal scalingOrganize data differentlyUse appropriate on-disk storage
    • 13. Sorting Hat Says...Distribution ModelData ModelDisk Data Structure
    • 14. Distribution ModelEmbedded (no distribution)Replication / ShardingChord - peer to peerDynamo consistent hashing, vnodes, vector clocks
    • 15. No DistributionBDBNeo4J
    • 16. Replication / ShardingDistributionMongoDBCouchDBRedis
    • 17. Dynamo DistributionBigCouchRiakVoldemortCassandra no vnodes no vector clocksHibari ?
    • 18. Dynamo - how does it work? N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A DZ E C N od e D 3 E F D No de E 4 F G 17
    • 19. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
    • 20. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z E C N od e D 3 E F D No de E 4 F G 17
    • 21. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
    • 22. Dynamo - how does it work?PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Node 1 26 No de A B C D de No B 2 C B C A D Z hash(blah) E C N od e D 3 E F D No de E 4 F G 17
    • 23. CAP TheoremPick Two (at any given time) Consistency Availability Partition ToleranceCP refuses requests, AP eventually consistentMust Read: http://codahale.com/you-cant-sacrifice-partition-tolerance/
    • 24. Data ModelKey/ValueDocumentColumnGraph
    • 25. Key / ValueBDBRiakVoldemortRedisHibari
    • 26. DocumentCouchDBMongoDBSimpleDB
    • 27. Column StoresHBaseCassandraHypertable
    • 28. Graph DatabasesNeo4JAllegroGraphFlockDB
    • 29. Disk Data Structurebtree - many different kindsmmap - compact bsonmemtable/sstable or log structured merge treelog-structured linear hashingadjacency lists / adjacency matrices
    • 30. Querying NoSQLKey Lookups fast, easy, limitingSecondary Indexes Immature part of most systems Roll your own MapReduceMongo query language
    • 31. Polyglot Persistence RDBMS batch processes CacheRaw Hadoop NoSQL AppsData NoSQL
    • 32. DriversSpring commons, hadoop, kv, document, graph membase, hbase, cassandra comingSerialization Thrift, Protocol Buffers, AvroNative Cassandra, Hadoop, Voldemort JInterface to Erlang?
    • 33. Good Luck! You’ll Need It.
    • 34. Questions?