Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

3,154 views

Published on

Compendium of my Brisk, Cassandra & Hadoop talks of the Summer 2011 - Delivered at JavaOne2011. I like the content in this one personally as it touches, Usecase driven intro to Cassandra, NoSQL followed by Intro to hadoop - MapReduce, HDFS internals, NameNode and JobTrackers. And how Brisk decomposes the Single point of failures in HDFS while providing a single form for Realtime & Batch storage and processing.
(And it seemed enjoyable to the audience in attendance)

Published in: Technology
  • Be the first to comment

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

  1. 1. Brisk: Truly peer-to-peer HadoopHigh-order bits from Cassandra & Hadoopsrisatish ambati@srisatish
  2. 2. How many in audience…
  3. 3. NoSQL -Know your queries.
  4. 4. points• Usecases• Why cassandra?• Usecase: Hadoop, Brisk• FUD: Consistency – Why facebook is not using Cassandra?• Anti-patterns• Community, Code, Tools• Q&A
  5. 5. Users. Netflix.Key by Customer, read-heavyKey by Customer:Movie, write-heavy
  6. 6. TimeSeries: (several customers)periodic readings: dev0,dev1…deviceID:metric:timestamp ->valueMetrics typically way larger dataset than users.
  7. 7. Why Cassandra?
  8. 8. Operational simplicitypeer-to-peer
  9. 9. writeOperational simplicity readpeer-to-peer
  10. 10. Replication:Multi-datacenterMulti-region ec2Multi-availability zones
  11. 11. reads local dc1 dc2Replication:Multi-datacenterMulti-region ec2, awsMulti-availability zones
  12. 12. 4.21.2011, Amazon Web Services outage:“Movie marathons on Netflix awaiting AWS tocome back up.” #ec2disabled
  13. 13. 4.21.2011, Amazon Web Services outage:Netflix was running on AWS.
  14. 14. fast durable writes.fast reads.
  15. 15. WritesSequential, append-only.~1-5ms
  16. 16. WritesSequential, append-only.~1-5msOn cloud: ephemeral disks rock!
  17. 17. ReadsLocalKey & row caches, (also, jna-based 0xffheap)indexes, materialized
  18. 18. ReadsLocalKey & row caches, (also, jna-based 0xffheap)indexes, materializedssds: improved read performance!
  19. 19. amortizeReplication over writesRepair over reads
  20. 20. Distribution between nodesGossipAnti-entropyFailure-detector L ig h t w e i g h t
  21. 21. Clients: cql, thriftpycassa, phpcassahector, pelops(scala, ruby, clojure)
  22. 22. Usecase #3: h a d o o pHdfs  cassandra  hiveLogs stats analytics
  23. 23. BriskTruly peer-to-peer hadoop.
  24. 24. mv computationnot data
  25. 25. map(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, "1");reduce(String key, Iterator values): // key: a word // values: a list of counts int result = 0; for each v in values: result += ParseInt(v); Emit(AsString(result));word count in MapReduce
  26. 26. Parallel Execution View
  27. 27. immutable datawrite-once-read-many!Files once created, written & closed..not changing!
  28. 28. jobtracker, tasktrackerhdfs: namenode, datanode
  29. 29. clouderaamazon: elastic map reducehortonworksmapRbrisk
  30. 30. Tools & AnalyticsHive, Pig, RKarmasphereDatameer… dozens of stealth startups!
  31. 31. “However, given that there is only a single master, it’s failure is unlikely;”The MapReduce paper, 2004. Sanjay et,al, Google.
  32. 32. Namenode decomposition, explained.
  33. 33. NameNode:Single Master nodeSingle Machine Address spaceSingle Point of failure
  34. 34. Use column families (tables)inodesblock
  35. 35. One kind of nodeno master node, no spofpeer-to-peer
  36. 36. near-real time hadoopLow latency: cassandra_dc nodesBatch Analytics: brisk_dc nodes
  37. 37. BriskSimpleSnitch.javaif(TrackerInitializer.isTrackerNode) { myDC = BRISK_DC; logger.info("Detected Hadoop trackersare enabled, setting my DC to " + myDC); } else { myDC = CASSANDRA_DC; logger.info("Looks like VanillaCassandra nodes, setting my DC to " + myDC); }
  38. 38. Hive: SQL-like accesscli, hwi, jdbc, metastorePushdown predicates (v beta2)
  39. 39. hive> CREATE TABLE invites (foo INT, barSTRING)PARTITIONED BY (ds STRING);hive> LOAD DATA LOCAL INPATH$BRISK_HOME/resources/hive/examples/files/kv2.txtOVERWRITE INTO TABLE invites PARTITION (ds=2008-08-15);hive> SELECT count(*), ds FROM invites GROUP BY ds; http://www.datastax.com/docs/0.8/brisk/about_hive
  40. 40. ETL Real-timeCassandra CFs DataCenters Scale @srisatish
  41. 41. @srisatish
  42. 42. No me in team! Ben Coverston  Michael Allen Ben Werther  Mike Bulman Brandon Williams  Nate McCall Cathy Daw  Nick M Bailey Jackson Chung  Patricio Echague Jake Luciani  Tyler Hobbs Joaquin Casares  SriSatish Ambati Jonathan Ellis  Yewei Zhang
  43. 43. 100-node Brisk Cluster on Opscenter @srisatish
  44. 44. FUD,acronym: fear, uncertainty, doubt.
  45. 45. Consistency: R + W > NORACLE, 2-node: R=1, W=2, N=2,(T=2)DNS* N is replication factor. Not to be confused with T=total #of nodes
  46. 46. Tune-able, flexibility.For High Consistency: read:quorum, write:quorumFor High Availability: high W, low R.
  47. 47. Consistency: R + W > NORACLE, 2-node: R=1, W=2, N=2,(T=2)DNS"brisk.consistencylevel.read", "QUORUM";"brisk.consistencylevel.write", "QUORUM";* N is replication factor. Not to be confused with T=total #of nodes
  48. 48. Inbox Search:600+cores.120+TB (2008)Went from 100-500m users.Average NoSQL deployment size: ~6-12 nodes.
  49. 49. Usecase #5: searchApache Solr + Cassandra = SolandraOther inbox/file Searches: xobni, c3github.com/tjake/solandra
  50. 50. “Eventual consistency is harder to program.”mostly immutable data.complex systems at scale.
  51. 51. Miscellaneous,Myth: data-loss, partial rows.writes are durable.
  52. 52. Anti-PatternsTransactionsJoinsRead before write
  53. 53. Anti-Patterns for cloudebsjvm, virtualizedsingle region
  54. 54. A few more good reasons for Cassandra...
  55. 55. ToolsAMIs, OpsCenter, DataStaxAppDynamicsGetting Started with brisk amiNetflix just builds AMIs for deployment!
  56. 56. Beautiful C 0 d e= new code(); //less is more~90k.java.concurrent.@annotate.bloomfilters, merkletrees.non-blocking, staged-event-driven.bigtable, dynamo.
  57. 57. Current & Future Focus:Distributed Counters, CQL.Simple client.operational smoothening. compaction.
  58. 58. CommunityRobust. Rapid. Brisk #Professional support from DataStax.git clone git@github.com:riptano/brisk.gitengineers: independent,startups, large companies,Rackspace, Twitter, Netflix..Come join the efforts!
  59. 59. Usecase #4: first NoSQL, then scale!simpledb  Cassandra mongodb  Cassandra
  60. 60. Copyright: xkcd
  61. 61. Copyright: plantoys… more than one way to do it!
  62. 62. Summary -high scale peer-to-peer datastorebest friend formulti-region, multi-zone availability.Hadoop – HDFS engulfing the DataWorldBrisk – best of both worlds!
  63. 63. @srisatishQ&A
  64. 64. Dynamo, 2007Bigtable, 2006 + OSS, 2008 Incubator 2009 TLP, 2010 Cassandra + + Brisk
  65. 65. NoSQL -Know your queries.

×