TriHUG - Beyond Batch
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


TriHUG - Beyond Batch






Total Views
Views on SlideShare
Embed Views



3 Embeds 14 11 2 1



Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • hbase - random reads/writes - 45% of all hadoop clusters\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Drill \nRemove schema requirement\nIn-situ for real since we’ll support multiple formats\n\nNote: MR needed for big joins so to speak\n
  • Drill\nWill support nested\nNo schema required\n
  • Protocol buffers are conceptual data model\nWill support multiple data models\nWill have to define a way to explain data format\n (filtering, fields, etc)\nSchema-less will have perf penalty\nHbase will be one format\n
  • Likely to support these\nCould add HiveQL and more as well. Could even be clever and support HiveQL to MR or Drill based upon query\nPig as well\n\nPluggability\nData format\nQuery language\n\nSomething 6-9 months alpha quality\nCommunity driven, I can’t speak for project\n\nMapR\nFS gives better chunk size control\nNFS support may make small test drivers easier\nUnified namespace will allow multi-cluster access\nMight even have drill component that autoformats data\n\n\nRead only model\n
  • Example query that Drill should support\n\nNeed to talk more here about what Dremel does\n
  • Load data into Drill (optional)\nCould just use as is in “row” format\nMultiple query languages\nPluggability very important\n
  • Note: we have an already partially built execution engine\n
  • Note: we have an already partially built execution engine\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Be prepared for Apache questions\nCommitter vs committee vs contributor\n\nIf can’t answer question, ask them to answer and contribute\nLisa - Need landing page\nReferences to paper and such at end\n
  • \n
  • \n
  • \n
  • scaling is painful\npoor fault tolerance\ncoding is hard\n
  • \n
  • \n
  • tweets stock ticks manufacturing machine data sensor messages\n
  • \n
  • \n
  • \n
  • \n
  • DAG\n\nruns continuously\n
  • abstractions like Cascading, Hive, Pig make MR approachable\n\ncode size reduction\n
  • \n
  • \n
  • kestrel - via thrift\nkafka - transactional topologies, idempotentcy, process only once\nactivemq\n
  • \n
  • current architecture\n\ndata ingest tool for hadoop (avoid Flume madness)\n
  • current architecture\n\ndata ingest tool for hadoop (avoid Flume madness)\n
  • \n

TriHUG - Beyond Batch Presentation Transcript

  • 1. Beyond Batch HBase, Drill, & Storm Brad Anderson©MapR Technologies
  • 2. whoami• Brad Anderson• Solutions Architect at MapR (Atlanta)• ATLHUG co-chair• ‘boorad’ most places (twitter, github)•©MapR Technologies
  • 3. • The open enterprise-grade distribution for Hadoop • Easy, dependable and fast • Open source with standards-based extensions• MapR is deployed at 1000’s of companies • From small Internet startups to the world’s largest enterprises• MapR customers analyze massive amounts of data: • Hundreds of billions of events daily • 90% of the world’s Internet population monthly • $1 trillion in retail purchases annually• MapR Cloud Partners • Google to provide Hadoop on Google Compute Engine • Amazon for Elastic Map Reduce + instances©MapR Technologies
  • 4. Beyond Batch• HBase & M7• Apache Drill• Storm©MapR Technologies
  • 5. Latency Matters Batch Interactive Streaming©MapR Technologies
  • 6. HBase IssuesReliability• Compactions disrupt operations• Very slow crash recovery• Unreliable splittingBusiness continuity• Common hardware/software issues cause downtime• Administration requires downtime• No point-in-time recovery• Complex backup processPerformance• Many bottlenecks result in low throughput• Limited data locality• Limited # of tablesManageability• Compactions, splits and merges must be done manually (in reality)• Basic operations like backup or table rename are complex©MapR Technologies
  • 7. M7 An integrated system for unstructured and structured data – Unified namespace for files and tables – Data management – Data protection – Disaster recovery – No additional administration An architecture that delivers reliability and performance – Fewer layers – No compactions – Seamless splits – Automatic merges – Single network hop – Instant recovery – Reduced read and write amplification©MapR Technologies
  • 8. Unified Namespace$ pwd/mapr/default/user/boorad$ lsfile1 file2 table1 table2$ hbase shellhbase(main):003:0> create /user/boorad/table3, cf1, cf2, cf30 row(s) in 0.1570 seconds$ lsfile1 file2 table1 table2 table3$ hadoop fs -ls /user/booradFound 5 items-rw-r--r-- 3 mapr mapr 16 2012-09-28 08:34 /user/boorad/file1-rw-r--r-- 3 mapr mapr 22 2012-09-28 08:34 /user/boorad/file2trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:32 /user/boorad/table1trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:33 /user/boorad/table2trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:38 /user/boorad/table3 ©MapR Technologies
  • 9. Simplifying HBase Architecture HBase JVM DFS HBase JVM JVM ext3 MapR Unified Disks Disks Disks Other Distributions©MapR Technologies
  • 10. No RegionServers? One network hopNo daemons to manage One cache©MapR Technologies 15
  • 11. No RegionServers? One network hopNo daemons to manage One cache©MapR Technologies 15
  • 12. Region Assignment©MapR Technologies
  • 13. Region Assignment©MapR Technologies
  • 14. Instant Recovery Apache HBase experiences an outage when any node crashes – Each RegionServer replays WAL before any region can be recovered – All regions served by that RegionServer cannot be accessed M7 provides instant recovery – M7 uses small WALs • Multiple WALs per region vs. 1 per RegionServer (1000 regions) – Instant recovery on put – 1000-10000x faster recovery on get How? – M7 leverages unique MapR-FS capabilities, not impacted by HDFS limitations • Append support • No limit to # of files©MapR Technologies
  • 15. LSMT (FTW) Traditional disk-based index structures like B- Trees are expensive to maintain in real-time Log Structured Merge Trees reduce the cost by deferring and batching index changes Writes – Writes go to an in-memory index • And a commit log in case the node crashes and recovery is needed – The in-memory index is occasionally merged into the disk-based index • This may trigger a compaction Reads – Reads hit the in-memory index and the disk-based index©MapR Technologies
  • 16. Storage Subsystem PerformanceWhat does it cost to merge the in-memory index into the disk-based index? HBase-style LevelDB-style M7Examples BigTable, HBase, Cassandra, Riak M7 Cassandra, RiakWAF Low High LowRAF High Low LowI/O storms Yes No NoDisk space High (2x) Low LowoverheadSkewed data Bad Good GoodhandlingRewrite large Yes Yes NovaluesTerminology: Write-amplification factor (WAF): The ratio between writes to disk and application writes. Note that data must be rewritten in every indexed structure. Read-amplification factor (RAF): The ratio between reads from disk and application reads. Skewed data handling: When inserting values with similar keys (eg, increasing©MapR Technologies
  • 17. Other M7 Features Smaller disk footprint – HBase stores key & column name for every version of every cell – M7 never repeats the key or column name Columnar layout – HBasesupports 2-3 column families in practice – M7 supports 64 column families Online schema changes – No need to disable table to add/remove/modify column families©MapR Technologies
  • 18. ©MapR Technologies
  • 19. Big Data Picture Batch processing Interactive analysis Stream processingQuery runtime Minutes to hours Milliseconds to minutes Never-endingData volume TBs to PBs GBs to PBs Continuous streamProgramming model MapReduce Queries DAGUsers Developers Analysts and Developers DevelopersGoogle project MapReduce DremelOpen source project Hadoop MapReduce Storm, S4 ©MapR Technologies
  • 20. Big Data Picture Batch processing Interactive analysis Stream processingQuery runtime Minutes to hours Milliseconds to minutes Never-endingData volume TBs to PBs GBs to PBs Continuous streamProgramming model MapReduce Queries DAGUsers Developers Analysts and Developers DevelopersGoogle project MapReduce DremelOpen source project Hadoop MapReduce Storm, S4 Apache Drill ©MapR Technologies
  • 21. Google Dremel• Interactive analysis of large-scale datasets • Trillion records at interactive speeds • Complementary to MapReduce • Used by thousands of Google employees • Paper published at VLDB 2010• Model • Nested data model with schema • Most data at Google is stored/transferred in Protocol Buffers • Normalization (to relational) is prohibitive • SQL-like query language with nested data support• Implementation • Column-based storage and processing • In-situ data access (GFS and Bigtable) • Tree architecture as in Web search (and databases)©MapR Technologies
  • 22. Google BigQuery• Hosted Dremel (Dremel as a Service)• CLI (bq) and Web UI• Import data from Google Cloud Storage or local files • Files must be in CSV format • Nested data not supported [yet] except built-in datasets • Schema definition required©MapR Technologies
  • 23. DrQL Example DocId: 10 Links Forward: 20 SELECT DocId AS Id, Forward: 40 COUNT(Name.Language.Code) WITHIN Name AS Forward: 60 Cnt, Name Name.Url + , + Name.Language.Code AS Str Language FROM t Code: en-us WHERE REGEXP(Name.Url, ^http) AND DocId < 20; Country: us Language Code: en Id: 10 Url: http://A Name Name Cnt: 2 Url: http://B Language Name Str: http://A,en-us Language Str: http://A,en Code: en-gb Name Country: gb Cnt: 0©MapR Technologies * Example from the Dremel paper
  • 24. Data Flow©MapR Technologies
  • 25. Extensibility• Nested query languages • Pluggable model • DrQL • Mongo Query Language • Cascading• Distributed execution engine • Extensible model (eg, Dryad) • Low-latency • Fault tolerant©MapR Technologies
  • 26. Extensibility• Nested data formats • Pluggable model • Column-based (ColumnIO/Dremel, Trevni, RCFile) • Row-based (RecordIO, Avro, JSON, CSV) • Schema (Protocol Buffers, Avro, CSV) • Schema-less (JSON, BSON)• Scalable data sources • Pluggable model • Hadoop • HBase©MapR Technologies
  • 27. Architecture• Only the execution engine knows the physical attributes of the cluster • # nodes, hardware, file locations, …• Public interfaces enable extensibility • Developers can build parsers for new query languages • Developers can provide an execution plan directly• Each level of the plan has a human readable representation • Facilitates debugging and unit testing©MapR Technologies
  • 28. Architecture©MapR Technologies
  • 29. Query Components• Query components: • SELECT • FROM • WHERE • GROUP BY • HAVING • (JOIN)• Key logical operators: • Scan • Filter • Aggregate • (Join)©MapR Technologies
  • 30. Execution Engine Layers• Drill execution engine has two layers • Operator layer is serialization-aware • Processes individual records • Execution layer is not serialization-aware • Processes batches of records (blobs) • Responsible for communication, dependencies and fault tolerance©MapR Technologies
  • 31. Design Principles Flexible Easy• Pluggable query languages • Unzip and run• Extensible execution engine • Zero configuration• Pluggable data formats • Reverse DNS not needed • Column-based and row- • IP addresses can change based • Clear and concise log • Schema and schema-less messages Fast Dependable• C/C++ core with Java • No SPOFsupport • Instant recovery from • Google C++ style guide crashes• Min latency and max throughput (limited only by hardware) ©MapR Technologies
  • 32. Hadoop Integration• Hadoop data sources • Hadoop FileSystem API (HDFS/MapR-FS) • HBase• Hadoop data formats • Apache Avro • RCFile• MapReduce-based tools to create column-based formats©MapR Technologies
  • 33. Fully Open©MapR Technologies
  • 34. Storm©MapR Technologies
  • 35. Before Storm Queues Workers©MapR Technologies
  • 36. Example©MapR Technologies (simplified)
  • 37. Storm Guaranteed data processing Horizontal scalability Fault-tolerance No intermediate message brokers! Higher level abstraction than message passing “Just works”©MapR Technologies
  • 38. Concepts©MapR Technologies
  • 39. Streams Tuple Tuple Tuple Tuple Tuple Tuple Tuple Unbounded sequence of tuples©MapR Technologies
  • 40. Spouts Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Source of streams©MapR Technologies
  • 41. Spoutspublic interface ISpout extends Serializable { void open(Map conf, TopologyContext context, SpoutOutputCollector collector); void close(); void nextTuple(); void ack(Object msgId); void fail(Object msgId);}©MapR Technologies
  • 42. Bolts Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple TupleProcesses input streams and produces new streams ©MapR Technologies
  • 43. Bolts public class DoubleAndTripleBolt extends BaseRichBolt { private OutputCollectorBase _collector; public void prepare(Map conf, TopologyContext context, OutputCollectorBase collector) { _collector = collector; } public void execute(Tuple input) { int val = input.getInteger(0); _collector.emit(input, new Values(val*2, val*3)); _collector.ack(input); } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("double", "triple")); } }©MapR Technologies
  • 44. Topologies Network of spouts and bolts©MapR Technologies
  • 45. TridentCascading for Storm©MapR Technologies
  • 46. TridentTridentTopology topology = new TridentTopology();TridentState wordCounts = topology.newStream("spout1", spout) .each(new Fields("sentence"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count")) .parallelismHint(6); ©MapR Technologies
  • 47. Interoperability©MapR Technologies
  • 48. Spouts •Kafka (with transactions) •Kestrel •JMS •AMQP •Beanstalkd©MapR Technologies
  • 49. Bolts •Functions •Filters •Aggregation •Joins •Talk to databases, Hadoop write-behind©MapR Technologies
  • 50. Storm realtime processes Apps QueueRawData Business Value Hadoop batch processes ©MapR Technologies
  • 51. Storm realtime processes Apps QueueRawData Business Value HadoopParallel Cluster Ingest batch processes ©MapR Technologies
  • 52. Storm realtime processes Apps QueueRawData Business Value Hadoop batch processes ©MapR Technologies
  • 53. Storm realtime processes AppsRawData Business Value Hadoop batch processes ©MapR Technologies
  • 54. Get Involved!• Get more details on M7 •• Join the Apache Drill mailing list •• Watch TailSpout development •{tdunning | boorad}/mapr-spout• Join MapR • •• @boorad©MapR Technologies