Beyond Batch  HBase, Drill, & Storm  Brad Anderson©MapR Technologies
whoami• Brad Anderson• Solutions Architect at MapR (Atlanta)• ATLHUG co-chair• ‘boorad’ most places (twitter, github)• ban...
•    The open enterprise-grade distribution for Hadoop     • Easy, dependable and fast     • Open source with standards-ba...
Beyond Batch• HBase & M7• Apache Drill• Storm©MapR Technologies
Latency Matters         Batch       Interactive   Streaming©MapR Technologies
HBase IssuesReliability• Compactions disrupt operations• Very slow crash recovery• Unreliable splittingBusiness continuity...
M7    An integrated system for unstructured and structured data     – Unified namespace for files and tables     – Data man...
Unified Namespace$ pwd/mapr/default/user/boorad$ lsfile1 file2 table1 table2$ hbase shellhbase(main):003:0> create /user/boor...
Simplifying HBase Architecture               HBase                     JVM                     DFS   HBase                ...
No RegionServers?                            One network hopNo daemons to manage                              One cache©Ma...
No RegionServers?                            One network hopNo daemons to manage                              One cache©Ma...
Region Assignment©MapR Technologies
Region Assignment©MapR Technologies
Instant Recovery    Apache HBase experiences an outage when any node     crashes     – Each RegionServer replays WAL befo...
LSMT (FTW) Traditional disk-based index structures like B-  Trees are expensive to maintain in real-time Log Structured ...
Storage Subsystem PerformanceWhat does it cost to merge the in-memory index into the disk-based index?                    ...
Other M7 Features    Smaller disk footprint     – HBase  stores key & column name for every version of       every cell  ...
©MapR Technologies
Big Data Picture                        Batch processing    Interactive analysis     Stream processingQuery runtime       ...
Big Data Picture                        Batch processing    Interactive analysis     Stream processingQuery runtime       ...
Google Dremel• Interactive analysis of large-scale datasets      • Trillion records at interactive speeds      • Complemen...
Google BigQuery• Hosted Dremel (Dremel as a Service)• CLI (bq) and Web UI• Import data from Google Cloud Storage or local ...
DrQL Example DocId: 10 Links  Forward: 20        SELECT DocId AS Id,  Forward: 40         COUNT(Name.Language.Code) WITHIN...
Data Flow©MapR Technologies
Extensibility• Nested query languages      •   Pluggable model      •   DrQL      •   Mongo Query Language      •   Cascad...
Extensibility• Nested data formats      • Pluggable model        • Column-based (ColumnIO/Dremel, Trevni, RCFile)        •...
Architecture• Only the execution engine knows the physical attributes of the  cluster      • # nodes, hardware, file locati...
Architecture©MapR Technologies
Query Components• Query components:      •   SELECT      •   FROM      •   WHERE      •   GROUP BY      •   HAVING      • ...
Execution Engine Layers• Drill execution engine has two layers      • Operator layer is serialization-aware          • Pro...
Design Principles    Flexible                         Easy•     Pluggable query languages     • Unzip and run•     Extensi...
Hadoop Integration• Hadoop data sources      • Hadoop FileSystem API (HDFS/MapR-FS)      • HBase• Hadoop data formats     ...
Fully Open©MapR Technologies
Storm©MapR Technologies
Before Storm                     Queues   Workers©MapR Technologies
Example©MapR Technologies                     (simplified)
Storm                     Guaranteed data processing                     Horizontal scalability                     Fault-...
Concepts©MapR Technologies
Streams  Tuple               Tuple   Tuple   Tuple   Tuple   Tuple   Tuple                     Unbounded sequence of tuple...
Spouts                                                         Tuple                                       Tuple Tuple Tup...
Spoutspublic interface ISpout extends Serializable {  void open(Map conf,         TopologyContext context,         SpoutOu...
Bolts Tuple      Tuple     Tuple   Tuple   Tuple   Tuple   Tuple                                                          ...
Bolts  public class DoubleAndTripleBolt extends BaseRichBolt {    private OutputCollectorBase _collector;       public voi...
Topologies                     Network of spouts and bolts©MapR Technologies
TridentCascading for Storm©MapR Technologies
TridentTridentTopology topology = new TridentTopology();TridentState wordCounts =   topology.newStream("spout1", spout)   ...
Interoperability©MapR Technologies
Spouts        •Kafka (with transactions)        •Kestrel        •JMS        •AMQP        •Beanstalkd©MapR Technologies
Bolts •Functions •Filters •Aggregation •Joins •Talk to databases, Hadoop write-behind©MapR Technologies
Storm                               realtime                              processes                                       ...
Storm                               realtime                              processes                                       ...
Storm                                        realtime                                       processes                     ...
Storm                                realtime                               processes                                     ...
Get Involved!• Get more details on M7      • http://mapr.com/products/mapr-editions/m7-edition• Join the Apache Drill mail...
Upcoming SlideShare
Loading in...5
×

TriHUG - Beyond Batch

1,528

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,528
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
53
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • hbase - random reads/writes - 45% of all hadoop clusters\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Drill \nRemove schema requirement\nIn-situ for real since we’ll support multiple formats\n\nNote: MR needed for big joins so to speak\n
  • Drill\nWill support nested\nNo schema required\n
  • Protocol buffers are conceptual data model\nWill support multiple data models\nWill have to define a way to explain data format\n (filtering, fields, etc)\nSchema-less will have perf penalty\nHbase will be one format\n
  • Likely to support these\nCould add HiveQL and more as well. Could even be clever and support HiveQL to MR or Drill based upon query\nPig as well\n\nPluggability\nData format\nQuery language\n\nSomething 6-9 months alpha quality\nCommunity driven, I can’t speak for project\n\nMapR\nFS gives better chunk size control\nNFS support may make small test drivers easier\nUnified namespace will allow multi-cluster access\nMight even have drill component that autoformats data\n\n\nRead only model\n
  • Example query that Drill should support\n\nNeed to talk more here about what Dremel does\n
  • Load data into Drill (optional)\nCould just use as is in “row” format\nMultiple query languages\nPluggability very important\n
  • Note: we have an already partially built execution engine\n
  • Note: we have an already partially built execution engine\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Be prepared for Apache questions\nCommitter vs committee vs contributor\n\nIf can’t answer question, ask them to answer and contribute\nLisa - Need landing page\nReferences to paper and such at end\n
  • \n
  • \n
  • \n
  • scaling is painful\npoor fault tolerance\ncoding is hard\n
  • \n
  • \n
  • tweets stock ticks manufacturing machine data sensor messages\n
  • \n
  • \n
  • \n
  • \n
  • DAG\n\nruns continuously\n
  • abstractions like Cascading, Hive, Pig make MR approachable\n\ncode size reduction\n
  • \n
  • \n
  • kestrel - via thrift\nkafka - transactional topologies, idempotentcy, process only once\nactivemq\n
  • \n
  • current architecture\n\ndata ingest tool for hadoop (avoid Flume madness)\n
  • current architecture\n\ndata ingest tool for hadoop (avoid Flume madness)\n
  • \n
  • TriHUG - Beyond Batch

    1. 1. Beyond Batch HBase, Drill, & Storm Brad Anderson©MapR Technologies
    2. 2. whoami• Brad Anderson• Solutions Architect at MapR (Atlanta)• ATLHUG co-chair• ‘boorad’ most places (twitter, github)• banderson@maprtech.com©MapR Technologies
    3. 3. • The open enterprise-grade distribution for Hadoop • Easy, dependable and fast • Open source with standards-based extensions• MapR is deployed at 1000’s of companies • From small Internet startups to the world’s largest enterprises• MapR customers analyze massive amounts of data: • Hundreds of billions of events daily • 90% of the world’s Internet population monthly • $1 trillion in retail purchases annually• MapR Cloud Partners • Google to provide Hadoop on Google Compute Engine • Amazon for Elastic Map Reduce + instances©MapR Technologies
    4. 4. Beyond Batch• HBase & M7• Apache Drill• Storm©MapR Technologies
    5. 5. Latency Matters Batch Interactive Streaming©MapR Technologies
    6. 6. HBase IssuesReliability• Compactions disrupt operations• Very slow crash recovery• Unreliable splittingBusiness continuity• Common hardware/software issues cause downtime• Administration requires downtime• No point-in-time recovery• Complex backup processPerformance• Many bottlenecks result in low throughput• Limited data locality• Limited # of tablesManageability• Compactions, splits and merges must be done manually (in reality)• Basic operations like backup or table rename are complex©MapR Technologies
    7. 7. M7 An integrated system for unstructured and structured data – Unified namespace for files and tables – Data management – Data protection – Disaster recovery – No additional administration An architecture that delivers reliability and performance – Fewer layers – No compactions – Seamless splits – Automatic merges – Single network hop – Instant recovery – Reduced read and write amplification©MapR Technologies
    8. 8. Unified Namespace$ pwd/mapr/default/user/boorad$ lsfile1 file2 table1 table2$ hbase shellhbase(main):003:0> create /user/boorad/table3, cf1, cf2, cf30 row(s) in 0.1570 seconds$ lsfile1 file2 table1 table2 table3$ hadoop fs -ls /user/booradFound 5 items-rw-r--r-- 3 mapr mapr 16 2012-09-28 08:34 /user/boorad/file1-rw-r--r-- 3 mapr mapr 22 2012-09-28 08:34 /user/boorad/file2trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:32 /user/boorad/table1trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:33 /user/boorad/table2trwxr-xr-x 3 mapr mapr 2 2012-09-28 08:38 /user/boorad/table3 ©MapR Technologies
    9. 9. Simplifying HBase Architecture HBase JVM DFS HBase JVM JVM ext3 MapR Unified Disks Disks Disks Other Distributions©MapR Technologies
    10. 10. No RegionServers? One network hopNo daemons to manage One cache©MapR Technologies 15
    11. 11. No RegionServers? One network hopNo daemons to manage One cache©MapR Technologies 15
    12. 12. Region Assignment©MapR Technologies
    13. 13. Region Assignment©MapR Technologies
    14. 14. Instant Recovery Apache HBase experiences an outage when any node crashes – Each RegionServer replays WAL before any region can be recovered – All regions served by that RegionServer cannot be accessed M7 provides instant recovery – M7 uses small WALs • Multiple WALs per region vs. 1 per RegionServer (1000 regions) – Instant recovery on put – 1000-10000x faster recovery on get How? – M7 leverages unique MapR-FS capabilities, not impacted by HDFS limitations • Append support • No limit to # of files©MapR Technologies
    15. 15. LSMT (FTW) Traditional disk-based index structures like B- Trees are expensive to maintain in real-time Log Structured Merge Trees reduce the cost by deferring and batching index changes Writes – Writes go to an in-memory index • And a commit log in case the node crashes and recovery is needed – The in-memory index is occasionally merged into the disk-based index • This may trigger a compaction Reads – Reads hit the in-memory index and the disk-based index©MapR Technologies
    16. 16. Storage Subsystem PerformanceWhat does it cost to merge the in-memory index into the disk-based index? HBase-style LevelDB-style M7Examples BigTable, HBase, Cassandra, Riak M7 Cassandra, RiakWAF Low High LowRAF High Low LowI/O storms Yes No NoDisk space High (2x) Low LowoverheadSkewed data Bad Good GoodhandlingRewrite large Yes Yes NovaluesTerminology: Write-amplification factor (WAF): The ratio between writes to disk and application writes. Note that data must be rewritten in every indexed structure. Read-amplification factor (RAF): The ratio between reads from disk and application reads. Skewed data handling: When inserting values with similar keys (eg, increasing©MapR Technologies
    17. 17. Other M7 Features Smaller disk footprint – HBase stores key & column name for every version of every cell – M7 never repeats the key or column name Columnar layout – HBasesupports 2-3 column families in practice – M7 supports 64 column families Online schema changes – No need to disable table to add/remove/modify column families©MapR Technologies
    18. 18. ©MapR Technologies
    19. 19. Big Data Picture Batch processing Interactive analysis Stream processingQuery runtime Minutes to hours Milliseconds to minutes Never-endingData volume TBs to PBs GBs to PBs Continuous streamProgramming model MapReduce Queries DAGUsers Developers Analysts and Developers DevelopersGoogle project MapReduce DremelOpen source project Hadoop MapReduce Storm, S4 ©MapR Technologies
    20. 20. Big Data Picture Batch processing Interactive analysis Stream processingQuery runtime Minutes to hours Milliseconds to minutes Never-endingData volume TBs to PBs GBs to PBs Continuous streamProgramming model MapReduce Queries DAGUsers Developers Analysts and Developers DevelopersGoogle project MapReduce DremelOpen source project Hadoop MapReduce Storm, S4 Apache Drill ©MapR Technologies
    21. 21. Google Dremel• Interactive analysis of large-scale datasets • Trillion records at interactive speeds • Complementary to MapReduce • Used by thousands of Google employees • Paper published at VLDB 2010• Model • Nested data model with schema • Most data at Google is stored/transferred in Protocol Buffers • Normalization (to relational) is prohibitive • SQL-like query language with nested data support• Implementation • Column-based storage and processing • In-situ data access (GFS and Bigtable) • Tree architecture as in Web search (and databases)©MapR Technologies
    22. 22. Google BigQuery• Hosted Dremel (Dremel as a Service)• CLI (bq) and Web UI• Import data from Google Cloud Storage or local files • Files must be in CSV format • Nested data not supported [yet] except built-in datasets • Schema definition required©MapR Technologies
    23. 23. DrQL Example DocId: 10 Links Forward: 20 SELECT DocId AS Id, Forward: 40 COUNT(Name.Language.Code) WITHIN Name AS Forward: 60 Cnt, Name Name.Url + , + Name.Language.Code AS Str Language FROM t Code: en-us WHERE REGEXP(Name.Url, ^http) AND DocId < 20; Country: us Language Code: en Id: 10 Url: http://A Name Name Cnt: 2 Url: http://B Language Name Str: http://A,en-us Language Str: http://A,en Code: en-gb Name Country: gb Cnt: 0©MapR Technologies * Example from the Dremel paper
    24. 24. Data Flow©MapR Technologies
    25. 25. Extensibility• Nested query languages • Pluggable model • DrQL • Mongo Query Language • Cascading• Distributed execution engine • Extensible model (eg, Dryad) • Low-latency • Fault tolerant©MapR Technologies
    26. 26. Extensibility• Nested data formats • Pluggable model • Column-based (ColumnIO/Dremel, Trevni, RCFile) • Row-based (RecordIO, Avro, JSON, CSV) • Schema (Protocol Buffers, Avro, CSV) • Schema-less (JSON, BSON)• Scalable data sources • Pluggable model • Hadoop • HBase©MapR Technologies
    27. 27. Architecture• Only the execution engine knows the physical attributes of the cluster • # nodes, hardware, file locations, …• Public interfaces enable extensibility • Developers can build parsers for new query languages • Developers can provide an execution plan directly• Each level of the plan has a human readable representation • Facilitates debugging and unit testing©MapR Technologies
    28. 28. Architecture©MapR Technologies
    29. 29. Query Components• Query components: • SELECT • FROM • WHERE • GROUP BY • HAVING • (JOIN)• Key logical operators: • Scan • Filter • Aggregate • (Join)©MapR Technologies
    30. 30. Execution Engine Layers• Drill execution engine has two layers • Operator layer is serialization-aware • Processes individual records • Execution layer is not serialization-aware • Processes batches of records (blobs) • Responsible for communication, dependencies and fault tolerance©MapR Technologies
    31. 31. Design Principles Flexible Easy• Pluggable query languages • Unzip and run• Extensible execution engine • Zero configuration• Pluggable data formats • Reverse DNS not needed • Column-based and row- • IP addresses can change based • Clear and concise log • Schema and schema-less messages Fast Dependable• C/C++ core with Java • No SPOFsupport • Instant recovery from • Google C++ style guide crashes• Min latency and max throughput (limited only by hardware) ©MapR Technologies
    32. 32. Hadoop Integration• Hadoop data sources • Hadoop FileSystem API (HDFS/MapR-FS) • HBase• Hadoop data formats • Apache Avro • RCFile• MapReduce-based tools to create column-based formats©MapR Technologies
    33. 33. Fully Open©MapR Technologies
    34. 34. Storm©MapR Technologies
    35. 35. Before Storm Queues Workers©MapR Technologies
    36. 36. Example©MapR Technologies (simplified)
    37. 37. Storm Guaranteed data processing Horizontal scalability Fault-tolerance No intermediate message brokers! Higher level abstraction than message passing “Just works”©MapR Technologies
    38. 38. Concepts©MapR Technologies
    39. 39. Streams Tuple Tuple Tuple Tuple Tuple Tuple Tuple Unbounded sequence of tuples©MapR Technologies
    40. 40. Spouts Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Source of streams©MapR Technologies
    41. 41. Spoutspublic interface ISpout extends Serializable { void open(Map conf, TopologyContext context, SpoutOutputCollector collector); void close(); void nextTuple(); void ack(Object msgId); void fail(Object msgId);}©MapR Technologies
    42. 42. Bolts Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple TupleProcesses input streams and produces new streams ©MapR Technologies
    43. 43. Bolts public class DoubleAndTripleBolt extends BaseRichBolt { private OutputCollectorBase _collector; public void prepare(Map conf, TopologyContext context, OutputCollectorBase collector) { _collector = collector; } public void execute(Tuple input) { int val = input.getInteger(0); _collector.emit(input, new Values(val*2, val*3)); _collector.ack(input); } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("double", "triple")); } }©MapR Technologies
    44. 44. Topologies Network of spouts and bolts©MapR Technologies
    45. 45. TridentCascading for Storm©MapR Technologies
    46. 46. TridentTridentTopology topology = new TridentTopology();TridentState wordCounts = topology.newStream("spout1", spout) .each(new Fields("sentence"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count")) .parallelismHint(6); ©MapR Technologies
    47. 47. Interoperability©MapR Technologies
    48. 48. Spouts •Kafka (with transactions) •Kestrel •JMS •AMQP •Beanstalkd©MapR Technologies
    49. 49. Bolts •Functions •Filters •Aggregation •Joins •Talk to databases, Hadoop write-behind©MapR Technologies
    50. 50. Storm realtime processes Apps QueueRawData Business Value Hadoop batch processes ©MapR Technologies
    51. 51. Storm realtime processes Apps QueueRawData Business Value HadoopParallel Cluster Ingest batch processes ©MapR Technologies
    52. 52. Storm realtime processes Apps QueueRawData Business Value Hadoop batch processes ©MapR Technologies
    53. 53. Storm realtime processes AppsRawData Business Value Hadoop batch processes ©MapR Technologies
    54. 54. Get Involved!• Get more details on M7 • http://mapr.com/products/mapr-editions/m7-edition• Join the Apache Drill mailing list • drill-dev-subscribe@incubator.apache.org• Watch TailSpout development • https://github.com/{tdunning | boorad}/mapr-spout• Join MapR • jobs@mapr.com • banderson@maprtech.com• @boorad©MapR Technologies
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×