Stinger Tech DeepDivePage 1Alan Gates, Arun Murthy, Owen O’Malley
Hadoop SummitPage 2Architecting the Future of Big Data•  June 26-27, 2013- San Jose Convention Cntr•  Co-hosted by Hortonw...
Stinger Deep Dive OverviewPage 3© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Hive Performance Near Term•  S...
Hive Performance Near TermPage 4© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Enable star joins–  Where poss...
© Hortonworks Inc. 2013TL;DR• TPC-DS Query 27, Scale=200, 10 EC2 nodes (40 disks)1501.8951176.479631.02739.985020040060080...
© Hortonworks Inc. 2013TL;DR - II• TPC-DS Query 82, Scale=200, 10 EC2 nodes (40 disks)3257.6922862.669255.64171.1140500100...
© Hortonworks Inc. 2013Before
© Hortonworks Inc. 2013After
Support for Analytics - OVERPage 9© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  OVER clause–  PARTITION BY, ...
Support for Analytics ContinuedPage 10© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  CUBE BY, ROLLUP in Hive ...
ORCFile – Optimized RCFilePage 11© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL
© Hortonworks Inc. 2012Top LevelPage 12
© Hortonworks Inc. 2012File StructurePage 13
© Hortonworks Inc. 2012Stripe StructurePage 14
© Hortonworks Inc. 2012File LayoutPage 15
© Hortonworks Inc. 2012Integer Column SerializationPage 16
© Hortonworks Inc. 2012String Column SerializationPage 17
© Hortonworks Inc. 2012CompressionPage 18
© Hortonworks Inc. 2012Projection and Predicate FilteringPage 19
© Hortonworks Inc. 2012Example File SizesPage 20
© Hortonworks Inc. 2012Final notesPage 21
© Hortonworks Inc. 2012ComparisonPage 22RC File Trevni ORC FileHive Type Model N N YSeparate complex columns N Y YSplits f...
© Hortonworks Inc. 2012Page 23Tez
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez (fka MRX)• Low level...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez- Core IdeaTask with ...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – Blocks for buildin...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – More tasksSpecial ...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezP...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezP...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONFastQuery: Beyond Batch ...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez Service• Hive Query ...
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONMore details• Tez initia...
Hive Performance Longer Term -VectorizationPage 33© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Rewrite oper...
© Hortonworks Inc. 2013CPU intensive code
Upcoming SlideShare
Loading in...5
×

April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster

2,048

Published on

Apache Hive and its HiveQL interface has become the de facto SQL interface for Hadoop. Apache Hive was originally built for large-scale operational batch processing and it is very effective with reporting, data mining, and data preparation use cases. These usage patterns remain very important but with widespread adoption of Hadoop, the enterprise requirement for Hadoop to become more real time or interactive has increased in importance as well. Enabling Hive to answer human-time use cases (i.e. queries in the 5-30 second range) such as big data exploration, visualization, and parameterized reports without needing to resort to yet another tool to install, maintain and learn can deliver a lot of value to the large community of users with existing Hive skills and investments. To this end, we have launched the Stinger Initiative, with input and participation from the broader community, to enhance Hive with more SQL and better performance for these human-time use cases. We believe the performance changes we are making today, along with the work being done in Tez will transform Hive into a single tool that Hadoop users can use to do report generation, ad hoc queries, and large batch jobs spanning 10s or 100s of terabytes.

Presenter(s):

Alan Gates, Co-founder, Hortonworks and Apache Pig PMC and Apache HCatalog PPMC Member, Author of "Programming Pig" from O'Reilly Media

Owen O’ Malley, Co-founder, Hortonworks, First committer added to Apache Hadoop and Founding Chair of the Apache Hadoop PMC

Published in: Technology

April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster

  1. 1. Stinger Tech DeepDivePage 1Alan Gates, Arun Murthy, Owen O’Malley
  2. 2. Hadoop SummitPage 2Architecting the Future of Big Data•  June 26-27, 2013- San Jose Convention Cntr•  Co-hosted by Hortonworks & Yahoo!•  Theme: Enabling the Next GenerationEnterprise Data Platform•  90+ Sessions and 7 Tracks:•  Community Focused Event–  Sessions selected by a Conference Committee–  Community Choice allowed public to vote forsessions they want to see•  Training classes offered pre event–  Apache Hadoop Essentials: A TechnicalUnderstanding for Business Users–  Understanding Microsoft HDInsight and ApacheHadoop–  Developing Solutions with Apache Hadoop –HDFS and MapReduce–  Applying Data Science using Apache Hadoop•  HUG member discount 10%Use code: 13DiscHUG10hadoopsummit.org
  3. 3. Stinger Deep Dive OverviewPage 3© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Hive Performance Near Term•  Support for Analytics•  ORCFile•  Tez•  Hive Performance Longer Term
  4. 4. Hive Performance Near TermPage 4© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Enable star joins–  Where possible do in single map only task–  When not possible push larger tables to separate tasks•  Collapse adjacent jobs where possible–  Hive has lots of M->MR type plans, collapse these to MR–  Collapse adjacent jobs on sufficiently similar keys when feasible–  join followed by group–  join followed by order–  group followed by order–  Much of this ongoing in the community, we are shepherding•  Remove client side hash table generation–  Need to prove this is worth the effort short term
  5. 5. © Hortonworks Inc. 2013TL;DR• TPC-DS Query 27, Scale=200, 10 EC2 nodes (40 disks)1501.8951176.479631.02739.98502004006008001000120014001600Query 27TextRCFilePartitioned RCFilePartitioned RCFile + Optimizations
  6. 6. © Hortonworks Inc. 2013TL;DR - II• TPC-DS Query 82, Scale=200, 10 EC2 nodes (40 disks)3257.6922862.669255.64171.1140500100015002000250030003500Query 82TextRCFilePartitioned RCFilePartitioned RCFile + Optimizations
  7. 7. © Hortonworks Inc. 2013Before
  8. 8. © Hortonworks Inc. 2013After
  9. 9. Support for Analytics - OVERPage 9© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  OVER clause–  PARTITION BY, ORDER BY, ROWS BETWEEN/FOLLOWING/PRECEDING–  Will work with current aggregate functions–  Adding new aggregates/window functions we believe are useful–  Condor: RANK, LEAD, ROW_NUMBER, LAG, LEAD, FIRST_VALUE, LAST_VALUE–  Ducati: NTILE, DENSE_RANK, CUME_DIST, PERCENT_RANK, PERCENT_CONT,PERCENT_DISC–  Ongoing work in the community, will be shepherding itSELECT salesperson, AVG(salesprice) OVER(PARTITION BY region, ORDER BY dateRANGE 10 ROWS PRECEEDING)FROM sales;
  10. 10. Support for Analytics ContinuedPage 10© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  CUBE BY, ROLLUP in Hive 0.10•  Subqueries in WHERE–  Non-correlated only–  [NOT] IN supported–  Plan to optimize to fit in memory as hash table when feasible, join when not•  Datatypes–  datetime–  char() and varchar()–  add precision and scale to decimal and float–  aliases for standard SQL types (BLOB = binary, CLOB = string, integer = int, real/number = decimal)
  11. 11. ORCFile – Optimized RCFilePage 11© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL
  12. 12. © Hortonworks Inc. 2012Top LevelPage 12
  13. 13. © Hortonworks Inc. 2012File StructurePage 13
  14. 14. © Hortonworks Inc. 2012Stripe StructurePage 14
  15. 15. © Hortonworks Inc. 2012File LayoutPage 15
  16. 16. © Hortonworks Inc. 2012Integer Column SerializationPage 16
  17. 17. © Hortonworks Inc. 2012String Column SerializationPage 17
  18. 18. © Hortonworks Inc. 2012CompressionPage 18
  19. 19. © Hortonworks Inc. 2012Projection and Predicate FilteringPage 19
  20. 20. © Hortonworks Inc. 2012Example File SizesPage 20
  21. 21. © Hortonworks Inc. 2012Final notesPage 21
  22. 22. © Hortonworks Inc. 2012ComparisonPage 22RC File Trevni ORC FileHive Type Model N N YSeparate complex columns N Y YSplits found quickly N Y YDefault column group size 4MB 64MB* 250MBFiles per a bucket 1 > 1 1Store min, max, sum, count N N YVersioned metadata N Y YRun length data encoding N N YStore strings in dictionary N N YStore row count N Y YSkip compressed blocks N N YStore internal indexes N N Y
  23. 23. © Hortonworks Inc. 2012Page 23Tez
  24. 24. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez (fka MRX)• Low level data-processing execution engine• Use it for the base of MapReduce, Hive, and Pig• Enables pipelining of jobs• Removes task and job launch times• Hive and Pig jobs no longer need to move to the endof the queue between steps in the pipeline• Does not write intermediate output to HDFS– Much lighter disk and network usage• Built on YARNPage 24
  25. 25. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez- Core IdeaTask with pluggable Input, Processor & OutputPage 25YARN ApplicationMaster to run DAG of Tez TasksInput ProcessorTez TaskOutput
  26. 26. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – Blocks for building tasksMapReduce ‘Map’Page 26MapReduce ‘Reduce’HDFSInputMapProcessorTez TaskSortedOutputShuffleInputReduceProcessorTez TaskHDFSOutputIntermediate ‘Reduce’forMap-Reduce-ReduceShuffleInputReduceProcessorTez TaskSortedOutput
  27. 27. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – More tasksSpecial Pig/Hive ‘Map’Page 27In-memory MapHDFSInputMapProcessorTez TaskPipelineSorterOutputHDFSInputMapProcessorTez TaskIn-memorySortedOutputSpecial Pig/Hive‘Reduce’ShuffleSkip-mergeInputReduceProcessorTez TaskSortedOutput
  28. 28. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezPage 28I/O SynchronizationBarrierI/O PipeliningHive - MR Hive - TezSELECT a.state, COUNT(*)FROM a JOIN b ON (a.id = b.id)GROUP BY a.state
  29. 29. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezPage 29Hive - MR Hive - TezSELECT a.state, COUNT(*), AVERAGE(c.price)FROM aJOIN b ON (a.id = b.id)JOIN c ON (a.itemId = c.itemId)GROUP BY a.state
  30. 30. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONFastQuery: Beyond Batch with YARNPage 30Tez Generalizes Map-ReduceSimplified execution plans processdata more efficientlyAlways-On Tez ServiceLow latency processing forall Hadoop data processing
  31. 31. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez Service• Hive Query Startup Expensive– Job launch & task-launch latencies are fatal for short queries (inorder of 5s to 30s)• Solution– Tez Service– Removes task-launch overhead– Removes job-launch overhead– Hive– Submit query-plan to Tez Service– Native Hadoop service, not ad-hocPage 31
  32. 32. © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONMore details• Tez initiation– MRX runtime for running existing MR applications• Tez Improvements– In-memory sort/shuffle– Specialized sorts for Hive (integer sorts etc.)– Specialized merges for Hive (shallow key-space merges)• Tez on MRX– Change Hive to use MR+ from Tez– Focus on map-side joins• Initiated Apache Incubator ProjectPage 32
  33. 33. Hive Performance Longer Term -VectorizationPage 33© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Rewrite operators to work on arrays of Java scalars•  aka vectorization – ala MonetDB•  Operates on blocks of 1K or more records•  Each block contains an array of Java scalars, one for each column•  Avoids many function calls•  Size to fit in L1 cache, avoid cache misses•  Generate code for operators on the fly to avoid branches in code, maximizedeep pipelines of modern processers•  Will require complete rewrite of operators and execution pipeline–  Need to figure out how to integrate this into Hive, separate branch, separate layerin another project, or integrate in existing pipeline•  Requires conversion of all column values to Java scalars – no objects allowed–  Integrates nicely with up coming ORC work, other input types will need conversionon reading•  Want to write this in a way it can be shared by Pig, Cascading, MRprogrammers
  34. 34. © Hortonworks Inc. 2013CPU intensive code

×