• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
 

April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster

on

  • 2,078 views

Apache Hive and its HiveQL interface has become the de facto SQL interface for Hadoop. Apache Hive was originally built for large-scale operational batch processing and it is very effective with ...

Apache Hive and its HiveQL interface has become the de facto SQL interface for Hadoop. Apache Hive was originally built for large-scale operational batch processing and it is very effective with reporting, data mining, and data preparation use cases. These usage patterns remain very important but with widespread adoption of Hadoop, the enterprise requirement for Hadoop to become more real time or interactive has increased in importance as well. Enabling Hive to answer human-time use cases (i.e. queries in the 5-30 second range) such as big data exploration, visualization, and parameterized reports without needing to resort to yet another tool to install, maintain and learn can deliver a lot of value to the large community of users with existing Hive skills and investments. To this end, we have launched the Stinger Initiative, with input and participation from the broader community, to enhance Hive with more SQL and better performance for these human-time use cases. We believe the performance changes we are making today, along with the work being done in Tez will transform Hive into a single tool that Hadoop users can use to do report generation, ad hoc queries, and large batch jobs spanning 10s or 100s of terabytes.

Presenter(s):

Alan Gates, Co-founder, Hortonworks and Apache Pig PMC and Apache HCatalog PPMC Member, Author of "Programming Pig" from O'Reilly Media

Owen O’ Malley, Co-founder, Hortonworks, First committer added to Apache Hadoop and Founding Chair of the Apache Hadoop PMC

Statistics

Views

Total Views
2,078
Views on SlideShare
2,078
Embed Views
0

Actions

Likes
14
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster Presentation Transcript

    • Stinger Tech DeepDivePage 1Alan Gates, Arun Murthy, Owen O’Malley
    • Hadoop SummitPage 2Architecting the Future of Big Data•  June 26-27, 2013- San Jose Convention Cntr•  Co-hosted by Hortonworks & Yahoo!•  Theme: Enabling the Next GenerationEnterprise Data Platform•  90+ Sessions and 7 Tracks:•  Community Focused Event–  Sessions selected by a Conference Committee–  Community Choice allowed public to vote forsessions they want to see•  Training classes offered pre event–  Apache Hadoop Essentials: A TechnicalUnderstanding for Business Users–  Understanding Microsoft HDInsight and ApacheHadoop–  Developing Solutions with Apache Hadoop –HDFS and MapReduce–  Applying Data Science using Apache Hadoop•  HUG member discount 10%Use code: 13DiscHUG10hadoopsummit.org
    • Stinger Deep Dive OverviewPage 3© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Hive Performance Near Term•  Support for Analytics•  ORCFile•  Tez•  Hive Performance Longer Term
    • Hive Performance Near TermPage 4© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Enable star joins–  Where possible do in single map only task–  When not possible push larger tables to separate tasks•  Collapse adjacent jobs where possible–  Hive has lots of M->MR type plans, collapse these to MR–  Collapse adjacent jobs on sufficiently similar keys when feasible–  join followed by group–  join followed by order–  group followed by order–  Much of this ongoing in the community, we are shepherding•  Remove client side hash table generation–  Need to prove this is worth the effort short term
    • © Hortonworks Inc. 2013TL;DR• TPC-DS Query 27, Scale=200, 10 EC2 nodes (40 disks)1501.8951176.479631.02739.98502004006008001000120014001600Query 27TextRCFilePartitioned RCFilePartitioned RCFile + Optimizations
    • © Hortonworks Inc. 2013TL;DR - II• TPC-DS Query 82, Scale=200, 10 EC2 nodes (40 disks)3257.6922862.669255.64171.1140500100015002000250030003500Query 82TextRCFilePartitioned RCFilePartitioned RCFile + Optimizations
    • © Hortonworks Inc. 2013Before
    • © Hortonworks Inc. 2013After
    • Support for Analytics - OVERPage 9© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  OVER clause–  PARTITION BY, ORDER BY, ROWS BETWEEN/FOLLOWING/PRECEDING–  Will work with current aggregate functions–  Adding new aggregates/window functions we believe are useful–  Condor: RANK, LEAD, ROW_NUMBER, LAG, LEAD, FIRST_VALUE, LAST_VALUE–  Ducati: NTILE, DENSE_RANK, CUME_DIST, PERCENT_RANK, PERCENT_CONT,PERCENT_DISC–  Ongoing work in the community, will be shepherding itSELECT salesperson, AVG(salesprice) OVER(PARTITION BY region, ORDER BY dateRANGE 10 ROWS PRECEEDING)FROM sales;
    • Support for Analytics ContinuedPage 10© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  CUBE BY, ROLLUP in Hive 0.10•  Subqueries in WHERE–  Non-correlated only–  [NOT] IN supported–  Plan to optimize to fit in memory as hash table when feasible, join when not•  Datatypes–  datetime–  char() and varchar()–  add precision and scale to decimal and float–  aliases for standard SQL types (BLOB = binary, CLOB = string, integer = int, real/number = decimal)
    • ORCFile – Optimized RCFilePage 11© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL
    • © Hortonworks Inc. 2012Top LevelPage 12
    • © Hortonworks Inc. 2012File StructurePage 13
    • © Hortonworks Inc. 2012Stripe StructurePage 14
    • © Hortonworks Inc. 2012File LayoutPage 15
    • © Hortonworks Inc. 2012Integer Column SerializationPage 16
    • © Hortonworks Inc. 2012String Column SerializationPage 17
    • © Hortonworks Inc. 2012CompressionPage 18
    • © Hortonworks Inc. 2012Projection and Predicate FilteringPage 19
    • © Hortonworks Inc. 2012Example File SizesPage 20
    • © Hortonworks Inc. 2012Final notesPage 21
    • © Hortonworks Inc. 2012ComparisonPage 22RC File Trevni ORC FileHive Type Model N N YSeparate complex columns N Y YSplits found quickly N Y YDefault column group size 4MB 64MB* 250MBFiles per a bucket 1 > 1 1Store min, max, sum, count N N YVersioned metadata N Y YRun length data encoding N N YStore strings in dictionary N N YStore row count N Y YSkip compressed blocks N N YStore internal indexes N N Y
    • © Hortonworks Inc. 2012Page 23Tez
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez (fka MRX)• Low level data-processing execution engine• Use it for the base of MapReduce, Hive, and Pig• Enables pipelining of jobs• Removes task and job launch times• Hive and Pig jobs no longer need to move to the endof the queue between steps in the pipeline• Does not write intermediate output to HDFS– Much lighter disk and network usage• Built on YARNPage 24
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez- Core IdeaTask with pluggable Input, Processor & OutputPage 25YARN ApplicationMaster to run DAG of Tez TasksInput ProcessorTez TaskOutput
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – Blocks for building tasksMapReduce ‘Map’Page 26MapReduce ‘Reduce’HDFSInputMapProcessorTez TaskSortedOutputShuffleInputReduceProcessorTez TaskHDFSOutputIntermediate ‘Reduce’forMap-Reduce-ReduceShuffleInputReduceProcessorTez TaskSortedOutput
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez – More tasksSpecial Pig/Hive ‘Map’Page 27In-memory MapHDFSInputMapProcessorTez TaskPipelineSorterOutputHDFSInputMapProcessorTez TaskIn-memorySortedOutputSpecial Pig/Hive‘Reduce’ShuffleSkip-mergeInputReduceProcessorTez TaskSortedOutput
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezPage 28I/O SynchronizationBarrierI/O PipeliningHive - MR Hive - TezSELECT a.state, COUNT(*)FROM a JOIN b ON (a.id = b.id)GROUP BY a.state
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONHive/MR versus Hive/TezPage 29Hive - MR Hive - TezSELECT a.state, COUNT(*), AVERAGE(c.price)FROM aJOIN b ON (a.id = b.id)JOIN c ON (a.itemId = c.itemId)GROUP BY a.state
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONFastQuery: Beyond Batch with YARNPage 30Tez Generalizes Map-ReduceSimplified execution plans processdata more efficientlyAlways-On Tez ServiceLow latency processing forall Hadoop data processing
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONTez Service• Hive Query Startup Expensive– Job launch & task-launch latencies are fatal for short queries (inorder of 5s to 30s)• Solution– Tez Service– Removes task-launch overhead– Removes job-launch overhead– Hive– Submit query-plan to Tez Service– Native Hadoop service, not ad-hocPage 31
    • © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATIONMore details• Tez initiation– MRX runtime for running existing MR applications• Tez Improvements– In-memory sort/shuffle– Specialized sorts for Hive (integer sorts etc.)– Specialized merges for Hive (shallow key-space merges)• Tez on MRX– Change Hive to use MR+ from Tez– Focus on map-side joins• Initiated Apache Incubator ProjectPage 32
    • Hive Performance Longer Term -VectorizationPage 33© 2013 Hortonworks, DO NOT SHARE, HORTONWORKSCONFIDENTIAL•  Rewrite operators to work on arrays of Java scalars•  aka vectorization – ala MonetDB•  Operates on blocks of 1K or more records•  Each block contains an array of Java scalars, one for each column•  Avoids many function calls•  Size to fit in L1 cache, avoid cache misses•  Generate code for operators on the fly to avoid branches in code, maximizedeep pipelines of modern processers•  Will require complete rewrite of operators and execution pipeline–  Need to figure out how to integrate this into Hive, separate branch, separate layerin another project, or integrate in existing pipeline•  Requires conversion of all column values to Java scalars – no objects allowed–  Integrates nicely with up coming ORC work, other input types will need conversionon reading•  Want to write this in a way it can be shared by Pig, Cascading, MRprogrammers
    • © Hortonworks Inc. 2013CPU intensive code