Overview of stinger interactive query for hive


Published on

Presentation given to the OC Big Data Meetup Group. http://www.meetup.com/OCBigData

Published in: Data & Analytics, Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Overview of stinger interactive query for hive

  1. 1. Overview  of  S+nger:  Interac+ve   Query  for  Hive     @ddkaiser   linkedin.com/in/dkaiser   slideshare.net/ddkaiser   dkaiser@cdk.com   dkaiser@hortonworks.com     OC  Big  Data  Meetup  #1   May  21,  2014   David  Kaiser  
  2. 2. Who Am I? David  Kaiser   20+  years  experience  with  Linux     3  years  experience  with  Hadoop     Career  experiences:   •  Data  Warehousing   •  Geospa+al  Analy+cs   •  Open-­‐source  Solu+ons  and  Architecture     Employed  at  Hortonworks  as  a  Senior  Solu+ons  Engineer     @ddkaiser   linkedin.com/in/dkaiser   slideshare.net/ddkaiser   dkaiser@cdk.com   dkaiser@hortonworks.com    
  3. 3. Overview of Stinger: Interactive Query for Hive • Abstract: – Hadoop  is  about  so  much  more  than  batch  processing.    With  the   recent  release  of  Hadoop  2,  there  have  been  many  new   approaches  for  increased  applica+on  performance.   – Hive  is  the  most  used  SQL  implementa+on  on  Hadoop.     – Hive  provides  the  most  amount  of  SQL  compa+bility  on  Hadoop.   – But…        Hive  is  Slow.             – Hive  WAS  Slow.   – This  talk  will  discuss  the  S+nger  ini+a+ve,  which  improved  Hive   performance  over  100x.  
  4. 4. S"nger  Project   (announced  February  2013)   Batch AND Interactive SQL-in-Hadoop Stinger Initiative A broad, community-based effort to drive the next generation of HIVE     Hive  0.13,  April  2014   •  Hive  on  Apache  Tez   •  Query  Service   •  Buffer  Cache   •  Cost  Based  Op+mizer  (Op+q)   •  Vectorized  Processing     Hive  0.11,  May  2013:   •  Base  Op+miza+ons   •  SQL  Analy+c  Func+ons   •  ORCFile,  Modern  File  Format   Hive  0.12,  October  2013:   •  VARCHAR,  DATE  Types   •  ORCFile  predicate  pushdown   •  Advanced  Op+miza+ons   •  Performance  Boosts  via  YARN   Speed   Improve  Hive  query  performance  by  100X  to   allow  for  interac+ve  query  +mes  (seconds)   Scale   The  only  SQL  processing  in  Hadoop  designed  for   queries  that  scale  from  TB  to  PB   SQL   Support  broadest  range  of  SQL  seman+cs  for   analy+c  applica+ons  running  against  Hadoop   Goals:   An Open Community at its finest: Apache Hive Contribution 1,672Jira Tickets Closed 145Developers 44Companies ~400,000Lines Of Code Added… 13Months
  5. 5. Outcomes from the Stinger Project Page 5 Feature   Descrip"on   Benefit   Tez  Integra+on   Tez  is  significantly  beeer  engine  than  MapReduce   Latency   Vectorized  Query   Take  advantage  of  modern  hardware  by  processing   thousand-­‐row  blocks  rather  than  row-­‐at-­‐a-­‐+me.   Throughput   Query  Planner   Using  extensive  sta+s+cs  now  available  in  Metastore   to  beeer  plan  and  op+mize  query,  including   predicate  pushdown  during  compila+on  to  eliminate   por+ons  of  input  (beyond  par++on  pruning)   Latency   ORC  File   Columnar,  type  aware  format  with  indices   Latency   Cost  Based  Op+mizer   (Op+q)   Join  re-­‐ordering  and  other  op+miza+ons  based  on   column  sta+s+cs  including  histograms  etc.   Latency   Hive  as  a  Service   Leaves  engine  running  between  sessions   Latency   Buffer  Cache   Leaves  most  used  HDFS  file  blocks  in  memory   Latency  
  6. 6. Hadoop 2: Moving Past MapReduce Page  6   HADOOP  1.0   HDFS   (redundant,  reliable  storage)   MapReduce   (cluster  resource  management    &  data  processing)   HDFS2   (redundant,  highly-­‐available  &  reliable  storage)   YARN   (cluster  resource  management)   MapReduce   (data  processing)   Others   HADOOP  2.0   Single  Use  System   Batch  Apps   Mul/  Purpose  Pla5orm   Batch,  Interac/ve,  Online,  Streaming,  …  
  7. 7. Apache Tez as the new Primitive HDFS2   (redundant,  reliable  storage)   Tez   (execu+on  engine)   YARN   (cluster  resource  management)   HADOOP  2.0   MapReduce  as  Base   Apache  Tez  as  Base   HDFS   (redundant,  reliable  storage)   MapReduce   (cluster  resource  management    &  data  processing)   Pig   (data  flow)   Hive   (sql)     Others   (Cascading)     HADOOP  1.0   Data  Flow   Pig   SQL   Hive     Others   (Cascading)     Batch   MapReduce   Slider   (con+nuous  execu+on)   Online     Data     Processing   HBase,   Accumulo   Real  Time     Stream     Processing   Storm  
  8. 8. Complete Open Source Stack •  YARN is the logical extension of Apache Hadoop –  Complements  HDFS,  the  data  reservoir     •  Resource Management for the Enterprise Data Lake –  Shared,  secure,  mul+-­‐tenant  Hadoop   Allows for all processing in Open-Source Hadoop Page  8   HDFS2  (Redundant,  Reliable  Storage)   YARN  (Cluster  Resource  Management)       BATCH   (MapReduce)   INTERACTIVE   (Tez)   STREAMING   (Storm,  S4,…)   GRAPH   (Giraph)   IN-­‐MEMORY   (Spark)   HPC  MPI   (OpenMPI)   ONLINE   (HBase)   OTHER   (Search)   (Weave…)  
  9. 9. Feature   Descrip"on   Benefit   Tez  Session   Overcomes  Map-­‐Reduce  job-­‐launch  latency  by  pre-­‐ launching  Tez  AppMaster   Latency   Tez  Container  Pre-­‐ Launch   Overcomes  Map-­‐Reduce  latency  by  pre-­‐launching   hot  containers  ready  to  serve  queries.   Latency   Tez  Container  Re-­‐ Use   Finished  maps  and  reduces  pick  up  more  work   rather  than  exi+ng.  Reduces  latency  and  eliminates   difficult  split-­‐size  tuning.  Out  of  box  performance!   Latency   Run+me  re-­‐ configura+on  of  DAG   Run+me  query  tuning  by  picking  aggrega+on   parallelism  using  online  query  sta+s+cs   Throughput   Tez  In-­‐Memory   Cache   Hot  data  kept  in  RAM  for  fast  access.   Latency   Complex  DAGs   Tez  Broadcast  Edge  and  Map-­‐Reduce-­‐Reduce   paeern  improve  query  scale  and  throughput.   Throughput   Hive On Tez - Execution
  10. 10. ORC File Advantages Sustained Query Times Apache Hive 0.12 provides sustained acceptable query times even at petabyte scale 131  GB   (78%  Smaller)   File  Size  Comparison  Across  Encoding  Methods   Dataset:  TPC-­‐DS  Scale  500  Dataset   221  GB   (62%  Smaller)   Encoded  with   Text   Encoded  with   RCFile   Encoded  with   ORCFile   Encoded  with   Parquet   505  GB   (14%  Smaller)   585  GB   (Original  Size)   •  Larger  Block  Sizes     •  Columnar  format   arranges  columns   adjacent  within  the   file  for  compression   &  fast  access   Impala   Hive  12   Smaller Footprint Better encoding with ORC in Apache Hive 0.12 reduces resource requirements for your cluster.
  11. 11. ORCFile  File  Format   Page 11 Query-­‐Op"mized:  Split-­‐able,  columnar  storage   file     Efficient  Reads:  Break  into  large  “stripes”  of   data  for  efficient  read     Fast  Filtering:  Built  in  index,  min/max,   metadata  for  fast  filtering  blocks  -­‐  bloom  filters   if  desired     Efficient  Compression:  Decompose  complex   row  types  into  primi+ves:  massive   compression  and  efficient  comparisons  for   filtering     Precomputa"on:  Built  in  aggregates  per  block   (min,  max,  count,  sum,  etc.)    
  12. 12. A Journey to SQL Compliance Evolu"on  of  SQL  Compliance  in  Hive   SQL  Datatypes   SQL  Seman"cs   INT/TINYINT/SMALLINT/BIGINT   SELECT,  INSERT   FLOAT/DOUBLE   GROUP  BY,  ORDER  BY,  HAVING   BOOLEAN   JOIN  on  explicit  join  key   ARRAY,  MAP,  STRUCT,  UNION   Inner,  outer,  cross  and  semi  joins   STRING   Sub-­‐queries  in  the  FROM  clause   BINARY   ROLLUP  and  CUBE   TIMESTAMP   UNION   DECIMAL   Standard  aggrega+ons  (sum,  avg,  etc.)   DATE   Custom  Java  UDFs   VARCHAR   Windowing  func+ons  (OVER,  RANK,  etc.)   CHAR   Advanced  UDFs  (ngram,  XPath,  URL)   Interval  Types   Sub-­‐queries  for  IN/NOT  IN,  HAVING   JOINs  in  WHERE  Clause   INSERT/UPDATE/DELETE   Legend   Hive  10  or  earlier   Roadmap   Hive  11   Hive  12   Hive  13  
  13. 13. Tez – Execution Performance •  Performance gains over Map Reduce –  Eliminate  replicated  write  barrier  between  successive  computa+ons.   –  Eliminate  job  launch  overhead  of  workflow  jobs.   –  Eliminate  extra  stage  of  map  reads  in  every  workflow  job.   –  Eliminate  queue  and  resource  conten+on  suffered  by  workflow  jobs  that  are  started  aper   a  predecessor  job  completes.   Page  13   Pig/Hive  -­‐  MR   Pig/Hive  -­‐  Tez  
  14. 14. Hive  –  MR   Hive  –  Tez   Hive-on-MR vs. Hive-on-Tez SELECT  a.state,  COUNT(*),  AVERAGE(c.price)  FROM  a   JOIN  b  on  (a.id  =  b.id)   JOIN  c  on  (a.itemId  =  c.itemId)   GROUP  by  a.state SELECT  a.state   JOIN  (a,  c)   SELECT  c.price   SELECT  b.id   JOIN(a,  b)   GROUP  BY  a.state   COUNT(*)   AVERAGE(c.price)   M M M R R M M R M M R M M R HDFS   HDFS   HDFS   M M M R R R M M R R SELECT  a.state,   c.itemId   JOIN  (a,  c)   JOIN(a,  b)   GROUP  BY  a.state   COUNT(*)   AVERAGE(c.price)   SELECT  b.id   Tez  avoids  unneeded   writes  to  HDFS  
  15. 15. Vectorization • Rewrite all operations to operate on blocks of 1K+ records, rather than one record at a time • Block is array of Java scalars, not Objects (eliminate Objects – compounding GC gains over time) • Avoids many function calls, CPU pipeline stalls •  Size to fit in L1 cache, avoid cache misses Page  15  
  16. 16. Stinger Phase 3: Unlocking Interactive Query S"nger  Phase  3:  Features  and  Benefits   Container  Pre-­‐Launch   Overcomes  Java  VM  startup  latency  by  pre-­‐ launching  hot  containers  ready  to  serve  queries   Container  Re-­‐Use   Finished  Maps  and  Reduces  pick  up  more  work   rather  than  exi+ng.  Reduces  latency  and  eliminates   difficult  split  size  tuning   Tez  Integra+on   Tez  Broadcast  Edge  and  Map-­‐Reduce-­‐Reduce   paeern  improve  query  scale  and  throughput   In-­‐Memory  Cache   Hot  data  kept  in  RAM  for  fast  access  
  17. 17. Quantifying Stinger Page 17 Hive 10 Hive 0.13 (Phase 3)Hive 0.11 (Phase 1) 190x   Improvement   1400s 39s 7.2s TPC-­‐DS  Query  27   3200s 65s 14.9s TPC-­‐DS  Query  82   200x   Improvement   Query  27:  Pricing  Analy"cs  using  Star  Schema  Join     Query  82:  Inventory  Analy"cs  Joining  2  Large  Fact  Tables   All  Results  at  Scale  Factor  200  (Approximately  200GB  Data)  
  18. 18. 41.1s 4.2s 39.8s 4.1s TPC-­‐DS  Query  52   TPC-­‐DS  Query  55   Query  Time  in  Seconds   Speed: Delivering Interactive Query Test  Cluster:   •  200  GB  Data  (ORCFile)   •  20  Nodes,  24GB  RAM  each,  6x  disk  each     Hive 0.12 Hive 0.13 (Phase 3) Query  52:  Star  Schema  Join  with  group-­‐by,  order-­‐by  on  different  keys   Query  55:  Star  Schema  Join  with  group-­‐by,  order-­‐by  on  different  keys  
  19. 19. 22s 9.8s 31s 6.7s TPC-­‐DS  Query  28   TPC-­‐DS  Query  12   Query  Time  in  Seconds   Speed: Delivering Interactive Query Test  Cluster:   •  200  GB  Data  (ORCFile)   •  20  Nodes,  24GB  RAM  each,  6x  disk  each     Hive 0.12 Hive 0.13 (Phase 3) Query  28:  Four  sub-­‐query  join  (Vectoriza"on)   Query  12:  Star  Join  over  range  of  dates  (M-­‐R-­‐R  palern)  
  20. 20. Hortonworks Confidential © 2014 Speed@Scale: Large Scale Implementation Page 20 http://blogs.cisco.com/datacenter/hdp Cisco Engineering Blog Post Independent assessment by Cisco UCS Team Benchmark @ 30TB
  21. 21. Hortonworks Confidential © 2014 Speed@Scale: Large Scale Implementation Page 21 https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/ Facebook Engineering Blog Post Hortonworks engineering team worked on ORCFile Facebook provided improvements to ORCFile, working with Hortonworks Hive is used for efficient analytics on the largest Hadoop Data Warehouse site Ultimate Scale Data Analysis
  22. 22. Your Fastest On-ramp to Enterprise Hadoop™! Page  22   hep://hortonworks.com/products/hortonworks-­‐sandbox/   The  Sandbox  lets  you  experience  Apache  Hadoop  from  the  convenience  of  your  own   laptop  –  no  data  center,  no  cloud  and  no  internet  connec+on  needed!     The  Hortonworks  Sandbox  is:   •  A  free  download:    hep://hortonworks.com/products/hortonworks-­‐sandbox/   •  A  complete,  self  contained  virtual  machine  with  Apache  Hadoop  pre-­‐configured   •  A  personal,  portable  and  standalone  Hadoop  environment   •  A  set  of  hands-­‐on,  step-­‐by-­‐step  tutorials  that  allow  you  to  learn  and  explore  Hadoop  
  23. 23. Ques+ons?   @ddkaiser   linkedin.com/in/dkaiser   slideshare.net/ddkaiser   dkaiser@cdk.com   dkaiser@hortonworks.com     David  Kaiser