SlideShare a Scribd company logo
1 of 17
Join Optimization in Hive Liyin Tang
Outline Map Join Optimization Previous Common Join and Map Join Optimized Map Join JDBM Performance Evaluation Convert Join to Map Join Automatically How it works Performance Evaluation
Common Join Task A Table X Table Y Common Join Task Mapper Mapper Mapper … Mapper … Mapper … Mapper Shuffle Reducer
Previous Map Join Task A Small Table Data MapJoin Task Mapper … … Record Mapper … … … Big Table  Data Record Mapper … … … Record Record Record Task C … …
Optimized Map Join Small Table Data Small Table Data Small Table Data Task A Upload files to DC HashTable Files HashTable Files HashTable  Files MapReduce Local Task Distributed Cache MapJoin Task Mapper … Record Mapper … Record Mapper … Big Table  Data Record Record … … Task C
JDBM JDBM is too heavy weight for Map Join Take more than 70% CPU time Generate very large file No need to use persistent hashtable for map join
Performance Evaluation I
Converting Common Join into Map Join Task A Task A Conditional Task a CommonJoinTask MapJoinLocalTask MapJoinLocalTask MapJoinLocalTask b CommonJoinTask . . . . .  Task C c MapJoinTask MapJoinTask MapJoinTask Previous Execution Flow  Task C Optimized Execution Flow
Compile Time SELECT * FROM  SRC1 x JOIN SRC2 y  ON x.key = y.key; Task A a Conditional Task Assume TABLE x is the big table Assume TABLE y is the big table MapJoinLocalTask MapJoinLocalTask CommonJoinTask MapJoinTask MapJoinTask Task C
Execution Time SELECT * FROM  SRC1 x JOIN SRC2 y  ON x.key = y.key; Task A Both tables are too big for map join Table X is the big table a Conditional Task MapJoinLocalTask CommonJoinTask MapJoinTask Task C
Backup Task Task A Conditional Task Memory Bound MapJoinLocalTask Run as a Backup Task CommonJoinTask MapJoinTask Task C
Performance Bottleneck Distributed Cache is the potential performance bottleneck Large hashtable file will slow down the propagation of Distributed Cache Mappers are waiting for the hashtables file from Distributed Cache Compress and archive all the hashtable file into a tar file.
Compress and Archive  Small Table Data Small Table Data Small Table Data Task A Compressed & Archived a HashTable Files HashTable Files HashTable Files MapReduce Local Task Distributed Cache Mapper … Record Mapper … Record Mapper … Big Table  Data Record b Record MapJoin Task … … Task C
Performance Evaluation II
Performance Evaluation III
Future Work Audit how many join will be converted into map join in the cluster. Set hashtable file replica number based on the number of Mappers Tune the limit of small table data size by sampling Increase the in-memory hashtable capacity.
Thank you Liyin Tang

More Related Content

What's hot

Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentationateeq ateeq
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reducerantav
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)npinto
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-ReduceBrendan Tierney
 
Map reduce and Hadoop on windows
Map reduce and Hadoop on windowsMap reduce and Hadoop on windows
Map reduce and Hadoop on windowsMuhammad Shahid
 
Hive Percona 2009
Hive Percona 2009Hive Percona 2009
Hive Percona 2009prasadc
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And HdfsCloudera, Inc.
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reducePaladion Networks
 
Hadoop Summit 2009 Hive
Hadoop Summit 2009 HiveHadoop Summit 2009 Hive
Hadoop Summit 2009 HiveZheng Shao
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizationsscottcrespo
 
Map Reduce
Map ReduceMap Reduce
Map Reduceschapht
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analyticsAvinash Pandu
 

What's hot (20)

Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hive
HiveHive
Hive
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
An Introduction To Map-Reduce
An Introduction To Map-ReduceAn Introduction To Map-Reduce
An Introduction To Map-Reduce
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-Reduce
 
MapReduce basic
MapReduce basicMapReduce basic
MapReduce basic
 
Map reduce and Hadoop on windows
Map reduce and Hadoop on windowsMap reduce and Hadoop on windows
Map reduce and Hadoop on windows
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hive Percona 2009
Hive Percona 2009Hive Percona 2009
Hive Percona 2009
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 
Hadoop Summit 2009 Hive
Hadoop Summit 2009 HiveHadoop Summit 2009 Hive
Hadoop Summit 2009 Hive
 
Hadoop Map Reduce Arch
Hadoop Map Reduce ArchHadoop Map Reduce Arch
Hadoop Map Reduce Arch
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analytics
 

Viewers also liked

Hive Correlation Optimizer
Hive Correlation OptimizerHive Correlation Optimizer
Hive Correlation OptimizerYin Huai
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentryBrock Noland
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveWill Du
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Cloudera, Inc.
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebookragho
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksDatabricks
 
Spark Summit Europe 2016 Keynote - Databricks CEO
Spark Summit Europe 2016 Keynote  - Databricks CEO Spark Summit Europe 2016 Keynote  - Databricks CEO
Spark Summit Europe 2016 Keynote - Databricks CEO Databricks
 
Apache Spark and Online Analytics
Apache Spark and Online Analytics Apache Spark and Online Analytics
Apache Spark and Online Analytics Databricks
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
Spark Summit EU 2016: The Next AMPLab: Real-time Intelligent Secure Execution
Spark Summit EU 2016: The Next AMPLab:  Real-time Intelligent Secure ExecutionSpark Summit EU 2016: The Next AMPLab:  Real-time Intelligent Secure Execution
Spark Summit EU 2016: The Next AMPLab: Real-time Intelligent Secure ExecutionDatabricks
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016 Databricks
 
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0Databricks
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Cloudera, Inc.
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsDatabricks
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingDatabricks
 
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...Databricks
 

Viewers also liked (20)

Hive Correlation Optimizer
Hive Correlation OptimizerHive Correlation Optimizer
Hive Correlation Optimizer
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Hive ppt (1)
Hive ppt (1)Hive ppt (1)
Hive ppt (1)
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Spark Summit Europe 2016 Keynote - Databricks CEO
Spark Summit Europe 2016 Keynote  - Databricks CEO Spark Summit Europe 2016 Keynote  - Databricks CEO
Spark Summit Europe 2016 Keynote - Databricks CEO
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Apache Spark and Online Analytics
Apache Spark and Online Analytics Apache Spark and Online Analytics
Apache Spark and Online Analytics
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Spark Summit EU 2016: The Next AMPLab: Real-time Intelligent Secure Execution
Spark Summit EU 2016: The Next AMPLab:  Real-time Intelligent Secure ExecutionSpark Summit EU 2016: The Next AMPLab:  Real-time Intelligent Secure Execution
Spark Summit EU 2016: The Next AMPLab: Real-time Intelligent Secure Execution
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
 
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured Streaming
 
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...
Spark Summit EU 2015: Spark DataFrames: Simple and Fast Analysis of Structure...
 

Similar to Join optimization in hive

White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuningAnil Reddy
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advancedChirag Ahuja
 
Applying stratosphere for big data analytics
Applying stratosphere for big data analyticsApplying stratosphere for big data analytics
Applying stratosphere for big data analyticsAvinash Pandu
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptxShimoFcis
 
Dache: A Data Aware Caching for Big-Data using Map Reduce framework
Dache: A Data Aware Caching for Big-Data using Map Reduce frameworkDache: A Data Aware Caching for Big-Data using Map Reduce framework
Dache: A Data Aware Caching for Big-Data using Map Reduce frameworkSafir Shah
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxHARIKRISHNANU13
 
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...jencyjayastina
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture Rupak Roy
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Finalprojectpresentation
FinalprojectpresentationFinalprojectpresentation
FinalprojectpresentationSANTOSH WAYAL
 
Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigKhanKhaja1
 
MapReduce
MapReduceMapReduce
MapReduceKavyaGo
 

Similar to Join optimization in hive (20)

White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
Map and Reduce
Map and ReduceMap and Reduce
Map and Reduce
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Applying stratosphere for big data analytics
Applying stratosphere for big data analyticsApplying stratosphere for big data analytics
Applying stratosphere for big data analytics
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
Dache: A Data Aware Caching for Big-Data using Map Reduce framework
Dache: A Data Aware Caching for Big-Data using Map Reduce frameworkDache: A Data Aware Caching for Big-Data using Map Reduce framework
Dache: A Data Aware Caching for Big-Data using Map Reduce framework
 
Hadoop
HadoopHadoop
Hadoop
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Finalprojectpresentation
FinalprojectpresentationFinalprojectpresentation
Finalprojectpresentation
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
 
ch02-mapreduce.pptx
ch02-mapreduce.pptxch02-mapreduce.pptx
ch02-mapreduce.pptx
 
Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pig
 
MapReduce
MapReduceMapReduce
MapReduce
 
Map reducefunnyslide
Map reducefunnyslideMap reducefunnyslide
Map reducefunnyslide
 
Hadoop 3
Hadoop 3Hadoop 3
Hadoop 3
 
Hadoop 2
Hadoop 2Hadoop 2
Hadoop 2
 

Join optimization in hive

  • 1. Join Optimization in Hive Liyin Tang
  • 2. Outline Map Join Optimization Previous Common Join and Map Join Optimized Map Join JDBM Performance Evaluation Convert Join to Map Join Automatically How it works Performance Evaluation
  • 3. Common Join Task A Table X Table Y Common Join Task Mapper Mapper Mapper … Mapper … Mapper … Mapper Shuffle Reducer
  • 4. Previous Map Join Task A Small Table Data MapJoin Task Mapper … … Record Mapper … … … Big Table Data Record Mapper … … … Record Record Record Task C … …
  • 5. Optimized Map Join Small Table Data Small Table Data Small Table Data Task A Upload files to DC HashTable Files HashTable Files HashTable Files MapReduce Local Task Distributed Cache MapJoin Task Mapper … Record Mapper … Record Mapper … Big Table Data Record Record … … Task C
  • 6. JDBM JDBM is too heavy weight for Map Join Take more than 70% CPU time Generate very large file No need to use persistent hashtable for map join
  • 8. Converting Common Join into Map Join Task A Task A Conditional Task a CommonJoinTask MapJoinLocalTask MapJoinLocalTask MapJoinLocalTask b CommonJoinTask . . . . . Task C c MapJoinTask MapJoinTask MapJoinTask Previous Execution Flow Task C Optimized Execution Flow
  • 9. Compile Time SELECT * FROM SRC1 x JOIN SRC2 y ON x.key = y.key; Task A a Conditional Task Assume TABLE x is the big table Assume TABLE y is the big table MapJoinLocalTask MapJoinLocalTask CommonJoinTask MapJoinTask MapJoinTask Task C
  • 10. Execution Time SELECT * FROM SRC1 x JOIN SRC2 y ON x.key = y.key; Task A Both tables are too big for map join Table X is the big table a Conditional Task MapJoinLocalTask CommonJoinTask MapJoinTask Task C
  • 11. Backup Task Task A Conditional Task Memory Bound MapJoinLocalTask Run as a Backup Task CommonJoinTask MapJoinTask Task C
  • 12. Performance Bottleneck Distributed Cache is the potential performance bottleneck Large hashtable file will slow down the propagation of Distributed Cache Mappers are waiting for the hashtables file from Distributed Cache Compress and archive all the hashtable file into a tar file.
  • 13. Compress and Archive Small Table Data Small Table Data Small Table Data Task A Compressed & Archived a HashTable Files HashTable Files HashTable Files MapReduce Local Task Distributed Cache Mapper … Record Mapper … Record Mapper … Big Table Data Record b Record MapJoin Task … … Task C
  • 16. Future Work Audit how many join will be converted into map join in the cluster. Set hashtable file replica number based on the number of Mappers Tune the limit of small table data size by sampling Increase the in-memory hashtable capacity.

Editor's Notes

  1. A common join in hive will involve a Map stage and a Reduce stageAs we all know, shuffle stage before reducer is expensive, they need to sort and merge the intermediate file. So we tried to avoid this stage whenever is possible.
  2. That’s the motivation of the map join.When one of the table is small enough to fit into the memory, so all the Mapper can hold the data in memory and do the join work in memory.So in this way, there is no shuffle/reduce stage is needed.That is how the previous map join works However the previous map join does not scaleThousands of Mapper read the small table from HDFS into memory, it will easily cause the small table to be the performance bottleneckAlso they will get read time out when the small table data became to be the hot spot. So that’s the problem of the previous map join
  3. I tried to solve this problemWe create a map reduce local task, which will run locally and read the small table data into memory and serialize the hashtable into files.After that, it will upload the files into Distributed CacheWhen the Map Join Task is launched, the DC will propagate the hashtable files to each mapper’s local file system.And each mapper will load them back into the memory and do the join work as before.By doing this, we need to read the small table data only once. Also we use the DC to push the data to Mapper, instead of pulling data from HDFS by mapper itself. The difference is if multiple mappers runs on the same machine, DC only needs to push the data once.
  4. Another optimization is to remove JDBM component from hive.JDBM is the persistent hash table used in Hive.Whenever the in-memory hashtable cannot hold data any more, it will swap the key/value into the JDBM table. It’s like a backup storageBy profiling, we found out JDBM is too heavy weight for map join.Take more than 70% CPU time when call the get function from JDBM.The generated Hashtable file is too large to propagate in DCActually, there is no need to use persistent hash table for map join. If the table is too large to fit in the memory, they should not run the query as a map join.Right now, we have totally removed this component from hive.
  5. We run several benchmark to how much performance improvement after optimization.The result of benchmark shows the new optimized map join will be 12~26 times faster the previous one.I have to mention that the performance improvement is not only because of introducing the Distributed Cache to propagate the hashtable file,But also we have optimized the map join code and remove a very heavy weight persistent hashtable component from Hive.
  6. since the new map join has a very good performance, Hive should try to run map join instead of common join whenever is possible.Previously, if user wants to do the map join, he needs to give the hints in query to assign which table is the small table 2) So the mapper can hold that data in the memory.3)But basically, not all of the users will givethis hint or users may gave a wrong hint in the query.4) So getting the hint from user is not good for user experience and query performance.6) So my work is to automatically and dynamically convert the common join into map join during the run time.7) Automatic means users don’t need to give the hint in the query any more.8) Dynamic means in the compile time, Hive will generate a series of execution flow, each of the execution flow covers one possible situation. (The number of execution path is bounded by the number of join tables.)12) During the execution, Hive will choose the most efficient execution path to run based on input file size.
  7. Let’s take an example: There are 2 tables join together on a join key. Let’s say table x and table y.So during the compile time, it will generate 3 execution paths. First execution path will do the map join by assuming table x is the big table and loading the other tables into memory, which is the table y.The second will also the do the map join by assuming table y is the big table.And finally, 3rd execution path assume both of the table is too large to fit into memory, not feasible for map join, it will run the original common join.Why we are doing this? It’s because we don’t the data size of join table during compile time. Some of the join table may be the intermediate tables, which is generated from some sub query at run time.We only know the data size during execution time.
  8. Let’s see what will happen during the execution time.When task A is finished, we exactly know the data size of each join table.If table X is the big table and the other table is small enough to fit into memory, Hive will run this execution path.However, if none of the tables can be load into memory, Hive will run the original common join execution path.By doing this, Hive can dynamically choose the most efficient execution path to run at execution stage.
  9. Since the local task needs to load the data into memory, it is a very memory intensive task.So hivewill launch this local task in a child jvm, which has the same heap size as the Mapper's.Right now, We have already carefully bounded the input data size of the small tables, but there is still possible that the local task may run out of memory.Sothe query processor will measure the memory usage of the local task very carefully. Once the memory usage of the Local Task is higher than a threshold, this Local Task will abort itself, which means the table is too large fit into memory and map join fails.In this case, the query processor will switch back the original Common Join task as a Backup Task to run, All of them is totally transparent to user.
  10. Let’s discuss the performance bottleneck. Previously, the small table file is the performance bottleneck,However, in the new map join, the distributed cache is the potential performance bottleneckIf you upload large hashtable file into DC, say larger than 30 M, it will really slow down the propagation speed of DC.You will see all the Mappers are launched but they will be in the initialization stage for a long time, which means they are waiting for the push from DC.Right now, the solution is very straightforward and walk around , we compress and archive all the hashtable files into tar file.
  11. We have run several performance benchmark to compare between with the compression and without compression.From the result, we can see the compression can help to improve the performance by 21 % ~ 86 %.Also we can see the larger the input data size is, the more performance we can get by compression, which is very reasonable.The larger the small table is, the large hashtable file it will generated.The larger the big table is, the more mappers will be launched.Both of these 2 factors will contributes to making the DC to be bottleneck problem.And compression can help to solve this situation.
  12. 1) Finally,let’s see the performance comparison between previous common join with the new optimized common join, which is converted map join.2) Because all the tested benchmark is valid to convert into map join. 3) The result shows the join performance will be improved by 57% to 163%, if the join can be converted into map join.
  13. There are several future works to follow. First, when this new optimized Join is running in the cluster, it would be better to know how many join operation is converted in Map Join and how many converted Map Join fails because it runs out of memory.We have already developed the hooks to audit but still need time to deploy in the cluster.Another thing is to set up the number of replications for the compressed hashtable files based on the number of Mappers. Currently, the number of replications for thehashtable files is 3. If we can set the replication number based on number of mappers, it will improve the propagation speed.we manually bounded the table size of small table to be 25M, which may be a little conservative. If we can tune this parameters by sampling the data, we will get more accurate limit of map join and more queries can be convert into map join.Finally,the local task can hold 2M unique key/value in the memory by consuming 1.47G memory space.By optimization to be more memory efficient, the local task can hold more data in memory. So more join queries can be converted into Map Join.