Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Bay Area Hive Contributor Meetup
16-Nov-2015
LLAP: Live Long and Process!
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Sub-Second Hive with LLAP
Sub Second:
• LLAP: Persistent server to instantly execute SQL queries.
• Caches hottest data in RAM.
• Overcomes latencies associated with Hive on Tez or Hive on Spark.
SQL Compatibility:
• 100% Compatible with SQL.
• Compatible with existing tools (BI, ETL, etc.)
Security:
• Security via HiveServer2.
• Integrates with Apache Ranger. Hadoop
Node
Hadoop
Node
Hadoop
Node
Vector
Cache
LLAP
Server
Vector
Cache
LLAP
Server
Vector
Cache
LLAP
Server
Hive
Sever2
LLAP Servers
(1 Per Hadoop Node)
Hive SQL
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
LLAP
Failure Tolerance
Concurrency & Pre-emption
ACID Transactions
No MPP Hotspots
Data overflow to disk
Elastic scale up/down
YARN native application
Hadoop
Node
Hadoop
Node
Hadoop
Node
Vector
Cache
LLAP
Server
Vector
Cache
LLAP
Server
Vector
Cache
LLAP
Server
Hive
Sever2
LLAP Servers
(1 Per Hadoop Node)
Hive SQL
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TPC-DS Query55
-- monthly sales by brand manager
select i_brand_id brand_id, i_brand brand,
sum(ss_ext_sales_price) ext_price
from date_dim, store_sales, item
where date_dim.d_date_sk = store_sales.ss_sold_date_sk
and store_sales.ss_item_sk = item.i_item_sk
and i_manager_id=${RANDOM_MANAGER}
and d_moy=${RANDOM_MONTH}
and d_year=${RANDOM_YEAR}
group by i_brand, i_brand_id
order by ext_price desc, i_brand_id
limit 100 ;
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
LLAP demo!
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
0
1
2
3
4
5
6
7
1 2 4 8
QueryLatency(ms)
Thousands
Concurrency
TPC-DS Q55 @ 10Tb scale (LLAP) x 256 runs
25th Percentile
50th Percentile
75th Percentile
100th Percentile
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
LLAP Execution
610
LLAP Execution
900
Compile
762
Compile
1054
25 million rows
32 million rows
0
5
10
15
20
25
30
35
0
500
1000
1500
2000
2500
3000
3500
Median 95th Percentile
RowsScanned
Millions
Millseconds
TPC-DS Q55 (@200Gb) - 1000 runs
LLAP Execution Compile DAG Build Tez Client Tez AM Rows Scanned
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
LLAP Execution
Compile
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 501 1001
TPC-DS Q55 (@200Gb) - Latency Fractions across 1000 runs
LLAP Execution DAG Build Tez Client Tez AM Compile

LLAP Nov Meetup

  • 1.
    Page1 © HortonworksInc. 2011 – 2014. All Rights Reserved Bay Area Hive Contributor Meetup 16-Nov-2015 LLAP: Live Long and Process!
  • 2.
    Page2 © HortonworksInc. 2011 – 2014. All Rights Reserved Sub-Second Hive with LLAP Sub Second: • LLAP: Persistent server to instantly execute SQL queries. • Caches hottest data in RAM. • Overcomes latencies associated with Hive on Tez or Hive on Spark. SQL Compatibility: • 100% Compatible with SQL. • Compatible with existing tools (BI, ETL, etc.) Security: • Security via HiveServer2. • Integrates with Apache Ranger. Hadoop Node Hadoop Node Hadoop Node Vector Cache LLAP Server Vector Cache LLAP Server Vector Cache LLAP Server Hive Sever2 LLAP Servers (1 Per Hadoop Node) Hive SQL
  • 3.
    Page3 © HortonworksInc. 2011 – 2014. All Rights Reserved LLAP Failure Tolerance Concurrency & Pre-emption ACID Transactions No MPP Hotspots Data overflow to disk Elastic scale up/down YARN native application Hadoop Node Hadoop Node Hadoop Node Vector Cache LLAP Server Vector Cache LLAP Server Vector Cache LLAP Server Hive Sever2 LLAP Servers (1 Per Hadoop Node) Hive SQL
  • 4.
    Page4 © HortonworksInc. 2011 – 2014. All Rights Reserved TPC-DS Query55 -- monthly sales by brand manager select i_brand_id brand_id, i_brand brand, sum(ss_ext_sales_price) ext_price from date_dim, store_sales, item where date_dim.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and i_manager_id=${RANDOM_MANAGER} and d_moy=${RANDOM_MONTH} and d_year=${RANDOM_YEAR} group by i_brand, i_brand_id order by ext_price desc, i_brand_id limit 100 ;
  • 5.
    Page5 © HortonworksInc. 2011 – 2014. All Rights Reserved LLAP demo!
  • 6.
    Page6 © HortonworksInc. 2011 – 2014. All Rights Reserved
  • 7.
    Page7 © HortonworksInc. 2011 – 2014. All Rights Reserved 0 1 2 3 4 5 6 7 1 2 4 8 QueryLatency(ms) Thousands Concurrency TPC-DS Q55 @ 10Tb scale (LLAP) x 256 runs 25th Percentile 50th Percentile 75th Percentile 100th Percentile
  • 8.
    Page8 © HortonworksInc. 2011 – 2014. All Rights Reserved LLAP Execution 610 LLAP Execution 900 Compile 762 Compile 1054 25 million rows 32 million rows 0 5 10 15 20 25 30 35 0 500 1000 1500 2000 2500 3000 3500 Median 95th Percentile RowsScanned Millions Millseconds TPC-DS Q55 (@200Gb) - 1000 runs LLAP Execution Compile DAG Build Tez Client Tez AM Rows Scanned
  • 9.
    Page9 © HortonworksInc. 2011 – 2014. All Rights Reserved LLAP Execution Compile 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 501 1001 TPC-DS Q55 (@200Gb) - Latency Fractions across 1000 runs LLAP Execution DAG Build Tez Client Tez AM Compile