1
A Benchmark Test on Presto, Spark
SQL and Hive on Tez for Medium Data
次世代システム研究室
GMO Internet
2
はじめに
 GMO Taxelのデータ解析・集計・ETL
 バッチ処理はメインで、リアルタイム性要求が普通だ(数分間~1時間内?)
 ハードウェアコストが抑えられたい(メモリ制限)
 Big Data SQLエンジン
 Hive (on MapReduce)
 Presto (Facebook)
 Spark SQL
 Hive on Tez
 Impala (Cloudera)
 Apache Drill (MapR), Tajo (LinkedIn)
 Benchmarkテストの実施方案
 TPC ベンチマーク (重い)
 UC BerkeleyのAMP LabのBig Data Benchmark + α
 https://amplab.cs.berkeley.edu/benchmark/
3
はじめに
 Presto
 In-memory computation
 Fast response time
 Spark SQL
 DAG, Schema RDD
 Yarn
 Hive on Tez
 MapReduce (Sequential) -> DAG (Network)
 Speed + Scale (TB to PB)
 Cost Based Optimizer
4
テスト環境
 Master Node: 3
 Slave Node: 5
 Hardware
 CPU: 4 vCore per Node
 Memory: 16 GB per Node
 Software
 HDP: HDP-2.2.6.0-2800
 Hadoop: HDFS 2.6.0.2.2
 YARN: 2.6.0.2.2
 Presto: Prest-0.86
 Spark SQL: Spark 1.2.1.2.2
 Tez: 0.5.2.2.2
 Hive: 0.14.0.2.2
 Java: 1.7.0
5
テストデータのスキーマ
6
テストデータ
No.
UserVisits
(Rows)
UserVisits
(Bytes)
Rankings
(Rows)
Rankings
(Bytes)
D1 15K 約3MB 1.2K 約80KB
D2 6.1M 約1GB 0.72M 約51MB
D3 61M 約10GB 7.2M 約500MB
D4 610M 約100GB 72M 約6GB
D5 1.22B 約200GB 144M 約12GB
ファイルFormat: ORC (Presto:RCFileもテストした)
ファイルCompression: Snappy
7
テストSQL
No. Query Description
Q1 SELECT pageURL, pageRank FROM rankings WHERE pageRank > X Scan Query
Q2
SELECT SUBSTR(sourceIP, 1, X), SUM(adRevenue) FROM uservisits GROUP BY
SUBSTR(sourceIP, 1, X)
Aggregation Query
Q3
SELECT sourceIP, totalRevenue, avgPageRank
FROM
(SELECT sourceIP,
AVG(pageRank) as avgPageRank,
SUM(adRevenue) as totalRevenue
FROM Rankings AS R, UserVisits AS UV
WHERE R.pageURL = UV.destURL
AND UV.visitDate BETWEEN Date('1980-01-01') AND Date('X')
GROUP BY UV.sourceIP)
ORDER BY totalRevenue DESC LIMIT 100
Join Query
Q4
SELECT destURL, adRevenue, visitDate FROM UserVisits ORDER BY adRevenue
DESC LIMIT 100
Sort Query
Q5
Insert overwrite table UserVisitsWithHighRevenue SELECT sourceIP, destURL,
adRevenue, visitDate FROM UserVisits WHERE adRevenue > X ORDER BY visitDate
Insert
8
Performance Overview
Data Case Mean Presto
(sec)
Spark SQL
(sec)
Hive on
Tez (sec)
Descriptio
n
Small-
Medium
Geometric 2.2 17.2 6.7 D1~D3,
Q1~Q4Arithmetic 5.2 32.1 15.8
Medium-
Large
Geometric 11.3 34.5 6.1 D3~D5,
Not Q3
and Q5Arithmetic 22.8 64.1 69.0
Large Geometric 36.5 79.1 11.8 D5,
Not Q3
and Q5Arithmetic 46.1 128.4 129.3
Total Geometric 4.4 26.3 22.4 D1~D5,
Q1~Q5*Arithmetic 14.5 49.5 437.7
*成功したQuery数: Presto=17, Spark SQL = 21, Hive on Tez = 25
9
Performance Overview
*成功したQuery数: Presto=17, Spark SQL = 21, Hive on Tez = 25
3.0 X 3.0 X 2.8 X
30.2 X
0.5 X 1.1 X 1.0 X
8.8 X
1.0 X 1.0 X 1.0 X 1.0 X
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
Small-Medium Medium-Large Large Total
倍
数
データサイズ
Hive On Tezに比べて何倍早いか
「算術平均」
Presto Spark SQL Hive on Tez
10
Performance Overview
*成功したQuery数: Presto=17, Spark SQL = 21, Hive on Tez = 25
3.0 X
0.5 X
0.3 X
5.1 X
0.4 X
0.2 X 0.1 X
0.9 X1.0 X 1.0 X 1.0 X 1.0 X
0.0
1.0
2.0
3.0
4.0
5.0
6.0
Small-Medium Medium-Large Large Total
倍
数
データサイズ
Hive On Tezに比べて何倍早いか
「幾何平均」
Presto Spark SQL Hive on Tez
11
Query on different data size
Q1 Q2 Q3 Q4 Q5
Spark SQL 5.9 15.3 15.9 6.5 16.6
Presto 3.2 2.4 1.0 0.2
Hive on Tez 2.0 8.7 7.4 6.0 9.8
5.9
15.3
15.9
6.5
16.6
3.2
2.4
1.0
0.2
2.0
8.7
7.4
6.0
9.8
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
実
行
時
間
(
秒
)
Query time On Data Set “D1” (15K records)
Spark SQL Presto Hive on Tez
12
Query on different data size
Q1 Q2 Q3 Q4 Q5
Spark SQL 7.2 28.0 50.9 11.3 35.1
Presto 0.3 3.1 29.2 0.8
Hive on Tez 1.9 17.3 18.1 11.6 44.5
7.2
28.0
50.9
11.3
35.1
0.3
3.1
29.2
0.81.9
17.3 18.1
11.6
44.5
0.0
10.0
20.0
30.0
40.0
50.0
60.0
実
行
時
間
(
秒
)
Query time On Data Set "D2" (6.1M records)
Spark SQL Presto Hive on Tez
13
Query on different data size
Q1 Q2 Q3 Q4 Q5
Spark SQL 7.3 33.5 189.9 13.5 80.6
Presto 3.5 7.2 6.5
Hive on Tez 0.0 29.0 72.4 15.6 372.2
7.3
33.5
189.9
13.5
80.6
3.5 7.2 6.50.0
29.0
72.4
15.6
372.2
0.0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
400.0
実
行
時
間
(
秒
)
Query time On Data Set "D3" (61M records)
Spark SQL Presto Hive on Tez
14
Query on different data size
Q1 Q2 Q3 Q4 Q5
Spark SQL 10.9 75.9 51.0
Presto 0.6 28.6 20.8
Hive on Tez 0.0 129.7 539.4 58.8 2828.4
10.9 75.9 51.00.6 28.6 20.80.0
129.7
539.4
58.8
2828.4
0.0
500.0
1000.0
1500.0
2000.0
2500.0
3000.0
実
行
時
間
(
秒
)
Query time On Data Set "D4" (610M records)
Spark SQL Presto Hive on Tez
15
Query on different data size
Q1 Q2 Q3 Q4 Q5
Spark SQL 16.5 246.8 121.8
Presto 12.4 57.6 68.4
Hive on Tez 0.0 248.8 932.8 139.1 5449.6
16
247 12212 58 680
249
933
139
5,450
0
1000
2000
3000
4000
5000
6000
実
行
時
間
(
秒
)
Query time On Data Set "D5" (1.2B records)
Spark SQL Presto Hive on Tez
16
Performance over Memory & Nodes
5.9
38.8
12.7
4.8
19.3
12.5
18%
50%
2%
0%
10%
20%
30%
40%
50%
60%
0
5
10
15
20
25
30
35
40
45
Presto SparkSQL HoT
実
行
時
間
(
s
e
c
)
2GB 4GB 短縮比率
*Spark SQLの場合、Node数とメモリ両方が倍にしました
17
ORC vs RCFile over Presto
ORCテーブルはRCFileテーブルより約5倍早い
D2 D3 D4
RCFile 10.1 46.2 63.5
ORC 8.4 5.7 10.7
10.1
46.2
63.5
8.4
5.7
10.7
0
10
20
30
40
50
60
70
実
行
時
間
(
秒
)
ORC vs RCFile over Presto
RCFile ORC
*CoulderaはImpalaとPrestoをBenchmarkするとき、RCFileを使った!!
http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on-hadoop-impala-1-4-widens-the-performance-gap/
18
結論
 Presto
 OOMが頻発、WorkerNodeがCrash
 Speed重視、メモリ十分
 Insert OverwriteがSupportされていない
 Spark SQL
 OOMが発生(別ノードでタスクをRetry)
 Speedが三者中に一番遅い
 Scalabilityがよい(リソースが倍にすると、処理スピードも倍になる)
 Hive on Tez
 SpeedがPrestoより遅いが、OOMが発生しない(メモリ要求が低い)
 Largeデータでも、安定して、全てのSQL実行を成功した(唯一)
 Cost Based Optimizer
 Scanクエリが異常に早いが、Insertが異常に遅い(Tuning?)
× Yarn
● Yarn
● Yarn
19
結論 – チューニング(Basic)
 Presto
 task.max-memory
 task.shard.max-threads (default: number of CPU cores * 4)
 JVM option: Xmx4G
 Spark SQL
 executor-memory, executor-cores, num-executors
 spark.sql.shuffle.partitions, spark.sql.codegen
 Hive on Tez
 Compute Stats
 hive.cbo.enable, hive.compute.query.using.stats, etc.
 tez.am.resource.memory.mb, tez.am.java.optsScan
20
• チューニング参照資料
– https://streever.atlassian.net/wiki/display/HADOOP/Hive+Performanc
e+Tips
– https://spark.apache.org/docs/latest/sql-programming-guide.html
– https://prestodb.io/docs/current/installation/deployment.html
– http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-
2.1.1/bk_installing_manually_book/content/rpm-chap1-11.html
• Similar Benchmark Test
– http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on-
hadoop-impala-1-4-widens-the-performance-gap/
– https://amplab.cs.berkeley.edu/benchmark/
– http://www.slideshare.net/hadoopxnttdata/sql-on-hadoop-201411

A Benchmark Test on Presto, Spark Sql and Hive on Tez

  • 1.
    1 A Benchmark Teston Presto, Spark SQL and Hive on Tez for Medium Data 次世代システム研究室 GMO Internet
  • 2.
    2 はじめに  GMO Taxelのデータ解析・集計・ETL バッチ処理はメインで、リアルタイム性要求が普通だ(数分間~1時間内?)  ハードウェアコストが抑えられたい(メモリ制限)  Big Data SQLエンジン  Hive (on MapReduce)  Presto (Facebook)  Spark SQL  Hive on Tez  Impala (Cloudera)  Apache Drill (MapR), Tajo (LinkedIn)  Benchmarkテストの実施方案  TPC ベンチマーク (重い)  UC BerkeleyのAMP LabのBig Data Benchmark + α  https://amplab.cs.berkeley.edu/benchmark/
  • 3.
    3 はじめに  Presto  In-memorycomputation  Fast response time  Spark SQL  DAG, Schema RDD  Yarn  Hive on Tez  MapReduce (Sequential) -> DAG (Network)  Speed + Scale (TB to PB)  Cost Based Optimizer
  • 4.
    4 テスト環境  Master Node:3  Slave Node: 5  Hardware  CPU: 4 vCore per Node  Memory: 16 GB per Node  Software  HDP: HDP-2.2.6.0-2800  Hadoop: HDFS 2.6.0.2.2  YARN: 2.6.0.2.2  Presto: Prest-0.86  Spark SQL: Spark 1.2.1.2.2  Tez: 0.5.2.2.2  Hive: 0.14.0.2.2  Java: 1.7.0
  • 5.
  • 6.
    6 テストデータ No. UserVisits (Rows) UserVisits (Bytes) Rankings (Rows) Rankings (Bytes) D1 15K 約3MB1.2K 約80KB D2 6.1M 約1GB 0.72M 約51MB D3 61M 約10GB 7.2M 約500MB D4 610M 約100GB 72M 約6GB D5 1.22B 約200GB 144M 約12GB ファイルFormat: ORC (Presto:RCFileもテストした) ファイルCompression: Snappy
  • 7.
    7 テストSQL No. Query Description Q1SELECT pageURL, pageRank FROM rankings WHERE pageRank > X Scan Query Q2 SELECT SUBSTR(sourceIP, 1, X), SUM(adRevenue) FROM uservisits GROUP BY SUBSTR(sourceIP, 1, X) Aggregation Query Q3 SELECT sourceIP, totalRevenue, avgPageRank FROM (SELECT sourceIP, AVG(pageRank) as avgPageRank, SUM(adRevenue) as totalRevenue FROM Rankings AS R, UserVisits AS UV WHERE R.pageURL = UV.destURL AND UV.visitDate BETWEEN Date('1980-01-01') AND Date('X') GROUP BY UV.sourceIP) ORDER BY totalRevenue DESC LIMIT 100 Join Query Q4 SELECT destURL, adRevenue, visitDate FROM UserVisits ORDER BY adRevenue DESC LIMIT 100 Sort Query Q5 Insert overwrite table UserVisitsWithHighRevenue SELECT sourceIP, destURL, adRevenue, visitDate FROM UserVisits WHERE adRevenue > X ORDER BY visitDate Insert
  • 8.
    8 Performance Overview Data CaseMean Presto (sec) Spark SQL (sec) Hive on Tez (sec) Descriptio n Small- Medium Geometric 2.2 17.2 6.7 D1~D3, Q1~Q4Arithmetic 5.2 32.1 15.8 Medium- Large Geometric 11.3 34.5 6.1 D3~D5, Not Q3 and Q5Arithmetic 22.8 64.1 69.0 Large Geometric 36.5 79.1 11.8 D5, Not Q3 and Q5Arithmetic 46.1 128.4 129.3 Total Geometric 4.4 26.3 22.4 D1~D5, Q1~Q5*Arithmetic 14.5 49.5 437.7 *成功したQuery数: Presto=17, Spark SQL = 21, Hive on Tez = 25
  • 9.
    9 Performance Overview *成功したQuery数: Presto=17,Spark SQL = 21, Hive on Tez = 25 3.0 X 3.0 X 2.8 X 30.2 X 0.5 X 1.1 X 1.0 X 8.8 X 1.0 X 1.0 X 1.0 X 1.0 X 0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 Small-Medium Medium-Large Large Total 倍 数 データサイズ Hive On Tezに比べて何倍早いか 「算術平均」 Presto Spark SQL Hive on Tez
  • 10.
    10 Performance Overview *成功したQuery数: Presto=17,Spark SQL = 21, Hive on Tez = 25 3.0 X 0.5 X 0.3 X 5.1 X 0.4 X 0.2 X 0.1 X 0.9 X1.0 X 1.0 X 1.0 X 1.0 X 0.0 1.0 2.0 3.0 4.0 5.0 6.0 Small-Medium Medium-Large Large Total 倍 数 データサイズ Hive On Tezに比べて何倍早いか 「幾何平均」 Presto Spark SQL Hive on Tez
  • 11.
    11 Query on differentdata size Q1 Q2 Q3 Q4 Q5 Spark SQL 5.9 15.3 15.9 6.5 16.6 Presto 3.2 2.4 1.0 0.2 Hive on Tez 2.0 8.7 7.4 6.0 9.8 5.9 15.3 15.9 6.5 16.6 3.2 2.4 1.0 0.2 2.0 8.7 7.4 6.0 9.8 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 実 行 時 間 ( 秒 ) Query time On Data Set “D1” (15K records) Spark SQL Presto Hive on Tez
  • 12.
    12 Query on differentdata size Q1 Q2 Q3 Q4 Q5 Spark SQL 7.2 28.0 50.9 11.3 35.1 Presto 0.3 3.1 29.2 0.8 Hive on Tez 1.9 17.3 18.1 11.6 44.5 7.2 28.0 50.9 11.3 35.1 0.3 3.1 29.2 0.81.9 17.3 18.1 11.6 44.5 0.0 10.0 20.0 30.0 40.0 50.0 60.0 実 行 時 間 ( 秒 ) Query time On Data Set "D2" (6.1M records) Spark SQL Presto Hive on Tez
  • 13.
    13 Query on differentdata size Q1 Q2 Q3 Q4 Q5 Spark SQL 7.3 33.5 189.9 13.5 80.6 Presto 3.5 7.2 6.5 Hive on Tez 0.0 29.0 72.4 15.6 372.2 7.3 33.5 189.9 13.5 80.6 3.5 7.2 6.50.0 29.0 72.4 15.6 372.2 0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0 400.0 実 行 時 間 ( 秒 ) Query time On Data Set "D3" (61M records) Spark SQL Presto Hive on Tez
  • 14.
    14 Query on differentdata size Q1 Q2 Q3 Q4 Q5 Spark SQL 10.9 75.9 51.0 Presto 0.6 28.6 20.8 Hive on Tez 0.0 129.7 539.4 58.8 2828.4 10.9 75.9 51.00.6 28.6 20.80.0 129.7 539.4 58.8 2828.4 0.0 500.0 1000.0 1500.0 2000.0 2500.0 3000.0 実 行 時 間 ( 秒 ) Query time On Data Set "D4" (610M records) Spark SQL Presto Hive on Tez
  • 15.
    15 Query on differentdata size Q1 Q2 Q3 Q4 Q5 Spark SQL 16.5 246.8 121.8 Presto 12.4 57.6 68.4 Hive on Tez 0.0 248.8 932.8 139.1 5449.6 16 247 12212 58 680 249 933 139 5,450 0 1000 2000 3000 4000 5000 6000 実 行 時 間 ( 秒 ) Query time On Data Set "D5" (1.2B records) Spark SQL Presto Hive on Tez
  • 16.
    16 Performance over Memory& Nodes 5.9 38.8 12.7 4.8 19.3 12.5 18% 50% 2% 0% 10% 20% 30% 40% 50% 60% 0 5 10 15 20 25 30 35 40 45 Presto SparkSQL HoT 実 行 時 間 ( s e c ) 2GB 4GB 短縮比率 *Spark SQLの場合、Node数とメモリ両方が倍にしました
  • 17.
    17 ORC vs RCFileover Presto ORCテーブルはRCFileテーブルより約5倍早い D2 D3 D4 RCFile 10.1 46.2 63.5 ORC 8.4 5.7 10.7 10.1 46.2 63.5 8.4 5.7 10.7 0 10 20 30 40 50 60 70 実 行 時 間 ( 秒 ) ORC vs RCFile over Presto RCFile ORC *CoulderaはImpalaとPrestoをBenchmarkするとき、RCFileを使った!! http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on-hadoop-impala-1-4-widens-the-performance-gap/
  • 18.
    18 結論  Presto  OOMが頻発、WorkerNodeがCrash Speed重視、メモリ十分  Insert OverwriteがSupportされていない  Spark SQL  OOMが発生(別ノードでタスクをRetry)  Speedが三者中に一番遅い  Scalabilityがよい(リソースが倍にすると、処理スピードも倍になる)  Hive on Tez  SpeedがPrestoより遅いが、OOMが発生しない(メモリ要求が低い)  Largeデータでも、安定して、全てのSQL実行を成功した(唯一)  Cost Based Optimizer  Scanクエリが異常に早いが、Insertが異常に遅い(Tuning?) × Yarn ● Yarn ● Yarn
  • 19.
    19 結論 – チューニング(Basic) Presto  task.max-memory  task.shard.max-threads (default: number of CPU cores * 4)  JVM option: Xmx4G  Spark SQL  executor-memory, executor-cores, num-executors  spark.sql.shuffle.partitions, spark.sql.codegen  Hive on Tez  Compute Stats  hive.cbo.enable, hive.compute.query.using.stats, etc.  tez.am.resource.memory.mb, tez.am.java.optsScan
  • 20.
    20 • チューニング参照資料 – https://streever.atlassian.net/wiki/display/HADOOP/Hive+Performanc e+Tips –https://spark.apache.org/docs/latest/sql-programming-guide.html – https://prestodb.io/docs/current/installation/deployment.html – http://docs.hortonworks.com/HDPDocuments/HDP2/HDP- 2.1.1/bk_installing_manually_book/content/rpm-chap1-11.html • Similar Benchmark Test – http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on- hadoop-impala-1-4-widens-the-performance-gap/ – https://amplab.cs.berkeley.edu/benchmark/ – http://www.slideshare.net/hadoopxnttdata/sql-on-hadoop-201411