More Related Content
Similar to Alan Gates, Hortonworks_Hadoop&SQL
Similar to Alan Gates, Hortonworks_Hadoop&SQL (20)
Alan Gates, Hortonworks_Hadoop&SQL
- 1. Stinger Overview
Page 1
•An initiative, not a project or product
•Includes changes to Hive and a new project Tez
•Two main goals
–Improve Hive performance 100x over Hive 0.10
–Extend Hive SQL to include features needed for
analytics
•Hive will support:
–BI tools connecting to Hadoop
–Analysts performing ad-hoc, interactive queries
–Still excellent at the large batch jobs it is used for today
© 2013 Hortonworks
- 2. Stinger Mileposts
Page 2
© 2013 Hortonworks
Stinger Phase 3
•Buffer Cache
•Cost Based
Optimizer
Stinger Phase 2
•YARN Resource Mgmnt
•Hive on Apache Tez
•Remove startup latency
•Vectorized Operators
Stinger Phase 1
•Reduce # of MR Jobs
•SQL Analytics
•ORCFile Format
1 2 Improve existing tools & preserve
investments
Enable Hive to support interactive
workloads
Released in
Hive 0.11
Current
Work
Roadmap
- 3. © Hortonworks Inc. 2013
Performance Trajectory
Page 3
1X
2X
12X
11X
21X
0X
5X
10X
15X
20X
25X
Hive 10
Text
Hive 10
RC
Hive 11
RC
Hive 11
ORC
Hive 11 CP
ORC, Tez…
Query 27 Speedup
1X
14X
44X
57X
78X
0X
10X
20X
30X
40X
50X
60X
70X
80X
90X
Hive 10
Text
Hive 10
RC
Hive 11
RC
Hive 11
ORC
Hive 11 CP
ORC, Tez
Query 82 Speedup
Editor's Notes
- Speedup (y-axis) as a ratio to Hive 10 Text. Bigger is better.