Tez, An Introduction

Alan F. Gates
Founder & Architect
@alanfgates

Page 1
In The Beginning of Hadoop…
...there was MapReduce

–It could handle data sizes way beyond
those of its competitors
–It wa...
But, It Was Too Low Level

© 2014 Hortonworks

Page 3
But it was too rigid

© 2014 Hortonworks

Page 4
But, It Was Batch

© 2014 Hortonworks

Page 5
YARN to the Rescue

© 2014 Hortonworks

Page 6
Why Tez? Enable Data Processing In Many
Tools
•An execution engine that can be used by
Hive, Pig, Cascading, and others
•R...
Why Tez? Span Batch and Interactive
•It’s hard for customers to use different tools
depending on their data size
•It’s har...
Why Tez? Preserve MapReduce Experience
•MapReduce represents engineering
centuries of work
•Much has been learned (mostly ...
Upcoming SlideShare
Loading in...5
×

February 2014 HUG : Introduction to Tez

1,814

Published on

February 2014 HUG : Introduction to Tez

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,814
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

February 2014 HUG : Introduction to Tez

  1. 1. Tez, An Introduction Alan F. Gates Founder & Architect @alanfgates Page 1
  2. 2. In The Beginning of Hadoop… ...there was MapReduce –It could handle data sizes way beyond those of its competitors –It was resilient in the face of failure –It made it easy for users to bring their code and algorithms to the data (i.e. free to program in Java instead of just SQL) © 2014 Hortonworks Page 2
  3. 3. But, It Was Too Low Level © 2014 Hortonworks Page 3
  4. 4. But it was too rigid © 2014 Hortonworks Page 4
  5. 5. But, It Was Batch © 2014 Hortonworks Page 5
  6. 6. YARN to the Rescue © 2014 Hortonworks Page 6
  7. 7. Why Tez? Enable Data Processing In Many Tools •An execution engine that can be used by Hive, Pig, Cascading, and others •Right now SQL on hadoop is hot, and we want to enable that •But we also want to keep in mind that there’s a lot else to be done in Hadoop (machine learning, ETL, graph processing, etc.) and we want to open up the work we’re doing to those groups as well. © 2014 Hortonworks Page 7
  8. 8. Why Tez? Span Batch and Interactive •It’s hard for customers to use different tools depending on their data size •It’s hard for applications like Hive to use different back end engines depending on the inputs and outputs © 2014 Hortonworks Page 8
  9. 9. Why Tez? Preserve MapReduce Experience •MapReduce represents engineering centuries of work •Much has been learned (mostly the hard way) about scale and resiliency •We are not excited to reinvent those wheels, we would rather rebuild the vehicle on top of them © 2014 Hortonworks Page 9
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×