Presto@Uber

Edit or delete footer text in Master ipsandella doloreium dem isciame ndaestia nessed
quibus aut hiligenet ut ea debisci eturiate poresti vid min core, vercidigent.
Uber’s Mission
Transportation as reliable as running water,
everywhere, for everyone
400+ Cities 69 Countries
And growing ...

Agenda
● Data Platform @ Uber
● SQL on Hadoop
● Presto
● Parquet
● Roadmap

Data@Uber in 2015
Kafka
Schemaless
MySQL, Postgres
Data Producers
Hadoop Distributed File
System (HDFS)
Commercial
Database
Commercial
Database
Commercial
Database
ETL Jobs
ETL Jobs
ETL Jobs
ETL Jobs
Data Consumers
Load
Load
Ad Hoc Queries
Reports
Machine
Learning Jobs
Load

Statistics
● Petabyte Scale Hadoop Cluster
● ~10 TB Data Ingested to Hadoop daily
● ~500 raw datasets
● Hundreds of nodes Hadoop Cluster

Pain Points
● Data not Queryable until available in Commercial DB
● Single Commercial DB cluster limited to ~32 nodes
● Data in Commercial DB <<< Data in HDFS
● Hive on the PB scale Hadoop Warehouse is **SLOW**

SQL On Hadoop
Hadoop Distributed File System
(HDFS)
Batch Jobs Interactive Queries
Presto
Janus
Hive
Applications
Kafka
Schemaless
MySQL, Postgres

Solution
● Data not Queryable until available in Commercial DB
○ Run SQL directly on Hadoop
● Single Commercial DB cluster limited to ~32 nodes
○ SQL on Hadoop scales to thousands of machines
● Data in Commercial DB <<< Data in HDFS
○ HDFS holds all the data
● Hive on the PB scale Hadoop Warehouse is **SLOW**
○ Try Presto

What is Presto
Distributed SQL engine for Hadoop
Fast
Scalable
ANSI SQL
Open Source
Extensible

Background
● Facebook Internal users would
like to run SQL on Hadoop
● Hive in production 2008
● Need a fast SQL engine
● 2013 Facebook Presto in
production
● 2014 Netflix Presto in production
● 2016 Uber Presto in production
● Presto + Hive = SQL on Hadoop
Fighter Aircraft F22 and F35

How Presto Works
Worker
Partial Aggregation
Table Scan
Parquet
File System
Worker
Partial Aggregation
Table Scan
Parquet
File System
Coordinator
Parser
Optimizer
Fragmenter
Scheduler
Worker
Final Aggregation
Client

● Data in memory during execution
● Pipelining and streaming
● Columnar storage & execution
● Bytecode generation
○ Inline virtual function calls
○ Inline constants
○ Rewrite inner loops
○ Rewrite type specific branches
Why Presto is Fast

● CPU Management
○ priority queues
○ short running queries higher priority
● Memory Management
○ query max memory per node
○ Query fails on hitting memory limit, Presto process
continue running
● Concurrency Management
○ Queue: per user max concurrent running queries
How Presto Manages Resources

● No Fault Tolerance
○ Applications have to retry if query fails
● Joins do not fit in memory
○ Join fails
○ Presto Worker process continues serving other queries
○ Run it on Hive
● Coordinator is a single point of failure
Limitations

Deployment
● ~ 200 node Presto cluster
● ~ 30K queries per day
● Serving ad hoc SQL queries
● Serving real time applications

Commercial
Database
Presto SparkSQL Hive
Performance Fast Fast Not as fast as Presto Not Fast
Open Source No Yes Yes Yes
Warehouse Size 100s of TB PB Scale PB Scale PB Scale
SQL Support ANSI SQL ANSI SQL HiveQL HiveQL
Nested Schema No Yes Yes Yes
User Defined
Functions
Has its own UDFs,
third party GeoSpatial
functions available
Has its own builtin functions.
GeoSpatial functions
implemented
Support UDFs, third party
GeoSpatial functions available
Support UDFs, third
party GeoSpatial
functions available
Memory Limit query rejected if
requests larger than
memory cap
Cannot handle huge joins if
hash bucket hits memory
cap
Spill to disk for big join Spill to disk for big join
In Summary

Parquet Improvement
Predicate Pushdown
Stats [ min: 5, max: 8 ]
Skip this Row Group
Dictionary Pushdown
Stats [ min 5, max: 20]
Dictionary Page [ 5, 9, 12,
17, 20]
Skip this Row Group
Query: Select A, B from T where C = 10;

Parquet Improvement
Lazy Reads
Read C first
No need to read A and B at
all if no matching C
Columnar Reads
Build Presto blocks for each
column
Not reading row by row
Query: Select A, B from T where C = 10;

Roadmap
● Schema Evolution
● Geo Spatial SQL support
● Parquet Performance Improvements:
○ Nested Column Pruning
○ Predicate Pushdown & Dictionary Pushdown
○ Lazy Reads & Columnar Reads

Presto@Uber

More Related Content

What's hot

Similar to Presto@Uber

Recently uploaded

Presto@Uber