Presto @ Facebook: Past, Present and Future

Presto
Past, Present and Future
Martin Traverso
June 5, 2014

“A good day is when I
can run 6 Hive queries”
— a Facebook data scientist

What is Presto?
Distributed SQL analytics engine
Optimized for low-latency, interactive analysis
ANSI SQL
Extensible

Architecture
Scheduler
Data
Location API
Parser/
Analyzer
Planner
Metadata
API
Coordinator
Client
Worker
Worker
Worker
Data Stream API
Data Stream API

Connectors
Coordinator Worker
Parser/
Analyzer
Planner Scheduler
Cassandra
Internal
MySQL
JMX
Hive
Metadata API
Cassandra
Internal
MySQL
JMX
Hive
Data Location API
Cassandra
Internal
MySQL
JMX
Hive
Data Stream API

Connectors
Hadoop 1.x
Hadoop 2.x
CDH 4
CDH 5
Custom S3 integration for Hadoop
Cassandra
TPC-H

Other extension points
Types
Functions
Operators

What makes Presto fast?
Data in memory during execution
Pipelining and streaming
Very careful coding of inner loops
Efficient ﬂat-memory data structures
Bytecode generation

More SQL features
Structs, Maps and Lists
Views
Scalar sub queries
Features required to run all TPC-DS

Execution engine
Huge joins and aggregations
•Hash distributed
•Co-distributed and co-partitioned
•Spill to disk (ﬂash)
Work stealing
Basic task recovery

ODBC driver
Targeting major BI tools
•Tableau, MicroStrategy and Excel
Support for Windows, Mac and Linux
Entirely open source (ASL2)

Native store
Stores data directly on worker nodes
Custom data format
Initial use cases
•‘Hot’ data
•‘Live’ data

Open source
Apache License 2.0
Open development
Releases every 1-2 weeks
!
External contributions welcome!

Presto
http://prestodb.io
github.com/facebook/presto
!
Martin Traverso
@mtraverso
github.com/martint

Bytecode generation
while (in.advanceNextPosition()) {!
if (in.getLong(3) >= 100 && !
in.getLong(3) <= 200 &&!
in.getLong(4) < in.getLong(5)) {!
!
out.advance();!
in.appendStringTo(0, out);!
out.appendLong(in.getLong(1) * in.getLong(2) / 10);!
}!
}
SELECT!
k AS c1,!
(a * b) / 10 AS c2!
FROM T!
WHERE!
c BETWEEN 100 AND 200!
AND d < e!
T: !
k varchar, !
a bigint, !
b bigint, !
c bigint, !
d bigint, !
e bigint

Presto @ Facebook: Past, Present and Future

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Presto @ Facebook: Past, Present and Future

Similar to Presto @ Facebook: Past, Present and Future (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Presto @ Facebook: Past, Present and Future