Presto@Netflix Presto Meetup 03-19-15

Presto @ Netflix: Interactive Queries
at Petabyte Scale
Nezih Yigitbasi and Zhenxiao Luo
Big Data Platform

Outline
» Big data platform @ Netflix
» Why we love Presto?
» Our contributions
» What are we working on?
» What else we need?

Cloud
Apps
S3
Suro Ursula
SSTable
s
Cassandra Aegisthus
Event Data
15m
Daily
Dimension Data
Our Data Pipeline

Data
Warehouse
Service
Tool
s
Gateways
Big Data Platform Architecture
Prod
Clients
Clusters
VPCQuery Prod TestBonusProd

» Batch jobs (Pig, Hive)
» ETL jobs
» reporting and other analysis
» Ad-hoc queries
» interactive data exploration
» Looked at Impala, Redshift, Spark, and Presto
Our Use Cases

Deployment
» v 0.86
» 1 coordinator (r3.4xlarge)
» 250 workers (m2.4xlarge)
Tooling
Numbers
» ~2.5K queries/day against our 10PB Hive DW on S3
» 230+ Presto users out of 300+ platform users
» presto-cli, Python, R,
BI tools (ODBC/JDBC), etc.
» Atlas/Suro for monitoring/logging
Presto @ Netflix

Why we love Presto?
» Open source
» Fast
» Scalable
» Works well on AWS
» Good integration with the Hadoop stack
» ANSI SQL

Our Contributions
24 open PRs, 60+ commits
» S3 file system
» multipart upload, IAM roles, retries, monitoring, etc.
» Functions for complex types
» Parquet
» name/index-based access, type coercion, etc.
» Query optimization
» Various other bug fixes

» Vectorized reader* Read based on column vectors
» Predicate pushdown Use statistics to skip data
» Lazy load Postpone loading the data until needed
» Lazy materialization Postpone decoding the data until needed
What are we Working on?
Parquet Optimizations
* PARQUET-

Netflix Integration
» BI tools integration
» ODBC driver, Tableau web connector, etc.
» Better monitoring
» Ganglia ⟶ Atlas
» Data lineage
» Presto ⟶ Suro ⟶ Charlotte

» Graceful cluster shrink
» Better resource management
» Dynamic type coercion for all file formats
» Support for more Hive types (e.g., decimal)
» Predictable metastore cache behavior
» Big table joins similar to Hive
What else we need?

Presto@Netflix Presto Meetup 03-19-15

More Related Content

What's hot

Viewers also liked

Similar to Presto@Netflix Presto Meetup 03-19-15

Recently uploaded

Presto@Netflix Presto Meetup 03-19-15

Editor's Notes