Presto @ Netflix: Interactive Queries
at Petabyte Scale
Nezih Yigitbasi and Zhenxiao Luo
Big Data Platform
Outline
» Big data platform @ Netflix
» Why we love Presto?
» Our contributions
» What are we working on?
» What else we need?
Cloud
Apps
S3
Suro Ursula
SSTable
s
Cassandra Aegisthus
Event Data
15m
Daily
Dimension Data
Our Data Pipeline
Data
Warehouse
Service
Tool
s
Gateways
Big Data Platform Architecture
Prod
Clients
Clusters
VPCQuery Prod TestBonusProd
» Batch jobs (Pig, Hive)
» ETL jobs
» reporting and other analysis
» Ad-hoc queries
» interactive data exploration
» Looked at Impala, Redshift, Spark, and Presto
Our Use Cases
Deployment
» v 0.86
» 1 coordinator (r3.4xlarge)
» 250 workers (m2.4xlarge)
Tooling
Numbers
» ~2.5K queries/day against our 10PB Hive DW on S3
» 230+ Presto users out of 300+ platform users
» presto-cli, Python, R,
BI tools (ODBC/JDBC), etc.
» Atlas/Suro for monitoring/logging
Presto @ Netflix
Why we love Presto?
» Open source
» Fast
» Scalable
» Works well on AWS
» Good integration with the Hadoop stack
» ANSI SQL
Our Contributions
24 open PRs, 60+ commits
» S3 file system
» multipart upload, IAM roles, retries, monitoring, etc.
» Functions for complex types
» Parquet
» name/index-based access, type coercion, etc.
» Query optimization
» Various other bug fixes
» Vectorized reader* Read based on column vectors
» Predicate pushdown Use statistics to skip data
» Lazy load Postpone loading the data until needed
» Lazy materialization Postpone decoding the data until needed
What are we Working on?
Parquet Optimizations
* PARQUET-
Netflix Integration
» BI tools integration
» ODBC driver, Tableau web connector, etc.
» Better monitoring
» Ganglia ⟶ Atlas
» Data lineage
» Presto ⟶ Suro ⟶ Charlotte
» Graceful cluster shrink
» Better resource management
» Dynamic type coercion for all file formats
» Support for more Hive types (e.g., decimal)
» Predictable metastore cache behavior
» Big table joins similar to Hive
What else we need?
THANK YOU

Presto Meetup Talk @ FB (03/19/15)

  • 1.
    Presto @ Netflix:Interactive Queries at Petabyte Scale Nezih Yigitbasi and Zhenxiao Luo Big Data Platform
  • 2.
    Outline » Big dataplatform @ Netflix » Why we love Presto? » Our contributions » What are we working on? » What else we need?
  • 3.
    Cloud Apps S3 Suro Ursula SSTable s Cassandra Aegisthus EventData 15m Daily Dimension Data Our Data Pipeline
  • 4.
    Data Warehouse Service Tool s Gateways Big Data PlatformArchitecture Prod Clients Clusters VPCQuery Prod TestBonusProd
  • 5.
    » Batch jobs(Pig, Hive) » ETL jobs » reporting and other analysis » Ad-hoc queries » interactive data exploration » Looked at Impala, Redshift, Spark, and Presto Our Use Cases
  • 6.
    Deployment » v 0.86 »1 coordinator (r3.4xlarge) » 250 workers (m2.4xlarge) Tooling Numbers » ~2.5K queries/day against our 10PB Hive DW on S3 » 230+ Presto users out of 300+ platform users » presto-cli, Python, R, BI tools (ODBC/JDBC), etc. » Atlas/Suro for monitoring/logging Presto @ Netflix
  • 7.
    Why we lovePresto? » Open source » Fast » Scalable » Works well on AWS » Good integration with the Hadoop stack » ANSI SQL
  • 8.
    Our Contributions 24 openPRs, 60+ commits » S3 file system » multipart upload, IAM roles, retries, monitoring, etc. » Functions for complex types » Parquet » name/index-based access, type coercion, etc. » Query optimization » Various other bug fixes
  • 9.
    » Vectorized reader*Read based on column vectors » Predicate pushdown Use statistics to skip data » Lazy load Postpone loading the data until needed » Lazy materialization Postpone decoding the data until needed What are we Working on? Parquet Optimizations * PARQUET-
  • 10.
    Netflix Integration » BItools integration » ODBC driver, Tableau web connector, etc. » Better monitoring » Ganglia ⟶ Atlas » Data lineage » Presto ⟶ Suro ⟶ Charlotte
  • 11.
    » Graceful clustershrink » Better resource management » Dynamic type coercion for all file formats » Support for more Hive types (e.g., decimal) » Predictable metastore cache behavior » Big table joins similar to Hive What else we need?
  • 12.

Editor's Notes

  • #4 data from apps/services. event data 200b events: app logs, user activity (search event, movie detail click from website, etc.), system operational data ursula demultiplex the events into event types (~150 event types right now). latency of this ursula pipeline is 15m dimension data: subscriber data. aegisthus extracts data from cassandra which is the online backing store for netflix and writes to s3.
  • #5 mention that we have single dw on s3, spin up multiple clusters. ittle perf diff. on s3 vs hdfs as we are mostly cpu bound. http://netflix.github.io/ sting: reporting charlotte: lineage
  • #6 impala: no s3 support spark loads all data, doesn’t stream + stability issues at that time. it couldn’t even handle an hour worth of data ~ 2013. spark sql recently graduated from alpha with the spark 1.3 release (https://spark.apache.org/releases/spark-release-1-3-0.html) redshift: need to copy data from s3 to redshift
  • #7 r3.4xlarge and m2.4xlarge are both memory optimized instances where m2 is a previous generation instance type 5PB of our 10PB Hive DW is in Parquet format
  • #8 single warehouse on s3, spin up multiple test/prod presto clusters and query live data etc.
  • #9 s3 fs: exp backoff, exposed various configs for the aws sdk, multipart upload, IAM roles, and monitoring prestoS3FileSystem and AWS sdk better tooling/community support for parquet. good integration with existing tools hive, spark, etc.. several bug fixes and new functions to manipulate complex types to close the gap between hive and presto DDL: alter/create table optimization:(2085) Rewrite Single Distinct Aggregation into GroupBy and (1937) and Optimize joins with similar subqueries complex types: array: contains, concat, sort, map: map_agg and map constructors, map_keys, map_values, etc. bridge the gap between hive and presto
  • #11 We log queries to our internal data pipeline (Suro) and another internal tool (Charlotte) analyzes data lineage
  • #12 we are pushing reporting to Presto with our Tableau/MS work. not for ETL. → monitoring, scheduling improvements. Presto’s distributed join is still memory-limited as there is no spills. hive decimal type: https://github.com/facebook/presto/issues/2417 -> at least be able to read it, still open