SlideShare a Scribd company logo
1 of 34
Download to read offline
Enabling Presto to Handle
Massive Scale at Lightning Speed
Fast and Scalable Data Processing
Raunaq Morarka
27/4/2019
00
About Presenter
● Presenter Name, Title: Raunaq Morarka, Staff Engineer at Qubole, Bangalore
● Bio: I currently work in the Presto team at Qubole. My areas of interests are distributed
database systems and software performance optimizations. At Qubole I have worked on features
related to scheduling, autoscaling and usage of spot nodes for running Presto as a service on
cloud. I have recently started contributing to Presto sql open source project. Before Qubole, I
worked on a time-series distributed columnar database which supports real time ingest and low
latency queries at Akamai.
00
Agenda
● State of Presto today
○ Background – Introduction, Why Presto
○ Presto Architecture
○ Usage Overview – Recent Growth and Adoption Trends
● Presto in the Cloud
○ Optimizing for Scale
■ Autoscaling
■ Maximizing the Benefits of the Cloud
○ Optimizing for Speed
■ Dynamic Filtering, Join Reordering, Join Strategy Selection
■ RubiX – The next-generation column level optimized caching on Presto
● Future roadmap
00
State of Presto Today
00
State of Presto Today - Background
What is Presto?
• Distributed SQL Query Engine originated in Facebook in 2013
• ANSI SQL Compliant
• Supports Federated Queries
• Pluggable data sources
• Completely in-memory and pipelined execution model
Why Presto?
• Built for variety of use cases : Low latency user facing applications, Exploratory Analysis through BI tools, Batch ETL
• Data source agnostic : HDFS, RDBMSs, NoSQL, Stream processing, Cloud Object stores (S3, ADLS, GCP)
• Zero configuration ideology
• Proven in production at Petabyte Scale: Facebook, Netflix, Airbnb, Uber, LinkedIn, Qubole, and more
• Highly Extendible
00
Presto Architecture
00
Query Lifecycle
● Client submits sql query to Coordinator using HTTP REST API
● ANTLR based parser converts query to syntax tree
● Logical Planner generates tree of plan tree
● Optimizer transforms logical plan into an efficient execution strategy
• RBO (predicate and limit pushdown, column pruning, partition pruning etc.)
• CBO (Join reordering, Join strategy selection)
• Takes advantage of Data layout (partitioning, sorting, grouping and indices)
• Inter-node parallelism by breaking up plan into Stages that can be executed in parallel across
workers
• Intra-node parallelism by running a sequence of operators (pipeline) in multiple threads.
00
00
Scheduling
● Coordinator distributes plan to workers, starts execution of tasks and then
begins to enumerate splits, which are opaque handles to an addressable
chunk of data in an external storage system
● Splits are assigned to the tasks responsible for reading this data
■
00
Exchange (Shuffles)
● Presto uses in-memory buffered shuffles over HTTP to exchange intermediate results for
different stages of a query
● Tasks produce data into an in-memory output buffer
● Workers consume results from other workers through an exchange client which uses HTTP
long-polling
● Exchange client buffers data before it is processed (input buffer)
● Exchange server retains data until client acknowledges receipt
● Engine tunes parallelism to maintain target utilization rates for output and input buffers
00
Split Assignment
Presto asks connectors to enumerate small batches of splits, and assigns them to tasks lazily
● Decouples query response time from time taken for listing files
● Avoid enumerating all splits when queries are cancelled or finish early when LIMIT clause is
satisfied.
● Workers maintain a queue of splits. The coordinator assigns new splits to tasks with the
shortest queue. Keeping these queues small allows the system to adapt to variance in CPU cost
of processing different splits and performance differences among workers
● Allows queries to execute without having to hold all their metadata in memory
● Lazy split enumeration can make it difficult to accurately estimate and report query progress
00
State of Presto Today – Usage Overview
Presto grew 420% in terms of compute hours on Qubole’s cloud platform from January 2017 to 2018.
Customers in aggregate are running 24x more commands per hour in Presto than Apache Spark and 6x
more commands than Apache Hadoop/Hive.
00
State of Presto Today – Usage Overview
Top Three Industries Using Presto
1. Entertainment
2. Travel Services
3. Gaming
Verticals everywhere are adopting Presto
00
Presto in the Cloud
00
Optimizing for Scale – Autoscaling
● Scale clusters in range [min size, max size]
● Scale up for the increased workload
● Scale down when load goes down
● Graceful scale down
● Usually implemented by defining rules on top of CPU/memory/IO metrics exported by system
● Qubole’s implementation
○ Monitor progress of queries
○ Intelligent decision making to scale up only if it can help to meet a given SLA
○ Handle bursty workloads by avoiding fixed step sizes
○ Finer controls like grouped scale up/down, cool down period, etc.
○ Automatic termination of idle clusters
○ Self start of cluster in response to first query on a shutdown cluster
00
Optimizing for Scale – Required workers
● Non source stages cannot be redistributed to take advantage of newly added nodes
● Min size of cluster must be large enough to avoid query failures
● Choice between high cost and degraded performance for initial queries
● Required workers is a mechanism to delay query execution until a minimum no. of worker
nodes join the cluster
● Integration with Qubole’s autoscaling
○ Scale up cluster to satisfy min workers requirement
○ Avoid scaling up for DDL and monitoring related queries
○ Scale down to 1 node during periods of inactivity
00
Config A Config B Config C
Total time taken 5h 12m 4h 26m 4h 37m
Total node runtime
seconds
143137 134664 124351
Min size 2 6 1 (6 required nodes)
00
Optimizing for Scale – Autoscaling
00
Optimizing for Scale – Maximizing the Benefits of the Cloud
● Spot nodes are generally available at highly discounted prices
● Presto is not able to utilize them well OOB due to its pipelined and in-memory execution
architecture
● Spot loss will lead to failure of all queries which had any part of their execution tree running on
that spot node
● Presto is usually run on newer generation, high memory instance types which experience spot
loss more often due to greater demand
● Qubole’s handling of Spot termination notification
○ Proactive addition of nodes to maintain cluster size
○ Avoid scheduling tasks on spot node after receiving STN
○ Acquire on-demand quickly
○ Lazily rebalance to achieve desired spot ratio
○ No query failures if all queries finish under 2 minutes
00
00
Query retries
● Fallback for query failures that cannot be handled in STN Integration
● Query retries should be transparent to the clients and work with all Presto clients: Java cli,
*DBC Drivers, Ruby client, etc.
● The retry should happen only if it is guaranteed that no partial results have yet been sent to
the client
● The retry should happen only if changes (if any) caused by the failed query have been rolled
back e.g. in case of Inserts, data written by the failed query has been deleted
● The retry should happen only if there is a chance of successful query execution
● Qubole’s implementation (Smart Query Retry)
○ Presto server responsible for retries, clients redirected to new query without any changes
required to client
○ Convert SELECT queries into IOD queries (INSERT OVERWRITE DIRECTORY), clients
get result only after query has finished
○ Track rollback status of query
○ Retry in bigger cluster if the failure is due to insufficient memory
○ Retry when cluster size stabilizes if the failure is due to node loss
00
Optimizing for Speed – Dynamic Filtering
SELECT (...)
FROM store_sales JOIN date_dim ON ss_sold_date_sk = d_date_sk (...)
WHERE d_year = 2000 and d_moy = 12 (...)
(... GROUP BY ... ORDER BY ...)
Currently (assuming tables are not partitioned) Presto will perform full table scan of both tables.
1. Skip accessing fact table partitions not needed by the query (runtime partition pruning)
2. Filter rows on probe side of join by sending only the subset of rows that match the join keys
across the network (runtime row filtering)
3. If storage format supports predicate pushdown, use runtime filters to avoid scanning data on
probe side (runtime predicate pushdown)
00
Dynamic Filtering concept
00
Dynamic Filtering results
• Runtime of 13 queries improved by at least 5X.
• Runtime of 13 queries improved between 3X - 5X.
• Runtime of 22 queries improved between 1.5X - 3X.
• 14 queries that did not run before succeeded.
00
Optimizing for Speed – Join Reordering
• Smaller table to the right for better performance
• Difficult to ensure it in a multi-join query
• Join Reordering Optimizer rule to the rescue
00
Optimizing for Speed – Join Reordering
BA AB
A.a = B.b B.b = A.a
Join Reordering made for the case
where build-side of Join is expensive
3~6x
Tpcds scale 3000*
Geomean 3.1x
00
Optimizing for Speed – Join Reordering
00
Optimizing for Speed – Join strategy selection
● Broadcast (Map-side join) vs Repartitioned (Shuffle join)
● Repartitioned
○ Default
○ Low memory usage
○ Both build and probe side need to be partitioned
○ More efficient for joins between large tables of similar size
● Broadcast
○ High memory usage, build side table must fit in memory
○ Probe side does not need to be partitioned
○ Build side table broadcast to all nodes
○ More efficient for joins where one table is of small size
00
Optimizing for Speed – RubiX
● A caching framework for big data engines in the cloud
● Open source https://github.com/qubole/rubix
● Built for zero configuration, SQL first, zero bottlenecks, auto rebalancing
● Adopted within Qubole for Presto, Hive and Spark
● Adopted outside Qubole e.g. HDInsights IOCache
● https://www.qubole.com/blog/rubix-fast-cache-access-for-big-data-analytics-on-cloud-storage
00
Optimizing for Speed – RubiX
• Cache file chunks Shared
• Cache across JVMs
• Engine-independent scheduling logic
Avg.
~20%
00
Optimizing for Speed – RubiX
00
Presto OS roadmap
● Coordinator High Availability
● Allow connectors to participate in query optimization
● Improvements to Spill to disk functionality
● Partial recovery support for failure of long running queries
● Ranger integration
● Qubole collaborations with community
○ Dynamic Filtering
○ Kinesis Connector
○ Supporting Insert Only Transactional Hive tables
○ Data locality based scheduling of source splits
○ Presto UDFs (https://github.com/qubole/presto-udfs)
00
Q&A
00
Thanks for attending!
Please feel free to reach out to me at raunaqm@qubole.com

More Related Content

What's hot

How to build massive service for advance
How to build massive service for advanceHow to build massive service for advance
How to build massive service for advanceDaeMyung Kang
 
業務システムにおけるMongoDB活用法
業務システムにおけるMongoDB活用法業務システムにおけるMongoDB活用法
業務システムにおけるMongoDB活用法Yoshitaka Mori
 
確実な再起動からはじめる クラウドネイティブオペレーション
確実な再起動からはじめる クラウドネイティブオペレーション確実な再起動からはじめる クラウドネイティブオペレーション
確実な再起動からはじめる クラウドネイティブオペレーションToru Makabe
 
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)NGINX, Inc.
 
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017Amazon Web Services Korea
 
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and OptimizationPgDay.Seoul
 
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)NTT DATA Technology & Innovation
 
NAND Flash から InnoDB にかけての話(仮)
NAND Flash から InnoDB にかけての話(仮)NAND Flash から InnoDB にかけての話(仮)
NAND Flash から InnoDB にかけての話(仮)Takanori Sejima
 
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSS
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSSYahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSS
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSSYahoo!デベロッパーネットワーク
 
DynamoDB設計のちょっとした技
DynamoDB設計のちょっとした技DynamoDB設計のちょっとした技
DynamoDB設計のちょっとした技Yoichi Toyota
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...StreamNative
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]MongoDB
 
Inside vacuum - 第一回PostgreSQLプレ勉強会
Inside vacuum - 第一回PostgreSQLプレ勉強会Inside vacuum - 第一回PostgreSQLプレ勉強会
Inside vacuum - 第一回PostgreSQLプレ勉強会Masahiko Sawada
 

What's hot (20)

How to build massive service for advance
How to build massive service for advanceHow to build massive service for advance
How to build massive service for advance
 
業務システムにおけるMongoDB活用法
業務システムにおけるMongoDB活用法業務システムにおけるMongoDB活用法
業務システムにおけるMongoDB活用法
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
確実な再起動からはじめる クラウドネイティブオペレーション
確実な再起動からはじめる クラウドネイティブオペレーション確実な再起動からはじめる クラウドネイティブオペレーション
確実な再起動からはじめる クラウドネイティブオペレーション
 
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
 
Redis
RedisRedis
Redis
 
Oracle Database Applianceのご紹介(詳細)
Oracle Database Applianceのご紹介(詳細)Oracle Database Applianceのご紹介(詳細)
Oracle Database Applianceのご紹介(詳細)
 
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
 
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
 
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
 
NAND Flash から InnoDB にかけての話(仮)
NAND Flash から InnoDB にかけての話(仮)NAND Flash から InnoDB にかけての話(仮)
NAND Flash から InnoDB にかけての話(仮)
 
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSS
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSSYahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSS
Yahoo! JAPANのプライベートRDBクラウドとマルチライター型 MySQL #dbts2017 #dbtsOSS
 
DynamoDB設計のちょっとした技
DynamoDB設計のちょっとした技DynamoDB設計のちょっとした技
DynamoDB設計のちょっとした技
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
AWS Database Migration Service ご紹介
AWS Database Migration Service ご紹介AWS Database Migration Service ご紹介
AWS Database Migration Service ご紹介
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
 
Inside vacuum - 第一回PostgreSQLプレ勉強会
Inside vacuum - 第一回PostgreSQLプレ勉強会Inside vacuum - 第一回PostgreSQLプレ勉強会
Inside vacuum - 第一回PostgreSQLプレ勉強会
 

Similar to Enabling presto to handle massive scale at lightning speed

Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedShubham Tagra
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud EraMydbops
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_systemKai Sasaki
 
Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@GrabShubham Tagra
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKafkaZone
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioAlluxio, Inc.
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent
 
Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doMetehan Çetinkaya
 
Running Dataproc At Scale in production - Searce Talk at GDG Delhi
Running Dataproc At Scale in production - Searce Talk at GDG DelhiRunning Dataproc At Scale in production - Searce Talk at GDG Delhi
Running Dataproc At Scale in production - Searce Talk at GDG DelhiSearce Inc
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Mariano Gonzalez
 
Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!Knoldus Inc.
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Dipti Borkar
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERShuyi Chen
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Community
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Lucidworks
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 

Similar to Enabling presto to handle massive scale at lightning speed (20)

Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
 
Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@Grab
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applications
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with Alluxio
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.do
 
Running Dataproc At Scale in production - Searce Talk at GDG Delhi
Running Dataproc At Scale in production - Searce Talk at GDG DelhiRunning Dataproc At Scale in production - Searce Talk at GDG Delhi
Running Dataproc At Scale in production - Searce Talk at GDG Delhi
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 
Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBER
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS Update
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
 
BAXTER phase 1b
BAXTER phase 1bBAXTER phase 1b
BAXTER phase 1b
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 

More from Shubham Tagra

Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudShubham Tagra
 
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...Shubham Tagra
 
Presto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsPresto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsShubham Tagra
 
Debugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarDebugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarShubham Tagra
 
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019Shubham Tagra
 
Presto Bangalore Meetup1 Event Listeners@qubole
Presto Bangalore Meetup1 Event Listeners@qubolePresto Bangalore Meetup1 Event Listeners@qubole
Presto Bangalore Meetup1 Event Listeners@quboleShubham Tagra
 
Presto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaPresto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaShubham Tagra
 
Presto Bangalore Meetup1 Ranger+Presto@ola
Presto Bangalore Meetup1 Ranger+Presto@olaPresto Bangalore Meetup1 Ranger+Presto@ola
Presto Bangalore Meetup1 Ranger+Presto@olaShubham Tagra
 
Presto Bangalore Meetup1 Repertoire@Myntra
Presto Bangalore Meetup1 Repertoire@MyntraPresto Bangalore Meetup1 Repertoire@Myntra
Presto Bangalore Meetup1 Repertoire@MyntraShubham Tagra
 

More from Shubham Tagra (9)

Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the Cloud
 
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
 
Presto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsPresto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analysts
 
Debugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarDebugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan Kumar
 
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
 
Presto Bangalore Meetup1 Event Listeners@qubole
Presto Bangalore Meetup1 Event Listeners@qubolePresto Bangalore Meetup1 Event Listeners@qubole
Presto Bangalore Meetup1 Event Listeners@qubole
 
Presto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaPresto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@ola
 
Presto Bangalore Meetup1 Ranger+Presto@ola
Presto Bangalore Meetup1 Ranger+Presto@olaPresto Bangalore Meetup1 Ranger+Presto@ola
Presto Bangalore Meetup1 Ranger+Presto@ola
 
Presto Bangalore Meetup1 Repertoire@Myntra
Presto Bangalore Meetup1 Repertoire@MyntraPresto Bangalore Meetup1 Repertoire@Myntra
Presto Bangalore Meetup1 Repertoire@Myntra
 

Recently uploaded

EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxnada99848
 

Recently uploaded (20)

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptx
 

Enabling presto to handle massive scale at lightning speed

  • 1. Enabling Presto to Handle Massive Scale at Lightning Speed Fast and Scalable Data Processing Raunaq Morarka 27/4/2019
  • 2. 00 About Presenter ● Presenter Name, Title: Raunaq Morarka, Staff Engineer at Qubole, Bangalore ● Bio: I currently work in the Presto team at Qubole. My areas of interests are distributed database systems and software performance optimizations. At Qubole I have worked on features related to scheduling, autoscaling and usage of spot nodes for running Presto as a service on cloud. I have recently started contributing to Presto sql open source project. Before Qubole, I worked on a time-series distributed columnar database which supports real time ingest and low latency queries at Akamai.
  • 3. 00 Agenda ● State of Presto today ○ Background – Introduction, Why Presto ○ Presto Architecture ○ Usage Overview – Recent Growth and Adoption Trends ● Presto in the Cloud ○ Optimizing for Scale ■ Autoscaling ■ Maximizing the Benefits of the Cloud ○ Optimizing for Speed ■ Dynamic Filtering, Join Reordering, Join Strategy Selection ■ RubiX – The next-generation column level optimized caching on Presto ● Future roadmap
  • 5. 00 State of Presto Today - Background What is Presto? • Distributed SQL Query Engine originated in Facebook in 2013 • ANSI SQL Compliant • Supports Federated Queries • Pluggable data sources • Completely in-memory and pipelined execution model Why Presto? • Built for variety of use cases : Low latency user facing applications, Exploratory Analysis through BI tools, Batch ETL • Data source agnostic : HDFS, RDBMSs, NoSQL, Stream processing, Cloud Object stores (S3, ADLS, GCP) • Zero configuration ideology • Proven in production at Petabyte Scale: Facebook, Netflix, Airbnb, Uber, LinkedIn, Qubole, and more • Highly Extendible
  • 7. 00 Query Lifecycle ● Client submits sql query to Coordinator using HTTP REST API ● ANTLR based parser converts query to syntax tree ● Logical Planner generates tree of plan tree ● Optimizer transforms logical plan into an efficient execution strategy • RBO (predicate and limit pushdown, column pruning, partition pruning etc.) • CBO (Join reordering, Join strategy selection) • Takes advantage of Data layout (partitioning, sorting, grouping and indices) • Inter-node parallelism by breaking up plan into Stages that can be executed in parallel across workers • Intra-node parallelism by running a sequence of operators (pipeline) in multiple threads.
  • 8. 00
  • 9. 00 Scheduling ● Coordinator distributes plan to workers, starts execution of tasks and then begins to enumerate splits, which are opaque handles to an addressable chunk of data in an external storage system ● Splits are assigned to the tasks responsible for reading this data ■
  • 10. 00 Exchange (Shuffles) ● Presto uses in-memory buffered shuffles over HTTP to exchange intermediate results for different stages of a query ● Tasks produce data into an in-memory output buffer ● Workers consume results from other workers through an exchange client which uses HTTP long-polling ● Exchange client buffers data before it is processed (input buffer) ● Exchange server retains data until client acknowledges receipt ● Engine tunes parallelism to maintain target utilization rates for output and input buffers
  • 11. 00 Split Assignment Presto asks connectors to enumerate small batches of splits, and assigns them to tasks lazily ● Decouples query response time from time taken for listing files ● Avoid enumerating all splits when queries are cancelled or finish early when LIMIT clause is satisfied. ● Workers maintain a queue of splits. The coordinator assigns new splits to tasks with the shortest queue. Keeping these queues small allows the system to adapt to variance in CPU cost of processing different splits and performance differences among workers ● Allows queries to execute without having to hold all their metadata in memory ● Lazy split enumeration can make it difficult to accurately estimate and report query progress
  • 12. 00 State of Presto Today – Usage Overview Presto grew 420% in terms of compute hours on Qubole’s cloud platform from January 2017 to 2018. Customers in aggregate are running 24x more commands per hour in Presto than Apache Spark and 6x more commands than Apache Hadoop/Hive.
  • 13. 00 State of Presto Today – Usage Overview Top Three Industries Using Presto 1. Entertainment 2. Travel Services 3. Gaming Verticals everywhere are adopting Presto
  • 15. 00 Optimizing for Scale – Autoscaling ● Scale clusters in range [min size, max size] ● Scale up for the increased workload ● Scale down when load goes down ● Graceful scale down ● Usually implemented by defining rules on top of CPU/memory/IO metrics exported by system ● Qubole’s implementation ○ Monitor progress of queries ○ Intelligent decision making to scale up only if it can help to meet a given SLA ○ Handle bursty workloads by avoiding fixed step sizes ○ Finer controls like grouped scale up/down, cool down period, etc. ○ Automatic termination of idle clusters ○ Self start of cluster in response to first query on a shutdown cluster
  • 16. 00 Optimizing for Scale – Required workers ● Non source stages cannot be redistributed to take advantage of newly added nodes ● Min size of cluster must be large enough to avoid query failures ● Choice between high cost and degraded performance for initial queries ● Required workers is a mechanism to delay query execution until a minimum no. of worker nodes join the cluster ● Integration with Qubole’s autoscaling ○ Scale up cluster to satisfy min workers requirement ○ Avoid scaling up for DDL and monitoring related queries ○ Scale down to 1 node during periods of inactivity
  • 17. 00 Config A Config B Config C Total time taken 5h 12m 4h 26m 4h 37m Total node runtime seconds 143137 134664 124351 Min size 2 6 1 (6 required nodes)
  • 18. 00 Optimizing for Scale – Autoscaling
  • 19. 00 Optimizing for Scale – Maximizing the Benefits of the Cloud ● Spot nodes are generally available at highly discounted prices ● Presto is not able to utilize them well OOB due to its pipelined and in-memory execution architecture ● Spot loss will lead to failure of all queries which had any part of their execution tree running on that spot node ● Presto is usually run on newer generation, high memory instance types which experience spot loss more often due to greater demand ● Qubole’s handling of Spot termination notification ○ Proactive addition of nodes to maintain cluster size ○ Avoid scheduling tasks on spot node after receiving STN ○ Acquire on-demand quickly ○ Lazily rebalance to achieve desired spot ratio ○ No query failures if all queries finish under 2 minutes
  • 20. 00
  • 21. 00 Query retries ● Fallback for query failures that cannot be handled in STN Integration ● Query retries should be transparent to the clients and work with all Presto clients: Java cli, *DBC Drivers, Ruby client, etc. ● The retry should happen only if it is guaranteed that no partial results have yet been sent to the client ● The retry should happen only if changes (if any) caused by the failed query have been rolled back e.g. in case of Inserts, data written by the failed query has been deleted ● The retry should happen only if there is a chance of successful query execution ● Qubole’s implementation (Smart Query Retry) ○ Presto server responsible for retries, clients redirected to new query without any changes required to client ○ Convert SELECT queries into IOD queries (INSERT OVERWRITE DIRECTORY), clients get result only after query has finished ○ Track rollback status of query ○ Retry in bigger cluster if the failure is due to insufficient memory ○ Retry when cluster size stabilizes if the failure is due to node loss
  • 22. 00 Optimizing for Speed – Dynamic Filtering SELECT (...) FROM store_sales JOIN date_dim ON ss_sold_date_sk = d_date_sk (...) WHERE d_year = 2000 and d_moy = 12 (...) (... GROUP BY ... ORDER BY ...) Currently (assuming tables are not partitioned) Presto will perform full table scan of both tables. 1. Skip accessing fact table partitions not needed by the query (runtime partition pruning) 2. Filter rows on probe side of join by sending only the subset of rows that match the join keys across the network (runtime row filtering) 3. If storage format supports predicate pushdown, use runtime filters to avoid scanning data on probe side (runtime predicate pushdown)
  • 24. 00 Dynamic Filtering results • Runtime of 13 queries improved by at least 5X. • Runtime of 13 queries improved between 3X - 5X. • Runtime of 22 queries improved between 1.5X - 3X. • 14 queries that did not run before succeeded.
  • 25. 00 Optimizing for Speed – Join Reordering • Smaller table to the right for better performance • Difficult to ensure it in a multi-join query • Join Reordering Optimizer rule to the rescue
  • 26. 00 Optimizing for Speed – Join Reordering BA AB A.a = B.b B.b = A.a Join Reordering made for the case where build-side of Join is expensive 3~6x Tpcds scale 3000* Geomean 3.1x
  • 27. 00 Optimizing for Speed – Join Reordering
  • 28. 00 Optimizing for Speed – Join strategy selection ● Broadcast (Map-side join) vs Repartitioned (Shuffle join) ● Repartitioned ○ Default ○ Low memory usage ○ Both build and probe side need to be partitioned ○ More efficient for joins between large tables of similar size ● Broadcast ○ High memory usage, build side table must fit in memory ○ Probe side does not need to be partitioned ○ Build side table broadcast to all nodes ○ More efficient for joins where one table is of small size
  • 29. 00 Optimizing for Speed – RubiX ● A caching framework for big data engines in the cloud ● Open source https://github.com/qubole/rubix ● Built for zero configuration, SQL first, zero bottlenecks, auto rebalancing ● Adopted within Qubole for Presto, Hive and Spark ● Adopted outside Qubole e.g. HDInsights IOCache ● https://www.qubole.com/blog/rubix-fast-cache-access-for-big-data-analytics-on-cloud-storage
  • 30. 00 Optimizing for Speed – RubiX • Cache file chunks Shared • Cache across JVMs • Engine-independent scheduling logic Avg. ~20%
  • 32. 00 Presto OS roadmap ● Coordinator High Availability ● Allow connectors to participate in query optimization ● Improvements to Spill to disk functionality ● Partial recovery support for failure of long running queries ● Ranger integration ● Qubole collaborations with community ○ Dynamic Filtering ○ Kinesis Connector ○ Supporting Insert Only Transactional Hive tables ○ Data locality based scheduling of source splits ○ Presto UDFs (https://github.com/qubole/presto-udfs)
  • 34. 00 Thanks for attending! Please feel free to reach out to me at raunaqm@qubole.com