Frank Hu @ Data Platform US, TikTok
Improving Presto
performance with
Alluxio Cache
Overview
Presto Use Case “Presto-on-Alluxio”
Integration
Cache Strategy &
Scheduling
Presto Use Case
● Workload:
○ 600K+ read-only, interactive SQLs daily
● Clusters Size
○ 40K+ vcore
○ 400TB+ memory
● Data Source
○ Hive tables on HDFS
○ Shared Hive Metastore (HMS) with other engines/database like
Spark, Clickhouse etc.
Why Caching?
● IO is the #1 time consuming part in SQL execution
● Slown HDFS datanode when high concurrent reads lands on the same
batch of block repetitively
● Save network bandwidth for other operations like shuffle
Problems with Cache
● Consistency
● Data Locality
● Pluggable Integration
● Resource Utilization
● Cold Start
● Caching Policy
● Multi-Tier Support
● ...
BEST Cache is NO
Cache ?
Open Source Integrations ?
Solution 1: Hardcoded URL Swap
● Change path in Location
properties in HMS table/partition
from hdfs:// to alluxio://
Problem
● Prerequisite: Query Engines
shared metadata in HMS
read/write to Alluxio
Open Source Integrations ?
Alluxio Catalog Service
●
Problems
● High QPS on Alluxio Master:
Every HMS lookup goes through
the catalog service regardless
whether the table is not cached
● Manual synchronization is
needed to keep metadata in
sync between Hive Metastore &
Alluxio catalog service
Inhouse Presto-on-Alluxio Integration
● Store alluxio path in a separate
table/partition parameters
cachePath in HMS
● Presto loads HDFS path and
optional Alluxio path and prefer
to read from Alluxio if cachePath
parameter presents
Inhouse Presto-on-Alluxio Integration
● Extend CachingFileSystem in
Presto to construct two
FileSystems (HDFS & Alluxio)
● Fallback to read from HDFS
whenever read from Alluxio fails
or timeout
Caching is insufficient
Benchmark
● 30% latency reduction on sample
SQLs in production
● The benefits fall to 17% on TPC-DS
average latency reduction
Learning
● Need to identify the IO-intensive
SQLs to maximize the resource
utilization
Customized Cache Strategy
● Collect time spent on
TableScanOperator &
ScanFilterAndProjectOperator
● Aggregate the top N
time-consumed partitions in the
past M days
● Knapsack problem: Given fixed
Alluxio space, find the best sets
of partitions ( and TTL)
Cache Scheduler
Trigger
● Subscribe to HMS changelog on
AddPartition, AlterPartition,
DropPartition events
● Compare with Cache Strategy to
determine whether the changed
partition is cacheable
Mount & Cleanup
● Cacheable partitions are mounted
in Alluxio first before adding
cachePath to HMS
● Cron job to remove cachePath in
HMS and unmount from Alluxio
based on the TTL defined in cache
strategy
Recap
● P95 query latency reduced by
41.2%
● With less than 1% of cache disk
vs daily HDFS increments, 32%
cache coverage in weekly basis
● 91.1% cache-hit SQLs reduce
latency by 20%+
Overall Results
● Experiment with "alluxio-as-lib”
a. Cache Consistency issue-13700
b. Optimized Presto scheduling
hash algorithm
c. Adopt Alluxio Structured Data
● Enable write caches on Alluxio to
chain ETL jobs
Next Steps
Thanks
TikTok is hiring!
https://careers.tiktok.com
Email: frank.hu@bytedance.com

Improving Presto performance with Alluxio at TikTok

  • 1.
    Frank Hu @Data Platform US, TikTok Improving Presto performance with Alluxio Cache
  • 2.
    Overview Presto Use Case“Presto-on-Alluxio” Integration Cache Strategy & Scheduling
  • 3.
    Presto Use Case ●Workload: ○ 600K+ read-only, interactive SQLs daily ● Clusters Size ○ 40K+ vcore ○ 400TB+ memory ● Data Source ○ Hive tables on HDFS ○ Shared Hive Metastore (HMS) with other engines/database like Spark, Clickhouse etc.
  • 4.
    Why Caching? ● IOis the #1 time consuming part in SQL execution ● Slown HDFS datanode when high concurrent reads lands on the same batch of block repetitively ● Save network bandwidth for other operations like shuffle
  • 5.
    Problems with Cache ●Consistency ● Data Locality ● Pluggable Integration ● Resource Utilization ● Cold Start ● Caching Policy ● Multi-Tier Support ● ... BEST Cache is NO Cache ?
  • 6.
    Open Source Integrations? Solution 1: Hardcoded URL Swap ● Change path in Location properties in HMS table/partition from hdfs:// to alluxio:// Problem ● Prerequisite: Query Engines shared metadata in HMS read/write to Alluxio
  • 7.
    Open Source Integrations? Alluxio Catalog Service ● Problems ● High QPS on Alluxio Master: Every HMS lookup goes through the catalog service regardless whether the table is not cached ● Manual synchronization is needed to keep metadata in sync between Hive Metastore & Alluxio catalog service
  • 8.
    Inhouse Presto-on-Alluxio Integration ●Store alluxio path in a separate table/partition parameters cachePath in HMS ● Presto loads HDFS path and optional Alluxio path and prefer to read from Alluxio if cachePath parameter presents
  • 9.
    Inhouse Presto-on-Alluxio Integration ●Extend CachingFileSystem in Presto to construct two FileSystems (HDFS & Alluxio) ● Fallback to read from HDFS whenever read from Alluxio fails or timeout
  • 10.
    Caching is insufficient Benchmark ●30% latency reduction on sample SQLs in production ● The benefits fall to 17% on TPC-DS average latency reduction Learning ● Need to identify the IO-intensive SQLs to maximize the resource utilization
  • 11.
    Customized Cache Strategy ●Collect time spent on TableScanOperator & ScanFilterAndProjectOperator ● Aggregate the top N time-consumed partitions in the past M days ● Knapsack problem: Given fixed Alluxio space, find the best sets of partitions ( and TTL)
  • 12.
    Cache Scheduler Trigger ● Subscribeto HMS changelog on AddPartition, AlterPartition, DropPartition events ● Compare with Cache Strategy to determine whether the changed partition is cacheable Mount & Cleanup ● Cacheable partitions are mounted in Alluxio first before adding cachePath to HMS ● Cron job to remove cachePath in HMS and unmount from Alluxio based on the TTL defined in cache strategy
  • 13.
  • 14.
    ● P95 querylatency reduced by 41.2% ● With less than 1% of cache disk vs daily HDFS increments, 32% cache coverage in weekly basis ● 91.1% cache-hit SQLs reduce latency by 20%+ Overall Results
  • 15.
    ● Experiment with"alluxio-as-lib” a. Cache Consistency issue-13700 b. Optimized Presto scheduling hash algorithm c. Adopt Alluxio Structured Data ● Enable write caches on Alluxio to chain ETL jobs Next Steps
  • 16.