Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Presto updates to 0.178

Presto Meetup at Tokyo

  • Login to see the comments

Presto updates to 0.178

  1. 1. Presto Updates to 0.178 Kai Sasaki Treasure Data Inc
  2. 2. Bio • Kai Sasaki (@Lewuathe) • Software Engineer at Treasure Data • Presto Team • Hadoop/Spark/Hivemall Contributor
  3. 3. Presto In Treasure Data
  4. 4. Presto In Treasure Data • Use Presto for query processing • 4.3+ million queries per month • 400 trillion records per month • 6+ PB per month
  5. 5. Presto In Treasure Data Presto Coordinator Presto Worker Presto Worker Presto Worker PostgreSQL S3 presto- client-ruby
  6. 6. 0.152 -> 0.178
  7. 7. New Features • Lambda Expression • Filtered Aggregation • VALIDATE mode in EXPLAIN • Compressed Exchange • Complex Grouping Operation
  8. 8. Lambda Expression • Use -> in lambda function https://prestodb.io/docs/current/functions/lambda.html
  9. 9. Filtered Aggregation • Filtering inside aggregation function SELECT sum(a) FILTER (WHERE a > 0) FROM …
  10. 10. VALIDATE mode in EXPLAIN • Syntax check by EXPLAIN presto> EXPLAIN (type VALIDATE) SELECT … Valid ——— true (1 row)
  11. 11. Compressed Exchange • Block exchanged between workers
 are compressed in LZ4 • Enabled by
 exchange.compression-enabled=true
  12. 12. Complex Grouping Operation • UNION ALL + GROUP BY SELECT host, path, code, AVG(size) FROM www_access GROUP BY GROUPING SETS ( (host), (path), (host,code) );
  13. 13. Complex Grouping Operation • UNION ALL + GROUP BY SELECT host, NULL, NULL, AVG(size) FROM www_access GROUP BY host UNION ALL SELECT NULL, path, NULL, AVG(size) FROM www_access GROUP BY path UNION ALL SELECT host, NULL, code, AVG(size) FROM www_access GROUP BY host, code
  14. 14. New Functions • xxhash64(binary), to_big_endian_64(bigint) • levenshtein_distance(string1,string2) • array_overlap(x, y), array_except(x, y) • to_ieee754_32(real), to_ieee754_64(double) • codepoint() • skewness(x), kurtosis(x)
  15. 15. Misc • INT as alias for INTEGER • Deprecated sample column for 
 approximate query (experimental though) • Allow specifying column comments
 for CREATE TABLE
  16. 16. Future Works • Presto Meetup - May 10th, 2017 
 @ Facebook HQ • Members • Facebook, Teradata, Netflix, Uber etc
  17. 17. Future Works • Disk Spill (on-going)
 https://github.com/prestodb/presto/issues/5144 • Warning Framework
 Notify warning and have a grace period so that users can migrate queries to a new style • Cost based optimizer

  18. 18. CAUTION! • deprecated.legacy-order-by
 Due to incompatibility of ORDER BY column resolution • deprecated.legacy-map-subscript
 Due to incompatibility of map subscript operator behavior if the key is not present
  19. 19. CAUTION!!! • In 0.179 • “Fix planning failure when GROUPING() is used with the legacy_order_by session property set to true” • https://prestodb.io/docs/current/release/ release-0.179.html
  20. 20. Thank you!

×