SlideShare a Scribd company logo
1 of 20
Download to read offline
Presto Updates to 0.178
Kai Sasaki
Treasure Data Inc
Bio
• Kai Sasaki (@Lewuathe)
• Software Engineer at Treasure Data
• Presto Team
• Hadoop/Spark/Hivemall Contributor
Presto In Treasure Data
Presto In Treasure Data
• Use Presto for query processing
• 4.3+ million queries per month
• 400 trillion records per month
• 6+ PB per month
Presto In Treasure Data
Presto
Coordinator
Presto
Worker
Presto
Worker
Presto
Worker
PostgreSQL
S3
presto-
client-ruby
0.152 -> 0.178
New Features
• Lambda Expression
• Filtered Aggregation
• VALIDATE mode in EXPLAIN
• Compressed Exchange
• Complex Grouping Operation
Lambda Expression
• Use -> in lambda function
https://prestodb.io/docs/current/functions/lambda.html
Filtered Aggregation
• Filtering inside aggregation function
SELECT
sum(a) FILTER (WHERE a > 0)
FROM
…
VALIDATE mode in EXPLAIN
• Syntax check by EXPLAIN
presto> EXPLAIN (type VALIDATE) SELECT …
Valid
———
true
(1 row)
Compressed Exchange
• Block exchanged between workers

are compressed in LZ4
• Enabled by

exchange.compression-enabled=true
Complex Grouping Operation
• UNION ALL + GROUP BY
SELECT host, path, code, AVG(size)
FROM www_access
GROUP BY GROUPING SETS (
(host),
(path),
(host,code)
);
Complex Grouping Operation
• UNION ALL + GROUP BY
SELECT host, NULL, NULL, AVG(size)
FROM www_access GROUP BY host
UNION ALL
SELECT NULL, path, NULL, AVG(size)
FROM www_access GROUP BY path
UNION ALL
SELECT host, NULL, code, AVG(size)
FROM www_access GROUP BY host, code
New Functions
• xxhash64(binary), to_big_endian_64(bigint)
• levenshtein_distance(string1,string2)
• array_overlap(x, y), array_except(x, y)
• to_ieee754_32(real), to_ieee754_64(double)
• codepoint()
• skewness(x), kurtosis(x)
Misc
• INT as alias for INTEGER
• Deprecated sample column for 

approximate query (experimental though)
• Allow specifying column comments

for CREATE TABLE
Future Works
• Presto Meetup - May 10th, 2017 

@ Facebook HQ
• Members
• Facebook, Teradata, Netflix, Uber etc
Future Works
• Disk Spill (on-going)

https://github.com/prestodb/presto/issues/5144
• Warning Framework

Notify warning and have a grace period so that users can
migrate queries to a new style
• Cost based optimizer

CAUTION!
• deprecated.legacy-order-by

Due to incompatibility of ORDER BY column
resolution
• deprecated.legacy-map-subscript

Due to incompatibility of map subscript
operator behavior if the key is not present
CAUTION!!!
• In 0.179
• “Fix planning failure when GROUPING() is
used with the legacy_order_by session
property set to true”
• https://prestodb.io/docs/current/release/
release-0.179.html
Thank you!

More Related Content

What's hot

20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_enOgibayashi
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Martin Traverso
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016kbajda
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Taro L. Saito
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookTreasure Data, Inc.
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Wojciech Biela
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Matt Fuller
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomyDongmin Yu
 
Expand data analysis tool at scale with Zeppelin
Expand data analysis tool at scale with ZeppelinExpand data analysis tool at scale with Zeppelin
Expand data analysis tool at scale with ZeppelinDataWorks Summit
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FutureDataWorks Summit
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
 
Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsSATOSHI TAGOMORI
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopGruter
 
Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)Dain Sundstrom
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringTaro L. Saito
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraSATOSHI TAGOMORI
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...Databricks
 

What's hot (20)

20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_en
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @Facebook
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
Expand data analysis tool at scale with Zeppelin
Expand data analysis tool at scale with ZeppelinExpand data analysis tool at scale with Zeppelin
Expand data analysis tool at scale with Zeppelin
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
 
Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench Tools
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
 
Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoring
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
 

Viewers also liked

A (too) Short Introduction to Scala
A (too) Short Introduction to ScalaA (too) Short Introduction to Scala
A (too) Short Introduction to ScalaRiccardo Cardin
 
Java - Concurrent programming - Thread's basics
Java - Concurrent programming - Thread's basicsJava - Concurrent programming - Thread's basics
Java - Concurrent programming - Thread's basicsRiccardo Cardin
 
Errori comuni nei documenti di Analisi dei Requisiti
Errori comuni nei documenti di Analisi dei RequisitiErrori comuni nei documenti di Analisi dei Requisiti
Errori comuni nei documenti di Analisi dei RequisitiRiccardo Cardin
 
Introduzione ai Design Pattern
Introduzione ai Design PatternIntroduzione ai Design Pattern
Introduzione ai Design PatternRiccardo Cardin
 
Design Pattern Strutturali
Design Pattern StrutturaliDesign Pattern Strutturali
Design Pattern StrutturaliRiccardo Cardin
 
Java Graphics Programming
Java Graphics ProgrammingJava Graphics Programming
Java Graphics ProgrammingRiccardo Cardin
 
Java- Concurrent programming - Synchronization (part 2)
Java- Concurrent programming - Synchronization (part 2)Java- Concurrent programming - Synchronization (part 2)
Java- Concurrent programming - Synchronization (part 2)Riccardo Cardin
 
Java- Concurrent programming - Synchronization (part 1)
Java- Concurrent programming - Synchronization (part 1)Java- Concurrent programming - Synchronization (part 1)
Java- Concurrent programming - Synchronization (part 1)Riccardo Cardin
 
Java - Processing input and output
Java - Processing input and outputJava - Processing input and output
Java - Processing input and outputRiccardo Cardin
 
Design pattern architetturali Model View Controller, MVP e MVVM
Design pattern architetturali   Model View Controller, MVP e MVVMDesign pattern architetturali   Model View Controller, MVP e MVVM
Design pattern architetturali Model View Controller, MVP e MVVMRiccardo Cardin
 
Design Pattern Architetturali - Dependency Injection
Design Pattern Architetturali - Dependency InjectionDesign Pattern Architetturali - Dependency Injection
Design Pattern Architetturali - Dependency InjectionRiccardo Cardin
 
Java - Concurrent programming - Thread's advanced concepts
Java - Concurrent programming - Thread's advanced conceptsJava - Concurrent programming - Thread's advanced concepts
Java - Concurrent programming - Thread's advanced conceptsRiccardo Cardin
 
Java Exception Handling, Assertions and Logging
Java Exception Handling, Assertions and LoggingJava Exception Handling, Assertions and Logging
Java Exception Handling, Assertions and LoggingRiccardo Cardin
 
Scala For Java Programmers
Scala For Java ProgrammersScala For Java Programmers
Scala For Java ProgrammersEnno Runne
 
Java - Remote method invocation
Java - Remote method invocationJava - Remote method invocation
Java - Remote method invocationRiccardo Cardin
 
Software architecture patterns
Software architecture patternsSoftware architecture patterns
Software architecture patternsRiccardo Cardin
 
SOLID - Principles of Object Oriented Design
SOLID - Principles of Object Oriented DesignSOLID - Principles of Object Oriented Design
SOLID - Principles of Object Oriented DesignRiccardo Cardin
 

Viewers also liked (20)

A (too) Short Introduction to Scala
A (too) Short Introduction to ScalaA (too) Short Introduction to Scala
A (too) Short Introduction to Scala
 
Java - Concurrent programming - Thread's basics
Java - Concurrent programming - Thread's basicsJava - Concurrent programming - Thread's basics
Java - Concurrent programming - Thread's basics
 
Diagrammi di Sequenza
Diagrammi di SequenzaDiagrammi di Sequenza
Diagrammi di Sequenza
 
Errori comuni nei documenti di Analisi dei Requisiti
Errori comuni nei documenti di Analisi dei RequisitiErrori comuni nei documenti di Analisi dei Requisiti
Errori comuni nei documenti di Analisi dei Requisiti
 
Introduzione ai Design Pattern
Introduzione ai Design PatternIntroduzione ai Design Pattern
Introduzione ai Design Pattern
 
Diagrammi delle Classi
Diagrammi delle ClassiDiagrammi delle Classi
Diagrammi delle Classi
 
Design Pattern Strutturali
Design Pattern StrutturaliDesign Pattern Strutturali
Design Pattern Strutturali
 
Java Graphics Programming
Java Graphics ProgrammingJava Graphics Programming
Java Graphics Programming
 
Java- Concurrent programming - Synchronization (part 2)
Java- Concurrent programming - Synchronization (part 2)Java- Concurrent programming - Synchronization (part 2)
Java- Concurrent programming - Synchronization (part 2)
 
Java- Concurrent programming - Synchronization (part 1)
Java- Concurrent programming - Synchronization (part 1)Java- Concurrent programming - Synchronization (part 1)
Java- Concurrent programming - Synchronization (part 1)
 
Java - Processing input and output
Java - Processing input and outputJava - Processing input and output
Java - Processing input and output
 
Design pattern architetturali Model View Controller, MVP e MVVM
Design pattern architetturali   Model View Controller, MVP e MVVMDesign pattern architetturali   Model View Controller, MVP e MVVM
Design pattern architetturali Model View Controller, MVP e MVVM
 
Design Pattern Architetturali - Dependency Injection
Design Pattern Architetturali - Dependency InjectionDesign Pattern Architetturali - Dependency Injection
Design Pattern Architetturali - Dependency Injection
 
Java - Concurrent programming - Thread's advanced concepts
Java - Concurrent programming - Thread's advanced conceptsJava - Concurrent programming - Thread's advanced concepts
Java - Concurrent programming - Thread's advanced concepts
 
Java Exception Handling, Assertions and Logging
Java Exception Handling, Assertions and LoggingJava Exception Handling, Assertions and Logging
Java Exception Handling, Assertions and Logging
 
Scala For Java Programmers
Scala For Java ProgrammersScala For Java Programmers
Scala For Java Programmers
 
Java - Remote method invocation
Java - Remote method invocationJava - Remote method invocation
Java - Remote method invocation
 
Software architecture patterns
Software architecture patternsSoftware architecture patterns
Software architecture patterns
 
SOLID - Principles of Object Oriented Design
SOLID - Principles of Object Oriented DesignSOLID - Principles of Object Oriented Design
SOLID - Principles of Object Oriented Design
 
Java - Sockets
Java - SocketsJava - Sockets
Java - Sockets
 

Similar to Presto Updates to 0.178 Summary

What is MariaDB Server 10.3?
What is MariaDB Server 10.3?What is MariaDB Server 10.3?
What is MariaDB Server 10.3?Colin Charles
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentationCyanny LIANG
 
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiTajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiData Con LA
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
PostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACIDPostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACIDOleg Bartunov
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Yandex
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovNikolay Samokhvalov
 
Presto changes
Presto changesPresto changes
Presto changesN Masahiro
 
Walkthrough Neo4j 1.9 & 2.0
Walkthrough Neo4j 1.9 & 2.0Walkthrough Neo4j 1.9 & 2.0
Walkthrough Neo4j 1.9 & 2.0Neo4j
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performanceDataWorks Summit
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooMithun Radhakrishnan
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveDataWorks Summit/Hadoop Summit
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at TwitterAlex Payne
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleEvan Chan
 
Cascading introduction
Cascading introductionCascading introduction
Cascading introductionAlex Su
 

Similar to Presto Updates to 0.178 Summary (20)

What is MariaDB Server 10.3?
What is MariaDB Server 10.3?What is MariaDB Server 10.3?
What is MariaDB Server 10.3?
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
 
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiTajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
PostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACIDPostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACID
 
Dapper
DapperDapper
Dapper
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
 
Presto changes
Presto changesPresto changes
Presto changes
 
Walkthrough Neo4j 1.9 & 2.0
Walkthrough Neo4j 1.9 & 2.0Walkthrough Neo4j 1.9 & 2.0
Walkthrough Neo4j 1.9 & 2.0
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
An intro to Azure Data Lake
An intro to Azure Data LakeAn intro to Azure Data Lake
An intro to Azure Data Lake
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
 
Python redis talk
Python redis talkPython redis talk
Python redis talk
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at Yahoo
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at Twitter
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 
Cascading introduction
Cascading introductionCascading introduction
Cascading introduction
 

More from Kai Sasaki

Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Kai Sasaki
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemKai Sasaki
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisKai Sasaki
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoKai Sasaki
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure DataKai Sasaki
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_systemKai Sasaki
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBKai Sasaki
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.jsKai Sasaki
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageKai Sasaki
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Kai Sasaki
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visibleKai Sasaki
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopKai Sasaki
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure CodingKai Sasaki
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Kai Sasaki
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADEKai Sasaki
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel orgKai Sasaki
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrapKai Sasaki
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案Kai Sasaki
 

More from Kai Sasaki (20)

Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed system
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDB
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.js
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visible
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoop
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADE
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel org
 
Reading drill
Reading drillReading drill
Reading drill
 
Kernel ext4
Kernel ext4Kernel ext4
Kernel ext4
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrap
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
 

Recently uploaded

NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESNarmatha D
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptxNikhil Raut
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 

Recently uploaded (20)

NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIES
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 

Presto Updates to 0.178 Summary

  • 1. Presto Updates to 0.178 Kai Sasaki Treasure Data Inc
  • 2. Bio • Kai Sasaki (@Lewuathe) • Software Engineer at Treasure Data • Presto Team • Hadoop/Spark/Hivemall Contributor
  • 4. Presto In Treasure Data • Use Presto for query processing • 4.3+ million queries per month • 400 trillion records per month • 6+ PB per month
  • 5. Presto In Treasure Data Presto Coordinator Presto Worker Presto Worker Presto Worker PostgreSQL S3 presto- client-ruby
  • 7. New Features • Lambda Expression • Filtered Aggregation • VALIDATE mode in EXPLAIN • Compressed Exchange • Complex Grouping Operation
  • 8. Lambda Expression • Use -> in lambda function https://prestodb.io/docs/current/functions/lambda.html
  • 9. Filtered Aggregation • Filtering inside aggregation function SELECT sum(a) FILTER (WHERE a > 0) FROM …
  • 10. VALIDATE mode in EXPLAIN • Syntax check by EXPLAIN presto> EXPLAIN (type VALIDATE) SELECT … Valid ——— true (1 row)
  • 11. Compressed Exchange • Block exchanged between workers
 are compressed in LZ4 • Enabled by
 exchange.compression-enabled=true
  • 12. Complex Grouping Operation • UNION ALL + GROUP BY SELECT host, path, code, AVG(size) FROM www_access GROUP BY GROUPING SETS ( (host), (path), (host,code) );
  • 13. Complex Grouping Operation • UNION ALL + GROUP BY SELECT host, NULL, NULL, AVG(size) FROM www_access GROUP BY host UNION ALL SELECT NULL, path, NULL, AVG(size) FROM www_access GROUP BY path UNION ALL SELECT host, NULL, code, AVG(size) FROM www_access GROUP BY host, code
  • 14. New Functions • xxhash64(binary), to_big_endian_64(bigint) • levenshtein_distance(string1,string2) • array_overlap(x, y), array_except(x, y) • to_ieee754_32(real), to_ieee754_64(double) • codepoint() • skewness(x), kurtosis(x)
  • 15. Misc • INT as alias for INTEGER • Deprecated sample column for 
 approximate query (experimental though) • Allow specifying column comments
 for CREATE TABLE
  • 16. Future Works • Presto Meetup - May 10th, 2017 
 @ Facebook HQ • Members • Facebook, Teradata, Netflix, Uber etc
  • 17. Future Works • Disk Spill (on-going)
 https://github.com/prestodb/presto/issues/5144 • Warning Framework
 Notify warning and have a grace period so that users can migrate queries to a new style • Cost based optimizer

  • 18. CAUTION! • deprecated.legacy-order-by
 Due to incompatibility of ORDER BY column resolution • deprecated.legacy-map-subscript
 Due to incompatibility of map subscript operator behavior if the key is not present
  • 19. CAUTION!!! • In 0.179 • “Fix planning failure when GROUPING() is used with the legacy_order_by session property set to true” • https://prestodb.io/docs/current/release/ release-0.179.html