SlideShare a Scribd company logo
Presto At LINE
Presto Conference Tokyo 2019
2019/07/11
Wataru Yukawa & Yuya Ebihara
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Data flow
Hadoop
RDBMS
Log
Hadoop
Ingest Data Report
Ranger
Tableau
OASIS
Yanagishima
LINE Analytics
Aquarium
OASIS
● Web-based data analysis platform
like Apache Zeppelin, Jupyter
● Basically use Spark but also
can use Presto
OASIS - Data Analysis Platform for Multi-tenant Hadoop Cluster
https://www.slideshare.net/linecorp/oasis-data-analysis-platform-for-multitenant-hadoop-cluster
904 UU
67k PV
LINE Analytics
● Analysis tool similar to Google Analytics
○ Dashboard
○ Basic Summary
○ Realtime
○ Page Contents
○ Event Tracking
○ User Environment
○ Tools
● Backend is Presto
Why LINE's Front-end Development Team Built the Web Tracking System
https://www.slideshare.net/linecorp/why-lines-frontend-development-team-built-the-web-tracking-system
433 UU
7k PV
Aquarium
● Metadata catalog tool
○ Contacts
○ Note
○ Columns
○ Location
○ HDFS
○ Relationship
○ Reference
○ DDL
Efficient And Invincible Big Data Platform In LINE
https://www.slideshare.net/linecorp/efficient-and-invincible-big-data-platform-in-line/25
481 UU
10k PV
Yanagishima
1.3k UU
11k PV
● Web UI for
○ Presto
○ Hive
○ Spark SQL
○ Elasticsearch
● Started in 2015
● Many similar tools like Hue, Airpal, Shib
but I wanted to create in my mind
https://github.com/yanagishima/yanagishima
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Yanagishima features
● Share query with permanent link
● Handle multiple Presto clusters
● Input parameters
● Pretty print for json & map
● Chart
● Pivot table
● Show EXPLAIN result as Text and Graphviz
● Desktop notification
Input
parameters
Pretty
print
● Supported chart type
○ Line
○ Stacked Area
○ Full-Stacked Area
○ Column
○ Stacked Column
Chart
Pivot
table
Explain
● Text
● Graphviz
Desktop
notification
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Yanagishima Components
API server
○ Written in Java
○ Store query into sqlite3 or mysql
○ Store query result into filesystem
○ Don’t have authentication, use in-house auth system in proxy server
○ Don’t have authorization system, use Apache Ranger
SPA
○ Written in jQuery at first
○ Frontend engineers in our department replaced with Vue.js in 2017
○ Clean and modern code thanks to their major refactoring
How to process query
● Asynchronous processing flow
○ User submits query
○ Get query id
○ Track with query id by client side polling
○ User can know progress and kill query
● Easy to implement thanks to Presto REST API
● Not easy to implement in Hive and Spark due to lack of API
e.g., Not easy to get YARN application id
Dependency
● Depends on presto-cli not JDBC because of performance and feature
● Yanagishima wants not only query result but also column name in 1 Presto request
● DatabaseMetaData#getColumns is slow, more than 10s due to system.jdbc.columns table scan
● Presto didn’t support JDBC cancel in 2015 but now supports
● Chose to use presto-cli but it has de-merit
Compatibility issue
● Unfortunately, presto-cli >= 0.205 can’t connect old Presto server because of ROW type #224
→ Bundled new & old presto-cli without shade because package name is different
io.prestosql & com.facebook.presto
● May change to use JDBC because it’s better not to use presto-cli as PSF mentioned in the
above issue
Version Workers Auth
Analysis 315 76 -
Datachain 314 100 LDAP
Shonan 306 36 -
Dataopen 0.197 9 LDAP
Datalake2 0.188 200 LDAP
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Analysis Presto error query
SemanticErrorName is available since 313 #790
Syntactic Analysis
Semantic Analysis
Syntax Error
Semantic Error
Fail
Fail
Pass
Thank you for kind code review 🙏🏻
Classification of USER_ERROR
● Many syntax errors
● Typical semantic error is that user accesses to not existed column/schema
SYNTAX_ERROR
● mismatched input ... expecting
● Hive views are not supported
● ...
Semantic Error Name Count
null 743
MISSING_ATTRIBUTE 273
MISSING_SCHEMA 169
MUST_BE_AGGREGATE_OR_GROUP_BY 111
TYPE_MISMATCH 87
FUNCTION_NOT_FOUND 72
MISSING_TABLE 53
MISSING_CATALOG 23
INVALID_LITERAL 4
AMBIGUOUS_ATTRIBUTE 4
CANNOT_HAVE_AGGREGATIONS_WINDOWS_OR_GROUPING 3
INVALID_ORDINAL 2
NOT_SUPPORTED 2
ORDER_BY_MUST_BE_IN_SELECT 2
NESTED_AGGREGATION 1
WINDOW_REQUIRES_OVER 1
REFERENCE_TO_OUTPUT_ATTRIBUTE_WITHIN_ORDER_BY_AGGREGATION 1
Error Name Count
SYNTAX_ERROR 635
NOT_FOUND 47
HIVE_EXCEEDED_PARTITION_LIMIT 26
INVALID_FUNCTION_ARGUMENT 19
INVALID_CAST_ARGUMENT 6
PERMISSION_DENIED 3
SUBQUERY_MULTIPLE_ROWS 3
ADMINISTRATIVELY_KILLED 2
INVALID_SESSION_PROPERTY 1
DIVISION_BY_ZERO 1
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
● Execute query with Presto because of speed and rich UDF
○ User wants to execute query quickly and check data roughly
● Implement batch with Spark SQL in OASIS or with Hive in console
○ User wants to create stable batch
● Yanagishima is like Gist, OASIS is like GitHub
Typical use case in Yanagishima & OASIS
Why we don’t use Presto in batch
● Lack of Hive metastore impersonation
○ Support Impersonation in Metastore communication #43
● Less stable than Hive or Spark
○ Want to prioritize stability rather than latency in batch
○ Need to handle huge data
Impersonation
● Presto does not support impersonating the end user when accessing the Hive metastore
● SELECT query is no problem but CREATE/DROP/ALTER/INSERT/DELETE query can be problem
● If Presto process’s user is presto and yukawa creates table, presto accesses Hive metastore as
presto user, not yukawa. It means other user can drop table if presto user has write permission
● We don’t allow presto user to write in HDFS with Apache Ranger
● HMS impersonation is available in Starburst Distribution of Presto
● Support for impersonation will be a game changer
CREATE TABLE line.ad
WRITE permission
hadoop.proxyuser.presto.groups=*
hadoop.proxyuser.presto.hosts=*
DROP TABLE line.ad
Hive Metastorepresto user
Ranger
Hadoop
yukawa
ebihara
Less stable than Hive/Spark
● A single query/worker crash can be a bottleneck
● Auto restart worker mechanizm may be necessary
● Presto worker, data node, node manager are deployed in the same machine
● Enabling CGroups may be necessary because pyspark python process cpu usage is high, etc
● For example, yarn.nodemanager.resource.percentage-physical-cpu-limit : 70%
Hive/Spark batch is more stable but it’s not easy to convert from Presto to
Hive/Spark due to date function, syntax, …
Presto
Hive
Spark SQL
json_extract_scalar get_json_object
date_format(now(), '%Y%m%d') date_format(current_timestamp(), 'yyyyMMdd')
cross join unnest () as t () lateral view explode() t
url function like url_extract_parameter -
try -
Hard to convert query
Confusing Spark SQL error message
Spark can use “DISTINCT” as column name SPARK-27170
It’s difficult to understand error message
It will be improved in Spark 3.0 SPARK-27901
cannot resolve ‘`distinct`’ given input columns: [...]; line 1 pos 7;
‘GlocalLimit 100’
+- ‘LocalLimit 100
+- ‘Project [‘distinct, ...’]
+- Filter (...)
+- SubqueryAlias ...
+- HiveTableRelation ...
SELECT distinct
,a
,b
,c
FROM test_table LIMIT 100
Confusing...💨
DEBUG
Recent Issues
More than 100,000 partitions error occurred in 307 #619
● The fixed version was released within one day
Partition location does not exist in hive external table #620
● Upgrade Hadoop library 2.7.7 to 3.2.0 affected
● Ongoing https://issues.apache.org/jira/browse/HDFS-14466
Create table failed when using viewfs #10099
● It’s known issue and not fatal for now because we use Presto with read-only mode
Handle repeated predicate pushdown into Hive connector #984
● Performance regression, already fixed in 315
Great !
Schema mismatch of parquet file #9156
● Our old cluster faced this issue recently, already fixed in 0.203
Scale
Table # Partitions
Table A 1,588,031
Table B 1,429,047
Table C 1,429,046
Table D 1,116,130
Table E 772,725
● Daily queries: ~20K
● Daily processed data: 330TB
● Daily processed rows: 4 Trillion rows
● Partitions
hive.max-partitions-per-scan (default: 100,000)
Maximum number of partitions for a single table scan
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Already fixed in 308
More than 100,000 partitions error
More than 100,000 partitions error occurred in 307 #619
→ Query over table 'default.test_left' can potentially read more than 100000 partitions
at io.prestosql.plugin.hive.HiveMetadata.getPartitionsAsList(HiveMetadata.java:601)
at io.prestosql.plugin.hive.HiveMetadata.getTableLayouts(HiveMetadata.java:1645)
....
Steps to Reproduce
● Start hadoop-master docker image
$ presto-product-tests/conf/docker/singlenode/compose.sh up -d hadoop-master
$ presto-product-tests/conf/docker/singlenode/compose.sh up -d
● Create a table and populate rows
presto> CREATE TABLE test_part (col int, part_col int) with (partitioned_by = ARRAY['part_col']);
presto> INSERT INTO test_part (col, part_col) SELECT 0, CAST(id AS int) FROM UNNEST (sequence(1, 100)) AS u(id);
presto> INSERT INTO test_part (col, part_col) SELECT 0, CAST(id AS int) FROM UNNEST (sequence(101, 150)) AS u(id);
hive.max-partitions-per-scan=100 in product test
hive.max-partitions-per-writers=100 (default)
● Execute reproducible query (TestHivePartitionsTable.java)
presto> SELECT a.part_col FROM
(SELECT * FROM test_part WHERE part_col = 1) a, (SELECT * FROM test_part WHERE part_col = 1) b
WHERE a.col = b.col
Frames and Variables
● Migration to remove table layout was ongoing
● “TupleDomain” is one of the keywords about predicate pushdown
Fix
● Fixed EffectivePredicateExtractor.visitTableScan method
Actually, it’s the workaround until the migration completed
● Timeline
○ Created Issue April 11, 4PM
○ Merged commit April 12, 7AM
○ Released 308 April 12, 3PM
Released within one day 🎉
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Webhdfs partition location does not exist
Partitioned webhdfs table throws “Partition location does not exist” error #620
● Webhdfs isn’t supported (at least not tested) due to missing classes #957
→ Add missing jar to plugin directory
● Create table with webhdfs location on hive
hive> CREATE TABLE test_part_webhdfs (col1 int) PARTITIONED BY (dt int)
LOCATION 'webhdfs://hadoop-master:50070/user/hive/warehouse/test_part_webhdfs';
hive> INSERT INTO test_part_webhdfs PARTITION(dt=1) VALUES (1);
presto> SELECT * FROM test_part_webhdfs;
→ Partition location does not exist:
webhdfs://hadoop-master:50070/user/hive/warehouse/test_part_webhdfs/dt=1
Remote Debugger
● Edit jvm.config
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
● Remote debugger configuration in IntelliJ IDEA
Run→Edit Configurations...→+→Remote
Step into Hadoop library
● The argument calling Hadoop library are same
○ We can also step into dependent libraries as local codes
Different internal call
● Hadoop 2.7.7 (Presto 306)
http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt=1?op=LISTSTATUS&user.name=x
● Hadoop 3.2.0 (Presto 307)
http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt%253D1?op=LISTSTATUS&user.name=x
{
"RemoteException":{
"exception":"FileNotFoundException",
"javaClassName":"java.io.FileNotFoundException",
"message":"File /user/hive/warehouse/test_part/dt%3D1 does not exist."
}
}
HDFS-14466
FileSystem.listLocatedStatus for path including '=' encodes it and returns FileNotFoundException
Equals sign is doubled encoded
dt=1 → dt%3D1 → dt%253D1
HADOOP-16258.001.patch by Iwasaki-san
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Already fixed in 0.203
Schema mismatch of parquet
● Failed to access table created by Spark
presto> SELECT * FROM default.test_parquet WHERE dt='20190101'
Error opening Hive split hdfs://cluster/apps/hive/warehouse/test_parquet/dt=20190101/20190101.snappy.parquet
(offset=503316480, length=33554432):
Schema mismatch, metastore schema for row column col1.element has 13 fields but parquet schema has 12 fields
Issue column type is ARRAY<STRUCT<...>>
● Hive metastore returns 13 fields
● Parquet schema returns 12 fields
parquet-tools
● Supported options
○ cat
○ head
○ schema
○ meta
○ dump
○ merge
○ rowcount
○ size
https://github.com/apache/parquet-mr/tree/master/parquet-tools
● Inspect schema of parquet file
$ parquet-tools schema sample.parquet
message spark_schema {
optional group @metadata {
optional binary beat (UTF8);
optional binary topic (UTF8);
optional binary type (UTF8);
optional binary version (UTF8);
}
optional binary @timestamp (UTF8);
…
We can analyze even large file over GB within few seconds
Agenda
● USE
○ Our on-premises log analysis platform and tools
○ Yanagishima features
○ Yanagishima internals
○ Presto error query analysis
○ Our Presto/Spark use case with Yanagishima/OASIS
● DEBUG
○ More than 100,000 partitions error
○ Webhdfs partition location does not exist
○ Schema mismatch of parquet
○ Contributions
Contributions
● General
○Retrieve semantic error name
○Fix partition pruning regression on 307
○COMMENT ON TABLE
○DROP COLUMN
○Column alias in CTAS
○Non-ascii date_format function argument
● Cassandra connector
○INSERT statement
○Auto-discover protocol version
○Materialized View
○Smallint, tinyint, date type
○Nested collection type
● Hive connector
○CREATE TABLE property
■ textfile_skip_header_line_count
■ textfile_skip_footer_line_count
● MySQL connector
○Map JSON to Presto JSON
● CLI
○Output format
■ JSON
■ CSV_UNQUOTED
■ CSV_HEADER_UNQUOTED
Thanks PSF, Starburst and ex-Teradata ✨
THANK YOU

More Related Content

What's hot

#dnstudy 01 ドメイン名の歴史
#dnstudy 01 ドメイン名の歴史#dnstudy 01 ドメイン名の歴史
#dnstudy 01 ドメイン名の歴史
Takashi Takizawa
 

What's hot (20)

#dnstudy 01 ドメイン名の歴史
#dnstudy 01 ドメイン名の歴史#dnstudy 01 ドメイン名の歴史
#dnstudy 01 ドメイン名の歴史
 
SQL Performance Improvements at a Glance in Apache Spark 3.0
SQL Performance Improvements at a Glance in Apache Spark 3.0SQL Performance Improvements at a Glance in Apache Spark 3.0
SQL Performance Improvements at a Glance in Apache Spark 3.0
 
IoT時代におけるストリームデータ処理と急成長の Apache Flink
IoT時代におけるストリームデータ処理と急成長の Apache FlinkIoT時代におけるストリームデータ処理と急成長の Apache Flink
IoT時代におけるストリームデータ処理と急成長の Apache Flink
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみた
 
Spring Cloud Data Flow の紹介 #streamctjp
Spring Cloud Data Flow の紹介  #streamctjpSpring Cloud Data Flow の紹介  #streamctjp
Spring Cloud Data Flow の紹介 #streamctjp
 
Real time data quality on Flink
Real time data quality on FlinkReal time data quality on Flink
Real time data quality on Flink
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...
 
JSON:APIについてざっくり入門
JSON:APIについてざっくり入門JSON:APIについてざっくり入門
JSON:APIについてざっくり入門
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
ヤフー発のメッセージキュー「Pulsar」のご紹介
ヤフー発のメッセージキュー「Pulsar」のご紹介ヤフー発のメッセージキュー「Pulsar」のご紹介
ヤフー発のメッセージキュー「Pulsar」のご紹介
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
ServerlessConf 2018 Keynote - Debunking Serverless Myths
ServerlessConf 2018 Keynote - Debunking Serverless MythsServerlessConf 2018 Keynote - Debunking Serverless Myths
ServerlessConf 2018 Keynote - Debunking Serverless Myths
 
Apache Atlasの現状とデータガバナンス事例 #hadoopreading
Apache Atlasの現状とデータガバナンス事例 #hadoopreadingApache Atlasの現状とデータガバナンス事例 #hadoopreading
Apache Atlasの現状とデータガバナンス事例 #hadoopreading
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기
 

Similar to Presto conferencetokyo2019

Similar to Presto conferencetokyo2019 (20)

Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
 
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
 
Introduction to Structured streaming
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streaming
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
Scaling 100PB Data Warehouse in Cloud
Scaling 100PB Data Warehouse in CloudScaling 100PB Data Warehouse in Cloud
Scaling 100PB Data Warehouse in Cloud
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQL
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Introduction to Spring Boot
Introduction to Spring BootIntroduction to Spring Boot
Introduction to Spring Boot
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
Understanding transactional writes in datasource v2
Understanding transactional writes in  datasource v2Understanding transactional writes in  datasource v2
Understanding transactional writes in datasource v2
 
Apache Tajo on Swift
Apache Tajo on SwiftApache Tajo on Swift
Apache Tajo on Swift
 
[OpenStack Day in Korea 2015] Track 2-6 - Apache Tajo on Swift
[OpenStack Day in Korea 2015] Track 2-6 - Apache Tajo on Swift[OpenStack Day in Korea 2015] Track 2-6 - Apache Tajo on Swift
[OpenStack Day in Korea 2015] Track 2-6 - Apache Tajo on Swift
 
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
pandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fastpandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fast
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 API
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
Accumulo Summit Keynote 2018
Accumulo Summit Keynote 2018Accumulo Summit Keynote 2018
Accumulo Summit Keynote 2018
 

More from wyukawa (19)

Strata2017 sg
Strata2017 sgStrata2017 sg
Strata2017 sg
 
Azkaban-en
Azkaban-enAzkaban-en
Azkaban-en
 
Azkaban
AzkabanAzkaban
Azkaban
 
Upgrading from-hdp-21-to-hdp-25
Upgrading from-hdp-21-to-hdp-25Upgrading from-hdp-21-to-hdp-25
Upgrading from-hdp-21-to-hdp-25
 
Promcon2016
Promcon2016Promcon2016
Promcon2016
 
Prometheus london
Prometheus londonPrometheus london
Prometheus london
 
Presto in my_use_case2
Presto in my_use_case2Presto in my_use_case2
Presto in my_use_case2
 
Prometheus casual talk1
Prometheus casual talk1Prometheus casual talk1
Prometheus casual talk1
 
My ambariexperience
My ambariexperienceMy ambariexperience
My ambariexperience
 
Prometheus
PrometheusPrometheus
Prometheus
 
Upgrading from-hdp-21-to-hdp-24
Upgrading from-hdp-21-to-hdp-24Upgrading from-hdp-21-to-hdp-24
Upgrading from-hdp-21-to-hdp-24
 
Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
 
Hive sourcecodereading
Hive sourcecodereadingHive sourcecodereading
Hive sourcecodereading
 
Hdfs write
Hdfs writeHdfs write
Hdfs write
 
Osc mercurial-public
Osc mercurial-publicOsc mercurial-public
Osc mercurial-public
 
Dvcs study
Dvcs studyDvcs study
Dvcs study
 
Hudson study-zen
Hudson study-zenHudson study-zen
Hudson study-zen
 
Shibuya.trac.8
Shibuya.trac.8Shibuya.trac.8
Shibuya.trac.8
 
Hudson tanabata.trac
Hudson tanabata.tracHudson tanabata.trac
Hudson tanabata.trac
 

Recently uploaded

Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Domenico Conte
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 

Recently uploaded (20)

Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 

Presto conferencetokyo2019

  • 1. Presto At LINE Presto Conference Tokyo 2019 2019/07/11 Wataru Yukawa & Yuya Ebihara
  • 2. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 3. Data flow Hadoop RDBMS Log Hadoop Ingest Data Report Ranger Tableau OASIS Yanagishima LINE Analytics Aquarium
  • 4. OASIS ● Web-based data analysis platform like Apache Zeppelin, Jupyter ● Basically use Spark but also can use Presto OASIS - Data Analysis Platform for Multi-tenant Hadoop Cluster https://www.slideshare.net/linecorp/oasis-data-analysis-platform-for-multitenant-hadoop-cluster 904 UU 67k PV
  • 5. LINE Analytics ● Analysis tool similar to Google Analytics ○ Dashboard ○ Basic Summary ○ Realtime ○ Page Contents ○ Event Tracking ○ User Environment ○ Tools ● Backend is Presto Why LINE's Front-end Development Team Built the Web Tracking System https://www.slideshare.net/linecorp/why-lines-frontend-development-team-built-the-web-tracking-system 433 UU 7k PV
  • 6. Aquarium ● Metadata catalog tool ○ Contacts ○ Note ○ Columns ○ Location ○ HDFS ○ Relationship ○ Reference ○ DDL Efficient And Invincible Big Data Platform In LINE https://www.slideshare.net/linecorp/efficient-and-invincible-big-data-platform-in-line/25 481 UU 10k PV
  • 7. Yanagishima 1.3k UU 11k PV ● Web UI for ○ Presto ○ Hive ○ Spark SQL ○ Elasticsearch ● Started in 2015 ● Many similar tools like Hue, Airpal, Shib but I wanted to create in my mind https://github.com/yanagishima/yanagishima
  • 8. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 9. Yanagishima features ● Share query with permanent link ● Handle multiple Presto clusters ● Input parameters ● Pretty print for json & map ● Chart ● Pivot table ● Show EXPLAIN result as Text and Graphviz ● Desktop notification
  • 12. ● Supported chart type ○ Line ○ Stacked Area ○ Full-Stacked Area ○ Column ○ Stacked Column Chart
  • 16. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 17. Yanagishima Components API server ○ Written in Java ○ Store query into sqlite3 or mysql ○ Store query result into filesystem ○ Don’t have authentication, use in-house auth system in proxy server ○ Don’t have authorization system, use Apache Ranger SPA ○ Written in jQuery at first ○ Frontend engineers in our department replaced with Vue.js in 2017 ○ Clean and modern code thanks to their major refactoring
  • 18. How to process query ● Asynchronous processing flow ○ User submits query ○ Get query id ○ Track with query id by client side polling ○ User can know progress and kill query ● Easy to implement thanks to Presto REST API ● Not easy to implement in Hive and Spark due to lack of API e.g., Not easy to get YARN application id
  • 19. Dependency ● Depends on presto-cli not JDBC because of performance and feature ● Yanagishima wants not only query result but also column name in 1 Presto request ● DatabaseMetaData#getColumns is slow, more than 10s due to system.jdbc.columns table scan ● Presto didn’t support JDBC cancel in 2015 but now supports ● Chose to use presto-cli but it has de-merit
  • 20. Compatibility issue ● Unfortunately, presto-cli >= 0.205 can’t connect old Presto server because of ROW type #224 → Bundled new & old presto-cli without shade because package name is different io.prestosql & com.facebook.presto ● May change to use JDBC because it’s better not to use presto-cli as PSF mentioned in the above issue Version Workers Auth Analysis 315 76 - Datachain 314 100 LDAP Shonan 306 36 - Dataopen 0.197 9 LDAP Datalake2 0.188 200 LDAP
  • 21. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 22. Analysis Presto error query SemanticErrorName is available since 313 #790 Syntactic Analysis Semantic Analysis Syntax Error Semantic Error Fail Fail Pass Thank you for kind code review 🙏🏻
  • 23. Classification of USER_ERROR ● Many syntax errors ● Typical semantic error is that user accesses to not existed column/schema SYNTAX_ERROR ● mismatched input ... expecting ● Hive views are not supported ● ... Semantic Error Name Count null 743 MISSING_ATTRIBUTE 273 MISSING_SCHEMA 169 MUST_BE_AGGREGATE_OR_GROUP_BY 111 TYPE_MISMATCH 87 FUNCTION_NOT_FOUND 72 MISSING_TABLE 53 MISSING_CATALOG 23 INVALID_LITERAL 4 AMBIGUOUS_ATTRIBUTE 4 CANNOT_HAVE_AGGREGATIONS_WINDOWS_OR_GROUPING 3 INVALID_ORDINAL 2 NOT_SUPPORTED 2 ORDER_BY_MUST_BE_IN_SELECT 2 NESTED_AGGREGATION 1 WINDOW_REQUIRES_OVER 1 REFERENCE_TO_OUTPUT_ATTRIBUTE_WITHIN_ORDER_BY_AGGREGATION 1 Error Name Count SYNTAX_ERROR 635 NOT_FOUND 47 HIVE_EXCEEDED_PARTITION_LIMIT 26 INVALID_FUNCTION_ARGUMENT 19 INVALID_CAST_ARGUMENT 6 PERMISSION_DENIED 3 SUBQUERY_MULTIPLE_ROWS 3 ADMINISTRATIVELY_KILLED 2 INVALID_SESSION_PROPERTY 1 DIVISION_BY_ZERO 1
  • 24. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 25. ● Execute query with Presto because of speed and rich UDF ○ User wants to execute query quickly and check data roughly ● Implement batch with Spark SQL in OASIS or with Hive in console ○ User wants to create stable batch ● Yanagishima is like Gist, OASIS is like GitHub Typical use case in Yanagishima & OASIS
  • 26. Why we don’t use Presto in batch ● Lack of Hive metastore impersonation ○ Support Impersonation in Metastore communication #43 ● Less stable than Hive or Spark ○ Want to prioritize stability rather than latency in batch ○ Need to handle huge data
  • 27. Impersonation ● Presto does not support impersonating the end user when accessing the Hive metastore ● SELECT query is no problem but CREATE/DROP/ALTER/INSERT/DELETE query can be problem ● If Presto process’s user is presto and yukawa creates table, presto accesses Hive metastore as presto user, not yukawa. It means other user can drop table if presto user has write permission ● We don’t allow presto user to write in HDFS with Apache Ranger ● HMS impersonation is available in Starburst Distribution of Presto ● Support for impersonation will be a game changer CREATE TABLE line.ad WRITE permission hadoop.proxyuser.presto.groups=* hadoop.proxyuser.presto.hosts=* DROP TABLE line.ad Hive Metastorepresto user Ranger Hadoop yukawa ebihara
  • 28. Less stable than Hive/Spark ● A single query/worker crash can be a bottleneck ● Auto restart worker mechanizm may be necessary ● Presto worker, data node, node manager are deployed in the same machine ● Enabling CGroups may be necessary because pyspark python process cpu usage is high, etc ● For example, yarn.nodemanager.resource.percentage-physical-cpu-limit : 70% Hive/Spark batch is more stable but it’s not easy to convert from Presto to Hive/Spark due to date function, syntax, …
  • 29. Presto Hive Spark SQL json_extract_scalar get_json_object date_format(now(), '%Y%m%d') date_format(current_timestamp(), 'yyyyMMdd') cross join unnest () as t () lateral view explode() t url function like url_extract_parameter - try - Hard to convert query
  • 30. Confusing Spark SQL error message Spark can use “DISTINCT” as column name SPARK-27170 It’s difficult to understand error message It will be improved in Spark 3.0 SPARK-27901 cannot resolve ‘`distinct`’ given input columns: [...]; line 1 pos 7; ‘GlocalLimit 100’ +- ‘LocalLimit 100 +- ‘Project [‘distinct, ...’] +- Filter (...) +- SubqueryAlias ... +- HiveTableRelation ... SELECT distinct ,a ,b ,c FROM test_table LIMIT 100 Confusing...💨
  • 31. DEBUG
  • 32. Recent Issues More than 100,000 partitions error occurred in 307 #619 ● The fixed version was released within one day Partition location does not exist in hive external table #620 ● Upgrade Hadoop library 2.7.7 to 3.2.0 affected ● Ongoing https://issues.apache.org/jira/browse/HDFS-14466 Create table failed when using viewfs #10099 ● It’s known issue and not fatal for now because we use Presto with read-only mode Handle repeated predicate pushdown into Hive connector #984 ● Performance regression, already fixed in 315 Great ! Schema mismatch of parquet file #9156 ● Our old cluster faced this issue recently, already fixed in 0.203
  • 33. Scale Table # Partitions Table A 1,588,031 Table B 1,429,047 Table C 1,429,046 Table D 1,116,130 Table E 772,725 ● Daily queries: ~20K ● Daily processed data: 330TB ● Daily processed rows: 4 Trillion rows ● Partitions hive.max-partitions-per-scan (default: 100,000) Maximum number of partitions for a single table scan
  • 34. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 35. Already fixed in 308 More than 100,000 partitions error More than 100,000 partitions error occurred in 307 #619 → Query over table 'default.test_left' can potentially read more than 100000 partitions at io.prestosql.plugin.hive.HiveMetadata.getPartitionsAsList(HiveMetadata.java:601) at io.prestosql.plugin.hive.HiveMetadata.getTableLayouts(HiveMetadata.java:1645) ....
  • 36. Steps to Reproduce ● Start hadoop-master docker image $ presto-product-tests/conf/docker/singlenode/compose.sh up -d hadoop-master $ presto-product-tests/conf/docker/singlenode/compose.sh up -d ● Create a table and populate rows presto> CREATE TABLE test_part (col int, part_col int) with (partitioned_by = ARRAY['part_col']); presto> INSERT INTO test_part (col, part_col) SELECT 0, CAST(id AS int) FROM UNNEST (sequence(1, 100)) AS u(id); presto> INSERT INTO test_part (col, part_col) SELECT 0, CAST(id AS int) FROM UNNEST (sequence(101, 150)) AS u(id); hive.max-partitions-per-scan=100 in product test hive.max-partitions-per-writers=100 (default) ● Execute reproducible query (TestHivePartitionsTable.java) presto> SELECT a.part_col FROM (SELECT * FROM test_part WHERE part_col = 1) a, (SELECT * FROM test_part WHERE part_col = 1) b WHERE a.col = b.col
  • 37. Frames and Variables ● Migration to remove table layout was ongoing ● “TupleDomain” is one of the keywords about predicate pushdown
  • 38. Fix ● Fixed EffectivePredicateExtractor.visitTableScan method Actually, it’s the workaround until the migration completed ● Timeline ○ Created Issue April 11, 4PM ○ Merged commit April 12, 7AM ○ Released 308 April 12, 3PM Released within one day 🎉
  • 39. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 40. Webhdfs partition location does not exist Partitioned webhdfs table throws “Partition location does not exist” error #620 ● Webhdfs isn’t supported (at least not tested) due to missing classes #957 → Add missing jar to plugin directory ● Create table with webhdfs location on hive hive> CREATE TABLE test_part_webhdfs (col1 int) PARTITIONED BY (dt int) LOCATION 'webhdfs://hadoop-master:50070/user/hive/warehouse/test_part_webhdfs'; hive> INSERT INTO test_part_webhdfs PARTITION(dt=1) VALUES (1); presto> SELECT * FROM test_part_webhdfs; → Partition location does not exist: webhdfs://hadoop-master:50070/user/hive/warehouse/test_part_webhdfs/dt=1
  • 41. Remote Debugger ● Edit jvm.config -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 ● Remote debugger configuration in IntelliJ IDEA Run→Edit Configurations...→+→Remote
  • 42. Step into Hadoop library ● The argument calling Hadoop library are same ○ We can also step into dependent libraries as local codes Different internal call ● Hadoop 2.7.7 (Presto 306) http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt=1?op=LISTSTATUS&user.name=x ● Hadoop 3.2.0 (Presto 307) http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt%253D1?op=LISTSTATUS&user.name=x { "RemoteException":{ "exception":"FileNotFoundException", "javaClassName":"java.io.FileNotFoundException", "message":"File /user/hive/warehouse/test_part/dt%3D1 does not exist." } }
  • 43. HDFS-14466 FileSystem.listLocatedStatus for path including '=' encodes it and returns FileNotFoundException Equals sign is doubled encoded dt=1 → dt%3D1 → dt%253D1 HADOOP-16258.001.patch by Iwasaki-san
  • 44. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 45. Already fixed in 0.203 Schema mismatch of parquet ● Failed to access table created by Spark presto> SELECT * FROM default.test_parquet WHERE dt='20190101' Error opening Hive split hdfs://cluster/apps/hive/warehouse/test_parquet/dt=20190101/20190101.snappy.parquet (offset=503316480, length=33554432): Schema mismatch, metastore schema for row column col1.element has 13 fields but parquet schema has 12 fields Issue column type is ARRAY<STRUCT<...>> ● Hive metastore returns 13 fields ● Parquet schema returns 12 fields
  • 46. parquet-tools ● Supported options ○ cat ○ head ○ schema ○ meta ○ dump ○ merge ○ rowcount ○ size https://github.com/apache/parquet-mr/tree/master/parquet-tools ● Inspect schema of parquet file $ parquet-tools schema sample.parquet message spark_schema { optional group @metadata { optional binary beat (UTF8); optional binary topic (UTF8); optional binary type (UTF8); optional binary version (UTF8); } optional binary @timestamp (UTF8); … We can analyze even large file over GB within few seconds
  • 47. Agenda ● USE ○ Our on-premises log analysis platform and tools ○ Yanagishima features ○ Yanagishima internals ○ Presto error query analysis ○ Our Presto/Spark use case with Yanagishima/OASIS ● DEBUG ○ More than 100,000 partitions error ○ Webhdfs partition location does not exist ○ Schema mismatch of parquet ○ Contributions
  • 48. Contributions ● General ○Retrieve semantic error name ○Fix partition pruning regression on 307 ○COMMENT ON TABLE ○DROP COLUMN ○Column alias in CTAS ○Non-ascii date_format function argument ● Cassandra connector ○INSERT statement ○Auto-discover protocol version ○Materialized View ○Smallint, tinyint, date type ○Nested collection type ● Hive connector ○CREATE TABLE property ■ textfile_skip_header_line_count ■ textfile_skip_footer_line_count ● MySQL connector ○Map JSON to Presto JSON ● CLI ○Output format ■ JSON ■ CSV_UNQUOTED ■ CSV_HEADER_UNQUOTED Thanks PSF, Starburst and ex-Teradata ✨