SlideShare a Scribd company logo
1 of 26
Download to read offline
Dataflow Shuffle Service
酒とゲームとインフラとGCP 第6回
#gcpja
Jul 19th, 2017
Cloud Customer Engineer, Google
Yuta Hono | 寳野 雄太
yutah@google.com @yutah_3
Yuta Hono | 寳野 雄太
Cloud Customer Engineer, Google
yutah@google.com
@yutah_3
● Google Cloud のソリューションのご提案
● お客様プロジェクトのアーキテクチャ設計/
プロダクトの技術的なご支援
(主に Google Cloud Platform )
Disclaimer: this presentation include my personal performance test
results. This is not an official performance testing, just an example of the
performance.
ETL with Cloud Dataflow
p.apply(
TextIO.Read.from(“gs://…”)
)
p.apply(
BigQueryIO.Write.to(“tablespec…”)
)p.apply(
ParDo.of(new YourTransform())
) Stream/Batch
Batch
Batch
Stream
p.apply(
PubsubIO.Read.topic(“input_topic”)
)
p.apply(
TextIO.Write.to(“gs://…”)
)
Parallel / Distributed processing
with your class in ParDo,
Both Streaming, Batch runs.
Agenda
1. Shuffle ?
2. Dataflow Shuffle Service [Beta, As of Jul 19]
Shuffle
Map -> Shuffle (Sort) -> Reduce
https://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0008.html
MapReduce - Execution
Cloud Dataflow Shuffle Service
<user1, record>
<user5, record>
<user3, record>
<user8, record>
<user4, record>
...
<user5, record>
<user5, record>
<user2, record>
<user2, record>
<user7, record>
...
<user3, record>
<user3, record>
<user8, record>
<user3, record>
<user6, record>
...
<user2, record>
<user1, record>
<user5, record>
<user8, record>
<user4, record>
...
Run Distributed Processing...
<user1, record>
<user5, record>
<user3, record>
<user8, record>
<user4, record>
...
<user5, record>
<user5, record>
<user2, record>
<user2, record>
<user7, record>
...
<user3, record>
<user3, record>
<user8, record>
<user3, record>
<user6, record>
...
<user2, record>
<user1, record>
<user5, record>
<user8, record>
<user4, record>
...
Shuffling
<user1, record>
<user1, record>
<user2, record>
<user2, record>
<user2, record>
...
<user3, record>
<user3, record>
<user3, record>
<user3, record>
<user4, record>
<user4, record>
...
<user5, record>
<user5, record>
<user5, record>
<user5, record>
<user6, record>
...
<user7, record>
<user8, record>
<user8, record>
<user8, record>
...
Shuffled
Batch Shuffling: Stages
1. Create shuffle dataset
2. Write to the shuffle dataset
(from possibly several stages)
3. Close shuffle dataset
4. Read from the shuffle dataset
(from possibly several stages)
Shuffle in Dataflow
Worker
Worker
Worker
Worker
Worker
PD
PD
PD
PD
PD
write
write
write
write
write
read
read
read
read
read
Dataflow Shuffle Service
/
Read/
Write
/
Read/
Write /
Read/
Write /
Read/
Write
New: Dataflow Shuffle Service
Ex. BigQuery Shuffle
https://cloud.google.com/blog/big-data/2016/08/in-memory-query-execution-in-google-bigquery
Example. Dataflow Shuffle Service
Joining 1TB of data
● The default configuration (worker-based
shuffle on n1-standard-1 machine type
using Persistent Disk storage)
● Tuned configuration (worker-based shuffle
on a larger n1-standard-4 machine type and
using SSD storage)
● The new Cloud Dataflow Shuffle
40% faster than the tuned version
of the worker-based shuffle.
Try Dataflow Shuffle Service
Condition:
● Export gdelt-bq:internetarchivebooks from BigQuery
○ approx. 180GB
○ sharded to 182 files
● Wordcount ! Previous shuffle vs Shuffle Service
○ No code change
○ --maxNumWorkers=30
○ us-central1
Try Dataflow Shuffle Service
Result:
Shuffle Service
Normal
Try Dataflow Shuffle Service
Shuffle ServiceNormal
What’s NEXT ?
Requirements:
● Dataflow SDK for Java version 1.9.x. Or 2.0 (from Today)
● us-central1 (Iowa) region
● Batch mode
● No --zone parameter
Try:
--experiments=shuffle_mode=service
Costs (in Beta, Could be changed):
● $0.0216 per GB-Hour :
○ The total bill for Dataflow pipelines is expected to be less than or
equal to the cost of Dataflow pipelines not using this option.
Try the shuffle service ! (and Give a feedback)
● dataflow-feedback-shuffle@google.com (en) or
● yutah@google.com (ja)
Issue:
● Not decreased the duration of the shuffle operation
● The cost of your pipeline job (using the optimization feature) increases
without any reduction in processing time
It is our intent to continue improving the service-based Shuffle optimization
feature until its performance is better than that of the worker-based Shuffle.
Feedback
Summary - Cloud Dataflow Shuffle Service
Scalable
Efficient
Fault-tolerant manner
Thank You!
yutah@google.com
https://cloud.google.com/

More Related Content

What's hot

Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)NTT DATA Technology & Innovation
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsDatabricks
 
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...NTT DATA Technology & Innovation
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
 
Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Yuki Morishita
 
Oracle GoldenGate for Big Data 12.2 セットアップガイド
Oracle GoldenGate for Big Data 12.2 セットアップガイドOracle GoldenGate for Big Data 12.2 セットアップガイド
Oracle GoldenGate for Big Data 12.2 セットアップガイドオラクルエンジニア通信
 
Kafkaを活用するためのストリーム処理の基本
Kafkaを活用するためのストリーム処理の基本Kafkaを活用するためのストリーム処理の基本
Kafkaを活用するためのストリーム処理の基本Sotaro Kimura
 
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)NTT DATA Technology & Innovation
 
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版)
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版) データウェアハウスモデリング入門(ダイジェスト版)(事前公開版)
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版) Satoshi Nagayasu
 
Parquetはカラムナなのか?
Parquetはカラムナなのか?Parquetはカラムナなのか?
Parquetはカラムナなのか?Yohei Azekatsu
 
自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報オラクルエンジニア通信
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan BlueDatabricks
 
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送Google Cloud Platform - Japan
 
MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所Makoto Kaga
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_APIKohei KaiGai
 
Oracle Cloud Infrastructure:2021年10月度サービス・アップデート
Oracle Cloud Infrastructure:2021年10月度サービス・アップデートOracle Cloud Infrastructure:2021年10月度サービス・アップデート
Oracle Cloud Infrastructure:2021年10月度サービス・アップデートオラクルエンジニア通信
 

What's hot (20)

Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)
明日から始める! ソフトウェアのグリーン化(GSF MeetUp Tokyo 発表資料)
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
 
C#で速度を極めるいろは
C#で速度を極めるいろはC#で速度を極めるいろは
C#で速度を極めるいろは
 
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界
 
AWSのNoSQL入門
AWSのNoSQL入門AWSのNoSQL入門
AWSのNoSQL入門
 
Oracle GoldenGate for Big Data 12.2 セットアップガイド
Oracle GoldenGate for Big Data 12.2 セットアップガイドOracle GoldenGate for Big Data 12.2 セットアップガイド
Oracle GoldenGate for Big Data 12.2 セットアップガイド
 
Kafkaを活用するためのストリーム処理の基本
Kafkaを活用するためのストリーム処理の基本Kafkaを活用するためのストリーム処理の基本
Kafkaを活用するためのストリーム処理の基本
 
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)
OpenLineage による Airflow のデータ来歴の収集と可視化(Airflow Meetup Tokyo #3 発表資料)
 
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版)
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版) データウェアハウスモデリング入門(ダイジェスト版)(事前公開版)
データウェアハウスモデリング入門(ダイジェスト版)(事前公開版)
 
HDFS vs. MapR Filesystem
HDFS vs. MapR FilesystemHDFS vs. MapR Filesystem
HDFS vs. MapR Filesystem
 
Parquetはカラムナなのか?
Parquetはカラムナなのか?Parquetはカラムナなのか?
Parquetはカラムナなのか?
 
自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報自律型データベース Oracle Autonomous Database 最新情報
自律型データベース Oracle Autonomous Database 最新情報
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送
[Cloud OnAir] BigQuery の仕組みからベストプラクティスまでのご紹介 2018年9月6日 放送
 
MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
 
Oracle Cloud Infrastructure:2021年10月度サービス・アップデート
Oracle Cloud Infrastructure:2021年10月度サービス・アップデートOracle Cloud Infrastructure:2021年10月度サービス・アップデート
Oracle Cloud Infrastructure:2021年10月度サービス・アップデート
 

Viewers also liked

SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
 
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...Insight Technology, Inc.
 
こんなIntelli jはイヤだ
こんなIntelli jはイヤだこんなIntelli jはイヤだ
こんなIntelli jはイヤだ勝信 今井
 
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)Ryusaburo Tanaka
 
プログラマに贈るクラウドとの上手な付き合い方
プログラマに贈るクラウドとの上手な付き合い方プログラマに贈るクラウドとの上手な付き合い方
プログラマに贈るクラウドとの上手な付き合い方Keisuke Nishitani
 
20121019-jenkins-akiko_pusu.pdf
20121019-jenkins-akiko_pusu.pdf20121019-jenkins-akiko_pusu.pdf
20121019-jenkins-akiko_pusu.pdfakiko_pusu
 
Application Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldApplication Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldKeisuke Nishitani
 
AWS SAMで始めるサーバーレスアプリケーション開発
AWS SAMで始めるサーバーレスアプリケーション開発AWS SAMで始めるサーバーレスアプリケーション開発
AWS SAMで始めるサーバーレスアプリケーション開発真吾 吉田
 

Viewers also liked (10)

SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
 
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A32:Amazon Redshift Deep Dive by アマゾン データ サービス ...
 
こんなIntelli jはイヤだ
こんなIntelli jはイヤだこんなIntelli jはイヤだ
こんなIntelli jはイヤだ
 
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)
JavaFX + NetBeans環境におけるJenkinsの活用(Jenkins第六回勉強会)
 
プログラマに贈るクラウドとの上手な付き合い方
プログラマに贈るクラウドとの上手な付き合い方プログラマに贈るクラウドとの上手な付き合い方
プログラマに贈るクラウドとの上手な付き合い方
 
第六回Jenkins勉強会
第六回Jenkins勉強会第六回Jenkins勉強会
第六回Jenkins勉強会
 
20121019-jenkins-akiko_pusu.pdf
20121019-jenkins-akiko_pusu.pdf20121019-jenkins-akiko_pusu.pdf
20121019-jenkins-akiko_pusu.pdf
 
Pokémon GOとGCP
Pokémon GOとGCPPokémon GOとGCP
Pokémon GOとGCP
 
Application Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldApplication Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless World
 
AWS SAMで始めるサーバーレスアプリケーション開発
AWS SAMで始めるサーバーレスアプリケーション開発AWS SAMで始めるサーバーレスアプリケーション開発
AWS SAMで始めるサーバーレスアプリケーション開発
 

Similar to Dataflow shuffle service

GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...Kohei KaiGai
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Kohei KaiGai
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 reviewManageIQ
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Chris Fregly
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumbergerinside-BigData.com
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsBarton Rhodes
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowTatiana Al-Chueyr
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
 
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...Sumeet Singh
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storageKohei KaiGai
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQLKohei KaiGai
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_ENKohei KaiGai
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solidLars Albertsson
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Chris Fregly
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 

Similar to Dataflow shuffle service (20)

GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 review
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache Airflow
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
 
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Dataflow shuffle service