SlideShare a Scribd company logo
1 of 14
Download to read offline
Sadayuki Furuhashi
Scripting Embulk
Plugins
Embulk & Digdag Online Meetup 2020
A founder of Treasure Data, Inc. located in Silicon Valley.
OSS I designed:
An open-source hacker.
Github: @frsyuki
Sadayuki Furuhashi
Data integrationis a key for data-driven business.
AI/ML Analytics
Databases
Data is moving to the cloud & SaaS
SaaS
SaaS data integration is becoming more common
SaaS
Who develops new SaaS integrations?
Java developers
Low code
Scripting with SDKs
Scripting
Embulk plugin API
SaaS users
Dev vs user
gap!
Who develops new SaaS integrations?
Java developers
Low code
Scripting with SDKs
Scripting
Embulk plugin API
Embulk scripting
SaaS users
Dev = user
Scripting on the powerful framework
Embulk scripting plugin
Embulk core framework
Your script
SDK / library
✓High-performance
✓Choices of output plugins
Embulk plugins
How it works?
1. Run a script
3. Write rows as a CSV file
4. Read the CSV file
2. Load rows
named pipe
Embulk scripting plugin
Your script
SDK / library
Named pipe is like a file but not a file.
• It doesn’t consume disk space.
• It doesn’t cause disk IO (=fast).
• It transfers data as your script writes
rows (=fast).
How it works?
1. Run a script
3. Write rows as a CSV file
4. Read the CSV file
named pipe
Embulk scripting plugin
Your script
SDK / library
output plugin5. Pass rows to an

output plugin
2. Load rows
How to use embulk-input-script
1. Install
2. Create a config
3. Run
$ embulk gem install embulk-input-script
in:
type: script
run: ruby your_script.rb #-- any executable
out:
type: …
$ embulk run config.yaml
How to develop a script- your script runs 3 times
if ARGV[0] == “setup”
File.write(ARGV[2], “…”)
elsif ARGV[0] == “run”
CSV.open(ARGV[2], “w”) do |file|
file << row
…
end
elsif ARGV[0] == “finish”
puts “Done!”
end
$ script.rb setup <config.yaml> <setup.yaml>
$ script.rb run <setup.yaml> <N> <output.csv>
$ script.rb finish <setup.yaml>
First, write a setup file. It should include
column names, column types and parallelism.
Second, load rows and write them to a CSV file.
If the setup file says parallelism is bigger than 1,
this runs for multiple times with N=0, 1, 2, 3, …
Finally, do cleanup if necessary.
$ script.rb setup <config.yaml> <setup.yaml>
$ script.rb run <setup.yaml> <output.csv> <N>
$ script.rb finish <config.yaml> <setup.yaml>
Examples
• Importing server status from DataDog

https://github.com/embulk/embulk-input-script/tree/master/examples/datadog_hosts
• Importing AWS EC2 server list

https://github.com/embulk/embulk-input-script/tree/master/examples/aws_ec2_instances
Wanted
• Output support

embulk-output-script is not available.
• Converter from a script to an Embulk plugin gem

When you create a script, you want to release it so that other people can reuse it.

To do it, we need a tool that packages the script with embulk-input-script as a gem.

More Related Content

What's hot

Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダSadayuki Furuhashi
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure DataMuga Nishizawa
 
Recent Updates at Embulk Meetup #3
Recent Updates at Embulk Meetup #3Recent Updates at Embulk Meetup #3
Recent Updates at Embulk Meetup #3Muga Nishizawa
 
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017Amazon Web Services Korea
 
Embulk at Treasure Data
Embulk at Treasure DataEmbulk at Treasure Data
Embulk at Treasure DataSatoshi Akama
 
Embulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureEmbulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureHiroshi Toyama
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -N Masahiro
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
Google App Engine With Java And Groovy
Google App Engine With Java And GroovyGoogle App Engine With Java And Groovy
Google App Engine With Java And GroovyKen Kousen
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container EraSadayuki Furuhashi
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Hyun-Mook Choi
 
Heat optimization
Heat optimizationHeat optimization
Heat optimizationRico Lin
 
Async and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyAsync and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyJoe Kutner
 
Microservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-dockerMicroservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-dockerKidong Lee
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayAmazon Web Services Korea
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Chris Tankersley
 
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...Amazon Web Services Korea
 

What's hot (20)

Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダ
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Recent Updates at Embulk Meetup #3
Recent Updates at Embulk Meetup #3Recent Updates at Embulk Meetup #3
Recent Updates at Embulk Meetup #3
 
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
 
Embulk at Treasure Data
Embulk at Treasure DataEmbulk at Treasure Data
Embulk at Treasure Data
 
Embulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureEmbulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructure
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
Google App Engine With Java And Groovy
Google App Engine With Java And GroovyGoogle App Engine With Java And Groovy
Google App Engine With Java And Groovy
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부
 
Amazed by AWS Series #4
Amazed by AWS Series #4Amazed by AWS Series #4
Amazed by AWS Series #4
 
Oracle on AWS RDS Migration - 성기명
Oracle on AWS RDS Migration - 성기명Oracle on AWS RDS Migration - 성기명
Oracle on AWS RDS Migration - 성기명
 
Heat optimization
Heat optimizationHeat optimization
Heat optimization
 
Async and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyAsync and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRuby
 
Microservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-dockerMicroservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-docker
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016
 
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...
[AWS Dev Day] 앱 현대화 | AWS Fargate를 사용한 서버리스 컨테이너 활용 하기 - 삼성전자 개발자 포털 사례 - 정영준...
 
Bosh 2.0
Bosh 2.0Bosh 2.0
Bosh 2.0
 

Similar to Scripting Embulk Plugins

Writing Rust Command Line Applications
Writing Rust Command Line ApplicationsWriting Rust Command Line Applications
Writing Rust Command Line ApplicationsAll Things Open
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on RailsDelphiCon
 
Into The Box | Alexa and ColdBox Api's
Into The Box | Alexa and ColdBox Api'sInto The Box | Alexa and ColdBox Api's
Into The Box | Alexa and ColdBox Api'sOrtus Solutions, Corp
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails DevsDiacode
 
Cocoapods and Most common used library in Swift
Cocoapods and Most common used library in SwiftCocoapods and Most common used library in Swift
Cocoapods and Most common used library in SwiftWan Muzaffar Wan Hashim
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampBigDataCamp
 
Ruby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developerRuby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developergicappa
 
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...Amazon Web Services
 
Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Pavel Kaminsky
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesEduardo Castro
 
Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaCloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaAmazon Web Services
 
Docker for developers on mac and windows
Docker for developers on mac and windowsDocker for developers on mac and windows
Docker for developers on mac and windowsDocker, Inc.
 
End-to-end web-testing in ruby ecosystem
End-to-end web-testing in ruby ecosystemEnd-to-end web-testing in ruby ecosystem
End-to-end web-testing in ruby ecosystemAlex Mikitenko
 
AWS July Webinar Series: Introducing AWS OpsWorks for Windows Server
AWS July Webinar Series: Introducing AWS OpsWorks for Windows ServerAWS July Webinar Series: Introducing AWS OpsWorks for Windows Server
AWS July Webinar Series: Introducing AWS OpsWorks for Windows ServerAmazon Web Services
 
Serverless in production, an experience report (LNUG)
Serverless in production, an experience report (LNUG)Serverless in production, an experience report (LNUG)
Serverless in production, an experience report (LNUG)Yan Cui
 
AWS Startup Webinar | Developing on AWS
AWS Startup Webinar | Developing on AWSAWS Startup Webinar | Developing on AWS
AWS Startup Webinar | Developing on AWSAmazon Web Services
 

Similar to Scripting Embulk Plugins (20)

AWS Serverless Workshop
AWS Serverless WorkshopAWS Serverless Workshop
AWS Serverless Workshop
 
Writing Rust Command Line Applications
Writing Rust Command Line ApplicationsWriting Rust Command Line Applications
Writing Rust Command Line Applications
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on Rails
 
Into The Box | Alexa and ColdBox Api's
Into The Box | Alexa and ColdBox Api'sInto The Box | Alexa and ColdBox Api's
Into The Box | Alexa and ColdBox Api's
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails Devs
 
Cocoapods and Most common used library in Swift
Cocoapods and Most common used library in SwiftCocoapods and Most common used library in Swift
Cocoapods and Most common used library in Swift
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
Dev streams2
Dev streams2Dev streams2
Dev streams2
 
Smooth CoffeeScript
Smooth CoffeeScriptSmooth CoffeeScript
Smooth CoffeeScript
 
Ruby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developerRuby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developer
 
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...
AWS Summit 2013 | Auckland - Continuous Deployment Practices, with Production...
 
Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration Services
 
Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaCloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
 
Docker for developers on mac and windows
Docker for developers on mac and windowsDocker for developers on mac and windows
Docker for developers on mac and windows
 
Integration Testing in Python
Integration Testing in PythonIntegration Testing in Python
Integration Testing in Python
 
End-to-end web-testing in ruby ecosystem
End-to-end web-testing in ruby ecosystemEnd-to-end web-testing in ruby ecosystem
End-to-end web-testing in ruby ecosystem
 
AWS July Webinar Series: Introducing AWS OpsWorks for Windows Server
AWS July Webinar Series: Introducing AWS OpsWorks for Windows ServerAWS July Webinar Series: Introducing AWS OpsWorks for Windows Server
AWS July Webinar Series: Introducing AWS OpsWorks for Windows Server
 
Serverless in production, an experience report (LNUG)
Serverless in production, an experience report (LNUG)Serverless in production, an experience report (LNUG)
Serverless in production, an experience report (LNUG)
 
AWS Startup Webinar | Developing on AWS
AWS Startup Webinar | Developing on AWSAWS Startup Webinar | Developing on AWS
AWS Startup Webinar | Developing on AWS
 

More from Sadayuki Furuhashi

Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019
Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019
Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019Sadayuki Furuhashi
 
DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?Sadayuki Furuhashi
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11Sadayuki Furuhashi
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsSadayuki Furuhashi
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Sadayuki Furuhashi
 
Fluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreFluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreSadayuki Furuhashi
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoSadayuki Furuhashi
 
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualWhat's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualSadayuki Furuhashi
 
How we use Fluentd in Treasure Data
How we use Fluentd in Treasure DataHow we use Fluentd in Treasure Data
How we use Fluentd in Treasure DataSadayuki Furuhashi
 
How to collect Big Data into Hadoop
How to collect Big Data into HadoopHow to collect Big Data into Hadoop
How to collect Big Data into HadoopSadayuki Furuhashi
 
Programming Tools and Techniques #369 - The MessagePack Project
Programming Tools and Techniques #369 - The MessagePack ProjectProgramming Tools and Techniques #369 - The MessagePack Project
Programming Tools and Techniques #369 - The MessagePack ProjectSadayuki Furuhashi
 
gumiStudy#7 The MessagePack Project
gumiStudy#7 The MessagePack ProjectgumiStudy#7 The MessagePack Project
gumiStudy#7 The MessagePack ProjectSadayuki Furuhashi
 

More from Sadayuki Furuhashi (20)

Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019
Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019
Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
 
DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Prestogres internals
Prestogres internalsPrestogres internals
Prestogres internals
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
 
Fluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreFluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect More
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for Presto
 
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualWhat's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
 
How we use Fluentd in Treasure Data
How we use Fluentd in Treasure DataHow we use Fluentd in Treasure Data
How we use Fluentd in Treasure Data
 
Fluentd meetup at Slideshare
Fluentd meetup at SlideshareFluentd meetup at Slideshare
Fluentd meetup at Slideshare
 
How to collect Big Data into Hadoop
How to collect Big Data into HadoopHow to collect Big Data into Hadoop
How to collect Big Data into Hadoop
 
Fluentd meetup
Fluentd meetupFluentd meetup
Fluentd meetup
 
upload test 1
upload test 1upload test 1
upload test 1
 
Programming Tools and Techniques #369 - The MessagePack Project
Programming Tools and Techniques #369 - The MessagePack ProjectProgramming Tools and Techniques #369 - The MessagePack Project
Programming Tools and Techniques #369 - The MessagePack Project
 
Gumi study7 messagepack
Gumi study7 messagepackGumi study7 messagepack
Gumi study7 messagepack
 
gumiStudy#7 The MessagePack Project
gumiStudy#7 The MessagePack ProjectgumiStudy#7 The MessagePack Project
gumiStudy#7 The MessagePack Project
 

Recently uploaded

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 

Recently uploaded (20)

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

Scripting Embulk Plugins

  • 2. A founder of Treasure Data, Inc. located in Silicon Valley. OSS I designed: An open-source hacker. Github: @frsyuki Sadayuki Furuhashi
  • 3. Data integrationis a key for data-driven business. AI/ML Analytics Databases
  • 4. Data is moving to the cloud & SaaS SaaS
  • 5. SaaS data integration is becoming more common SaaS
  • 6. Who develops new SaaS integrations? Java developers Low code Scripting with SDKs Scripting Embulk plugin API SaaS users Dev vs user gap!
  • 7. Who develops new SaaS integrations? Java developers Low code Scripting with SDKs Scripting Embulk plugin API Embulk scripting SaaS users Dev = user
  • 8. Scripting on the powerful framework Embulk scripting plugin Embulk core framework Your script SDK / library ✓High-performance ✓Choices of output plugins Embulk plugins
  • 9. How it works? 1. Run a script 3. Write rows as a CSV file 4. Read the CSV file 2. Load rows named pipe Embulk scripting plugin Your script SDK / library Named pipe is like a file but not a file. • It doesn’t consume disk space. • It doesn’t cause disk IO (=fast). • It transfers data as your script writes rows (=fast).
  • 10. How it works? 1. Run a script 3. Write rows as a CSV file 4. Read the CSV file named pipe Embulk scripting plugin Your script SDK / library output plugin5. Pass rows to an
 output plugin 2. Load rows
  • 11. How to use embulk-input-script 1. Install 2. Create a config 3. Run $ embulk gem install embulk-input-script in: type: script run: ruby your_script.rb #-- any executable out: type: … $ embulk run config.yaml
  • 12. How to develop a script- your script runs 3 times if ARGV[0] == “setup” File.write(ARGV[2], “…”) elsif ARGV[0] == “run” CSV.open(ARGV[2], “w”) do |file| file << row … end elsif ARGV[0] == “finish” puts “Done!” end $ script.rb setup <config.yaml> <setup.yaml> $ script.rb run <setup.yaml> <N> <output.csv> $ script.rb finish <setup.yaml> First, write a setup file. It should include column names, column types and parallelism. Second, load rows and write them to a CSV file. If the setup file says parallelism is bigger than 1, this runs for multiple times with N=0, 1, 2, 3, … Finally, do cleanup if necessary. $ script.rb setup <config.yaml> <setup.yaml> $ script.rb run <setup.yaml> <output.csv> <N> $ script.rb finish <config.yaml> <setup.yaml>
  • 13. Examples • Importing server status from DataDog
 https://github.com/embulk/embulk-input-script/tree/master/examples/datadog_hosts • Importing AWS EC2 server list
 https://github.com/embulk/embulk-input-script/tree/master/examples/aws_ec2_instances
  • 14. Wanted • Output support
 embulk-output-script is not available. • Converter from a script to an Embulk plugin gem
 When you create a script, you want to release it so that other people can reuse it.
 To do it, we need a tool that packages the script with embulk-input-script as a gem.