SlideShare a Scribd company logo
Build DMP on top of
GCP
VMFive - Randy Huang
Agenda
• Migrated Pipeline to GCP
• Cost Comparison
• Business Use Case
• Fluentd Demo
ELK + AWS EMR
Kinesis Lambda
Pros & Cons
• Pros :
• Well Support.
• Well docs.
• Easy to find Reference.
• Cons :
• High Cost.
• Not open source.
• Have to set the scale at first.
Pipeline on GCP
Dataflow
BigQuery
Machine Learning
Data Visualization
Compute Engine
Global Load Balancing
Datastudio
The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 7
Batch
BI Analysis
Storage

Cloud Storage
Processing

Cloud DataflowStreaming
Time Series Streaming

Cloud Pub/Sub
Storage

BigQuery
The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 8
Targeting Engines
Data Sources
Machine Learning
Applications
API Backend

Compute Engine
Spark MLlib

Cloud Dataproc
App Engine
Transform Data
Hosted Models

Cloud Machine Learning
Real-Time

Prediction API
Device Related

Cloud Pub/Sub
Behavior Related

Cloud Pub/Sub
3rd Party Data

Cloud Pub/Sub
Redis

Compute Engine
Pros & Cons
• Pros :
• Cost-effective.
• Operation-effective.
• Google got your back.
• Cons :
• API/SDK changes everyday.
• Some still in beta mode.
• Docs everywhere.
Workflow Monitoring
• Digdag <Airflow/Oozie/Luigi>
• Native support Python & Ruby
• Multi-Cloud
• Modular
• Workflow as code
• Docker Support
• Altering to Slack
Digdag Sample
Digdag
Cost Comparison
• $2000 on AWS per month
• about $200 on GCP production
• about another $200 for dev
• 50M events per month
Business Use Case
• Digital Ads Targeting
• User Behavior Tagging
• BI
• GEO Reporting
• KPI Reporting
• User Demographic
Some Tips
• BigQuery
• https://status.cloud.google.com/incident/bigquery/
18022
• Solved by Fluentd’s Retry and HA
• Dataflow’s SDK & docs is not sync
• Dataflow Sideinput has a bug with Streaming mode
• Compute Engine SLB - TCP/UDP setup for forwarding
Flunetd Update
• Release note for v0.14
• sub second event flush
• New Plugin APIS
support formatting configurations dynamically
(e.g., path /my/dest/${tag}/mydata.%Y-%m-%d.log)
• Secure Forward
Demo
• Nginx -> Fluentd -> BigQuery -> DataStudio
• MySQL -> Fluentd -> BigQuery

More Related Content

What's hot

10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide
Databricks
 
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
HostedbyConfluent
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
Brittany Ingram
 
Introduction to knime
Introduction to knimeIntroduction to knime
Introduction to knime
Bernardo Najlis
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
Balvinder Hira
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
Kriangkrai Chaonithi
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
Gerard Toonstra
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
kbajda
 
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
InfluxData
 
Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!
brad_quarry
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
HostedbyConfluent
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
HostedbyConfluent
 
Logging in The World of DevOps
Logging in The World of DevOps Logging in The World of DevOps
Logging in The World of DevOps
DevOps Indonesia
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
HostedbyConfluent
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform Benchmark
Lahav Savir
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
Databricks
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
kbajda
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
HostedbyConfluent
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
InfluxData
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluent
confluent
 

What's hot (20)

10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide
 
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
 
Introduction to knime
Introduction to knimeIntroduction to knime
Introduction to knime
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
 
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
 
Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
 
Logging in The World of DevOps
Logging in The World of DevOps Logging in The World of DevOps
Logging in The World of DevOps
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform Benchmark
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluent
 

Viewers also liked

A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data Pipelines
Daniel Mescheder
 
Building data pipelines
Building data pipelinesBuilding data pipelines
Building data pipelines
Jonathan Holloway
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
Jesus Rodriguez
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
Lars Albertsson
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
DataWorks Summit/Hadoop Summit
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesDataWorks Summit
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
Amazon Web Services
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
PetteriTeikariPhD
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Hakka Labs
 

Viewers also liked (9)

A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data Pipelines
 
Building data pipelines
Building data pipelinesBuilding data pipelines
Building data pipelines
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
 

Similar to The journey of Moving from AWS ELK to GCP Data Pipeline

GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
nine
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / Platforms
Nilanchal
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s Solution
VMware Tanzu
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBM
RightScale
 
Code first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with AzureCode first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with Azure
Jeremy Likness
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google Cloud
Jonny Daenen
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROI
RightScale
 
Best practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on CloudBest practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on Cloud
Oleg Posyniak
 
Gcp intro-20160721
Gcp intro-20160721Gcp intro-20160721
Gcp intro-20160721
Haeseung Lee
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIs
wesley chun
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Python
wesley chun
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptx
DSCIITPatna
 
Kube journey 2017-04-19
Kube journey   2017-04-19Kube journey   2017-04-19
Kube journey 2017-04-19
Doug Davis
 
Azure and web sites hackaton deck
Azure and web sites hackaton deckAzure and web sites hackaton deck
Azure and web sites hackaton deck
Alexey Bokov
 
Powerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hackPowerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hack
wesley chun
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3
Simon Su
 
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
Google Cloud Platform - Japan
 
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
GangTingFan
 

Similar to The journey of Moving from AWS ELK to GCP Data Pipeline (20)

GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / Platforms
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s Solution
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBM
 
Code first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with AzureCode first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with Azure
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google Cloud
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROI
 
Best practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on CloudBest practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on Cloud
 
Gcp intro-20160721
Gcp intro-20160721Gcp intro-20160721
Gcp intro-20160721
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIs
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Python
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptx
 
Kube journey 2017-04-19
Kube journey   2017-04-19Kube journey   2017-04-19
Kube journey 2017-04-19
 
Azure and web sites hackaton deck
Azure and web sites hackaton deckAzure and web sites hackaton deck
Azure and web sites hackaton deck
 
Powerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hackPowerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hack
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3
 
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
 
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
 

Recently uploaded

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 

Recently uploaded (20)

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 

The journey of Moving from AWS ELK to GCP Data Pipeline

  • 1. Build DMP on top of GCP VMFive - Randy Huang
  • 2. Agenda • Migrated Pipeline to GCP • Cost Comparison • Business Use Case • Fluentd Demo
  • 3. ELK + AWS EMR Kinesis Lambda
  • 4. Pros & Cons • Pros : • Well Support. • Well docs. • Easy to find Reference. • Cons : • High Cost. • Not open source. • Have to set the scale at first.
  • 5. Pipeline on GCP Dataflow BigQuery Machine Learning Data Visualization Compute Engine Global Load Balancing
  • 7. The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 7 Batch BI Analysis Storage
 Cloud Storage Processing
 Cloud DataflowStreaming Time Series Streaming
 Cloud Pub/Sub Storage
 BigQuery
  • 8. The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 8 Targeting Engines Data Sources Machine Learning Applications API Backend
 Compute Engine Spark MLlib
 Cloud Dataproc App Engine Transform Data Hosted Models
 Cloud Machine Learning Real-Time
 Prediction API Device Related
 Cloud Pub/Sub Behavior Related
 Cloud Pub/Sub 3rd Party Data
 Cloud Pub/Sub Redis
 Compute Engine
  • 9. Pros & Cons • Pros : • Cost-effective. • Operation-effective. • Google got your back. • Cons : • API/SDK changes everyday. • Some still in beta mode. • Docs everywhere.
  • 10. Workflow Monitoring • Digdag <Airflow/Oozie/Luigi> • Native support Python & Ruby • Multi-Cloud • Modular • Workflow as code • Docker Support • Altering to Slack
  • 13.
  • 14. Cost Comparison • $2000 on AWS per month • about $200 on GCP production • about another $200 for dev • 50M events per month
  • 15. Business Use Case • Digital Ads Targeting • User Behavior Tagging • BI • GEO Reporting • KPI Reporting • User Demographic
  • 16. Some Tips • BigQuery • https://status.cloud.google.com/incident/bigquery/ 18022 • Solved by Fluentd’s Retry and HA • Dataflow’s SDK & docs is not sync • Dataflow Sideinput has a bug with Streaming mode • Compute Engine SLB - TCP/UDP setup for forwarding
  • 17. Flunetd Update • Release note for v0.14 • sub second event flush • New Plugin APIS support formatting configurations dynamically (e.g., path /my/dest/${tag}/mydata.%Y-%m-%d.log) • Secure Forward
  • 18. Demo • Nginx -> Fluentd -> BigQuery -> DataStudio • MySQL -> Fluentd -> BigQuery