SlideShare a Scribd company logo
Airflow 4 Manager
Alex Pongpech
What is Airflow?
● Apache Airflow is an open-source workflow
management platform. It started at Airbnb in October
2014 as a solution to manage the company's
increasingly complex workflows
● Airflow is a platform to programmatically author,
schedule and monitor workflows.
● Use airflow to author workflows as directed acyclic
graphs (DAGs) of tasks. The airflow scheduler executes
your tasks on an array of workers while following the
specified dependencies.
2
Why use Airflow?
● Have you ever managed messy data pipeline? In
real life, data pipeline can be pretty messy
● Using CLI interface
● Had you ever have to adjust your workflow? Surely
you want to be able to scale your workflow up and
down quickly and effectively
3
When not to use Airflow
A sampling of examples that Airflow can not satisfy in a first-class
way includes:
● DAGs which need to be run off-schedule or with no schedule
at all
● DAGs that run concurrently with the same start time
● DAGs with complicated branching logic
● DAGs with many fast tasks
● DAGs which rely on the exchange of data
● Parametrized DAGs
● Dynamic DAGs
4
What Airflow is used for in general?
● Monitoring Cron jobs
● transferring data from one place to other.
● Automating your DevOps operations.
● Periodically fetching data from websites and
update the database for your awesome price
comparison system.
● Data processing for recommendation based
systems.
● Machine Learning Pipelines.
5
So how others have used Airflow
6
Robinhood
● Managing dependencies between jobs was difficult.
With cron we would use worst-case expected durations
for upstream jobs to schedule downstream jobs.
● Failure handling and alerting had to be managed by the
job. We would have to rely on the job, or the on-call
engineer to handle retries and upstream failures in the
case of dependent jobs.
● Retrospection was difficult. We would need to sift
through logs or alerts to check how a job may have
performed on a certain day in the past.
7
Google
● In May 2018 Google announced Google Cloud
Composer, a managed Apache Airflow service that is
fully integrated in the Google Cloud platform and has
thus become one of the cornerstones for
orchestrating managed services in Google Cloud.
●
8
References
1. https://medium.com/analytics-and-data/10-benefits-to-using-airflow-33d312537bae
2. https://medium.com/analytics-and-data/on-the-evolution-of-data-engineering-c5e5
6d273e37
3. https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b
4. https://en.paradigmadigital.com/dev/apache-airflow/
5. https://robinhood.engineering/why-robinhood-uses-airflow-aed13a9a90c8
6. https://towardsdatascience.com/building-a-production-level-etl-pipeline-platform-us
ing-apache-airflow-a4cf34203fbd
7. https://medium.com/the-prefect-blog/why-not-airflow-4cfa423299c4
9

More Related Content

What's hot

Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
PyData
 
From business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflowFrom business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflow
Derrick Qin
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward
 
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
Flink Forward
 
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
Flink Forward
 
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
Flink Forward
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
NikolayGrishchenkov
 
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward
 
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 BarcelonaWeavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
Amazon Web Services
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Flink Forward
 
Building GraphQL Applications with Relay - GraphQL Manchester #1
Building GraphQL Applications with Relay - GraphQL Manchester #1Building GraphQL Applications with Relay - GraphQL Manchester #1
Building GraphQL Applications with Relay - GraphQL Manchester #1
Chris Grice
 
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
Flink Forward
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Flink Forward
 
Apache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisiónApache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisión
Globant
 
The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
Tuning Flink For Robustness And Performance
Tuning Flink For Robustness And PerformanceTuning Flink For Robustness And Performance
Tuning Flink For Robustness And Performance
Stefan Richter
 
Caribbean Developers Conference - 201K8s
Caribbean Developers Conference - 201K8sCaribbean Developers Conference - 201K8s
Caribbean Developers Conference - 201K8s
Ravi Lachhman
 
Frameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with RFrameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with R
Daniel Nüst
 

What's hot (20)

Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
 
From business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflowFrom business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflow
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
 
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
 
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...
 
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...
 
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
Flink Forward Berlin 2018: Brian Wolfe - "Upshot: distributed tracing using F...
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
 
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
 
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 BarcelonaWeavework Flagger Demo- AWS Container Day 2019 Barcelona
Weavework Flagger Demo- AWS Container Day 2019 Barcelona
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
 
Building GraphQL Applications with Relay - GraphQL Manchester #1
Building GraphQL Applications with Relay - GraphQL Manchester #1Building GraphQL Applications with Relay - GraphQL Manchester #1
Building GraphQL Applications with Relay - GraphQL Manchester #1
 
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
Flink Forward Berlin 2018: Oleksandr Nitavskyi - "Data lossless event time st...
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
 
Apache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisiónApache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisión
 
The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
 
Tuning Flink For Robustness And Performance
Tuning Flink For Robustness And PerformanceTuning Flink For Robustness And Performance
Tuning Flink For Robustness And Performance
 
Caribbean Developers Conference - 201K8s
Caribbean Developers Conference - 201K8sCaribbean Developers Conference - 201K8s
Caribbean Developers Conference - 201K8s
 
Frameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with RFrameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with R
 

Similar to Airflow 4 manager

Airflow Intro-1.pdf
Airflow Intro-1.pdfAirflow Intro-1.pdf
Airflow Intro-1.pdf
BagustTriCahyo1
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
mutt_data
 
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything togetherSashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
React Conf Brasil
 
The Apollo and GraphQL Stack
The Apollo and GraphQL StackThe Apollo and GraphQL Stack
The Apollo and GraphQL Stack
Sashko Stubailo
 
apacheairflow-160827123852.pdf
apacheairflow-160827123852.pdfapacheairflow-160827123852.pdf
apacheairflow-160827123852.pdf
vijayapraba1
 
Serverless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From ProductionServerless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From Production
Steve Hogg
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the Monolith
VMware Tanzu
 
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfServerless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Dhaval Nagar
 
Managing transactions on Ethereum with Apache Airflow
Managing transactions on Ethereum with Apache AirflowManaging transactions on Ethereum with Apache Airflow
Managing transactions on Ethereum with Apache Airflow
Michael Ghen
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
Liangjun Jiang
 
Prefect Workflow Applications 2023.pdf
Prefect Workflow Applications 2023.pdfPrefect Workflow Applications 2023.pdf
Prefect Workflow Applications 2023.pdf
Jeff Hale
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
SaarBergerbest
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
Sampath Kumar
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Mariano Gonzalez
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
Knoldus Inc.
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
Knoldus Inc.
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
John J Zhao
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
Anant Corporation
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
Haggai Philip Zagury
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsFedir RYKHTIK
 

Similar to Airflow 4 manager (20)

Airflow Intro-1.pdf
Airflow Intro-1.pdfAirflow Intro-1.pdf
Airflow Intro-1.pdf
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
 
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything togetherSashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
Sashko Stubailo - The GraphQL and Apollo Stack: connecting everything together
 
The Apollo and GraphQL Stack
The Apollo and GraphQL StackThe Apollo and GraphQL Stack
The Apollo and GraphQL Stack
 
apacheairflow-160827123852.pdf
apacheairflow-160827123852.pdfapacheairflow-160827123852.pdf
apacheairflow-160827123852.pdf
 
Serverless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From ProductionServerless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From Production
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the Monolith
 
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfServerless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
 
Managing transactions on Ethereum with Apache Airflow
Managing transactions on Ethereum with Apache AirflowManaging transactions on Ethereum with Apache Airflow
Managing transactions on Ethereum with Apache Airflow
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
 
Prefect Workflow Applications 2023.pdf
Prefect Workflow Applications 2023.pdfPrefect Workflow Applications 2023.pdf
Prefect Workflow Applications 2023.pdf
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 

More from Worapol Alex Pongpech, PhD

Blockchain based Customer Relation System
Blockchain based Customer Relation SystemBlockchain based Customer Relation System
Blockchain based Customer Relation System
Worapol Alex Pongpech, PhD
 
Fast analytics kudu to druid
Fast analytics  kudu to druidFast analytics  kudu to druid
Fast analytics kudu to druid
Worapol Alex Pongpech, PhD
 
Apache Kafka
Apache Kafka Apache Kafka
Building business intuition from data
Building business intuition from dataBuilding business intuition from data
Building business intuition from data
Worapol Alex Pongpech, PhD
 
10 basic terms so you can talk to data engineer
10 basic terms so you can  talk to data engineer10 basic terms so you can  talk to data engineer
10 basic terms so you can talk to data engineer
Worapol Alex Pongpech, PhD
 
Why are we using kubernetes
Why are we using kubernetesWhy are we using kubernetes
Why are we using kubernetes
Worapol Alex Pongpech, PhD
 
Fast Analytics
Fast Analytics Fast Analytics
In15orlesss hadoop
In15orlesss hadoopIn15orlesss hadoop
In15orlesss hadoop
Worapol Alex Pongpech, PhD
 

More from Worapol Alex Pongpech, PhD (9)

Blockchain based Customer Relation System
Blockchain based Customer Relation SystemBlockchain based Customer Relation System
Blockchain based Customer Relation System
 
Fast analytics kudu to druid
Fast analytics  kudu to druidFast analytics  kudu to druid
Fast analytics kudu to druid
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 
Building business intuition from data
Building business intuition from dataBuilding business intuition from data
Building business intuition from data
 
10 basic terms so you can talk to data engineer
10 basic terms so you can  talk to data engineer10 basic terms so you can  talk to data engineer
10 basic terms so you can talk to data engineer
 
Why are we using kubernetes
Why are we using kubernetesWhy are we using kubernetes
Why are we using kubernetes
 
Fast Analytics
Fast Analytics Fast Analytics
Fast Analytics
 
Dark data
Dark dataDark data
Dark data
 
In15orlesss hadoop
In15orlesss hadoopIn15orlesss hadoop
In15orlesss hadoop
 

Recently uploaded

By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

Airflow 4 manager

  • 2. What is Airflow? ● Apache Airflow is an open-source workflow management platform. It started at Airbnb in October 2014 as a solution to manage the company's increasingly complex workflows ● Airflow is a platform to programmatically author, schedule and monitor workflows. ● Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. 2
  • 3. Why use Airflow? ● Have you ever managed messy data pipeline? In real life, data pipeline can be pretty messy ● Using CLI interface ● Had you ever have to adjust your workflow? Surely you want to be able to scale your workflow up and down quickly and effectively 3
  • 4. When not to use Airflow A sampling of examples that Airflow can not satisfy in a first-class way includes: ● DAGs which need to be run off-schedule or with no schedule at all ● DAGs that run concurrently with the same start time ● DAGs with complicated branching logic ● DAGs with many fast tasks ● DAGs which rely on the exchange of data ● Parametrized DAGs ● Dynamic DAGs 4
  • 5. What Airflow is used for in general? ● Monitoring Cron jobs ● transferring data from one place to other. ● Automating your DevOps operations. ● Periodically fetching data from websites and update the database for your awesome price comparison system. ● Data processing for recommendation based systems. ● Machine Learning Pipelines. 5
  • 6. So how others have used Airflow 6
  • 7. Robinhood ● Managing dependencies between jobs was difficult. With cron we would use worst-case expected durations for upstream jobs to schedule downstream jobs. ● Failure handling and alerting had to be managed by the job. We would have to rely on the job, or the on-call engineer to handle retries and upstream failures in the case of dependent jobs. ● Retrospection was difficult. We would need to sift through logs or alerts to check how a job may have performed on a certain day in the past. 7
  • 8. Google ● In May 2018 Google announced Google Cloud Composer, a managed Apache Airflow service that is fully integrated in the Google Cloud platform and has thus become one of the cornerstones for orchestrating managed services in Google Cloud. ● 8
  • 9. References 1. https://medium.com/analytics-and-data/10-benefits-to-using-airflow-33d312537bae 2. https://medium.com/analytics-and-data/on-the-evolution-of-data-engineering-c5e5 6d273e37 3. https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b 4. https://en.paradigmadigital.com/dev/apache-airflow/ 5. https://robinhood.engineering/why-robinhood-uses-airflow-aed13a9a90c8 6. https://towardsdatascience.com/building-a-production-level-etl-pipeline-platform-us ing-apache-airflow-a4cf34203fbd 7. https://medium.com/the-prefect-blog/why-not-airflow-4cfa423299c4 9