Introduction
Airflow – an Automation Tool
Metfone 2022
AGENDA
What is Airflow
Benefit of Airflow
Airflow Structure
Core Concepts
How to create workflow
Demo
WHAT IS AIRFLOW
Airflow is the workflow orchestration tools:
 Manage scheduling and running jobs and data pipelines
 Ensure jobs are ordered correctly based on dependencies
 Provides mechanisms for tracking and monitoring state of jobs and recovering from
failure
BENEFITS OF AIRFLOW
 Easy to Use: User need a bit of Python knowledge to use.
 Open Source: Free and open source with a lot of contributing
 Robust Integrations: Ready to use with various platforms and systems: BashShell, SFTP,
MySQL, Posgres, ORACLE, Python …
 Standard Python: User can use Python to create workflow very flexibilities.
 Visualization: Can monitor and manage workflows by web interface.
AIRFLOW ARCHITECTURE
CORE CONCEPTS
DAG: Directed Acyclic Graph - A DAG is a series of tasks that you want to run as part of
your workflow. This might include something like execute bashshell, performing some
checklist by Python script or Database script ... In Airflow each of these steps would be
written as individual tasks in a DAG.
Airflow enables you to also specify the relationship between the tasks, any dependencies
(e.g. data having loaded in a table before a task is run) and the order in which the tasks
should be run.
CORE CONCEPTS
TASK: represent each node of a defined DAG. They are visual representations of the work
being done at each step of the workflow, with the actual work that they represent being
defined by operators
CORE CONCEPTS
Operators: An operator encapsulates the operation to be performed in each task in a DAG.
Airflow has a wide range of built-in operators that can perform specific tasks some of which are
platform-specific. Additionally, it is possible to create your own custom operators.
HOW TO CREATE WORKFLOW
DAG will be saved as a .py file in the dags directory. The steps to create a dag:
Define DAG
•Define dag_id, start_date
and how often the tasks
should be run.
•Airflow uses a CRON
expression to define the
schedule
CREATE TASKS
•Airflow provides a range
of operators to perform
most functions:
BashShell, Database
(Oracle, Postgres, MySQL
…), Python, FTP … to
create tasks.
ORDER TASKS
•Define dependency and
order to perform tasks
References:
https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html
https://www.youtube.com/watch?v=CLkzXrjrFKg
DEMO
http://10.79.9.131:8080/home
THANKS

Introduce Airflow.ppsx

  • 1.
    Introduction Airflow – anAutomation Tool Metfone 2022
  • 2.
    AGENDA What is Airflow Benefitof Airflow Airflow Structure Core Concepts How to create workflow Demo
  • 3.
    WHAT IS AIRFLOW Airflowis the workflow orchestration tools:  Manage scheduling and running jobs and data pipelines  Ensure jobs are ordered correctly based on dependencies  Provides mechanisms for tracking and monitoring state of jobs and recovering from failure
  • 4.
    BENEFITS OF AIRFLOW Easy to Use: User need a bit of Python knowledge to use.  Open Source: Free and open source with a lot of contributing  Robust Integrations: Ready to use with various platforms and systems: BashShell, SFTP, MySQL, Posgres, ORACLE, Python …  Standard Python: User can use Python to create workflow very flexibilities.  Visualization: Can monitor and manage workflows by web interface.
  • 5.
  • 6.
    CORE CONCEPTS DAG: DirectedAcyclic Graph - A DAG is a series of tasks that you want to run as part of your workflow. This might include something like execute bashshell, performing some checklist by Python script or Database script ... In Airflow each of these steps would be written as individual tasks in a DAG. Airflow enables you to also specify the relationship between the tasks, any dependencies (e.g. data having loaded in a table before a task is run) and the order in which the tasks should be run.
  • 7.
    CORE CONCEPTS TASK: representeach node of a defined DAG. They are visual representations of the work being done at each step of the workflow, with the actual work that they represent being defined by operators
  • 8.
    CORE CONCEPTS Operators: Anoperator encapsulates the operation to be performed in each task in a DAG. Airflow has a wide range of built-in operators that can perform specific tasks some of which are platform-specific. Additionally, it is possible to create your own custom operators.
  • 9.
    HOW TO CREATEWORKFLOW DAG will be saved as a .py file in the dags directory. The steps to create a dag: Define DAG •Define dag_id, start_date and how often the tasks should be run. •Airflow uses a CRON expression to define the schedule CREATE TASKS •Airflow provides a range of operators to perform most functions: BashShell, Database (Oracle, Postgres, MySQL …), Python, FTP … to create tasks. ORDER TASKS •Define dependency and order to perform tasks
  • 10.
  • 11.
  • 12.

Editor's Notes

  • #2Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #3Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #4Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #5Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #6Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #7Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #8Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #9Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #10Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #11Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #12Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)
  • #13Ā Hiện trįŗ”ng tĆ i nguyĆŖn mįŗ”ng lưới, tįŗ£i hệ thống, license hệ thống, dį»± bĆ”o nguy cĘ”, vįŗ„n đề đang gįŗ·p… (nįŗæu có, nguyĆŖn nhĆ¢n, giįŗ£i phĆ”p, hĆ nh động, kįŗæt quįŗ£)