SlideShare a Scribd company logo
1 of 44
Download to read offline
Polidea
Polidea
Airflow is a platform to programmatically author,
schedule and monitor workflows.
Dynamic/Elegant
Extensible
Scalable
Polidea
Polidea
● Developing Airflow is (was !) hard
● Road taken to developer productivity
● Improving first time experience for the developers
● Focus on teamwork
Polidea
Polidea
Polidea
Hi!
Principal Software Engineer @Polidea
Apache Airflow PMC member
Certified GCP Architect
ex-Googler, ex-CTO, ex-choir member
Polidea
TALENTS
PROJECTS
DELIVERED
USERS OF
OUR APPS
OF BUSINESS
THROUGH
REFERRALS
Polidea
Polidea
Polidea
Polidea
● 100+ operators
● 18+ GCP services
● Oozie-To-Airflow
Polidea
● 1 Apache Airflow committer, 1 PMC member
● Documentation improvements
● Breeze - improved development environment
● Py2 -> Py3
● Pylint compatibility
● CI environment reimplemented
● Operator scaffolding
Polidea
Polidea
Polidea
● Multiple backends: postgres, mysql, sqlite
● Multiple python versions (2.7) - 3.5, 3.6. 3.7
● Multiple executors: Local/Sequential/Kubernetes
● Automated static code analysis
● Automated documentation building
Polidea
● Long time to set it up
● Frustrations of fresh developer experience
● High friction/learning curve for Airflow development environment
● Slow iteration speed
● Complicated Development Environment
Polidea
● Scripts only designed for CI, not local environment
● Dependencies installed every time you start the environment
● Always full database reset
● Minutes to run one test
● No guidance how to iterate over tests
Polidea
Polidea
Polidea
● Focus on developer productivity
● Faster development cycle
● Decrease developer frustration
● Improve the teamwork
● Easy for ad-hoc contributors to code & test
Polidea
● AIP-10: Multi-layered and multi-stage official Airflow image
● AIP-7: Simplified Development Workflow
● AIP-26: Production-ready Airflow Docker Image and helm chart
● AIP-23: Migrate out of Travis CI
● AIP-4: Support for System Tests for external systems
Polidea
● Local virtualenv
● Own Travis CI fork
● Docker compose (Travis CI equivalent)
Polidea
● Total time: 7 minutes
● Running one test only
● Failure at the end (!)
● Re-run - 10-20 seconds for DB
● Re-enter - same time (!)
● No bash history
Polidea
Polidea
● Docker images built from master automatically (DockerHub)
● Local images use cached images
● Tests and static checks run using Docker Compose/Docker environment
● Can be run on Kubernetes Cluster (Docker-In-Docker)
● CI system - independent
● Base to build production image
Polidea
Polidea
● works out-of-the-box
● initializes DB when needed
● environment variables set
● sub-second test overhead
● ipdb debugging
● verbose output
Polidea
Polidea
● entering the environment: ./breeze --backend sqlite --python 3.5
● last-used environment: ./breeze
● automated image management
● autocomplete of options
● sub-second test execution overhead
● host sources mounted to Docker container
● ports forwarded
● hints for ad-hoc developers
Polidea
● run-tests tests.core<TAB><TAB> autocomplete
● bash history across sessions
● run static checks with Breeze
● build documentation with Breeze
● run licence checks with Breeze
● easy debugging (including debugging with IDE)
● pre-commit checks
Polidea
Polidea
Polidea
Polidea
● Docker image management
● Run-tests with DB initialisation
● Travis CI integration
● Run all tasks (docs/static/licence check ...)
● Pre-commit checks
● Comprehensive documentation - Google Season of Docs YAY!
Polidea
Polidea
Polidea
● easy to use
○ pre-commit install
○ pre-commit run
○ pre-commit run mypy
○ pre-commit run --all-files
● run only for changed files (fast)
● catches errors early
● make committers time efficient
● promotes good practices
Polidea
Polidea
● Production-ready Apache Airflow official image
● Simplifications (less images, easier scripts)
● Migrating out of Travis CI
○ GitLab CI (only CI) or GitHub Actions
○ Kubernetes Cluster on Google Kubernetes Engine (Thanks Google!)
● Automation of Performance Tests
● Automation of Release Tests
Polidea
hello@polidea.com

More Related Content

What's hot

FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CIOlinData
 
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...Samsul Ma'arif
 
An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5Everett Toews
 
Using GitLab CI
Using GitLab CIUsing GitLab CI
Using GitLab CIColCh
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsBo-Yi Wu
 
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...CodiLime
 
Git with the flow
Git with the flowGit with the flow
Git with the flowDana White
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)Francesco Fiore
 
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and BeyondArchitecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and BeyondSandro Andrade
 
Training: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, MavenTraining: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, MavenArtur Ventura
 
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro WorkshopOpen Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro WorkshopWong Hoi Sing Edison
 
Git Tutorial I
Git Tutorial IGit Tutorial I
Git Tutorial IJim Yeh
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD processHYS Enterprise
 
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021Kaxil Naik
 

What's hot (20)

FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CI
 
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
 
An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5
 
Introduction to GIT
Introduction to GITIntroduction to GIT
Introduction to GIT
 
Using GitLab CI
Using GitLab CIUsing GitLab CI
Using GitLab CI
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 
Gitlab ci-cd
Gitlab ci-cdGitlab ci-cd
Gitlab ci-cd
 
Git Tutorial
Git TutorialGit Tutorial
Git Tutorial
 
Flow
FlowFlow
Flow
 
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...
CodiLime Tech Talk - Dawid Trzebiatowski i Wojciech Urbański: Opening the Flo...
 
Git with the flow
Git with the flowGit with the flow
Git with the flow
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)
 
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and BeyondArchitecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
 
Training: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, MavenTraining: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, Maven
 
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro WorkshopOpen Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
 
Git Tutorial I
Git Tutorial IGit Tutorial I
Git Tutorial I
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD process
 
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021
 
Lets git to it
Lets git to itLets git to it
Lets git to it
 
Git and Github
Git and GithubGit and Github
Git and Github
 

Similar to It's a Breeze to develop Apache Airflow (Apache Con Berlin)

It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)Jarek Potiuk
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentLeandro Totino Pereira
 
Advanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAdvanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAriel Moskovich
 
Dev + DevOps для PHP розробника
Dev + DevOps для PHP розробникаDev + DevOps для PHP розробника
Dev + DevOps для PHP розробникаphpfriendsclub
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development PipelineGlobalLogic Ukraine
 
DrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration ToolboxDrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration ToolboxAndrii Podanenko
 
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...Ortus Solutions, Corp
 
Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019Codefresh
 
Docker based-Pipelines with Codefresh
Docker based-Pipelines with CodefreshDocker based-Pipelines with Codefresh
Docker based-Pipelines with CodefreshCodefresh
 
MoldCamp - multidimentional testing workflow. CIBox.
MoldCamp  - multidimentional testing workflow. CIBox.MoldCamp  - multidimentional testing workflow. CIBox.
MoldCamp - multidimentional testing workflow. CIBox.Andrii Podanenko
 
Chef - Administration for programmers
Chef - Administration for programmersChef - Administration for programmers
Chef - Administration for programmersmrsabo
 
Docker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowDocker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowAndrii Podanenko
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'acorehard_by
 
Ruby microservices with Docker - Sergii Koba
Ruby microservices with Docker -  Sergii KobaRuby microservices with Docker -  Sergii Koba
Ruby microservices with Docker - Sergii KobaRuby Meditation
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with PuppetKris Buytaert
 
Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Igalia
 

Similar to It's a Breeze to develop Apache Airflow (Apache Con Berlin) (20)

It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
 
Advanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAdvanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the Field
 
Dev + DevOps для PHP розробника
Dev + DevOps для PHP розробникаDev + DevOps для PHP розробника
Dev + DevOps для PHP розробника
 
Continuous testing
Continuous testingContinuous testing
Continuous testing
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development Pipeline
 
DrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration ToolboxDrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration Toolbox
 
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
 
Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019
 
Docker based-Pipelines with Codefresh
Docker based-Pipelines with CodefreshDocker based-Pipelines with Codefresh
Docker based-Pipelines with Codefresh
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
MoldCamp - multidimentional testing workflow. CIBox.
MoldCamp  - multidimentional testing workflow. CIBox.MoldCamp  - multidimentional testing workflow. CIBox.
MoldCamp - multidimentional testing workflow. CIBox.
 
Docking with Docker
Docking with DockerDocking with Docker
Docking with Docker
 
Chef - Administration for programmers
Chef - Administration for programmersChef - Administration for programmers
Chef - Administration for programmers
 
Docker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowDocker SQL Continuous Integration Flow
Docker SQL Continuous Integration Flow
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'a
 
Ruby microservices with Docker - Sergii Koba
Ruby microservices with Docker -  Sergii KobaRuby microservices with Docker -  Sergii Koba
Ruby microservices with Docker - Sergii Koba
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)
 

More from Jarek Potiuk

Subtle Differences between Python versions
Subtle Differences between Python versionsSubtle Differences between Python versions
Subtle Differences between Python versionsJarek Potiuk
 
Caching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer scienceCaching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer scienceJarek Potiuk
 
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestJarek Potiuk
 
Off time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaOff time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaJarek Potiuk
 
Berlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsBerlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsJarek Potiuk
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...Jarek Potiuk
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)Jarek Potiuk
 

More from Jarek Potiuk (8)

Subtle Differences between Python versions
Subtle Differences between Python versionsSubtle Differences between Python versions
Subtle Differences between Python versions
 
Caching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer scienceCaching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer science
 
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
 
Off time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaOff time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social media
 
Berlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsBerlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow Workshops
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
 
Ci for android OS
Ci for android OSCi for android OS
Ci for android OS
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

It's a Breeze to develop Apache Airflow (Apache Con Berlin)

  • 1.
  • 3. Polidea Airflow is a platform to programmatically author, schedule and monitor workflows. Dynamic/Elegant Extensible Scalable
  • 5. Polidea ● Developing Airflow is (was !) hard ● Road taken to developer productivity ● Improving first time experience for the developers ● Focus on teamwork
  • 8. Polidea Hi! Principal Software Engineer @Polidea Apache Airflow PMC member Certified GCP Architect ex-Googler, ex-CTO, ex-choir member
  • 13. Polidea ● 100+ operators ● 18+ GCP services ● Oozie-To-Airflow
  • 14. Polidea ● 1 Apache Airflow committer, 1 PMC member ● Documentation improvements ● Breeze - improved development environment ● Py2 -> Py3 ● Pylint compatibility ● CI environment reimplemented ● Operator scaffolding
  • 17. Polidea ● Multiple backends: postgres, mysql, sqlite ● Multiple python versions (2.7) - 3.5, 3.6. 3.7 ● Multiple executors: Local/Sequential/Kubernetes ● Automated static code analysis ● Automated documentation building
  • 18. Polidea ● Long time to set it up ● Frustrations of fresh developer experience ● High friction/learning curve for Airflow development environment ● Slow iteration speed ● Complicated Development Environment
  • 19. Polidea ● Scripts only designed for CI, not local environment ● Dependencies installed every time you start the environment ● Always full database reset ● Minutes to run one test ● No guidance how to iterate over tests
  • 22. Polidea ● Focus on developer productivity ● Faster development cycle ● Decrease developer frustration ● Improve the teamwork ● Easy for ad-hoc contributors to code & test
  • 23. Polidea ● AIP-10: Multi-layered and multi-stage official Airflow image ● AIP-7: Simplified Development Workflow ● AIP-26: Production-ready Airflow Docker Image and helm chart ● AIP-23: Migrate out of Travis CI ● AIP-4: Support for System Tests for external systems
  • 24. Polidea ● Local virtualenv ● Own Travis CI fork ● Docker compose (Travis CI equivalent)
  • 25. Polidea ● Total time: 7 minutes ● Running one test only ● Failure at the end (!) ● Re-run - 10-20 seconds for DB ● Re-enter - same time (!) ● No bash history
  • 27. Polidea ● Docker images built from master automatically (DockerHub) ● Local images use cached images ● Tests and static checks run using Docker Compose/Docker environment ● Can be run on Kubernetes Cluster (Docker-In-Docker) ● CI system - independent ● Base to build production image
  • 29. Polidea ● works out-of-the-box ● initializes DB when needed ● environment variables set ● sub-second test overhead ● ipdb debugging ● verbose output
  • 31. Polidea ● entering the environment: ./breeze --backend sqlite --python 3.5 ● last-used environment: ./breeze ● automated image management ● autocomplete of options ● sub-second test execution overhead ● host sources mounted to Docker container ● ports forwarded ● hints for ad-hoc developers
  • 32. Polidea ● run-tests tests.core<TAB><TAB> autocomplete ● bash history across sessions ● run static checks with Breeze ● build documentation with Breeze ● run licence checks with Breeze ● easy debugging (including debugging with IDE) ● pre-commit checks
  • 36. Polidea ● Docker image management ● Run-tests with DB initialisation ● Travis CI integration ● Run all tasks (docs/static/licence check ...) ● Pre-commit checks ● Comprehensive documentation - Google Season of Docs YAY!
  • 39. Polidea ● easy to use ○ pre-commit install ○ pre-commit run ○ pre-commit run mypy ○ pre-commit run --all-files ● run only for changed files (fast) ● catches errors early ● make committers time efficient ● promotes good practices
  • 41. Polidea ● Production-ready Apache Airflow official image ● Simplifications (less images, easier scripts) ● Migrating out of Travis CI ○ GitLab CI (only CI) or GitHub Actions ○ Kubernetes Cluster on Google Kubernetes Engine (Thanks Google!) ● Automation of Performance Tests ● Automation of Release Tests
  • 42.
  • 43.