Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Airflow & Apache Spark data pipelines in the cloud

44 views

Published on

Quick Intro into building distributed pipelines in the cloud

Published in: Internet
  • Be the first to comment

  • Be the first to like this

Apache Airflow & Apache Spark data pipelines in the cloud

  1. 1. W E L C O M E ! 4TH DATA DRIVEN RIJNMOND
  2. 2. P R O G R A M ‣ Apache Airflow & Apache Spark data pipelines in the cloud ‣ Collecting data in the food domain with apps ‣ Large-scale outlet matching and enrichment in the food service domain
  3. 3. D A T L I N Q
  4. 4. A I R F L O W & S P A R K I N T H E C L O U D
  5. 5. D A T A I S G A R B A G E
  6. 6. D A T A I N F O R M A T I O N K N O W L E D G E I N S I G H T
  7. 7. B E T T E R C O M B I N E D
  8. 8. C L E A N I N G D A T A I S H A R D
  9. 9. C O N T I N U O U S I N F L O W
  10. 10. A P A C H E S P A R K
  11. 11. D E C E N T R A L I S E & A T O M I C I S E
  12. 12. A P A C H E A I R F L O W
  13. 13. G O O G L E C L O U D P L A T F O R M
  14. 14. D E M O
  15. 15. Q U E S T I O N S ? S L I D E S & L I N K S W I L L B E P O S T E D O N L I N E

×