SlideShare a Scribd company logo
1 of 13
S E T T I N G U P A D A T A P I P E L I N E F O R
A N A L Y S I S O F S A L E S D A T A O F V A R I O U S
T R A N S P O R T V E H I C L E S
The upcoming slides show the process of automating a data pipeline to:
• Load data from amazon S3 bucket into a snowflake table using Airflow running in an AWS EC2 instance.
• Then data is then visualized using Tableau.
• Technologies used: Airflow, Python, SQL, AWS (EC2, S3), Snowflake and Tableau.
A R C H I T E C T U R E
• Use S3KeySensor package to mark the presence of source file being present in S3 bucket.
• Create staging area for file to land in snowflake.
• Create table schema to hold the records.
• Copy the records into from S3 to Snpowflake.
• Then data is then visualized using Tableau.
• Technologies used: Airflow, Python, SQL, AWS (EC2, S3), Snowflake and Tableau.
S T E P 1 : I N S T A L L P A C K A G E S & D E P E N D E N C I E S
S T E P 2 : S E T - U P E C 2 , S 3 & U P L O A D S O U R C E F I L E
S T E P 3 : C R E A T E D A G
S T E P 3 : A D D T A S K 1 T O D A G , T O S E E K T H E S O U R C E F I L E I N S 3
S T E P 4 : S E T - U P S N O W F L A K E D B
S T E P 5 : A D D T A S K 2 T O D A G - T O C R E A T E T A B L E S T R U C T U R E I N D B
S T E P 6 : A D D T A S K 3 T O D A G - T O C O P Y D A T A F R O M S 3 T O D B
S T E P 6 : T R I G G E R T H E D A G , T O R U N A L L T A S K S S E Q U E N T I A L L Y
V E R I F Y : D A T A G O T L O A D E D I N T A B L E
C R E A T E D D A S H B O A R D U S I N G T A B L E A U

More Related Content

Similar to airflow_aws_snow.pptx

Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...
Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...
Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...Ricardo Fanjul Fandiño
 
Competency-Based Learning and Learning Relationship Management #LRM
Competency-Based Learning and Learning Relationship Management #LRMCompetency-Based Learning and Learning Relationship Management #LRM
Competency-Based Learning and Learning Relationship Management #LRMGunnar Counselman
 
California Science Center (USC CSCI 588)
California Science Center (USC CSCI 588)California Science Center (USC CSCI 588)
California Science Center (USC CSCI 588)Sunny Chiu
 
Meteor - not just for rockstars
Meteor - not just for rockstarsMeteor - not just for rockstars
Meteor - not just for rockstarsStephan Hochhaus
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutesDavid Simons
 
Pintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@PinterestPintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@PinterestSuman Karumuri
 
From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsRonald Ashri
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsRonald Ashri
 
Pintrace: Distributed tracing @Pinterest
Pintrace: Distributed tracing @PinterestPintrace: Distributed tracing @Pinterest
Pintrace: Distributed tracing @PinterestSuman Karumuri
 
Whitepaper - Spinbackup Overview
Whitepaper - Spinbackup OverviewWhitepaper - Spinbackup Overview
Whitepaper - Spinbackup OverviewSpinbackup
 
Spring Roo 2.0 Preview at Spring I/O 2016
Spring Roo 2.0 Preview at Spring I/O 2016 Spring Roo 2.0 Preview at Spring I/O 2016
Spring Roo 2.0 Preview at Spring I/O 2016 DISID
 
Mapping as Strategy/Structure/Scaffold
Mapping as Strategy/Structure/ScaffoldMapping as Strategy/Structure/Scaffold
Mapping as Strategy/Structure/Scaffoldchar booth
 
Reducing Resistance: Deployment as Surface
Reducing Resistance: Deployment as SurfaceReducing Resistance: Deployment as Surface
Reducing Resistance: Deployment as SurfaceJeffrey Hulten
 
Choosing the right database
Choosing the right databaseChoosing the right database
Choosing the right databaseDavid Simons
 
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266iMasters
 
InterCon 2016 - Blockchain e smart-contracts em Ethereu
InterCon 2016 - Blockchain e smart-contracts em EthereuInterCon 2016 - Blockchain e smart-contracts em Ethereu
InterCon 2016 - Blockchain e smart-contracts em EthereuiMasters
 
Why the org_matters_shorter.jzt.2018sept25
Why the org_matters_shorter.jzt.2018sept25Why the org_matters_shorter.jzt.2018sept25
Why the org_matters_shorter.jzt.2018sept25Julie Tsai
 
Profiling Web Archives IIPC GA 2015
Profiling Web Archives IIPC GA 2015Profiling Web Archives IIPC GA 2015
Profiling Web Archives IIPC GA 2015Sawood Alam
 
Digital Data Commons - Emergence of AI Blockchain Convergence
Digital Data Commons - Emergence of AI Blockchain ConvergenceDigital Data Commons - Emergence of AI Blockchain Convergence
Digital Data Commons - Emergence of AI Blockchain ConvergenceGokul Alex
 

Similar to airflow_aws_snow.pptx (20)

Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...
Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...
Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apa...
 
Competency-Based Learning and Learning Relationship Management #LRM
Competency-Based Learning and Learning Relationship Management #LRMCompetency-Based Learning and Learning Relationship Management #LRM
Competency-Based Learning and Learning Relationship Management #LRM
 
California Science Center (USC CSCI 588)
California Science Center (USC CSCI 588)California Science Center (USC CSCI 588)
California Science Center (USC CSCI 588)
 
Meteor - not just for rockstars
Meteor - not just for rockstarsMeteor - not just for rockstars
Meteor - not just for rockstars
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutes
 
Pintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@PinterestPintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@Pinterest
 
From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dots
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the Dots
 
Pintrace: Distributed tracing @Pinterest
Pintrace: Distributed tracing @PinterestPintrace: Distributed tracing @Pinterest
Pintrace: Distributed tracing @Pinterest
 
Whitepaper - Spinbackup Overview
Whitepaper - Spinbackup OverviewWhitepaper - Spinbackup Overview
Whitepaper - Spinbackup Overview
 
Spring Roo 2.0 Preview at Spring I/O 2016
Spring Roo 2.0 Preview at Spring I/O 2016 Spring Roo 2.0 Preview at Spring I/O 2016
Spring Roo 2.0 Preview at Spring I/O 2016
 
Mapping as Strategy/Structure/Scaffold
Mapping as Strategy/Structure/ScaffoldMapping as Strategy/Structure/Scaffold
Mapping as Strategy/Structure/Scaffold
 
Reducing Resistance: Deployment as Surface
Reducing Resistance: Deployment as SurfaceReducing Resistance: Deployment as Surface
Reducing Resistance: Deployment as Surface
 
Choosing the right database
Choosing the right databaseChoosing the right database
Choosing the right database
 
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266
InterCon 2016 - Internet of “Thinking” – IoT sem BS com ESP8266
 
InterCon 2016 - Blockchain e smart-contracts em Ethereu
InterCon 2016 - Blockchain e smart-contracts em EthereuInterCon 2016 - Blockchain e smart-contracts em Ethereu
InterCon 2016 - Blockchain e smart-contracts em Ethereu
 
Why the org_matters_shorter.jzt.2018sept25
Why the org_matters_shorter.jzt.2018sept25Why the org_matters_shorter.jzt.2018sept25
Why the org_matters_shorter.jzt.2018sept25
 
Profiling Web Archives IIPC GA 2015
Profiling Web Archives IIPC GA 2015Profiling Web Archives IIPC GA 2015
Profiling Web Archives IIPC GA 2015
 
eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016
 
Digital Data Commons - Emergence of AI Blockchain Convergence
Digital Data Commons - Emergence of AI Blockchain ConvergenceDigital Data Commons - Emergence of AI Blockchain Convergence
Digital Data Commons - Emergence of AI Blockchain Convergence
 

Recently uploaded

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

airflow_aws_snow.pptx

  • 1. S E T T I N G U P A D A T A P I P E L I N E F O R A N A L Y S I S O F S A L E S D A T A O F V A R I O U S T R A N S P O R T V E H I C L E S
  • 2. The upcoming slides show the process of automating a data pipeline to: • Load data from amazon S3 bucket into a snowflake table using Airflow running in an AWS EC2 instance. • Then data is then visualized using Tableau. • Technologies used: Airflow, Python, SQL, AWS (EC2, S3), Snowflake and Tableau.
  • 3. A R C H I T E C T U R E • Use S3KeySensor package to mark the presence of source file being present in S3 bucket. • Create staging area for file to land in snowflake. • Create table schema to hold the records. • Copy the records into from S3 to Snpowflake. • Then data is then visualized using Tableau. • Technologies used: Airflow, Python, SQL, AWS (EC2, S3), Snowflake and Tableau.
  • 4. S T E P 1 : I N S T A L L P A C K A G E S & D E P E N D E N C I E S
  • 5. S T E P 2 : S E T - U P E C 2 , S 3 & U P L O A D S O U R C E F I L E
  • 6. S T E P 3 : C R E A T E D A G
  • 7. S T E P 3 : A D D T A S K 1 T O D A G , T O S E E K T H E S O U R C E F I L E I N S 3
  • 8. S T E P 4 : S E T - U P S N O W F L A K E D B
  • 9. S T E P 5 : A D D T A S K 2 T O D A G - T O C R E A T E T A B L E S T R U C T U R E I N D B
  • 10. S T E P 6 : A D D T A S K 3 T O D A G - T O C O P Y D A T A F R O M S 3 T O D B
  • 11. S T E P 6 : T R I G G E R T H E D A G , T O R U N A L L T A S K S S E Q U E N T I A L L Y
  • 12. V E R I F Y : D A T A G O T L O A D E D I N T A B L E
  • 13. C R E A T E D D A S H B O A R D U S I N G T A B L E A U