SlideShare a Scribd company logo
1 of 7
CO1
EXPERIMENT 2:Designofa Data Warehouse, Understandthe importance
of pre-processing by performing the ETL Operations
Date of the Session: ___/___/___ Time of the Session: _____to______
Pre-Lab:
1. Install Python of version 3.6.9 and Anaconda version 3. Launch Jupyter Notebook IDE and
load required python libraries
A.
Q2. Analyze data analytics in the context of data warehousing?
A.
3. Identify the importance of data modelling?
A.
4. Explore the need of Real-Time ETL in Data warehouse?
A.
5. Find out the role of impact analysis in the ETL system?
A.
6. Mention the significance of the ETL system in the Data warehouse?
A.
In-Lab:
Task #1:Understand the importance of pre-processing by performing the
ETL Operations implemented using MySql connector and python
programming.
1. Perform the following ETL operations by making the use of python built-in packages and
SQL commands.
a. Extracting
b. Transforming and
c. Loading tables of data.
Dataset: Use the following link to download the dataset
https://drive.google.com/file/d/1M7phlSaw5mal3RlEUCTeOlQCEpQ5iHY3/view?usp=shari
ng
i. From the given dataset perform Extract operation using python etl library to convert
the data stored in json file to zip file.
ii. From the extracted data perform Transform operation to convert the data into a single
standard dictionary format.
iii. Use MySQL cursor to load the data from CSV file to MySQL by creating the schema
and tables
Writing space of the problem (For Student’s use only)
(Writing space for In-Lab)
(Writing space for In-Lab)
Post-Lab:
1. Summarize all your findings from the In-Lab experiments and explain your conclusions.
Writing space of the problem (For Student’s use only)
2. Give a brief description about the libraries used and required from performing the ETL
operations.
Writing space of the problem (For Student’s use only)
(Writing space for Post-Lab)
Viva Voice:
1.What are the characteristics of Data Warehouse?
2.What is Operation Data Source?
3.What is the difference between Manual Testing and ETL Testing?
4.What are the steps followed in ETL testing process?
5.What is the difference between ETL tools and OLAP tools?

More Related Content

Similar to Lab manual etl

Oracle RI ETL process overview.
Oracle RI ETL process overview.Oracle RI ETL process overview.
Oracle RI ETL process overview.Puneet Kala
 
123448572 all-in-one-informatica
123448572 all-in-one-informatica123448572 all-in-one-informatica
123448572 all-in-one-informaticahomeworkping9
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...Databricks
 
OpenProdoc 0.8 Load Tests
OpenProdoc 0.8 Load TestsOpenProdoc 0.8 Load Tests
OpenProdoc 0.8 Load Testsjhierrot
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...Shahzad
 
ITFT - Java Coding
ITFT - Java CodingITFT - Java Coding
ITFT - Java CodingBlossom Sood
 
QTP Automation Testing Tutorial 6
QTP Automation Testing Tutorial 6QTP Automation Testing Tutorial 6
QTP Automation Testing Tutorial 6Akash Tyagi
 
Logging presentation
Logging presentationLogging presentation
Logging presentationJatan Malde
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Fwdays
 
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity HardwareBig Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity HardwareLaurens De Vocht
 
Performance Analysis of Idle Programs
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programsgreenwop
 
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Alex Rayón Jerez
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaSpringPeople
 
Btech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfBtech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfAdityaBhateja1
 
Qtp Interview Questions
Qtp Interview QuestionsQtp Interview Questions
Qtp Interview Questionskspanigra
 
Comp 220 ilab 5 of 7
Comp 220 ilab 5 of 7Comp 220 ilab 5 of 7
Comp 220 ilab 5 of 7ashhadiqbal
 
Etl interview questions
Etl interview questionsEtl interview questions
Etl interview questionsashokvirtual
 
Flink and Hive integration - unifying enterprise data processing systems
Flink and Hive integration - unifying enterprise data processing systemsFlink and Hive integration - unifying enterprise data processing systems
Flink and Hive integration - unifying enterprise data processing systemsBowen Li
 

Similar to Lab manual etl (20)

Oracle RI ETL process overview.
Oracle RI ETL process overview.Oracle RI ETL process overview.
Oracle RI ETL process overview.
 
123448572 all-in-one-informatica
123448572 all-in-one-informatica123448572 all-in-one-informatica
123448572 all-in-one-informatica
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
 
OpenProdoc 0.8 Load Tests
OpenProdoc 0.8 Load TestsOpenProdoc 0.8 Load Tests
OpenProdoc 0.8 Load Tests
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
 
ITFT - Java Coding
ITFT - Java CodingITFT - Java Coding
ITFT - Java Coding
 
QTP Automation Testing Tutorial 6
QTP Automation Testing Tutorial 6QTP Automation Testing Tutorial 6
QTP Automation Testing Tutorial 6
 
java slides
java slidesjava slides
java slides
 
Logging presentation
Logging presentationLogging presentation
Logging presentation
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
 
ETL DW-RealTime
ETL DW-RealTimeETL DW-RealTime
ETL DW-RealTime
 
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity HardwareBig Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
 
Performance Analysis of Idle Programs
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programs
 
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Btech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfBtech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdf
 
Qtp Interview Questions
Qtp Interview QuestionsQtp Interview Questions
Qtp Interview Questions
 
Comp 220 ilab 5 of 7
Comp 220 ilab 5 of 7Comp 220 ilab 5 of 7
Comp 220 ilab 5 of 7
 
Etl interview questions
Etl interview questionsEtl interview questions
Etl interview questions
 
Flink and Hive integration - unifying enterprise data processing systems
Flink and Hive integration - unifying enterprise data processing systemsFlink and Hive integration - unifying enterprise data processing systems
Flink and Hive integration - unifying enterprise data processing systems
 

Recently uploaded

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 

Lab manual etl

  • 1. CO1 EXPERIMENT 2:Designofa Data Warehouse, Understandthe importance of pre-processing by performing the ETL Operations Date of the Session: ___/___/___ Time of the Session: _____to______ Pre-Lab: 1. Install Python of version 3.6.9 and Anaconda version 3. Launch Jupyter Notebook IDE and load required python libraries A. Q2. Analyze data analytics in the context of data warehousing? A. 3. Identify the importance of data modelling? A.
  • 2. 4. Explore the need of Real-Time ETL in Data warehouse? A. 5. Find out the role of impact analysis in the ETL system? A. 6. Mention the significance of the ETL system in the Data warehouse? A.
  • 3. In-Lab: Task #1:Understand the importance of pre-processing by performing the ETL Operations implemented using MySql connector and python programming. 1. Perform the following ETL operations by making the use of python built-in packages and SQL commands. a. Extracting b. Transforming and c. Loading tables of data. Dataset: Use the following link to download the dataset https://drive.google.com/file/d/1M7phlSaw5mal3RlEUCTeOlQCEpQ5iHY3/view?usp=shari ng i. From the given dataset perform Extract operation using python etl library to convert the data stored in json file to zip file. ii. From the extracted data perform Transform operation to convert the data into a single standard dictionary format. iii. Use MySQL cursor to load the data from CSV file to MySQL by creating the schema and tables Writing space of the problem (For Student’s use only)
  • 6. Post-Lab: 1. Summarize all your findings from the In-Lab experiments and explain your conclusions. Writing space of the problem (For Student’s use only) 2. Give a brief description about the libraries used and required from performing the ETL operations. Writing space of the problem (For Student’s use only) (Writing space for Post-Lab)
  • 7. Viva Voice: 1.What are the characteristics of Data Warehouse? 2.What is Operation Data Source? 3.What is the difference between Manual Testing and ETL Testing? 4.What are the steps followed in ETL testing process? 5.What is the difference between ETL tools and OLAP tools?