SlideShare a Scribd company logo
1 of 25
Download to read offline
Data Wrangling
Week 5
Dr. Ferdin Joe John Joseph
Faculty of Information Technology
Thai – Nichi Institute of Technology, Bangkok
Today’s Lesson
• Introduction to XML
• XML - Theories
• XML parsing using Python
• Text files parsing using Python
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
2
XML Introduction
• XML stands for eXtensible Markup Language
• Used for storing and transmitting data
• It is readable by both human and machine
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
3
XML
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
4
XML
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
5
XML Parsing using Python
Make a file sample.xml
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
6
Import libraries necessary
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
7
Parse the text file
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
8
Printing the root
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
9
Find Occurences
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
10
Query from XML
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
11
Query from XML
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
12
Store values in array
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
13
Print the arrays
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
14
Convert to Pandas Dataframe
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
15
Print Data Frame
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
16
Activity
• Make the data look like the schema shown below
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
17
Calculate Mean
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
18
Parsing Text files
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
19
Procedure
• Most of the text file tables are comma separated
• CSV parsing in pandas can be used
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
20
Activity
• Save a csv file you used in your previous lectures in txt format
• Use the same pandas library to parse through the text
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
21
XML from a url
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
22
XML from a url
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
23
Activity
• Copy the news XML. Convert this xml into pandas data frame
• Perform data analysis on the given dataframe
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
24
Lesson for Next Week
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
https://github.com/ferdinjoe/DSA201
25

More Related Content

What's hot

Blockchain Technology - Week 2 - Blockchain Terminologies
Blockchain Technology - Week 2 - Blockchain TerminologiesBlockchain Technology - Week 2 - Blockchain Terminologies
Blockchain Technology - Week 2 - Blockchain TerminologiesFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 5 - Cryptography and Steganography
Blockchain Technology - Week 5 - Cryptography and SteganographyBlockchain Technology - Week 5 - Cryptography and Steganography
Blockchain Technology - Week 5 - Cryptography and SteganographyFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 4 - Hyperledger and Smart Contracts
Blockchain Technology - Week 4 - Hyperledger and Smart ContractsBlockchain Technology - Week 4 - Hyperledger and Smart Contracts
Blockchain Technology - Week 4 - Hyperledger and Smart ContractsFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General ProblemBlockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General ProblemFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 11 - Thai-Nichi Institute of Technology
Blockchain Technology - Week 11 - Thai-Nichi Institute of TechnologyBlockchain Technology - Week 11 - Thai-Nichi Institute of Technology
Blockchain Technology - Week 11 - Thai-Nichi Institute of TechnologyFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 3 - FinTech and Cryptocurrencies
Blockchain Technology - Week 3 - FinTech and CryptocurrenciesBlockchain Technology - Week 3 - FinTech and Cryptocurrencies
Blockchain Technology - Week 3 - FinTech and CryptocurrenciesFerdin Joe John Joseph PhD
 
Blockchain Technology - Week 1 - Introduction to Blockchain
Blockchain Technology - Week 1 - Introduction to BlockchainBlockchain Technology - Week 1 - Introduction to Blockchain
Blockchain Technology - Week 1 - Introduction to BlockchainFerdin Joe John Joseph PhD
 
Week 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingWeek 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
2019 DSA 105 Introduction to Data Science Week 1
2019 DSA 105 Introduction to Data Science Week 12019 DSA 105 Introduction to Data Science Week 1
2019 DSA 105 Introduction to Data Science Week 1Ferdin Joe John Joseph PhD
 
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 

What's hot (20)

Blockchain Technology - Week 2 - Blockchain Terminologies
Blockchain Technology - Week 2 - Blockchain TerminologiesBlockchain Technology - Week 2 - Blockchain Terminologies
Blockchain Technology - Week 2 - Blockchain Terminologies
 
Week2: Programming for Data Analysis
Week2: Programming for Data AnalysisWeek2: Programming for Data Analysis
Week2: Programming for Data Analysis
 
Week 8: Programming for Data Analysis
Week 8: Programming for Data AnalysisWeek 8: Programming for Data Analysis
Week 8: Programming for Data Analysis
 
Blockchain Technology - Week 5 - Cryptography and Steganography
Blockchain Technology - Week 5 - Cryptography and SteganographyBlockchain Technology - Week 5 - Cryptography and Steganography
Blockchain Technology - Week 5 - Cryptography and Steganography
 
Week 10: Programming for Data Analysis
Week 10: Programming for Data AnalysisWeek 10: Programming for Data Analysis
Week 10: Programming for Data Analysis
 
Blockchain Technology - Week 4 - Hyperledger and Smart Contracts
Blockchain Technology - Week 4 - Hyperledger and Smart ContractsBlockchain Technology - Week 4 - Hyperledger and Smart Contracts
Blockchain Technology - Week 4 - Hyperledger and Smart Contracts
 
Week 1: Programming for Data Analysis
Week 1: Programming for Data AnalysisWeek 1: Programming for Data Analysis
Week 1: Programming for Data Analysis
 
Programming for Data Analysis: Week 4
Programming for Data Analysis: Week 4Programming for Data Analysis: Week 4
Programming for Data Analysis: Week 4
 
Blockchain Technology - Week 9 - Blockciphers
Blockchain Technology - Week 9 - BlockciphersBlockchain Technology - Week 9 - Blockciphers
Blockchain Technology - Week 9 - Blockciphers
 
Week 9: Programming for Data Analysis
Week 9: Programming for Data AnalysisWeek 9: Programming for Data Analysis
Week 9: Programming for Data Analysis
 
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General ProblemBlockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem
Blockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem
 
Blockchain Technology - Week 11 - Thai-Nichi Institute of Technology
Blockchain Technology - Week 11 - Thai-Nichi Institute of TechnologyBlockchain Technology - Week 11 - Thai-Nichi Institute of Technology
Blockchain Technology - Week 11 - Thai-Nichi Institute of Technology
 
Blockchain Technology - Week 3 - FinTech and Cryptocurrencies
Blockchain Technology - Week 3 - FinTech and CryptocurrenciesBlockchain Technology - Week 3 - FinTech and Cryptocurrencies
Blockchain Technology - Week 3 - FinTech and Cryptocurrencies
 
Blockchain Technology - Week 1 - Introduction to Blockchain
Blockchain Technology - Week 1 - Introduction to BlockchainBlockchain Technology - Week 1 - Introduction to Blockchain
Blockchain Technology - Week 1 - Introduction to Blockchain
 
Deep learning - Introduction
Deep learning - IntroductionDeep learning - Introduction
Deep learning - Introduction
 
Week 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud ComputingWeek 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud Computing
 
Week 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingWeek 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud Computing
 
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
 
2019 DSA 105 Introduction to Data Science Week 1
2019 DSA 105 Introduction to Data Science Week 12019 DSA 105 Introduction to Data Science Week 1
2019 DSA 105 Introduction to Data Science Week 1
 
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
 

Similar to Data wrangling week 6

Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
 
score based ranking of documents
score based ranking of documentsscore based ranking of documents
score based ranking of documentsKriti Khanna
 
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questionsLarry Smarr
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of SouthamptonRepository Fringe
 
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...Unleashing the Potential: Navigating the Versatility and Simplicity of Python...
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...Flexsin
 
lecture-1-overview.pptx
lecture-1-overview.pptxlecture-1-overview.pptx
lecture-1-overview.pptxMweeneMweemba1
 
Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...Ana Roxin
 
A Practical Approach to Design, Implementation, and Management A Practical Ap...
A Practical Approach to Design, Implementation, and Management A Practical Ap...A Practical Approach to Design, Implementation, and Management A Practical Ap...
A Practical Approach to Design, Implementation, and Management A Practical Ap...Cynthia Velynne
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...semanticsconference
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DataeXascale Infolab
 
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...ChemAxon
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
 
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...CaaS EU FP7 Project
 
Smarter Data for Smarter Libraries
Smarter Data for Smarter LibrariesSmarter Data for Smarter Libraries
Smarter Data for Smarter LibrariesOCLC
 

Similar to Data wrangling week 6 (20)

Data wrangling week 5
Data wrangling week 5Data wrangling week 5
Data wrangling week 5
 
Data wrangling week 11
Data wrangling week 11Data wrangling week 11
Data wrangling week 11
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 
score based ranking of documents
score based ranking of documentsscore based ranking of documents
score based ranking of documents
 
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questions
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of Southampton
 
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...Unleashing the Potential: Navigating the Versatility and Simplicity of Python...
Unleashing the Potential: Navigating the Versatility and Simplicity of Python...
 
Week4
Week4Week4
Week4
 
Python with dataScience
Python with dataSciencePython with dataScience
Python with dataScience
 
Data Wrangling Week 7
Data Wrangling Week 7Data Wrangling Week 7
Data Wrangling Week 7
 
lecture-1-overview.pptx
lecture-1-overview.pptxlecture-1-overview.pptx
lecture-1-overview.pptx
 
Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...
 
A Practical Approach to Design, Implementation, and Management A Practical Ap...
A Practical Approach to Design, Implementation, and Management A Practical Ap...A Practical Approach to Design, Implementation, and Management A Practical Ap...
A Practical Approach to Design, Implementation, and Management A Practical Ap...
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
 
Python ml
Python mlPython ml
Python ml
 
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...
USUGM 2014 - Erin Bolstad (ChemAxon): Consultancy report - New capabilities a...
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...
Context-oriented Knowledge Management in Production Networks @Gsom Emerging m...
 
Smarter Data for Smarter Libraries
Smarter Data for Smarter LibrariesSmarter Data for Smarter Libraries
Smarter Data for Smarter Libraries
 

More from Ferdin Joe John Joseph PhD

Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Ferdin Joe John Joseph PhD
 
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Ferdin Joe John Joseph PhD
 
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingWeek 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Ferdin Joe John Joseph PhD
 
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingWeek 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingWeek 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumSept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumFerdin Joe John Joseph PhD
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD
 

More from Ferdin Joe John Joseph PhD (14)

Invited Talk DGTiCon 2022
Invited Talk DGTiCon 2022Invited Talk DGTiCon 2022
Invited Talk DGTiCon 2022
 
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud Computing
 
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
 
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
 
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingWeek 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
 
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
 
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingWeek 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
 
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingWeek 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
 
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumSept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
 
Hadoop in Alibaba Cloud
Hadoop in Alibaba CloudHadoop in Alibaba Cloud
Hadoop in Alibaba Cloud
 
Cloud Computing Essentials in Alibaba Cloud
Cloud Computing Essentials in Alibaba CloudCloud Computing Essentials in Alibaba Cloud
Cloud Computing Essentials in Alibaba Cloud
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
Data wrangling week 9
Data wrangling week 9Data wrangling week 9
Data wrangling week 9
 
Deep Learning and CNN Architectures
Deep Learning and CNN ArchitecturesDeep Learning and CNN Architectures
Deep Learning and CNN Architectures
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 

Data wrangling week 6

  • 1. Data Wrangling Week 5 Dr. Ferdin Joe John Joseph Faculty of Information Technology Thai – Nichi Institute of Technology, Bangkok
  • 2. Today’s Lesson • Introduction to XML • XML - Theories • XML parsing using Python • Text files parsing using Python • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 2
  • 3. XML Introduction • XML stands for eXtensible Markup Language • Used for storing and transmitting data • It is readable by both human and machine Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 3
  • 4. XML Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 4
  • 5. XML Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 5
  • 6. XML Parsing using Python Make a file sample.xml Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 6
  • 7. Import libraries necessary Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 7
  • 8. Parse the text file Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 8
  • 9. Printing the root Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 9
  • 10. Find Occurences Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 10
  • 11. Query from XML Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 11
  • 12. Query from XML Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 12
  • 13. Store values in array Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 13
  • 14. Print the arrays Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 14
  • 15. Convert to Pandas Dataframe Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 15
  • 16. Print Data Frame Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 16
  • 17. Activity • Make the data look like the schema shown below Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 17
  • 18. Calculate Mean Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 18
  • 19. Parsing Text files Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 19
  • 20. Procedure • Most of the text file tables are comma separated • CSV parsing in pandas can be used Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 20
  • 21. Activity • Save a csv file you used in your previous lectures in txt format • Use the same pandas library to parse through the text Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 21
  • 22. XML from a url Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 22
  • 23. XML from a url Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 23
  • 24. Activity • Copy the news XML. Convert this xml into pandas data frame • Perform data analysis on the given dataframe Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 24
  • 25. Lesson for Next Week Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok https://github.com/ferdinjoe/DSA201 25