SlideShare a Scribd company logo
1 of 5
R-XML Files
XML is a file format which shares both the file format and the data on the World Wide
Web, intranets, and elsewhere using standard ASCII text. It stands for Extensible
Markup Language (XML). Similar to HTML it contains markup tags. But unlike HTML
where the markup tag describes structure of the page, in xml the markup tags
describe the meaning of the data contained into he file.
You can read a xml file in R using the "XML" package. This package can be installed
using following command.
install.packages("XML")
Input Data
Create a XMl file by copying the below data into a text editor like notepad. Save the
file with a .xml extension and choosing the file type as all files(*.*).
<RECORDS>
<STUDENT>
<ID>1</ID>
<NAME>Rick</NAME>
<DOB>1/1/2012</DOB>
<YEAR>1</YEAR>
<MARK>90</MARK>
</STUDENT>
<STUDENT>
<ID>2</ID>
<NAME>Rick</NAME>
<DOB>1/1/2012</DOB>
<YEAR>1</YEAR>
<MARK>50</MARK>
</STUDENT>
<STUDENT>
<ID>3</ID>
<NAME>Rick</NAME>
<DOB>1/1/2012</DOB>
<YEAR>1</YEAR>
<MARK>70</MARK>
</STUDENT>
</RECORDS>
Reading XML File
The xml file is read by R using the function xmlParse(). It is stored as a list in R.
# Load the package required to read XML files.
library("XML")
# Also load the other required package.
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Print the result.
print(result)
When we execute the above code, it produces the following result −
1
Rick
623.3
1/1/2012
IT
2
Dan
515.2
9/23/2013
Operations
3
Michelle
611
11/15/2014
IT
4
Ryan
729
5/11/2014
HR
5
Gary
843.25
3/27/2015
Finance
6
Nina
578
5/21/2013
IT
7
Simon
632.8
7/30/2013
Operations
8
Guru
722.5
6/17/2014
Finance
Get Number of Nodes Present in XML File
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Find number of nodes in the root.
rootsize <- xmlSize(rootnode)
# Print the result.
print(rootsize)
When we execute the above code, it produces the following result −
output
[1] 8
Details of the First Node
Let's look at the first record of the parsed file. It will give us an idea of the various
elements present in the top level node.
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Print the result.
print(rootnode[1])
When we execute the above code, it produces the following result −
$EMPLOYEE
1
Rick
623.3
1/1/2012
IT
attr(,"class")
[1] "XMLInternalNodeList" "XMLNodeList"
Get Different Elements of a Node
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Get the first element of the first node.
print(rootnode[[1]][[1]])
# Get the fifth element of the first node.
print(rootnode[[1]][[5]])
# Get the second element of the third node.
print(rootnode[[3]][[2]])
When we execute the above code, it produces the following result −
1
IT
Michelle
XML to Data Frame
To handle the data effectively in large files we read the data in the xml file as a data
frame. Then process the data frame for data analysis.
# Load the packages required to read XML files.
library("XML")
library("methods")
# Convert the input xml file to a data frame.
xmldataframe <- xmlToDataFrame("input.xml")
print(xmldataframe)
When we execute the above code, it produces the following result −
ID NAME SALARY STARTDATE DEPT
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
As the data is now available as a dataframe we can use data frame related function
to read and manipulate the file.

More Related Content

Similar to R-XML.docx (20)

XML notes.pptx
XML notes.pptxXML notes.pptx
XML notes.pptx
 
distributed system concerned lab sessions
distributed system concerned lab sessionsdistributed system concerned lab sessions
distributed system concerned lab sessions
 
Xml
XmlXml
Xml
 
Xml 215-presentation
Xml 215-presentationXml 215-presentation
Xml 215-presentation
 
Xml
XmlXml
Xml
 
Creating xml publisher documents with people code
Creating xml publisher documents with people codeCreating xml publisher documents with people code
Creating xml publisher documents with people code
 
Full xml
Full xmlFull xml
Full xml
 
PHP XML
PHP XMLPHP XML
PHP XML
 
XML-Unit 1.ppt
XML-Unit 1.pptXML-Unit 1.ppt
XML-Unit 1.ppt
 
Xml nisha dwivedi
Xml nisha dwivediXml nisha dwivedi
Xml nisha dwivedi
 
Xml And JSON Java
Xml And JSON JavaXml And JSON Java
Xml And JSON Java
 
Xml session
Xml sessionXml session
Xml session
 
treeview
treeviewtreeview
treeview
 
treeview
treeviewtreeview
treeview
 
Xml
XmlXml
Xml
 
java API for XML DOM
java API for XML DOMjava API for XML DOM
java API for XML DOM
 
XML Presentation-2
XML Presentation-2XML Presentation-2
XML Presentation-2
 
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
Unit 10: XML and Beyond (Sematic Web, Web Services, ...)
 
XML Introduction
XML IntroductionXML Introduction
XML Introduction
 
WEB PROGRAMMING
WEB PROGRAMMINGWEB PROGRAMMING
WEB PROGRAMMING
 

More from MrRRajasekarCSE

More from MrRRajasekarCSE (13)

TOC question bank.pdf
TOC question bank.pdfTOC question bank.pdf
TOC question bank.pdf
 
QB104545.pdf
QB104545.pdfQB104545.pdf
QB104545.pdf
 
QB104543.pdf
QB104543.pdfQB104543.pdf
QB104543.pdf
 
QB104544.pdf
QB104544.pdfQB104544.pdf
QB104544.pdf
 
QB104541.pdf
QB104541.pdfQB104541.pdf
QB104541.pdf
 
Learning_Deep_Models_for_Face_Anti-Spoofing_Binary.pdf
Learning_Deep_Models_for_Face_Anti-Spoofing_Binary.pdfLearning_Deep_Models_for_Face_Anti-Spoofing_Binary.pdf
Learning_Deep_Models_for_Face_Anti-Spoofing_Binary.pdf
 
Swapna siripireddy vtu 11843 .pdf
Swapna siripireddy vtu 11843 .pdfSwapna siripireddy vtu 11843 .pdf
Swapna siripireddy vtu 11843 .pdf
 
Unit I - Theorems.pdf
Unit I - Theorems.pdfUnit I - Theorems.pdf
Unit I - Theorems.pdf
 
Swapna siripireddy vtu 11843 .pdf
Swapna siripireddy vtu 11843 .pdfSwapna siripireddy vtu 11843 .pdf
Swapna siripireddy vtu 11843 .pdf
 
R-XML.docx
R-XML.docxR-XML.docx
R-XML.docx
 
R-Json.docx
R-Json.docxR-Json.docx
R-Json.docx
 
Unit I - Theorems.pdf
Unit I - Theorems.pdfUnit I - Theorems.pdf
Unit I - Theorems.pdf
 
R-Json.docx
R-Json.docxR-Json.docx
R-Json.docx
 

Recently uploaded

Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 

Recently uploaded (20)

Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 

R-XML.docx

  • 1. R-XML Files XML is a file format which shares both the file format and the data on the World Wide Web, intranets, and elsewhere using standard ASCII text. It stands for Extensible Markup Language (XML). Similar to HTML it contains markup tags. But unlike HTML where the markup tag describes structure of the page, in xml the markup tags describe the meaning of the data contained into he file. You can read a xml file in R using the "XML" package. This package can be installed using following command. install.packages("XML") Input Data Create a XMl file by copying the below data into a text editor like notepad. Save the file with a .xml extension and choosing the file type as all files(*.*). <RECORDS> <STUDENT> <ID>1</ID> <NAME>Rick</NAME> <DOB>1/1/2012</DOB> <YEAR>1</YEAR> <MARK>90</MARK> </STUDENT> <STUDENT> <ID>2</ID> <NAME>Rick</NAME> <DOB>1/1/2012</DOB> <YEAR>1</YEAR> <MARK>50</MARK> </STUDENT> <STUDENT> <ID>3</ID> <NAME>Rick</NAME> <DOB>1/1/2012</DOB> <YEAR>1</YEAR> <MARK>70</MARK> </STUDENT> </RECORDS> Reading XML File The xml file is read by R using the function xmlParse(). It is stored as a list in R. # Load the package required to read XML files. library("XML") # Also load the other required package. library("methods")
  • 2. # Give the input file name to the function. result <- xmlParse(file = "input.xml") # Print the result. print(result) When we execute the above code, it produces the following result − 1 Rick 623.3 1/1/2012 IT 2 Dan 515.2 9/23/2013 Operations 3 Michelle 611 11/15/2014 IT 4 Ryan 729 5/11/2014 HR 5 Gary 843.25 3/27/2015 Finance 6 Nina 578 5/21/2013 IT 7 Simon 632.8 7/30/2013 Operations 8 Guru 722.5 6/17/2014
  • 3. Finance Get Number of Nodes Present in XML File # Load the packages required to read XML files. library("XML") library("methods") # Give the input file name to the function. result <- xmlParse(file = "input.xml") # Exract the root node form the xml file. rootnode <- xmlRoot(result) # Find number of nodes in the root. rootsize <- xmlSize(rootnode) # Print the result. print(rootsize) When we execute the above code, it produces the following result − output [1] 8 Details of the First Node Let's look at the first record of the parsed file. It will give us an idea of the various elements present in the top level node. # Load the packages required to read XML files. library("XML") library("methods") # Give the input file name to the function. result <- xmlParse(file = "input.xml") # Exract the root node form the xml file. rootnode <- xmlRoot(result) # Print the result. print(rootnode[1]) When we execute the above code, it produces the following result − $EMPLOYEE 1 Rick 623.3 1/1/2012 IT
  • 4. attr(,"class") [1] "XMLInternalNodeList" "XMLNodeList" Get Different Elements of a Node # Load the packages required to read XML files. library("XML") library("methods") # Give the input file name to the function. result <- xmlParse(file = "input.xml") # Exract the root node form the xml file. rootnode <- xmlRoot(result) # Get the first element of the first node. print(rootnode[[1]][[1]]) # Get the fifth element of the first node. print(rootnode[[1]][[5]]) # Get the second element of the third node. print(rootnode[[3]][[2]]) When we execute the above code, it produces the following result − 1 IT Michelle XML to Data Frame To handle the data effectively in large files we read the data in the xml file as a data frame. Then process the data frame for data analysis. # Load the packages required to read XML files. library("XML") library("methods") # Convert the input xml file to a data frame. xmldataframe <- xmlToDataFrame("input.xml") print(xmldataframe) When we execute the above code, it produces the following result − ID NAME SALARY STARTDATE DEPT 1 1 Rick 623.30 2012-01-01 IT 2 2 Dan 515.20 2013-09-23 Operations 3 3 Michelle 611.00 2014-11-15 IT 4 4 Ryan 729.00 2014-05-11 HR 5 NA Gary 843.25 2015-03-27 Finance 6 6 Nina 578.00 2013-05-21 IT 7 7 Simon 632.80 2013-07-30 Operations
  • 5. 8 8 Guru 722.50 2014-06-17 Finance As the data is now available as a dataframe we can use data frame related function to read and manipulate the file.