Python Notes for mca i year students osmania university.docx
Web Technologies Unit 2 Print.pdf
1. 11/16/2023
XML
(eXtensible Markup Language)
Introduction:
• XML stands for eXtensible Markup Language and it is used
for storing and transferring data.
• XML doesn’t depend on the platform and the software
(programming language). i.e. we can able to write a program
in any language on any platform (Operating System) to send,
receive or store data using XML.
• XML is a simple document with the data, which can be used to
store and transfer data between any systems irrespective of
their hardware and software compatibilities.
• It is so much easier to read the data from XML and display it
on a GUI (graphical user interface) using markup language
like HTML.
HTML Vs XML
XML(eXtensible Markup Language)
HTML(Hyper Text Markup Language)
XML is case sensitive language
HTML is not a case sensitive language
XML is mainly used for storing and
transporting the data
HTML is mainly concerned with the
presentation of data
XML is dynamic
HTML is static
In XML the closing tag is mandatory
In HTML the closing tag in optional
XML uses the user-defined tags that we
create while writing the XML document.
HTML uses predefined tags such as <b>,
<br>, <img> etc.
XML preserves white space.
HTML does not preserve white space.
XML Attribute Values Must Always be
Quoted.
HTML Attribute Values Quoting is
optional.
Simple XML document example
A XML document structure looks like this:
<root>
<child>
<subchild>.....</subchild>
</child>
<child>
<subchild>.....</subchild>
</child>
</root>
• <?xml version="1.0" encoding="UTF-8"?> is called XML Prolog.
• It is optional, however when we include it in the XML document, it
should always be the first line of the document. XML Prolog defines the
XML version and the encoding used in the XML document.
2. 11/16/2023
• The XML code can be written on a simple notepad and should be saved as
“filename.xml”.
Students.xml
<?xml version="1.0" encoding="UTF-8"?>
<students>
<student>
<num>527</num>
<name>Ravi</name>
<age>20</age>
</student>
<student>
<num>572</num>
<name>Dev</name>
<age>23</age>
</student>
</students>
• In the above XML document we have the details of the few students.
Here <students> is the root element, <student> is the child element and
num,name and age are sub-child elements.
Output:
XML Syntax
Root Element is mandatory in XML
• XML document must have a root element. A root element can
have child elements and sub-child elements.
• For example: In the following XML document, <message> is the
root element and <to>, <from>, <subject> and <text> are child
elements.
<?xml version="1.0" encoding="UTF-8"?>
<message>
<to>Durga</to>
<from>Madhu</from>
<subject>Message from teacher to Student</subject>
<text>You have an exam tomorrow at 10:00 AM</text>
</message>
XML Syntax Contd…
XML is case sensitive
XML is a case sensitive language.
For example:
<from>madhu</from> This is valid
<from>Prasad</FROM> This is invalid
• All letters of closing tag is in capital while all letters of opening
tag is in small, this is an example of invalid XML.
XML Prolog
<?xml version="1.0" encoding="UTF-8"?>
• This line is called the XML Prolog. It is an optional line, however
it should be the first line when you mention it. It specifies the
XML version and the encoding used in the XML document.
3. 11/16/2023
XML Syntax Contd…
Elements should not overlap
• All the elements in XML should be properly nested and they should not
overlap.
<class><teacher>Madhu</class></teacher> -->Wrong (Not nested properly)
<class><teacher>Durga</teacher></class> -->Correct (Correctly nested)
XML elements must have a closing tag
• All XML documents must have a closing tag.
<text category = message>hello</text> -->correct
<text category = message>hello -->wrong
Comments in XML
• This is how a comment should look like in XML document.
<!-- This is just a comment -->
XML Attributes
• XML elements can have attributes. By the use of attributes we can
add the additional information about the element.
• XML attributes are a way to add additional data to the XML
element. Attributes contain data in form of name & value pairs.
• XML attributes are used to enhance the properties of the elements.
Note: XML attributes must always be quoted. We can use single or
double quote.
Example:
<book category="computers">
<price>1000rs</price>
<publisher>Tata McGraw Hill</publisher>
</book>
XML Attributes Contd…
XML attributes vs XML Sub Elements:
• The Data can be stored in attributes or in child elements. But there are
some limitations in using attributes, over child elements.
Same information can be represented in two ways:
1st Way:
<book category="computers">
<price>1000rs</price>
<publisher>Tata McGraw Hill</publisher>
</book>
2nd Way:
<book>
<category>computers </category>
<price>1000rs</price>
<publisher>Tata McGraw Hill</publisher>
</book>
XML Attributes Contd…
• In the first way category is used as an attribute and in the second
way category is used as an element.
• Both examples provide the same information but it is good practice
to avoid attribute in XML and use elements instead of attributes.
Because
• Attributes cannot contain multiple values but child elements can
have multiple values.
• Attributes cannot contain tree structure but child element can.
• Elements are easy to be handled by the programming language
compared to the attributes.
4. 11/16/2023
XML Validation
Document Type Definition
(DTD)
XML – Validation
• Validation is a process by which an XML document is validated.
• An XML document with correct syntax is known as valid XML
document.
Let’s see few important rules to check for syntax errors.
• All XML documents must have a root element.
• XML is a case sensitive language so you should be careful with the
case while opening and closing tags.
• All XML tags must have a closing tag.
• XML attribute name should not be quoted while its value must be
quoted.
• There are two ways to check whether the XML document is valid.
1 XML DTD (Document Type Definition)
2 XML Schema
Document Type Definition (DTD)
• DTD stands for Document Type Definition. It is used to define
document structure with a list of legal elements and attributes.
• Document Type Definition (DTD) is a certain piece of code,
which can defines structure of XML document.
• Each DTD contains a list of elements, which specifies the rules
for structuring a given XML document.
• Each DTD specifies the relationship between root element, child
elements and sub child elements.
• An XML document is considered “well formed” and “valid” if it
is successfully validated against DTD.
Document Type Definition (DTD) Contd..
• DTD’s are optional in XML, but recommended for clarity
purpose.
• DTD’s can be declared internally (With in XML doc) and as an
external file (with in a separate file with “.dtd” extension).
Basic Building Blocks:
There are 4 building blocks to build a XML document with DTD.
those are
1.Tags
2.Elements
3.Attributes
4.Entities
5. 11/16/2023
Document Type Definition (DTD) Contd..
1. Tags:
– The XML allows the user to create own tags.Useally the tag
<tagname> is an Opening tag, </tagname> refers to its equivalent
closing tag.
Example:
<book>Let us C</book>
<author> Yashwant Kanetkar </ author >
– In XML the tags are case sensitive . And closing tag is mandatory.
– All the elements in XML should be properly nested and they
should not overlap.
Document Type Definition (DTD) Contd..
2. Elements in DTD:
– The DTD document is composed with various elements. These
elements are used to represent the tags in XML document.
Declaration:
<!ELEMENT name-of-element(context)>
– To declare an empty element
Syntax:
<!ELEMENT name-of-element(EMPTY)>
– To declare an element, this carries data
Syntax:
<!ELEMENT name-of-element(#PCDATA)>
(or)
<!ELEMENT name-of-element(#CDATA)>
Document Type Definition (DTD) Contd..
– To declare an element, this contains child elements
Syntax:
<!ELEMENT name-of-element(child-names)>
Examples:
<!ELEMENT student(id,name,age)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT age(EMPTY)>
3.Attributes:
• These are used to provide the additional information along with
elements.
• In DTD, the attributes are declared by using “ATTLIST” Keyword.
Syntax:
<! ATTLIST name-of-element name-of-attribute
attribute-type [default-value]>
Document Type Definition (DTD) Contd..
• In the above syntax, the field ‘attribute-type’ can specifies the
following pre-defined values.
• ID:It is a value which remains unique.
• CDATA: The value supplied here is nothing but character data.
• ENTITY: The value supplied in this case is nothing but an entity.
• In the same way, the filed ‘default-value’ can specifies the following
pre-defined values.
#REQUIRED: It means the value for the attribute is required.
#IMPLIED: It means the attribute is not required.
#FIXED: Here a fixed value is supplied.
Default-value: It is a default value of given attribute.
Example:
<! ATTLIST student address CDATA #REQUIRED>
6. 11/16/2023
Document Type Definition (DTD) Contd..
4. Entities:
• Some markup elements can contain complex data; these types of
elements are called as Entities.
• These are used to create small piece of data which you want you use
repeatedly throughout your schema.
Syntax:
<! ENTITY name-of-entity “value”>
Example:
<! ENTITY Book “Web Technologies”>
Used as
<author> the &Book author is Uttam K Roy</author>
Document Type Definition (DTD) Contd..
• DTD’s can be declared internally (With in XML doc) and as an
external file (with in a separate file with “.dtd” extension)
Example with external DTD:
• Create a DTD for a remainder; it has following remainder as root
element and child elements-heading, to, from, message.
“remainder.dtd”
<!ELEMENT remainder(heading,to,from,message)>
<!ELEMENT heading(#PCDATA)>
<!ELEMENT to(#PCDATA)>
<!ELEMENT from(#PCDATA)>
<!ELEMENT message(#PCDATA)>
Document Type Definition (DTD) Contd..
“remainder.xml”
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "remainder.dtd">
<remainder>
<heading>Final Remainder</heading>
<to>Mr.Madhu</to>
<from> Ravi Gupta</from>
<message> Date of joining is 5th Feb</message>
</remainder>
Output:
Document Type Definition (DTD) Contd..
Example with internal DTD:
“remainderdemo.xml”
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT remainder (heading,to,from,message)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT message (#PCDATA)>
]>
<remainder>
<heading>Final Remainder</heading>
<to>Mr.Madhu</to>
<from> Ravi Gupta</from>
<message> Date of joining is 5th Feb</message>
</remainder>
7. 11/16/2023
Document Type Definition (DTD) Contd..
Example:
• Create a DTD for a catalog of four stroke engine motorbikes where
each motor bike has the following child elements: make, model,
year, color, engine, chassis number and accessories. The engine
element has child elements those are – engine number, number of
cylinders, type of fuel. The accessories elements has the attributes
like disk break, auto start and radio, each of which required and has
the possible values ‘YES’ and ‘NO’ . Entities must be declared for the
names of the popular motorbike makes.
Document Type Definition (DTD) Contd..
“bikecatalog.dtd”
<!ELEMENT catalog (motorbike)*>
<!ELEMENT motorbike (make, model,year,color,engine, chasis_num, accessories)>
<!ELEMENT make (#PCDATA)>
<!ELEMENT model (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT color (#PCDATA)>
<!ELEMENT engine (engine_num,cylinder_num,fuel_type)>
<!ELEMENT engine_num (#PCDATA)>
<!ELEMENT cylinder_num (#PCDATA)>
<!ELEMENT fuel_type (#PCDATA)>
<!ELEMENT chasis_num (#PCDATA)>
<!ELEMENT accessories (#PCDATA)>
<!ATTLIST accessories diskbrake (Yes|No) #REQUIRED autostart (Yes|No)
#REQUIRED radio (Yes|No) #REQUIRED>
Document Type Definition (DTD) Contd..
<?xml version="1.0"?> “bikecatalog.xml”
<!DOCTYPE catalog SYSTEM "bikecatalog.dtd">
<catalog>
<motorbike>
<make>Hero</make>
<model>Glammur 120cc</model>
<year>2014</year>
<color>Red-Black</color>
<engine>
<engine_num>564789</engine_num>
<cylider_num>32</cylider_num>
<fuel_type>PreMimum</fuel_type>
</engine>
<chasis_num>546</chasis_num>
<accessories discbrake="Yes" autostart="Yes" radio="No"/>
</motorbike>
</catalog>
XML Validation
XML Schema
8. 11/16/2023
XML Schema (Or) XML Schema Definition (XSD)
• XML Schema mainly used for structuring XML documents.
Similar to DTD, XML Schema is also used to check whether
the given XML document is “well formed” and “valid”.
• XML Schema can be defined root element, child elements,
their number as well as their order.
• It defines data types, default and fixed values for the
elements and attributes.
• For this purpose, it has the form of XML schema language
which is also known as Xml Schema Definition (XSD).
XML Schema Contd...
• XML schemas are created by using XML syntax, where
DTD use a separate syntax.
• XML Schemas specify the type of textual data that can be
used with in attributes and elements.
• If we use the XML Schema for complex and large
operations, then the processing of XML document may slow
down.
• The XML document can’t be displayed if the corresponding
schema file is absent.
XML Schema Contd...
Data Types in XML Schema:
• Binary data type: It includes the binary data (0’s or 1’s).
• Boolean data type: It includes either “true” or “false”.
• Number data type: There are 3 main number data types those are
o float : It contains 32-bit floating point values.
o double: It contains 64-bit floating point values.
o decimal: It includes the decimal numbers either +ve or –ve.
• Date data type: It specifies the current date (YYYY-MM-DD).
• Time data type: It specifies the current time (HH:MM:SS).
• String data type: It includes series of characters such as strings.
• Example: The following example is an XML schema file called
“remainder.xsd” that defines the elements of XML document.
“remainder.xsd”
<?xml version = "1.0" encoding = "UTF-8"?>
<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema">
<xs:element name="remainder">
<xs:complextype>
<xs:sequence>
<xs:element name="to" type="xs:string">
<xs:element name="from" type="xs:string">
<xs:element name="heading" type="xs:string">
<xs:element name="msg" type="xs:string">
</xs:sequence>
</xs:complextype>
</xs:schema>
9. 11/16/2023
“remaind.xml”
<?xml version = "1.0" encoding = "UTF-8"?>
<remainder schemaLocation="remainder.xsd">
<to>Madhu</to>
<from>Durga</from>
<heading>Notice</heading>
<msg>This is My Last Remainder</msg>
</remainder>
Output:
XML Parsers
(DOM & SAX)
XML Parsers
• An XML parser is a software library or package that
provides interfaces for client applications to work with an
XML document.
• The XML Parser is designed to read the XML document and
create a way(interface or API) for programs to use XML.
XML Parsers
Two Types of parsers
DOM Parser
SAX Parser
10. 11/16/2023
DOM (Document Object Model)
• DOM is a platform that allows programs and scripts to
dynamically access and update the content and structure of a
XML documents.
• The Document Object Model (DOM) is a programming API
for HTML and XML documents. It defines the logical
structure of documents and provides interface(API) for
access documents.
• The Document Object Model can be used with any
programming language.
• DOM exposes the whole document to applications.
DOM (Document Object Model)
• The XML DOM defines a standard way for accessing and
manipulating XML documents. It presents an XML document
as a tree-structure.
• The tree structure makes easy to describe an XML document. A
tree structure contains root element (as parent), child element
and so on.
• The XML DOM makes a tree-structure view for an XML
document.
• We can access all elements through the DOM tree. We can
modify or delete their content and also create new elements.
DOM (Document Object Model)
<?xml version="1.0"?>
<college>
<student>
<firstname>Durga</firstname>
<lastname>Madhu</lastname>
<contact>999123456</contact>
<email>dm@abc.com</email>
<address>
<city>Hyderabad</city>
<state>TS</state>
<pin>500088</pin>
</address>
</student>
</college>
DOM (Document Object Model)
Let's see the tree-structure representation of the above example.
11. 11/16/2023
DOM (Document Object Model)
• We need a parser to read XML document into memory and
converts into XML DOM Object that can be accesses with
any programming language (here we can use PHP).
• The DOM parser functions are part of the PHP core. There is
no installation needed to use these functions.
• To load XML document in PHP
$xmlDoc = new DOMDocument();
this statement creates an object.
$xmlDoc->load("note.xml");
this statement loads a xml file by using object.
DOM (Document Object Model)
These are some typical DOM properties in php:
• X -> nodeName - the name of X
• X -> nodeValue - the value of X
• X->parentNode - the parent node of X
• X->childNodes - the child nodes of X
• X->attributes - the attributes nodes of X
Where X is Node object.
“note.xml”
<?xml version="1.0" encoding="UTF-8"?>
<student>
<num>521</num>
<name>xyz</name>
<age>30</age>
</student>
DOM (Document Object Model)
“Note.php”
<?php
$xmlDoc = new DOMDocument();
$xmlDoc->load("note.xml");
$x = $xmlDoc->documentElement;
foreach ($x->childNodes AS $item) {
print $item->nodeValue . "<br>";
}
?>
Output:
SAX
Simple API for XML
12. 11/16/2023
XML Parsers
What is an XML parser?
– An XML parser is a software library or package
that provides interfaces for client applications to
work with an XML document.
– The XML Parser is designed to read the XML and
create a way for programs to use XML.
XML Parsers
Two types of parser
– SAX (Simple API for XML)
• Event driven API
• Sends events to the application as the document is read
– DOM (Document Object Model)
• Reads the entire document into memory in a tree
structure
Simple API for XML
SAX Parser
When should I use it?
– Large documents
– Memory constrained devices
– If you need not to modify the document
13. 11/16/2023
SAX Parser
Which languages are supported?
– Java
– Perl
– C++
– Python
SAX Implementation in Java
• Create a class which extends the SAX event handler
Import org.xml.sax.*;
import org.xml.sax.helpers.ParserFactory;
Public class SaxApplication extends HandlerBase {
public static void main(String args[]) {
}
}
SAX Implementation in Java
• Create a SAX Parser
public static void main(args[]) {
String parserName = “org.apache.xerces.parsers.SAXParser”;
try {
SaxApplication app = new SaxApplication();
Parser parser = ParserFactory.makeParser(parserName);
parser.setDocumentHandler(app);
parser.setErrorHandler(app);
parser.parse(new InputSource(args[0]));
} catch (Throwable t) {
// Handle exceptions
}
}
SAX Implementation in Java
• Most important methods to parse
– void startDocument()
• Called once when document parsing begins
– void endDocument()
• Called once when parsing ends
– void startElement(...)
• Called each time an element begin tag is encountered
– void endElement(...)
• Called each time an element end tag is encountered
– void error(...)
• Called once when parsing error occurred.
14. 11/16/2023
SAX
DOM
Event based parser (Sequence of
events).
Tree model parser (Object based) (Tree
of nodes).
SAX parses the file as it reads it, i.e.
parses node by node.
DOM loads the file into the memory and
then parse- the file.
No memory constraints as it does not
store the XML content in the memory.
Has memory constraints since it loads
the whole XML file before parsing.
SAX is read only i.e. can’t insert or
delete the node.
DOM is read and write (can insert or
delete nodes).
Use SAX parser when XML content is
large.
If the XML content is small, then prefer
DOM parser.
SAX reads the XML file from top to
bottom and backward navigation is not
possible.
Backward and forward search is possible
for searching the tags and evaluation of
the information inside the tags.
Faster at run time.
Slower at run time.