XMLMukesh N Tekwanitekwani@email.comwww.myexamnotes.com
Disadvantages of HTML – Need for XMLHTML lacks syntax checking
HTML lacks structure
HTML is not suitable for data interchange
HTML is not context aware – HTML does not allow us to describe the information content or the semantics of the document
HTML is not object-oriented
HTML is not re-usable
HTML is not extensibleIntroduction to XMLXML – Extensible Markup Language
Extensible – capable of being extended
Markup – it is a way of adding information to the text indicating the logical components of a document
How is it different from HTML?
HTML was designed to display data
XML was designed to store, describe and transport data
XML is also a markup language like HTML
XML tags are not predefined – we must design our own tags.Differences between HTML and XML
Advantages (Features) of XML - 1XML simplifies data sharingSince XML data is stored in plain text format, data can be easily shared among different hardware and software platforms.XML separates data from HTMLTo display dynamic data in HTML, the code must be rewritten each time the data changes. With XML, data can be stored in separate files so that whenever the data changes it is automatically displayed correctly. We have to design the HTML for layout only once.
Advantages (Features) of XML - 2XML simplifies data transport
With XML, data can be easily exchanged between different platforms.
XML makes data more available
Since XML is independent of hardware, software and application, XML can make your data more available and useful.
Different applications can access your data in HTML pages
XML provides a means to package almost any type of information (binary, text, voice, video) for delivery to a receiving end.Advantages (Features) of XML - 3InternationalityHTML relies heavily on ASCII which makes using foreign characters very difficult. XML uses Unicode so that many European and Asian languages are also handled easily
Applications of XMLFile ConvertersMany applications have been written to convert existing documents into the XML standard. An example is a PDF to XML converter. Cell Phones - XML data is sent to some cell phones. This data is then formatted to display text or images, and even to play sounds!VoiceXML - Converts XML documents into an audio format so that you can listen to an XML document.
XML Document – Example 1<?xml version="1.0" encoding="ISO-8859-1"?><class_list><student><name>Anamika</name><grade>A+</grade></student><student><name>Veena</name><grade>B+</grade></student></class_list>
XML Document–Example 1 - ExplainedThe first line is the XML declaration.
<?xml version="1.0" encoding="ISO-8859-1"?>
It defines the XML version (1.0)
It gives the encoding used (ISO-8859-1 = Latin-1/West European character set)
The XML declaration is actually a processing instruction (PI) an it is identified by the ? At its start and end
The next line describes the root element of the document (like saying: "this document is a class_list“)
The next 2 lines describe 2 child elements of the root (student, name, and grade)
And finally the last line defines the end of the root element: </class_list>Logical StructureXML uses its start tags and end tags as containers.The start tag, the content and the end tag form an elementElements are the building blocks out of which an XML document is assembled.An XML document has a tree-like structure with the root element at the top and all the other elements are contained within each other.
Tree structureXML documents form a tree structure.
XML documents must contain a root element. This element is "the parent" of all other elements.
The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree.
All elements can have sub elements (child elements)
<root>	<child>		<subchild>.....</subchild>	</child></rootXML – Example 2
XML – Example 2<bookstore>	<book category = "COOKING">	<title lang = "en">Everyday Italian</title>	<author>Giada De Laurentiis</author>	<year>2005</year>	<price>30.00</price></book>	<book category = "CHILDREN">	<title lang = "en">Harry Potter</title>	<author>J K. Rowling</author>	<year>2005</year>	<price>29.99</price></book>	<book category = "WEB">	<title lang = "en">Learning XML</title>	<author>Erik T. Ray</author>	<year>2003</year>	<price>39.95</price></book></bookstore>
Important DefinitionsXML ElementAn element is a start tag, content, and an end tag.E.g.,  <greeting>”Hello World</greeting> XML AttributeAn attribute provides additional information about elementsE.g., <note priority = “high”>
Important DefinitionsChild elements – XML elements may have child elements<employee id = “100”>	<name>		<first>Anita</first>		<initial>D</initial>		<last>Singh</last>			</name></employee>Parent Element NameChildren of parent element
XML ElementAn XML element is everything from the element's start tag to the element's end tag.An element can contain other elements, simple text or a mixture of both. Elements can also have attributes.
XML Syntax RulesAll XML elements must have a closing tagXML tags are case sensitive. The tag <Book> is different from the tag <book>Opening and closing tags must be written with the same case	<Message>This is incorrect</message><message>This is correct</message>
XML Syntax RulesXML elements must be properly nested
HTML permits this:<B><I>This text is bold and italic</B></I>	But in XML this is invalid. All elements must be properly nested within one another.<B><I>This text is bold and italic</I></B>XML documents must have a root element. It is the parent of all other elements.<root>	<child>	<subchild>.....</subchild></child></root>
XML Syntax RulesXML Entity ReferencesSome characters have a special meaning in XML. E.g., If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.<message>if salary < 1000 then </message>To avoid this error, replace the "<" character with an entity reference:<message>if salary &lt; 1000 then</message>
XML Syntax RulesXML Entity ReferencesThere are 5 predefined entity references in XML:
XML Syntax RulesComments in XML (similar to HTML)	<!-- This is a comment --> White space is preserved in XML but not in HTML
XML Naming Rules
Names can contain letters, numbers, and other characters
Names cannot start with a number or punctuation character
Names cannot start with the letters xml (or XML, or Xml, etc)
Names cannot contain spaces
Any name can be used, no words are reserved.XML Markup DelimitersEvery XML element is made up of the following parts:		Symbol		Description		<			Start tag open delimiter		</			End tag open delimiter		something		element name		>			tag close delimiter		/>			empty tag close delimiter
Different Types of XML Markups5 Types of Markup in XMLElementsEntitiesCommentsProcessing InstructionsIgnored Sections
Element MarkupElement MarkupIt is composed of 3 parts: start tag, the content, and the end tag.Example:  <name>Neetu</name>The start tag and the end tag can be treated as wrappersThe element name that appears in the start tag must be exactly the same as the name that appears in the end tag.Example:  <Name>Neetu</name>
Different Types of XML MarkupsAttribute MarkupAttributes are used to attach information to the information contained in an element.General syntax for attributes is:<elementname property = ‘value’>			Or<elementname property = “value”>Attribute value must be enclosed within quotation marksUse either single quotes or double quotes but don’t mix them.
Attribute MarkupIf we specify the attributes for the same element more than once, the specifications are merged.<?xml version = “1.0”?><myparas><para num = “first”>This is Para 1 </para><para num = ‘second’ color = “red”>This is Para 2</para><myparas>

XML

  • 1.
  • 2.
    Disadvantages of HTML– Need for XMLHTML lacks syntax checking
  • 3.
  • 4.
    HTML is notsuitable for data interchange
  • 5.
    HTML is notcontext aware – HTML does not allow us to describe the information content or the semantics of the document
  • 6.
    HTML is notobject-oriented
  • 7.
    HTML is notre-usable
  • 8.
    HTML is notextensibleIntroduction to XMLXML – Extensible Markup Language
  • 9.
    Extensible – capableof being extended
  • 10.
    Markup – itis a way of adding information to the text indicating the logical components of a document
  • 11.
    How is itdifferent from HTML?
  • 12.
    HTML was designedto display data
  • 13.
    XML was designedto store, describe and transport data
  • 14.
    XML is alsoa markup language like HTML
  • 15.
    XML tags arenot predefined – we must design our own tags.Differences between HTML and XML
  • 16.
    Advantages (Features) ofXML - 1XML simplifies data sharingSince XML data is stored in plain text format, data can be easily shared among different hardware and software platforms.XML separates data from HTMLTo display dynamic data in HTML, the code must be rewritten each time the data changes. With XML, data can be stored in separate files so that whenever the data changes it is automatically displayed correctly. We have to design the HTML for layout only once.
  • 17.
    Advantages (Features) ofXML - 2XML simplifies data transport
  • 18.
    With XML, datacan be easily exchanged between different platforms.
  • 19.
    XML makes datamore available
  • 20.
    Since XML isindependent of hardware, software and application, XML can make your data more available and useful.
  • 21.
    Different applications canaccess your data in HTML pages
  • 22.
    XML provides ameans to package almost any type of information (binary, text, voice, video) for delivery to a receiving end.Advantages (Features) of XML - 3InternationalityHTML relies heavily on ASCII which makes using foreign characters very difficult. XML uses Unicode so that many European and Asian languages are also handled easily
  • 23.
    Applications of XMLFileConvertersMany applications have been written to convert existing documents into the XML standard. An example is a PDF to XML converter. Cell Phones - XML data is sent to some cell phones. This data is then formatted to display text or images, and even to play sounds!VoiceXML - Converts XML documents into an audio format so that you can listen to an XML document.
  • 24.
    XML Document –Example 1<?xml version="1.0" encoding="ISO-8859-1"?><class_list><student><name>Anamika</name><grade>A+</grade></student><student><name>Veena</name><grade>B+</grade></student></class_list>
  • 25.
    XML Document–Example 1- ExplainedThe first line is the XML declaration.
  • 26.
  • 27.
    It defines theXML version (1.0)
  • 28.
    It gives theencoding used (ISO-8859-1 = Latin-1/West European character set)
  • 29.
    The XML declarationis actually a processing instruction (PI) an it is identified by the ? At its start and end
  • 30.
    The next linedescribes the root element of the document (like saying: "this document is a class_list“)
  • 31.
    The next 2lines describe 2 child elements of the root (student, name, and grade)
  • 32.
    And finally thelast line defines the end of the root element: </class_list>Logical StructureXML uses its start tags and end tags as containers.The start tag, the content and the end tag form an elementElements are the building blocks out of which an XML document is assembled.An XML document has a tree-like structure with the root element at the top and all the other elements are contained within each other.
  • 33.
    Tree structureXML documentsform a tree structure.
  • 34.
    XML documents mustcontain a root element. This element is "the parent" of all other elements.
  • 35.
    The elements inan XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree.
  • 36.
    All elements canhave sub elements (child elements)
  • 37.
  • 38.
    XML – Example2<bookstore> <book category = "COOKING"> <title lang = "en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price></book> <book category = "CHILDREN"> <title lang = "en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price></book> <book category = "WEB"> <title lang = "en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price></book></bookstore>
  • 39.
    Important DefinitionsXML ElementAnelement is a start tag, content, and an end tag.E.g., <greeting>”Hello World</greeting> XML AttributeAn attribute provides additional information about elementsE.g., <note priority = “high”>
  • 40.
    Important DefinitionsChild elements– XML elements may have child elements<employee id = “100”> <name> <first>Anita</first> <initial>D</initial> <last>Singh</last> </name></employee>Parent Element NameChildren of parent element
  • 41.
    XML ElementAn XMLelement is everything from the element's start tag to the element's end tag.An element can contain other elements, simple text or a mixture of both. Elements can also have attributes.
  • 42.
    XML Syntax RulesAllXML elements must have a closing tagXML tags are case sensitive. The tag <Book> is different from the tag <book>Opening and closing tags must be written with the same case <Message>This is incorrect</message><message>This is correct</message>
  • 43.
    XML Syntax RulesXMLelements must be properly nested
  • 44.
    HTML permits this:<B><I>Thistext is bold and italic</B></I> But in XML this is invalid. All elements must be properly nested within one another.<B><I>This text is bold and italic</I></B>XML documents must have a root element. It is the parent of all other elements.<root> <child> <subchild>.....</subchild></child></root>
  • 45.
    XML Syntax RulesXMLEntity ReferencesSome characters have a special meaning in XML. E.g., If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.<message>if salary < 1000 then </message>To avoid this error, replace the "<" character with an entity reference:<message>if salary &lt; 1000 then</message>
  • 46.
    XML Syntax RulesXMLEntity ReferencesThere are 5 predefined entity references in XML:
  • 47.
    XML Syntax RulesCommentsin XML (similar to HTML) <!-- This is a comment --> White space is preserved in XML but not in HTML
  • 48.
  • 49.
    Names can containletters, numbers, and other characters
  • 50.
    Names cannot startwith a number or punctuation character
  • 51.
    Names cannot startwith the letters xml (or XML, or Xml, etc)
  • 52.
  • 53.
    Any name canbe used, no words are reserved.XML Markup DelimitersEvery XML element is made up of the following parts: Symbol Description < Start tag open delimiter </ End tag open delimiter something element name > tag close delimiter /> empty tag close delimiter
  • 54.
    Different Types ofXML Markups5 Types of Markup in XMLElementsEntitiesCommentsProcessing InstructionsIgnored Sections
  • 55.
    Element MarkupElement MarkupItis composed of 3 parts: start tag, the content, and the end tag.Example: <name>Neetu</name>The start tag and the end tag can be treated as wrappersThe element name that appears in the start tag must be exactly the same as the name that appears in the end tag.Example: <Name>Neetu</name>
  • 56.
    Different Types ofXML MarkupsAttribute MarkupAttributes are used to attach information to the information contained in an element.General syntax for attributes is:<elementname property = ‘value’> Or<elementname property = “value”>Attribute value must be enclosed within quotation marksUse either single quotes or double quotes but don’t mix them.
  • 57.
    Attribute MarkupIf wespecify the attributes for the same element more than once, the specifications are merged.<?xml version = “1.0”?><myparas><para num = “first”>This is Para 1 </para><para num = ‘second’ color = “red”>This is Para 2</para><myparas>
  • 58.
    Attribute MarkupWhen theXML processor encounters line 3, it will record the fact that para element has the num attributeWhen it encounters the 4th line it will record the fact that para element has the color attribute
  • 59.
    Reserved AttributeThe xml:langattribute is reserved to identify the human language in which the element was writtenThe value of attribute is one of the following:en Englishfr Frenchde German
  • 60.
    XML AttributesAttribute providesadditional information about the element
  • 61.
    Similar to attributesin HTML e.g., <IMG SRC=“sky.jpg”> In this SRC is the attribute
  • 62.
    XML Attribute valuesmust be quoted
  • 63.
    XML elements canhave attributes in name/value pairs just like in HTML.
  • 64.
    In XML theattribute value must always be quoted.<note date = 01/01/2010> <---------- This is invalid<to>Priya</to><from>Deeali</from></note><note date = “01/01/2010”> --------- Now OK since enclosed in double quotes<note date = ‘01/01/2010’> --------- This is also OK since enclosed in single quotes
  • 65.
    XML Attributes andElementsConsider the following example:<person gender = "female"> <firstname>Geeta</firstname> <lastname>Shah</lastname></person>Gender is an attributeGender is an element<person> <gender>female</gender> <firstname>Geeta</firstname> <lastname>Shah</lastname></person>
  • 66.
    Problems with XMLAttributesAttributes cannot contain multiple values whereas elements can
  • 67.
  • 68.
    Attributes are noteasily expandable (for future changes)
  • 69.
    Attributes are difficultto read and maintain
  • 70.
  • 71.
    Use attributes forinformation that is not relevant to the data.Illustrating Problematic AttributesConsider the following example: <note day=“03" month="02" year="2010"to="Tina" from=“Yasmin" heading="Reminder"body=“Happy Birthday!"></note>Better way: <note> <date> <day>03</day> <month>02</month> <year>2010</year> </date> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note>
  • 72.
    When to useAttributes?XML Attributes can be used to assign ID references to elements.
  • 73.
    Metadata – dataabout data – should be stored as attributes
  • 74.
    The ID canthen be used to identify the XML element <messages><note id="501"> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note><note id="502"> <to>Yasmin</to> <from>Tina</from> <heading>Re: Reminder</heading> <body>Thank you, my dear</body></note></messages>
  • 75.
    What does Extensiblemean in XML?Consider the following XML example: <note> <to>Anita</to> <from>Veena</from> <body>You have an exam tomorrow</body></note> Suppose we create an application that extracted the <to>, <from> and <body> elements from the XML document to produce the result: MESSAGE To: AnitaFrom:Veena You have an exam tomorrow
  • 76.
    What does Extensiblemean in XML?Now suppose the author of the XML document added some extra information to it:<note><date>2008-01-10</date><to>Anita</to><from>Veena</from><heading>Reminder</heading><body>You have an exam tomorrow</body></note>
  • 77.
    What does Extensiblemean in XML?This application will not crash because it will still find the <to>, <from> and <body> elements in the XML document and produce the same output.
  • 78.
    XML ValidationWhat isa “well formed” XML document?XML with correct syntax is "Well Formed" XML.A "Well Formed" XML document has correct XML syntax.XML documents must have a root elementXML elements must have a closing tagXML tags are case sensitiveXML elements must be properly nestedXML attribute values must be quoted
  • 79.
    Wellformed Document -Rule 1Elements are case-sensitive.
  • 80.
    If you defineyou language to use lowercase elements, then all instances of those elements must be in lowercase.Bad Examples…<H1>Sample Heading</H1><h1>Sample Heading</H1><H1>Sample Heading</h1>
  • 81.
    Rule 2:All elementsthat contain text or other elements must have both start and ending tags.Rule 3:All empty elements (commonly known as standalone tags) must have a slash (/) before the end of the tag.Rule 4:All attribute values must be contained in quotes, either single or double – no exceptions!Rule 5:Elements may not overlap.
  • 82.
    Elements must benested properly within other elements and can not start before a sub-element and end within the sub-element. Rule 6:Isolated markup characters (characters essential to creating markup documents) may not appear in parsed content as is.
  • 83.
    Isolated markup charactersmust be represented as a character entity and include the following: <, [, ], >, ', " and &.Isolated Markup Characters
  • 84.
    These examples areinvalid since they are both examples forgetting the semi-colon following the character entity.Bad Examples…<h1>Jack &amp Jill</h1><equation>5 &lt 2</equation>
  • 85.
    Good Examples…<h1>Jack &amp;Jill</h1><equation>5 &lt; 2</equation>
  • 86.
    Rule 7:Element (andattribute) names must start with either a letter (uppercase or lowercase) or a underscore.
  • 87.
    Element names maycontain letters, numbers, hyphens, periods and underscores inclusively.BAD EXAMPLES<bad*characters><illegal space><99number-start>GOOD EXAMPLES<example-one><_example2><Example.Three>
  • 88.
    XML ValidationA “wellformed” XML document conforms to the rules of a Document Type Definition (DTD) <?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE note SYSTEM "Note.dtd"><note> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note>
  • 89.
    XML DTD –Document Type DefinitionThe DOCTYPE declaration in the example above, is a reference to an external DTD file. The content of the file is shown in the paragraph below.
  • 90.
    The purpose ofa DTD is to define the structure of an XML document. It defines the structure with a list of legal elements: <!DOCTYPE note[ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>]>More on DOCTYPE laterViewing XML Files - 1
  • 91.
    Viewing XML Files- 2The XML document will be displayed with color-coded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu.
  • 92.
    Viewing XML Files- 3Why XML documents display like this?XML documents do not carry information about how to display the data.Since XML tags are created by the user of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.Without any information about how to display the data, most browsers will just display the XML document as it is.
  • 93.
    Using CSS todisplay XML FilesCSS (Cascading Style Sheets) can be used to format a XML document.Consider this XML document:
  • 94.
    Displaying Formatted XMLdocument-1<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type = "text/css" href = "birthdate.css"?><birthdate> <person> <name> <first>Anokhi</first> <last>Parikh</last> </name> <date> <month>01</month> <day>21</day> <year>1992</year> </date> </person></birthdate>
  • 95.
    Displaying Formatted XMLdocument-2Stylesheet – birthdate.cssbirthdate{ background-color: #ffffff; width: 100%;}person{ margin-left: 0;}name{ color: #FF0000; font-size: 20pt;}month, day, year{display:block; color: #000000; margin-left: 20pt;}
  • 96.
  • 97.
    XSLTXSL is alanguage for style sheets
  • 98.
    An XSL stylesheet is a file that describes how to display an XML document
  • 99.
    XSL contains atransformation language for XML documents: XSLT. XSLT is used for generating HTML web pages from XML data.
  • 100.
  • 101.
    XSLT is usedto transform an XML document into an HTML document
  • 102.
    XSLT is therecommended style sheet language for XML