Published on

Introduction to XML

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. XML<br />Mukesh N Tekwani<br />tekwani@email.com<br />www.myexamnotes.com<br />
  2. 2. Disadvantages of HTML – Need for XML<br /><ul><li>HTML lacks syntax checking
  3. 3. HTML lacks structure
  4. 4. HTML is not suitable for data interchange
  5. 5. HTML is not context aware – HTML does not allow us to describe the information content or the semantics of the document
  6. 6. HTML is not object-oriented
  7. 7. HTML is not re-usable
  8. 8. HTML is not extensible</li></li></ul><li>Introduction to XML<br /><ul><li>XML – Extensible Markup Language
  9. 9. Extensible – capable of being extended
  10. 10. Markup – it is a way of adding information to the text indicating the logical components of a document
  11. 11. How is it different from HTML?
  12. 12. HTML was designed to display data
  13. 13. XML was designed to store, describe and transport data
  14. 14. XML is also a markup language like HTML
  15. 15. XML tags are not predefined – we must design our own tags.</li></li></ul><li>Differences between HTML and XML<br />
  16. 16. Advantages (Features) of XML - 1<br />XML simplifies data sharing<br />Since XML data is stored in plain text format, data can be easily shared among different hardware and software platforms.<br />XML separates data from HTML<br />To display dynamic data in HTML, the code must be rewritten each time the data changes. With XML, data can be stored in separate files so that whenever the data changes it is automatically displayed correctly. We have to design the HTML for layout only once.<br />
  17. 17. Advantages (Features) of XML - 2<br /><ul><li>XML simplifies data transport
  18. 18. With XML, data can be easily exchanged between different platforms.
  19. 19. XML makes data more available
  20. 20. Since XML is independent of hardware, software and application, XML can make your data more available and useful.
  21. 21. Different applications can access your data in HTML pages
  22. 22. XML provides a means to package almost any type of information (binary, text, voice, video) for delivery to a receiving end.</li></li></ul><li>Advantages (Features) of XML - 3<br />Internationality<br />HTML relies heavily on ASCII which makes using foreign characters very difficult. XML uses Unicode so that many European and Asian languages are also handled easily<br />
  23. 23. Applications of XML<br />File Converters<br />Many applications have been written to convert existing documents into the XML standard. An example is a PDF to XML converter. <br />Cell Phones - XML data is sent to some cell phones. This data is then formatted to display text or images, and even to play sounds!<br />VoiceXML - Converts XML documents into an audio format so that you can listen to an XML document.<br />
  24. 24. XML Document – Example 1<br /><?xml version="1.0" encoding="ISO-8859-1"?><br /><class_list><br /><student><br /><name>Anamika</name><br /><grade>A+</grade><br /></student><br /><student><br /><name>Veena</name><br /><grade>B+</grade><br /></student><br /></class_list> <br />
  25. 25. XML Document–Example 1 - Explained<br /><ul><li>The first line is the XML declaration.
  26. 26. <?xml version="1.0" encoding="ISO-8859-1"?>
  27. 27. It defines the XML version (1.0)
  28. 28. It gives the encoding used (ISO-8859-1 = Latin-1/West European character set)
  29. 29. The XML declaration is actually a processing instruction (PI) an it is identified by the ? At its start and end
  30. 30. The next line describes the root element of the document (like saying: "this document is a class_list“)
  31. 31. The next 2 lines describe 2 child elements of the root (student, name, and grade)
  32. 32. And finally the last line defines the end of the root element: </class_list></li></li></ul><li>Logical Structure<br />XML uses its start tags and end tags as containers.<br />The start tag, the content and the end tag form an element<br />Elements are the building blocks out of which an XML document is assembled.<br />An XML document has a tree-like structure with the root element at the top and all the other elements are contained within each other.<br />
  33. 33. Tree structure<br /><ul><li>XML documents form a tree structure.
  34. 34. XML documents must contain a root element. This element is "the parent" of all other elements.
  35. 35. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree.
  36. 36. All elements can have sub elements (child elements)
  37. 37. <root> <child> <subchild>.....</subchild> </child></root</li></li></ul><li>XML – Example 2<br />
  38. 38. XML – Example 2<br /><bookstore><br /> <book category = "COOKING"> <title lang = "en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price></book><br /> <book category = "CHILDREN"> <title lang = "en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price></book><br /> <book category = "WEB"> <title lang = "en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price></book><br /></bookstore><br />
  39. 39. Important Definitions<br />XML Element<br />An element is a start tag, content, and an end tag.<br />E.g., <greeting>”Hello World</greeting> <br />XML Attribute<br />An attribute provides additional information about elements<br />E.g., <note priority = “high”><br />
  40. 40. Important Definitions<br />Child elements – XML elements may have child elements<br /><employee id = “100”><br /> <name><br /> <first>Anita</first><br /> <initial>D</initial><br /> <last>Singh</last> <br /> </name><br /></employee><br />Parent Element Name<br />Children of parent element<br />
  41. 41. XML Element<br />An XML element is everything from the element's start tag to the element's end tag.<br />An element can contain other elements, simple text or a mixture of both. <br />Elements can also have attributes.<br />
  42. 42. XML Syntax Rules<br />All XML elements must have a closing tag<br />XML tags are case sensitive. <br />The tag <Book> is different from the tag <book><br />Opening and closing tags must be written with the same case<br /> <Message>This is incorrect</message><message>This is correct</message><br />
  43. 43. XML Syntax Rules<br /><ul><li>XML elements must be properly nested
  44. 44. HTML permits this:</li></ul><B><I>This text is bold and italic</B></I><br /> But in XML this is invalid. All elements must be properly nested within one another.<br /><B><I>This text is bold and italic</I></B><br /><ul><li>XML documents must have a root element. It is the parent of all other elements.</li></ul><root><br /> <child> <subchild>.....</subchild></child><br /></root><br />
  45. 45. XML Syntax Rules<br />XML Entity References<br />Some characters have a special meaning in XML. E.g., If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.<br /><message>if salary < 1000 then </message><br />To avoid this error, replace the "<" character with an entity reference:<br /><message>if salary &lt; 1000 then</message><br />
  46. 46. XML Syntax Rules<br />XML Entity References<br />There are 5 predefined entity references in XML:<br />
  47. 47. XML Syntax Rules<br /><ul><li>Comments in XML (similar to HTML)</li></ul> <!-- This is a comment --> <br /><ul><li>White space is preserved in XML but not in HTML
  48. 48. XML Naming Rules
  49. 49. Names can contain letters, numbers, and other characters
  50. 50. Names cannot start with a number or punctuation character
  51. 51. Names cannot start with the letters xml (or XML, or Xml, etc)
  52. 52. Names cannot contain spaces
  53. 53. Any name can be used, no words are reserved.</li></li></ul><li>XML Markup Delimiters<br />Every XML element is made up of the following parts:<br /> Symbol Description<br /> < Start tag open delimiter<br /> </ End tag open delimiter<br /> something element name<br /> > tag close delimiter<br /> /> empty tag close delimiter<br />
  54. 54. Different Types of XML Markups<br />5 Types of Markup in XML<br />Elements<br />Entities<br />Comments<br />Processing Instructions<br />Ignored Sections<br />
  55. 55. Element Markup<br />Element Markup<br />It is composed of 3 parts: start tag, the content, and the end tag.<br />Example: <name>Neetu</name><br />The start tag and the end tag can be treated as wrappers<br />The element name that appears in the start tag must be exactly the same as the name that appears in the end tag.<br />Example: <Name>Neetu</name> <br />
  56. 56. Different Types of XML Markups<br />Attribute Markup<br />Attributes are used to attach information to the information contained in an element.<br />General syntax for attributes is:<br /><elementname property = ‘value’><br /> Or<br /><elementname property = “value”><br />Attribute value must be enclosed within quotation marks<br />Use either single quotes or double quotes but don’t mix them.<br />
  57. 57. Attribute Markup<br />If we specify the attributes for the same element more than once, the specifications are merged.<br /><?xml version = “1.0”?><br /><myparas><br /><para num = “first”>This is Para 1 </para><br /><para num = ‘second’ color = “red”>This is Para 2</para><br /><myparas><br />
  58. 58. Attribute Markup<br />When the XML processor encounters line 3, it will record the fact that para element has the num attribute<br />When it encounters the 4th line it will record the fact that para element has the color attribute<br />
  59. 59. Reserved Attribute<br />The xml:lang attribute is reserved to identify the human language in which the element was written<br />The value of attribute is one of the following:<br />en English<br />fr French<br />de German<br />
  60. 60. XML Attributes<br /><ul><li>Attribute provides additional information about the element
  61. 61. Similar to attributes in HTML e.g., <IMG SRC=“sky.jpg”> In this SRC is the attribute
  62. 62. XML Attribute values must be quoted
  63. 63. XML elements can have attributes in name/value pairs just like in HTML.
  64. 64. In XML the attribute value must always be quoted.</li></ul><note date = 01/01/2010> <---------- This is invalid<to>Priya</to><from>Deeali</from><br /></note><br /><note date = “01/01/2010”> --------- Now OK since enclosed in double quotes<br /><note date = ‘01/01/2010’> --------- This is also OK since enclosed in single quotes<br />
  65. 65. XML Attributes and Elements<br />Consider the following example:<br /><person gender = "female"> <firstname>Geeta</firstname> <lastname>Shah</lastname></person><br />Gender is an attribute<br />Gender is an element<br /><person> <gender>female</gender> <firstname>Geeta</firstname> <lastname>Shah</lastname></person><br />
  66. 66. Problems with XML Attributes<br /><ul><li>Attributes cannot contain multiple values whereas elements can
  67. 67. Attributes cannot contain tree structures
  68. 68. Attributes are not easily expandable (for future changes)
  69. 69. Attributes are difficult to read and maintain
  70. 70. Use elements for data.
  71. 71. Use attributes for information that is not relevant to the data.</li></li></ul><li>Illustrating Problematic Attributes<br /><ul><li>Consider the following example:</li></ul> <note day=“03" month="02" year="2010"to="Tina" from=“Yasmin" heading="Reminder"body=“Happy Birthday!"></note><br /><ul><li>Better way:</li></ul> <note> <date> <day>03</day> <month>02</month> <year>2010</year> </date> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note><br />
  72. 72. When to use Attributes?<br /><ul><li>XML Attributes can be used to assign ID references to elements.
  73. 73. Metadata – data about data – should be stored as attributes
  74. 74. The ID can then be used to identify the XML element</li></ul> <messages><note id="501"> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note><note id="502"> <to>Yasmin</to> <from>Tina</from> <heading>Re: Reminder</heading> <body>Thank you, my dear</body></note></messages><br />
  75. 75. What does Extensible mean in XML?<br /><ul><li>Consider the following XML example:</li></ul> <note> <to>Anita</to> <from>Veena</from> <body>You have an exam tomorrow</body></note><br /> Suppose we create an application that extracted the <to>, <from> and <body> elements from the XML document to produce the result:<br /> MESSAGE<br /> To: AnitaFrom:Veena<br /> You have an exam tomorrow<br />
  76. 76. What does Extensible mean in XML?<br />Now suppose the author of the XML document added some extra information to it:<br /><note><date>2008-01-10</date><to>Anita</to><from>Veena</from><heading>Reminder</heading><body>You have an exam tomorrow</body><br /></note><br />
  77. 77. What does Extensible mean in XML?<br />This application will not crash because it will still find the <to>, <from> and <body> elements in the XML document and produce the same output.<br />
  78. 78. XML Validation<br />What is a “well formed” XML document?<br />XML with correct syntax is "Well Formed" XML.<br />A "Well Formed" XML document has correct XML syntax.<br />XML documents must have a root element<br />XML elements must have a closing tag<br />XML tags are case sensitive<br />XML elements must be properly nested<br />XML attribute values must be quoted<br />
  79. 79. Wellformed Document - Rule 1<br /><ul><li>Elements are case-sensitive.
  80. 80. If you define you language to use lowercase elements, then all instances of those elements must be in lowercase.</li></li></ul><li>Bad Examples…<br /><H1>Sample Heading</H1><br /><h1>Sample Heading</H1><br /><H1>Sample Heading</h1><br />
  81. 81. Rule 2:<br /><ul><li>All elements that contain text or other elements must have both start and ending tags.</li></li></ul><li>Rule 3:<br /><ul><li>All empty elements (commonly known as standalone tags) must have a slash (/) before the end of the tag.</li></li></ul><li>Rule 4:<br /><ul><li>All attribute values must be contained in quotes, either single or double – no exceptions!</li></li></ul><li>Rule 5:<br /><ul><li>Elements may not overlap.
  82. 82. Elements must be nested properly within other elements and can not start before a sub-element and end within the sub-element. </li></li></ul><li>Rule 6:<br /><ul><li>Isolated markup characters (characters essential to creating markup documents) may not appear in parsed content as is.
  83. 83. Isolated markup characters must be represented as a character entity and include the following: <, [, ], >, ', " and &.</li></li></ul><li>Isolated Markup Characters<br />
  84. 84. These examples are invalid since they are both examples forgetting the semi-colon following the character entity.<br />Bad Examples…<br /><h1>Jack &amp Jill</h1><br /><equation>5 &lt 2</equation><br />
  85. 85. Good Examples…<br /><h1>Jack &amp; Jill</h1><br /><equation>5 &lt; 2</equation><br />
  86. 86. Rule 7:<br /><ul><li>Element (and attribute) names must start with either a letter (uppercase or lowercase) or a underscore.
  87. 87. Element names may contain letters, numbers, hyphens, periods and underscores inclusively.</li></ul>BAD EXAMPLES<br /><bad*characters><br /><illegal space><br /><99number-start><br />GOOD EXAMPLES<br /><example-one><br /><_example2><br /><Example.Three><br />
  88. 88. XML Validation<br /><ul><li>A “well formed” XML document conforms to the rules of a Document Type Definition (DTD)</li></ul> <?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE note SYSTEM "Note.dtd"><note> <to>Tina</to> <from>Yasmin</from> <heading>Reminder</heading> <body>Happy Birthday!</body></note><br />
  89. 89. XML DTD – Document Type Definition<br /><ul><li>The DOCTYPE declaration in the example above, is a reference to an external DTD file. The content of the file is shown in the paragraph below.
  90. 90. The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements:</li></ul> <!DOCTYPE note[ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>]><br /><ul><li>More on DOCTYPE later</li></li></ul><li>Viewing XML Files - 1<br />
  91. 91. Viewing XML Files - 2<br />The XML document will be displayed with color-coded root and child elements. <br />A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. <br />To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu.<br />
  92. 92. Viewing XML Files - 3<br />Why XML documents display like this?<br />XML documents do not carry information about how to display the data.<br />Since XML tags are created by the user of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.<br />Without any information about how to display the data, most browsers will just display the XML document as it is.<br />
  93. 93. Using CSS to display XML Files<br />CSS (Cascading Style Sheets) can be used to format a XML document.<br />Consider this XML document:<br />
  94. 94. Displaying Formatted XML document-1<br /><?xml version="1.0" encoding="ISO-8859-1"?><br /><?xml-stylesheet type = "text/css" href = "birthdate.css"?><br /><birthdate><br /> <person> <br /> <name><br /> <first>Anokhi</first><br /> <last>Parikh</last><br /> </name><br /> <date> <br /> <month>01</month><br /> <day>21</day><br /> <year>1992</year><br /> </date> <br /> </person><br /></birthdate><br />
  95. 95. Displaying Formatted XML document-2<br />Stylesheet – birthdate.css<br />birthdate<br />{<br /> background-color: #ffffff;<br /> width: 100%;<br />}<br />person<br />{<br /> margin-left: 0;<br />}<br />name<br />{<br /> color: #FF0000;<br /> font-size: 20pt;<br />}<br />month, day, year<br />{<br />display:block;<br /> color: #000000;<br /> margin-left: 20pt;<br />}<br />
  96. 96. Final Output<br />
  97. 97. XSLT<br /><ul><li>XSL is a language for style sheets
  98. 98. An XSL style sheet is a file that describes how to display an XML document
  99. 99. XSL contains a transformation language for XML documents: XSLT. XSLT is used for generating HTML web pages from XML data.
  100. 100. XSLT - eXtensibleStylesheetLanguage Transformations
  101. 101. XSLT is used to transform an XML document into an HTML document
  102. 102. XSLT is the recommended style sheet language for XML</li>