An Introduction to XML Based on the  W3C XML Recommendations
Agenda XML Syntax XML vs HTML Data Types – Elements, Attributes White Space – Optional, Mandatory, & Preserved Empty Content Valid vs Well Formed XML Schema Used to Validate XML data Before XML Schema – DTD’s Simple Types vs Complex Types Restricting data with Regular Expressions Namespaces   Avoiding Tag Name Conflicts XML Tools XML Spy and Other Tools Corresponding Sample XML, XSD, DTD, XSL, and XHTML files XML Resources on the web http://www.w3schools.com - an excellent site http://www.xml.com http://www.w3.org
XML vs HTML As you can see, XML looks similar to HTML. <?xml version=&quot;1.0“?> <root> <child> <subchild attribute=“metadata”>Data</subchild> </child> </root>
XML vs HTML Unlike HMTL: XML is Case Sensitive Tags must be properly nested All start tags must have a corresponding end tag to close the element All XML documents must have a root element Attrbutes must use quotes (can be single or double) White space between tags is preserved
XML vs HTML Special Characters Handled the same way For Example: < > ‘ “ & &lt; &gt; &apos; &quot; &amp;
Elements XML Elements are extensible and they have parent/child relationships. XML elements must follow these naming rules: Names can contain letters, numbers, underscores, periods, colons, and hyphens (last three are not normally used in element names) Names must not start with a number or punctuation character  Names must not start with the letters xml (or XML or Xml )  Names cannot contain spaces
Attributes Attributes are normally used to store metadata, data about data, and the real data is stored in elements between the start and end tags. Single or Double quotes can be used.
White Space White Space Includes: Carrage returns, Line feeds, Spaces, Horizontal Tabs Optional White Space  White space is optional in XML files Mandatory White Space  White Space must occur when using attributes Preserved White Space  Between start/end tag pairs
Optional White Space Valid <?xml version=&quot;1.0&quot;?> <  root  > <  child  > <subchild>Data</subchild> </child> </root> Valid <?xml version=&quot;1.0&quot;?><root><child><subchild>Data</subchild></child> </root>
Mandatory White Space Valid <?xml version=&quot;1.0&quot;?> <root> <child attribute=“metadata”> </root> Invalid <?xml version=&quot;1.0&quot;?> <root> <childattribute=“metadata”> </root> Must have white space here
Preserved White Space Valid <?xml version=&quot;1.0&quot;?> <root> <child> <subchild>White space between start/end tag pairs will be preserved</subchild> </child> </root>
Empty Content IF no data is “held” between a start/ end tag pair, two formats may be used: <tag></tag> <tag/> The second format is called an Empty Tag (aka Null tag) and commonly used when only an attribute is needed: <tag attribute=“data”/>
Valid vs Well Formed XML data is defined and validated most commonly by: XML Schemas DTD’s (Document Type Definition) XML data is well formed if it follows the W3C XML Recommendation, Version 1.0 This includes: The start/end tags matching up White space used properly NOTE: XML Spy does both checks
XML Schema Used to Define and Validate XML In order for the XML file to be validated by a schema, the schema’s location is referenced as an attribute of the root element <FirmOrder … schemaLocation=&quot;http://www.telcordia.com/SGG/FO C:\13.0\Documentation\xsd\firmOrder.xsd&quot;/>
XML Schema Before XML Schema, most XML documents were validated against a DTD <?xml version=&quot;1.0&quot;?> <xsd:schema xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema&quot; targetNamespace=&quot;http://www.sample.org/xml&quot; xmlns=&quot;http://www.sample.org/xml&quot; elementFormDefault=&quot;qualified&quot;> <xsd:element name=“File&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Record&quot; minOccurs=“0&quot; maxOccurs=“unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Record&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Fill minOccurs=“1&quot; maxOccurs=“1”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Fill&quot; type=&quot;xsd:string&quot;/> </xsd:schema> <!ELEMENT File (Record)*> <!ELEMENT Record (Fill)> <!ELEMENT Fill (#PCDATA)> DTD XML Schema Mercator Type Tree
XML Schema Element Data Types XML Schema’s Simple Types Similar to Items in Mercator 44 Simple types built-in XML Schema XML Schema’s Complex Types Similar to Groups in Mercator 36 Complex types built-in XML Schema XML Schema’s Attributes Similar to the Properties of Items and Groups in Mercator
XML Schema Simple Types can be restricted using Regular Expressions: <xsd:simpleType name=&quot;alphaString&quot;>   <xsd:restriction base=&quot;xsd:string&quot;>   <xsd:pattern value=&quot;([A-Z]|[a-z]|[ ])*&quot;/>   </xsd:restriction> </xsd:simpleType>
Namespaces XML Namespaces provide a method to avoid element name conflicts. Since element names are not predefined as in HTML, often times a name conflict can occur when combining two different documents using the same name for two different elements
Namespaces, cont. If the following two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition. <table> <name>Tea Table</name> <width>80</width> <length>120</length> </table> <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>
Solving Name Conflicts using a Prefix <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
Using Namespaces <h:table xmlns:h=&quot;http://www.w3.org/TR/html4/&quot;> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table xmlns:f=&quot;http://www.w3schools.com/furniture&quot;> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
Namespaces URI’s are used as the namespace name Most commonly used URI is a URL URL’s by definition are unique to companies The URL does NOT need to be valid They are used for creating uniqueness not validating your tags Most companies put “help” documentation about their namespace, tags, and/or XML Schemas
XML Samples The next five slides have different types of XML files that correspond to each other: XML Data Document XML Schema DTD (these are not written in XML) XSL – style sheet
XML Data Sample <?xml version=&quot;1.0&quot;?> <?xml-stylesheet type=&quot;text/xsl&quot; href=&quot;xmlxsl.xsl&quot;?> <root> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> </child> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> <description>Second description start/end tag pair in child tag</description> </child> </root>
XML Schema Sample <?xml version=&quot;1.0&quot;?> <xsd:schema xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema&quot; targetNamespace=&quot;http://www.sample.org/xml&quot; xmlns=&quot;http://www.sample.org/xml&quot; elementFormDefault=&quot;qualified&quot;> <xsd:element name=“root&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“child&quot; minOccurs=&quot;0&quot; maxOccurs=&quot;unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=&quot;child&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=&quot;name&quot; minOccurs =&quot;0&quot;/> <xsd:element ref=&quot;description&quot; maxOccurs=&quot;unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=&quot;name&quot; type=&quot;xsd:string&quot;/> <xsd:element name=&quot;description&quot; type=&quot;xsd:string&quot;/> </xsd:schema>
XML DTD Sample <!ELEMENT root (child)*> <!ELEMENT child (name?, description+)> <!ELEMENT name (#PCDATA)> <!ELEMENT description (#PCDATA)>
Style Sheet (XSL) Sample <?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?> <xsl:stylesheet version=&quot;1.0&quot; xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;> <xsl:template match=&quot;/&quot;> <html> <body> <h2>XHTML Sample</h2> <table border=&quot;1&quot;> <tr bgcolor=&quot;gray&quot;> <th>Name</th> <th>Description</th> </tr> <xsl:for-each select=&quot;root/child&quot;> <tr> <td><xsl:value-of select=&quot;name&quot; /></td> <td><xsl:value-of select=&quot;description&quot; /></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
XHTML Generated <html> <body> <h2>XHTML Sample</h2> <table border=&quot;1&quot;> <tr bgcolor=&quot;gray&quot;> <th>Name</th> <th>Description</th> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> </table> </body> </html>
XML Spy Accomplishes several XML tasks including: Editing a variety of XML data graphically Allowing multiple views including: Text, browser, grid, structure (schema design),  Creates test data from XML Schemas Generates XML Schemas from XML files Validates data Checks data for Well-formedness
Other Tools Internet Explorer Displays XML data using a default style sheet Checks XML for Well-formedness and displays error message for troubleshooting UltraEdit
XML Resources on the web They are hundreds of XML resources on the web. http://www.w3schools.com  (an excellent site) http://www.xml.com http://www.w3.org The easiest was to find data about a specific XML topic or syntax is to type it into google.com
Contact Us Barry DeBruin debruinconsulting.com 919-434-5399

Introduction To Xml

  • 1.
    An Introduction toXML Based on the W3C XML Recommendations
  • 2.
    Agenda XML SyntaxXML vs HTML Data Types – Elements, Attributes White Space – Optional, Mandatory, & Preserved Empty Content Valid vs Well Formed XML Schema Used to Validate XML data Before XML Schema – DTD’s Simple Types vs Complex Types Restricting data with Regular Expressions Namespaces Avoiding Tag Name Conflicts XML Tools XML Spy and Other Tools Corresponding Sample XML, XSD, DTD, XSL, and XHTML files XML Resources on the web http://www.w3schools.com - an excellent site http://www.xml.com http://www.w3.org
  • 3.
    XML vs HTMLAs you can see, XML looks similar to HTML. <?xml version=&quot;1.0“?> <root> <child> <subchild attribute=“metadata”>Data</subchild> </child> </root>
  • 4.
    XML vs HTMLUnlike HMTL: XML is Case Sensitive Tags must be properly nested All start tags must have a corresponding end tag to close the element All XML documents must have a root element Attrbutes must use quotes (can be single or double) White space between tags is preserved
  • 5.
    XML vs HTMLSpecial Characters Handled the same way For Example: < > ‘ “ & &lt; &gt; &apos; &quot; &amp;
  • 6.
    Elements XML Elementsare extensible and they have parent/child relationships. XML elements must follow these naming rules: Names can contain letters, numbers, underscores, periods, colons, and hyphens (last three are not normally used in element names) Names must not start with a number or punctuation character Names must not start with the letters xml (or XML or Xml ) Names cannot contain spaces
  • 7.
    Attributes Attributes arenormally used to store metadata, data about data, and the real data is stored in elements between the start and end tags. Single or Double quotes can be used.
  • 8.
    White Space WhiteSpace Includes: Carrage returns, Line feeds, Spaces, Horizontal Tabs Optional White Space White space is optional in XML files Mandatory White Space White Space must occur when using attributes Preserved White Space Between start/end tag pairs
  • 9.
    Optional White SpaceValid <?xml version=&quot;1.0&quot;?> < root > < child > <subchild>Data</subchild> </child> </root> Valid <?xml version=&quot;1.0&quot;?><root><child><subchild>Data</subchild></child> </root>
  • 10.
    Mandatory White SpaceValid <?xml version=&quot;1.0&quot;?> <root> <child attribute=“metadata”> </root> Invalid <?xml version=&quot;1.0&quot;?> <root> <childattribute=“metadata”> </root> Must have white space here
  • 11.
    Preserved White SpaceValid <?xml version=&quot;1.0&quot;?> <root> <child> <subchild>White space between start/end tag pairs will be preserved</subchild> </child> </root>
  • 12.
    Empty Content IFno data is “held” between a start/ end tag pair, two formats may be used: <tag></tag> <tag/> The second format is called an Empty Tag (aka Null tag) and commonly used when only an attribute is needed: <tag attribute=“data”/>
  • 13.
    Valid vs WellFormed XML data is defined and validated most commonly by: XML Schemas DTD’s (Document Type Definition) XML data is well formed if it follows the W3C XML Recommendation, Version 1.0 This includes: The start/end tags matching up White space used properly NOTE: XML Spy does both checks
  • 14.
    XML Schema Usedto Define and Validate XML In order for the XML file to be validated by a schema, the schema’s location is referenced as an attribute of the root element <FirmOrder … schemaLocation=&quot;http://www.telcordia.com/SGG/FO C:\13.0\Documentation\xsd\firmOrder.xsd&quot;/>
  • 15.
    XML Schema BeforeXML Schema, most XML documents were validated against a DTD <?xml version=&quot;1.0&quot;?> <xsd:schema xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema&quot; targetNamespace=&quot;http://www.sample.org/xml&quot; xmlns=&quot;http://www.sample.org/xml&quot; elementFormDefault=&quot;qualified&quot;> <xsd:element name=“File&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Record&quot; minOccurs=“0&quot; maxOccurs=“unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Record&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Fill minOccurs=“1&quot; maxOccurs=“1”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Fill&quot; type=&quot;xsd:string&quot;/> </xsd:schema> <!ELEMENT File (Record)*> <!ELEMENT Record (Fill)> <!ELEMENT Fill (#PCDATA)> DTD XML Schema Mercator Type Tree
  • 16.
    XML Schema ElementData Types XML Schema’s Simple Types Similar to Items in Mercator 44 Simple types built-in XML Schema XML Schema’s Complex Types Similar to Groups in Mercator 36 Complex types built-in XML Schema XML Schema’s Attributes Similar to the Properties of Items and Groups in Mercator
  • 17.
    XML Schema SimpleTypes can be restricted using Regular Expressions: <xsd:simpleType name=&quot;alphaString&quot;> <xsd:restriction base=&quot;xsd:string&quot;> <xsd:pattern value=&quot;([A-Z]|[a-z]|[ ])*&quot;/> </xsd:restriction> </xsd:simpleType>
  • 18.
    Namespaces XML Namespacesprovide a method to avoid element name conflicts. Since element names are not predefined as in HTML, often times a name conflict can occur when combining two different documents using the same name for two different elements
  • 19.
    Namespaces, cont. Ifthe following two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition. <table> <name>Tea Table</name> <width>80</width> <length>120</length> </table> <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>
  • 20.
    Solving Name Conflictsusing a Prefix <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
  • 21.
    Using Namespaces <h:tablexmlns:h=&quot;http://www.w3.org/TR/html4/&quot;> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table xmlns:f=&quot;http://www.w3schools.com/furniture&quot;> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
  • 22.
    Namespaces URI’s areused as the namespace name Most commonly used URI is a URL URL’s by definition are unique to companies The URL does NOT need to be valid They are used for creating uniqueness not validating your tags Most companies put “help” documentation about their namespace, tags, and/or XML Schemas
  • 23.
    XML Samples Thenext five slides have different types of XML files that correspond to each other: XML Data Document XML Schema DTD (these are not written in XML) XSL – style sheet
  • 24.
    XML Data Sample<?xml version=&quot;1.0&quot;?> <?xml-stylesheet type=&quot;text/xsl&quot; href=&quot;xmlxsl.xsl&quot;?> <root> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> </child> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> <description>Second description start/end tag pair in child tag</description> </child> </root>
  • 25.
    XML Schema Sample<?xml version=&quot;1.0&quot;?> <xsd:schema xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema&quot; targetNamespace=&quot;http://www.sample.org/xml&quot; xmlns=&quot;http://www.sample.org/xml&quot; elementFormDefault=&quot;qualified&quot;> <xsd:element name=“root&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=“child&quot; minOccurs=&quot;0&quot; maxOccurs=&quot;unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=&quot;child&quot;> <xsd:complexType> <xsd:sequence> <xsd:element ref=&quot;name&quot; minOccurs =&quot;0&quot;/> <xsd:element ref=&quot;description&quot; maxOccurs=&quot;unbounded&quot;/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=&quot;name&quot; type=&quot;xsd:string&quot;/> <xsd:element name=&quot;description&quot; type=&quot;xsd:string&quot;/> </xsd:schema>
  • 26.
    XML DTD Sample<!ELEMENT root (child)*> <!ELEMENT child (name?, description+)> <!ELEMENT name (#PCDATA)> <!ELEMENT description (#PCDATA)>
  • 27.
    Style Sheet (XSL)Sample <?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?> <xsl:stylesheet version=&quot;1.0&quot; xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;> <xsl:template match=&quot;/&quot;> <html> <body> <h2>XHTML Sample</h2> <table border=&quot;1&quot;> <tr bgcolor=&quot;gray&quot;> <th>Name</th> <th>Description</th> </tr> <xsl:for-each select=&quot;root/child&quot;> <tr> <td><xsl:value-of select=&quot;name&quot; /></td> <td><xsl:value-of select=&quot;description&quot; /></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
  • 28.
    XHTML Generated <html><body> <h2>XHTML Sample</h2> <table border=&quot;1&quot;> <tr bgcolor=&quot;gray&quot;> <th>Name</th> <th>Description</th> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> </table> </body> </html>
  • 29.
    XML Spy Accomplishesseveral XML tasks including: Editing a variety of XML data graphically Allowing multiple views including: Text, browser, grid, structure (schema design), Creates test data from XML Schemas Generates XML Schemas from XML files Validates data Checks data for Well-formedness
  • 30.
    Other Tools InternetExplorer Displays XML data using a default style sheet Checks XML for Well-formedness and displays error message for troubleshooting UltraEdit
  • 31.
    XML Resources onthe web They are hundreds of XML resources on the web. http://www.w3schools.com (an excellent site) http://www.xml.com http://www.w3.org The easiest was to find data about a specific XML topic or syntax is to type it into google.com
  • 32.
    Contact Us BarryDeBruin debruinconsulting.com 919-434-5399