Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1.  It is Data Description Language. It is a flexible way to create common information formats. It also provide to share both the format and the data on the World Wide Web, intranets, and elsewhere. Computer users might agree on a standard or common way to describe a piece of information using a platform independent format with XML. Such a standard way of describing data would enable a user, to exchange the data with description between any type of platforms.
  2. 2.  XML → Extensible Markup Language Just a text file, with tags, attributes, free text... looks much like HTML Used to create special-purpose markup languages Facilitates sharing of structured text and information across the Internet. XML Structure facilitates parsing of data. XML tags are not pre-defined, you must define your own tags. Two Apps still have to agree on what the meaning of the "descriptive tags“.
  3. 3. HTML MMLSGML CML XML XML for Developers - Version 1a 4
  4. 4. HTML XML<H1>Cars for Sale</H1> <for_sale><h3>Audi 80</h3> <heading>Cars for Sale</heading>1800 cc<BR> <make>Audi 80</make>Blue<BR> <engine>1800 cc</engine>Manual<BR> <color>Blue</color>1988<BR> <transmission>Manual</transmission>$1250 <year>1988</year><P> <price>$1250</price><H3>Toyota Corolla</h3> <make>Toyota Corolla</make>1250 cc<BR> <engine>1250</engine>Red<BR> <color>Red</color>Automatic<BR> <transmission>Automatic</transmission>1984<BR> <year>1984</year>Red<BR> <price>$940</price>$940<BR> </for_sale> Example1.html Example1.xml XML for Developers - Version 1a 5
  5. 5. HTML XML in IE5 browserCars for SaleAudi 801800 ccBlueManual1988$1250Toyota Corolla1250 ccRedAutomatic1984Red$940 6
  6. 6. XML HTML It is free & extensible  Derived language from language. SGML. Tags are user-defined.  Tags are pre-defined It is about describing  It is about displaying information Extensible set of tags information. Content orientated  Fixed set of tags Standard Data  Presentation oriented infrastructure  No data validation Allows multiple output capabilities forms  Single presentation
  7. 7.  Many number of extensible languages are defined from XML. ◦ Wireless Markup Language (WML). ◦ Chemical Markup Language (CML) ◦ Bio-informatic Sequence Markup Language (BSML). ◦ Mathematical Markup Language (MathML). ◦ Open Office Markup Language ( OOML ) Directly usable over the internet, means any platform and any protocol can understand XML. Plain text documents and easier to write & transport Programs. Can be used with SGML without any conflict along with other web technologies. Clearly understandable by human. Accurate validation can be possible with the help of DTD and schema. We can define structure of data and along with description. Semi structured Data Bases are also defined using XML.
  8. 8.  Web publishing : XML allows the customer to customize web pages. With XML, you store the data once and then render that content for different viewers. Web searching and automating Web tasks: XML defines the type of information contained in a document, making it easier to return useful results when searching the Web. General applications: XML provides a standard method to use, store, transmit, and display data for all kinds of applications and devices. e-business applications: XML allows to make electronic data interchange (EDI) for both business-to-business transactions, and business-to-consumer transactions. Metadata applications: XML makes it easier to express metadata in a portable, reusable format. Pervasive computing: XML provides portable and structured information types for display on pervasive (wireless) computing devices such as personal digital assistants (PDAs), cellular phones.
  9. 9.  DTD (Document Type Definition) and XML Schemas are used to define legal XML tags and their attributes for particular purposes CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another DOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing 10
  10. 10.  XML documents use a self-describing and simplesyntax.<?xml version=“1.0” encoding=“ISO-8859-1”?><note> <to> Ramesh </to> <from> Kiran</from> <heading> Reminder </heading> <body> Don’t forget me this weekend </body></note>
  11. 11.  XML Syntax consists of ◦ XML Declaration ◦ XML Elements ◦ XML Attributes The first line of an XML document should always consist of an XML declaration defining the version of XML. An XML document may start with one or more processing instructions (PIs) or directives: <?xml version="1.0"?> <?xml-stylesheet type="text/css" href="ss.css"?> Following the directives, there must be exactly one root element containing all the rest of the XML: <weatherReport> ... </weatherReport>
  12. 12.  XML elements have relationships Elements can have different content types Element Naming Rules: 1) Names contain letters, numbers & other characters. 2) Names must not start with a number or punctuation marks. 3) Names must not start with the letters xml. 4) Names cannot contain spaces. Names (as used for tags and attributes) must begin with a letter or underscore, and can consist of: ◦ Letters, both Roman (English) and foreign ◦ Digits, both Roman and foreign . (dot) - (hyphen) _ (underscore)
  13. 13. <person gender="male"> <firstname>Ravi</firstname> <lastname>Kumar</lastname></person>
  14. 14.  Attributes and elements are somewhat interchangeable Example using just elements: <name> <first>David</first> <last>Matuszek</last> </name> Example using attributes: <name first="David" last="Matuszek"></name> You will find that elements are easier to use in your programs--this is a good reason to prefer them Attributes often contain metadata, such as unique IDs Generally speaking, browsers display only elements (values enclosed by tags), not tags and attributes 15
  15. 15.  While elements can contain multiple values attributes cannot Attributes are not expandable Elements can describe structure but not Attributes Attributes are more difficult to manipulate by program code than elements Attribute values are difficult to validate against a DTD
  16. 16. <novel> <foreword> <paragraph> This is the great American novel.</paragraph> </foreword> <chapter number="1"> <paragraph>It was a dark and stormy night.</paragraph> <paragraph>Suddenly, a shot rang out!</paragraph> </chapter></novel> novel foreword chapter number="1" paragraph paragraph paragraph This is the great It was a dark Suddenly, a shot American novel. and stormy night. rang out! 17
  17. 17.  Every element must have both a start tag and an end tag, e.g. <name> ... </name> ◦ But empty elements can be abbreviated: <break />. ◦ XML tags are case sensitive ◦ XML tags may not begin with the letters xml, in any combination of cases Elements must be properly nested, e.g. not <b><i>bold and italic</b></i> Every XML document must have one and only one root element. The values of attributes must be enclosed in single or double quotes, e.g. <time unit="days"> Character data cannot contain < or & 18
  18. 18.  Start with <?xml version="1.0"?> XML is case sensitive You must have exactly one root element that encloses all the rest of the XML Every element must have a closing tag Elements must be properly nested Attribute values must be enclosed in double or single quotation marks There are only five pre-declared entities 19
  19. 19.  Five special characters must be written as entities: &amp; for & (almost always necessary) &lt; for < (almost always necessary) &gt; for > (not usually necessary) &quot; for " (necessary inside double quotes) &apos; for (necessary inside single quotes) These entities can be used even in places where they are not absolutely required These are the only predefined entities in XML 20
  20. 20.  The XML declaration looks like this: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> ◦ The XML declaration is not required by browsers, but is required by most XML processors (so include it!) ◦ If present, the XML declaration must be first--not even whitespace should precede it ◦ Note that the brackets are <? and ?> ◦ version="1.0" is required (this is the only version so far) ◦ encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode), or something else, or it can be omitted ◦ standalone tells whether there is a separate DTD 21
  21. 21.  PIs (Processing Instructions) may occur anywhere in the XML document (but usually first) A PI is a command to the program processing the XML document to handle it in a certain way XML documents are typically processed by more than one program Programs that do not recognize a given PI should just ignore it General format of a PI: <?target instructions?> Example: <?xml-stylesheet type="text/css" href="mySheet.css"?> 22
  22. 22.  <!-- This is a comment in both HTML and XML --> Comments can be put anywhere in an XML document Comments are useful for: ◦ Explaining the structure of an XML document ◦ Commenting out parts of the XML during development and testing Comments are not elements and do not have an end tag The blanks after <!-- and before --> are optional The character sequence -- cannot occur in the comment The closing bracket must be --> Comments are not displayed by browsers, but can be seen by anyone who looks at the source code 23
  23. 23.  By default, all text inside an XML document is parsed. You can force text to be treated as unparsed character data by enclosing it in <![CDATA[ ... ]]> Any characters, even & and <, can occur inside a CDATA Whitespace inside a CDATA is (usually) preserved The only real restriction is that the character sequence ]]> cannot occur inside a CDATA CDATA is useful when your text has a lot of illegal characters (for example, if your XML document contains some HTML text) 24
  24. 24. <note> <to> Ramesh </to> <from> Kiran</from> <heading> Reminder </heading> <body> Don’t forget me this weekend </body></note><remainder> <heading> Reminder </heading> <to> Ramesh </to> <from> Kiran</from> <message> Don’t forget me this weekend </message></ remainder >
  25. 25.  Basically DTD is used to specify the set of rules for structuring data in xml file. It is used to define the building blocks of XML document. Using DTD we can specify the various elements types, attributes and their relationship. DTD constraints structure of XML data ◦ What elements can occur ◦ What attributes can/must an element have. ◦ What subelements can/must occur inside each element, and how many times. DTD syntax ◦ <!ELEMENT element (subelements-specification) > ◦ <!ATTLIST element (attributes) >
  26. 26.  A DTD adds syntactical requirements in addition to the well-formed requirement It helps in eliminating errors when creating or editing XML documents It clarifies the intended semantics It simplifies the processing of XML documents 27
  27. 27.  <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Raj</to> <from>AEC</from> <heading>Invitation</heading> <body>Welcome to Aditya!</body> </note>
  28. 28.  !DOCTYPE note defines that the root element of this document is note !ELEMENT note defines that the note element contains four child elements: "to,from,heading,body" !ELEMENT to defines the to element to be of type "#PCDATA" !ELEMENT from defines the from element to be of type "#PCDATA" !ELEMENT heading defines the heading element to be of type "#PCDATA" !ELEMENT body defines the body element to be of type "#PCDATA"
  29. 29. <person> <name> K.Vijay Kumar </name> Exactly one name <greet> Happy new year </greet> At most one greeting <addr>19-12, main road </addr> As many address <addr> Kakinada </addr> lines as needed <tel> 943786254 </tel> Mixed telephones <fax> 227862544 </fax> and faxes <tel> 227862551 </tel> <email> vkumar123@gmail.com </email> As many as needed</person> 30
  30. 30.  name to specify a name element greet? to specify an optional (0 or 1) greet elements name, greet? to specify a name followed by an optional greet addr* to specify 0 or more address lines tel | fax a tel or a fax element (tel | fax)* 0 or more repeats of tel or fax email* 0 or more email elements 31
  31. 31.  So the whole structure of a person entry is specified by name, greet?, addr*, (tel | fax)*, email* This is known as a regular expression 32
  32. 32. <?xml version="1.0" encoding="UTF-8"?> The name of<!DOCTYPE addressbook [ the DTD is <!ELEMENT addressbook (person*)> addressbook <!ELEMENT person (name, greet?, address*, (fax | tel)*, email*)> <!ELEMENT name (#PCDATA)> <!ELEMENT greet (#PCDATA)> The syntax <!ELEMENT address (#PCDATA)> of a DTD is <!ELEMENT tel (#PCDATA)> not XML <!ELEMENT fax (#PCDATA)> syntax <!ELEMENT email (#PCDATA)>]> “Internal” means that the DTD and the XML Document are in the same file 33
  33. 33.  Suffixes: ? optional foreword? + one or more chapter+ * zero or more appendix* Separators , both, in order foreword?, chapter+ | or section|chapter Grouping () grouping (section|chapter)+
  34. 34.  The syntax is <!ELEMENT name category> ◦ The name is the element name used in start and end tags ◦ The category may be EMPTY:  In the DTD: <!ELEMENT br EMPTY>  In the XML: <br></br> or just <br /> ◦ In the XML, an empty element may not have any content between the start tag and the end tag ◦ An empty element may (and usually does) have attributes
  35. 35.  The syntax is <!ELEMENT name category> ◦ The category may be ANY  This indicates that any content--character data, elements, even undeclared elements--may be used  Since the whole point of using a DTD is to define the structure of a document, ANY should be avoided wherever possible ◦ The category may be (#PCDATA), indicating that only character data may be used  In the DTD: <!ELEMENT paragraph (#PCDATA)>  In the XML: <paragraph>A shot rang out!</paragraph>  The parentheses are required!  Note: In (#PCDATA), whitespace is kept exactly as entered  Elements may not be used within parsed character data  Entities are character data, and may be used
  36. 36.  A category may describe one or more children: <!ELEMENT novel (foreword, chapter+)> ◦ Parentheses are required, even if there is only one child ◦ A space must precede the opening parenthesis ◦ Commas (,) between elements mean that all children must appear, and must be in the order specified ◦ “|” separators means any one child may be used ◦ All child elements must themselves be declared ◦ Children may have children ◦ Parentheses can be used for grouping: <!ELEMENT novel (foreword, (chapter+|section+))>
  37. 37.  # #PCDATA describes elements with only character data #PCDATA can be used in an “or” grouping: ◦ <!ELEMENT note (#PCDATA|message)*> ◦ This is called mixed content ◦ Certain (rather severe) restrictions apply:  #PCDATA must be first  The separators must be “|”  The group must be starred (meaning zero or more)
  38. 38.  The format of an attribute is: <!ATTLIST element-name name type requirement name type requirement> where the name-type-requirement may be repeated as many times as desired ◦ Note that only spaces separate the parts, so careful counting is essential ◦ The element-name tells which element may have these attributes ◦ The name is the name of the attribute ◦ Each element has a type, such as CDATA (character data) ◦ Each element may be required, optional, or “fixed” ◦ In the XML, attributes may occur in any order
  39. 39.  There are ten attribute types These are the most important ones: ◦ CDATA The value is character data ◦ (man|woman|child) The value is one of enumerated values ◦ ID The value is a unique identifier  ID values must be legal XML names and must be unique within the document ◦ NMTOKEN The value is a legal XML name  This is sometimes used to disallow whitespace in the name  It also disallows numbers, since an XML name cannot begin with a digit
  40. 40.  IDREF The ID of another element IDREFS A list of other IDs NMTOKENS A list of valid XML names ENTITY An entity ENTITIES A list of entities NOTATION A notation xml: A predefined XML value
  41. 41.  Recall that an attribute has the form <!ATTLIST element-name name type requirement> The requirement is one of: ◦ A default value, enclosed in quotes  Example: <!ATTLIST degree CDATA "PhD"> ◦ #REQUIRED  The attribute must be present ◦ #IMPLIED  The attribute is optional ◦ #FIXED "value"  The attribute always has the given value  If specified in the XML, the same value must be used
  42. 42. Invoice Element Declaration: <?xml version=“1.0” ?> <!ELEMENT employee (#PCDATA)><! ElementName AttributeName Type Default ><!ATTLIST employee type (FullTime | PartTime) “FullTime” > Usage in XML file: <?xml version=“1.0” ?> <employee type=“PartTime”/>
  43. 43.  CDATA ◦ CDATA attributes are strings , any text is allowed ID ◦ The values of an ID attribute must be a name. All id the ID attributes used in a document must be unique. IDs uniquely identify individual elements in a document.Elements can only have a single ID attrinute IDREF or IDREFS ◦ An IDREF attributes value must be the value of a single ID attribute on some element in the document. The value of an IDREFs attribute may contain multiple IDREF values seperated by white space. ENTITY or ENTITIES ◦ An ENTITY attribute’s must be the name of a single ENTITY. The value of an ENTITIES attribute may contain multiple entity names separated by white space. NMTOKEN or NMTOKENS ◦ Name token attributes are a restricted form of string attribute, but there are no other restrictions on the word. List of Names Enumerated ◦ You can specify that the value of an attribute must be taken from a specific list of names. This frequently called an enumerated type because each of the possible values must be explicitely enumerated in the declaration
  44. 44.  #REQUIRED ◦ The attribute must have an explicitly specified value for every occurrence of the element in the document #IMPLIED ◦ The attribute value is not required and no default value is provided. If a value is not specified the XMP processor must proceed without one. “value” ◦ An attrubute can be given any legal value as a default. The attribute value is not required on each element of the document, and if it is not present it will appear to be the specified default #FIXED “value” ◦ An attribute declaration may specify that an attribute has a fixed value. In this case, the attribute is not required, but if it occurrs, it must have the specified value. If it is not present, it will appear to be the specified defualt
  45. 45.  CDATA  ID ◦ Character data ◦ Unique ID NMTOKEN  IDREF ◦ Single token ◦ Match to ID NMTOKENS  IDREFS ◦ Multiple tokens ◦ Match to multiple IDs ENTITY  NOTATION ◦ Attribute is entity ref ◦ Describe non-XML data ENTITIES  Name group ◦ Multiple entity refs ◦ Restricted list
  46. 46.  CDATA ◦ name = "Tom Jones"  ID NMTOKEN ◦ ID = "P09567" ◦ color="red"  IDREF NMTOKENS ◦ IDREF="P09567" ◦ values=“A12 A15 A34"  IDREFS ENTITY ◦ IDREFS="A01 A02" ◦ photo="MyPic"  NOTATION ENTITIES ◦ FORMAT="TeX" ◦ photos="pic1 pic2"  Name group ◦ coord="X"
  47. 47.  Can specify a default attribute value for when its missing from XML document, or state that value must be entered ◦ #REQUIRED Must be specified ◦ #IMPLIED May be specifed ◦ "default" Default value if unspecified ◦ #FIXED Only one value allowed<ATTLIST tag name type default><!ATTLIST seqlist sepchar NMTOKEN #REQUIRED type (alpha|num) "num"
  48. 48.  There are exactly five predefined entities: &lt;, &gt;, &amp;, &quot;, and &apos; Additional entities can be defined in the DTD: <!ENTITY copyright "Copyright Dr. Dave"> Entities can be defined in another document: <!ENTITY copyright SYSTEM "MyURI"> Example of use in the XML: This document is &copyright; 2002.• Entities are a way to include fixed text (sometimes called “boilerplate”)• Entities should not be confused with character references, which are numerical values between & and # • Example: &233#; or &xE9#; to indicate the character é
  49. 49.  In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.This XML carries HTML table information: This XML carries information about a table<table> (a piece of furniture): <tr> <table> <td>Apples</td> <name>Wooden Table</name> <td>Bananas</td> <width>80</width> </tr> <length>120</length></table> </table> •If these both XML tags were added together, there would be a name conflict. • Both contain a <table> element, but the elements have different content and meaning. •An XML parser will not know how to handle these differences.
  50. 50. Name conflicts in XML can easily be avoided using a name prefix.This XML carries information about an HTML table, and a piece of furniture:<h:table> In the example above, there <h:tr> <h:td>Apples</h:td> will be no conflict because the <h:td>Bananas</h:td> two <table> elements have </h:tr> different names.</h:table><f:table> <f:name>Wooden Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table>
  51. 51.  When using prefixes in XML, a so-called namespace for the prefix must be defined. The namespace is defined by the xmlns attribute in the start tag of an element. The namespace declaration has the following syntax. xmlns:prefix="URI".<root xmlns:h="http://www.w3.org/TR/html4/" xmlns:f=“http://www.w3schools.com/furniture”><h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table><f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table></root>
  52. 52.  Recall that DTDs are used to define the tags that can be used in an XML document An XML document may reference more than one DTD Namespaces are a way to specify which DTD defines a given tag XML, like Java, uses qualified names ◦ This helps to avoid collisions between names ◦ Java: myObject.myVariable ◦ XML: myDTD:myTag ◦ Note that XML uses a colon (:) rather than a dot (.) 53
  53. 53.  A namespace is defined as a unique string ◦ To guarantee uniqueness, typically a URI (Uniform Resource Indicator) is used, because the author “owns” the domain ◦ It doesnt have to be a “real” URI; it just has to be a unique string ◦ Example: http://www.matuszek.org/ns There are two ways to use namespaces: ◦ Declare a default namespace ◦ Associate a prefix with a namespace, then use the prefix in the XML to refer to the namespace 54
  54. 54.  In any start tag you can use the reserved attribute name xmlns: <book xmlns="http://www.matuszek.org/ns"> ◦ This namespace will be used as the default for all elements up to the corresponding end tag ◦ You can override it with a specific prefix You can use almost this same form to declare a prefix: <book xmlns:dave="http://www.matuszek.org/ns"> ◦ Use this prefix on every tag and attribute you want to use from this namespace, including end tags--it is not a default prefix <dave:chapter dave:number="1">To Begin</dave:chapter> You can use the prefix in the start tag in which it is defined: <dave:book xmlns:dave="http://www.matuszek.org/ns"> 55
  55. 55.  XSL stands for EXtensible Stylesheet Language, and is a style sheet language for XML documents. XSLT stands for XSL Transformations. XSLT is used to transform XML documents into other formats, like XHTML. XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. XSLT does this by transforming each XML element into an (X)HTML element. With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements. You can also perform tests and make decisions about which elements to hide and display, and a lot more.
  56. 56.  DTDs are a very weak specification language ◦ You can’t put any restrictions on element contents. ◦ It’s difficult to specify:  All the children must occur, but may be in any order.  This element must occur a certain number of times. ◦ There are only ten data types for attribute values. DTDs aren’t written in XML! ◦ If you want to do any validation, you need one parser for the XML and another for the DTD. ◦ This makes XML parsing harder than it needs to be. ◦ There is a newer and more powerful technology: XML Schemas. ◦ However, DTDs are still very much in use.
  57. 57.  An XML Schema describes the structure of an XML document. XML Schema is an XML-based alternative to DTD. The XML Schema language is also referred to as XML Schema Definition (XSD). Ex: remainder.xsd<?xml version="1.0"?>< xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> < /xs:element>< /xs:schema>
  58. 58. <?xml version="1.0"?>< note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> < to>Ravi</to> < from>AEC</from> < heading>Reminder</heading> < body>Welcome to Aditya</body>< /note><?xml version="1.0"?>< note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation=“note.xsd"> < to>Ravi</to> < from>AEC</from> < heading>Reminder</heading> < body>Welcome to Aditya</body>< /note>
  59. 59.  The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: ◦ defines elements that can appear in a document. ◦ defines attributes that can appear in a document. ◦ defines which elements are child elements. ◦ defines the order of child elements. ◦ defines the number of child elements. ◦ defines whether an element is empty or can include text. ◦ defines data types for elements and attributes. ◦ defines default and fixed values for elements and attributes. XML Schemas are the Successors of DTDs ◦ XML Schemas are extensible to future additions. ◦ XML Schemas are richer and more powerful than DTDs. ◦ XML Schemas are written in XML. ◦ XML Schemas support data types. ◦ XML Schemas support namespaces. XML Schemas are much more powerful than DTDs.
  60. 60.  XML Schemas is the support for data types. ◦ It is easier to validate the correctness of data. ◦ It is easier to work with data from a database. ◦ It is easier to define data facets (restrictions on data), data patterns (data formats) and easy to convert data between different data types. XML Schemas is that they are written in XML. ◦ You dont have to learn a new language. ◦ You can use your XML editor to edit your Schema files. ◦ You can use your XML parser to parse your Schema files. XML Schemas provides Secure Data Communication. ◦ A date like: "03-11-2004" will be interpreted as in some countries, 3.November and in other as 11.March. ◦ However, an XML element with a data type like this: ◦ <date type="date">2004-03-11</date> ◦ ensures a mutual understanding between sender and reciever, i.e., the XML "date“ type requires the format "YYYY-MM-DD". XML Schemas are Extensible. ◦ Reuse your Schema in other Schemas. ◦ Create your own data types derived from the standard types. ◦ Reference multiple schemas in the same document.
  61. 61.  Defining Simple Element : <xs:element name="xxx" type="yyy"/> XML Schema has a lot of built-in data types. o xs:string o xs:decimal o xs:integer o xs:boolean o xs:date o xs:timeExample Here are some XML elements: <lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03-27</dateborn> Here are the corresponding simple element definitions in Schema: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/> <xs:element name="color" type="xs:string" default="red"/> <xs:element name="color" type="xs:string" fixed="red"/>
  62. 62.  Syntex : <xs:attribute name="xxx" type="yyy"/> Example Here is an XML element with an attribute: <lastname lang="EN">Smith</lastname> And here is the corresponding attribute definition: <xs:attribute name="lang" type="xs:string"/> <xs:attribute name="lang" type="xs:string" default="EN"/> <xs:attribute name="lang" type="xs:string" fixed="EN"/> <xs:attribute name="lang" type="xs:string" use="required"/>
  63. 63.  Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on XML elements are called facets.Restrictions on Values The example defines an element called "age" with a restriction.The value of age cannot be lower than 0 or greater than 100:<xs:element name="age"> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="100"/> </xs:restriction> </xs:simpleType></xs:element>
  64. 64.  The example below defines an element called "car" with arestriction.The only acceptable values are: Audi, Golf, BMW:<xs:element name="car"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType></xs:element>
  65. 65.  Below example defines an element "letter" with a restriction. The acceptable value is ONE of the LOWERCASE letters from a to z:<xs:element name="letter"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[a-z]"/> </xs:restriction> </xs:simpleType></xs:element> The only acceptable value is THREE of the UPPERCASE letters from ato z:<xs:element name="initials"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[A-Z][A-Z][A-Z]"/> </xs:restriction> </xs:simpleType></xs:element>
  66. 66.  The next example defines an element called "gender" with arestriction. The only acceptable value is male OR female:<xs:element name="gender"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="male|female"/> </xs:restriction> </xs:simpleType></xs:element> The example defines an element “mobileno" with a restriction. There must be exactly 10 digits:<xs:element name=“mobileno"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{10}"/> </xs:restriction> </xs:simpleType></xs:element>
  67. 67.  The whiteSpace constraint is set to "preserve", which means that the XMLprocessor WILL NOT remove any white space characters:<xs:element name="address"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:whiteSpace value="preserve"/> </xs:restriction> </xs:simpleType></xs:element> The whiteSpace constraint is set to "replace", which means that the XMLprocessor WILL REPLACE all white space characters (line feeds, tabs, spaces, andcarriage returns) with spaces:<xs:element name="address"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:whiteSpace value="replace"/> </xs:restriction> </xs:simpleType></xs:element>
  68. 68.  The value must be minimum five characters and maximum eight characters: <xs:element name="password"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:minLength value="5"/> <xs:maxLength value="8"/> </xs:restriction> </xs:simpleType> </xs:element>
  69. 69.  A complex element is an XML element that contains other elements and/or attributes. There are four kinds of complex elements: ◦ empty elements ◦ elements that contain only other elements ◦ elements that contain only text ◦ elements that contain both other elements and text
  70. 70.  It is a software library (or a package) that provides methods (or interfaces) for client applications to work with XML documents It checks the well-formattedness It may validate the documents It does a lot of other detailed things so that a client is shielded from that complexities
  71. 71.  DOM: Document Object Model SAX: Simple API for XML A DOM parser implements DOM API A SAX parser implement SAX API Most major parsers implement both DOM and SAX API’s
  72. 72.  A DOM document is an object containing all the information of an XML document It is composed of a tree (DOM tree) of nodes , and various nodes that are somehow associated with other nodes in the tree but are not themselves part of the DOM tree
  73. 73.  There are 12 types of nodes in a DOM Document object Document node Element node Text node Attribute node Processing instruction node …….
  74. 74. Sample XML document<?xml version="1.0"?><?xml-stylesheet type="text/css" href=“test.css"?><!-- Its an xml-stylesheet processing instruction. --><!DOCTYPE shapes SYSTEM “shapes.dtd"><shapes> …… <squre color=“BLUE”> <length> 20 </length> </squre> ……</shapes>
  75. 75.  A DOM parser creates an internal structure in memory which is a DOM document object Client applications get the information of the original XML document by invoking methods on this Document object or on other objects it contains DOM parser is tree-based (or DOM obj-based) Client application seems to be pulling the data actively, from the data flow point of view
  76. 76.  Advantage: (1) It is good when random access to widely separated parts of a document is required (2) It supports both read and write operations Disadvantage: (1) It is memory inefficient (2) It seems complicated, although not really
  77. 77.  It does not first create any internal structure Client does not specify what methods to call Client just overrides the methods of the API and place his own code inside there When the parser encounters start-tag, end- tag,etc., it thinks of them as events
  78. 78.  When such an event occurs, the handler automatically calls back to a particular method overridden by the client, and feeds as arguments the method what it sees SAX parser is event-based,it works like an event handler in Java (e.g. MouseAdapter) Client application seems to be just receiving the data inactively, from the data flow point of view
  79. 79.  Advantage: (1) It is simple (2) It is memory efficient (3) It works well in stream application Disadvantage: The data is broken into pieces and clients never have all the information as a whole unless they create their own data structure