Introduction to xml

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. XML Basics 1
  • 2. Table of ContentsCHAPTER 1: Introduction 1.1 What is XML? 1.2 Advantages of XML? 1.3 Differences between XML and HTML 1.4 XML Related TechnologiesCHAPTER 2: How XML can be used? 2.2 XML Benefits 2.3 Uses of XML 2.4 XML TagsCHAPTER 3: XML Editors 3.1 EmEditor 3.2 XML Spy 3.3 XML Syntax Rules 3.4 XML ViewingCHAPTER 4: XML Documents 4.1 Well Formed XML 4.2 Valid XML 4.3 XML Parser 4.4 PrologCHAPTER 5: Document Type Definition 5.1 DTD Elements 5.2 Types of Elements 5.3 Attributes 5.4 EntitiesCHAPTER 6: Why we Need DTD? 6.1 Classification of DTD 6.2 Internal DTD 6.3 External DTD 6.4 Problems with DTD 6.5 Design Principles 6.6 XML Schema 2
  • 3. 1. IntroductionXML stands for Extensible Markup Language. XML was developed around 1996 and is asubset of SGML (Standard Generalized Markup Language). XML was made less complicatedthan SGML to enable its use on the web.XML is a set of rules for encoding documentselectronically. XML is a new type of language which has been developed for the web whichis different to any other type of scripting or programming language available before.XML is used for exchange of data. The language makes it possible to define data in astructured way. XML tags are not predefined like HTML. XML lets you create your ownunique tags that are meaningful for your data, hence the use of the term “extensible”.An xml document does not do anything by itself. It is just pure information wrapped in tags.You have to write a piece of software to send, receive or display it. XML is recommended bythe World Wide Web Consortium (W3C). XML is a meta-language. A meta-language is alanguage thats used to define other languages. XML has become popular to use with webservices.1.1 What is XML? XML stands for extensible markup language. XML is a markup language much like HTML. XML is designed to carry the data, not to display the data. XML tags are not predefined we can define our own tags. XML is designed to be self descriptive. XML is a W3c Recommendation. XML is designed to store the data.1.2 Advantages of XML It is a simultaneously human and machine-readable format. It supports Unicode, allowing almost any information in any written human language to be communicated. It can represent the most general computer science data structures, records, lists and trees. The strict syntax and parsing requirements make the necessary parsing algorithms extremely simple, efficient, and consistent. XML is heavily used as a format for document storage and processing, both online and offline. It is based on international standards. The hierarchical structure is suitable for most types of documents. 3
  • 4. It manifests as plain text files, which are less restrictive than other proprietary document formats. It is platform-independent, thus relatively immune to changes in technology. XML document is a plain text and human readable and also easy to edit/view. XML document has a tree structure which is powerful enough to express complex data and simple enough to understand. XML documents are language neutral. For e.g. a Java program can generate an xml which can be parsed by a program written in C++ or Perl. XML files are operating system independent.1.3 Differences between XML and HTMLXML and HTML are different and they both have different goals. They are designed fordifferent purposes. Some people think that xml is an advanced version of html and it hascome to replace html. It is not the case. Both will be there as they are used for differentpurposes.Some of the Differences between XML and HTMLExtensible Markup Language Hyper Text Markup LanguageXML is designed to store the data HTML is designed to display the dataXML focus on what the data is HTML focus on how data looksXML allows us to define our own tags HTML has predefined set of tagsXML is used to transport the data HTML is used to format and display data1.4 XML Related TechnologiesDTD (Document Type Definition) and xml schemas are used to define legal xml tags andtheir attributes.CSS (Cascading Style Sheets) describe HTML or XML in a browser.XSLT (Extensible Style Sheet Language Transformations) and XPath are used to translatefrom one form xml to another.DOM (Document Object Model), SAX (Simple API for XML), and JAXP (Java API for XMLprocessing) are all APIs for xml parsing.2. How Can XML be used?XML can be used in many aspects of web development, often to simply data storage andsharing. 4
  • 5. XML Simplifies Data Sharing: XML data is stored in plain text format. This provides asoftware and hardware independent way of storing data. This makes it much easier tocreate data that different applications can share.XML Simplifies Data Transport: One of the most time-consuming challenges fordevelopers is to exchange data between incompatible systems over the Internet.Exchanging of data using xml greatly reduces this complexity, since the data can be read bydifferent incompatible applications.XML Simplifies Platform Changes: XML data is stored in text format. This makes it easierto expand or upgrade to new operating systems, new applications, or new browsers,without losing data.XML Makes our Data More Available: Since xml is independent of hardware and softwareapplications, xml can make your data more available and useful.XML is used to Create New Internet Languages: A lot of new Internet languages arecreated with XML.2.2 XML Benefits XML improves the functionality of web technologies through the use of a more flexible and adaptable means to identify information. XML is a Meta language. That is, it is a language that describes other languages. XML provides the facility to define tags and the structural relationship between them. The extensibility and structured nature of xml allows it to be used for communication between different systems.2.3 Uses of XMLMeta Content: To describe the contents of a document.Messaging: Where applications or organizations exchanges data between them.Database: The data extracted from the database can be preserved with originalinformation and can be used more than one application in different ways.2.4 XML TagsThe tags used in xml also look like HTML tags. They are formed by a word (or a number ofwords) enclosed inside < > and < / > signs. The difference is that xml tags are not pre-defined like HTML. 5
  • 6. <Composer> is an example for an opening tag. In XML all opening Tags must have closingtags, in this case the closing tag would look like </Composer>.Start TagThe beginning of every non-empty XML element is marked by a start-tag.An example of a start-tag: <Composer>End TagThe end of every non-empty XML element is marked by an end-tag.An example of an end-tag: </Composer>Element ContentThe text between the start-tag and end-tag is called the elements content.The element content in this case would be: This is my home page!!!!!!!Empty Element TagIf an element is empty, it must be represented either by a start-tag immediately followed byan end-tag or by an empty-element tag.An empty-element tag takes a special form: <BR/>...empty element tag in XML OR<BR></BR>Empty-element tags may be used for any element which has no content, whether or not it isdeclared using the keyword EMPTY. For interoperability, the empty-element tag must beused, and can only be used, for elements which are declared EMPTY.By convention put HTML tags in upper case and XML tags in lower case. Furthermore, XMLis case sensitive. Always remember that <Composer>, <composer> and <COMPOSER> aredifferent kinds of tags in XML.Tags should begin with either a letter, an underscore (_) or a colon (:) followed by somecombination of letters, numbers, periods (.), colons, underscores, or hyphens (-) but nowhite space, with the exception that no tags should begin with any form of "xml".3. XML EditorAn xml editor is a markup language editor with added functionality to facilitate the editingof xml. This can be done using a plain text editor, with all the code visible, but xml editorshave added facilities like tag completion and menus and buttons for tasks that are commonin xml editing, based on data supplied with document type definition (DTD) or the xml tree. 6
  • 7. An xml Editor should be able to Add closing tags to your opening tags automatically. Force you to write valid xml. Verify your xml against a DTD. Verify your xml against a Schema. Color codes your xml syntax.Here are Some xml Editors Emeditor XML Notepad XML Cook top XML Pro XML Spy eNotepadIf you use notepad for xml editing, you will soon run into problems. Notepad does not knowthat you are writing xml, so it will not be able to assist you. You will create many errors,and as your xml documents grow larger you will lose control. Today xml is an importanttechnology, and every day we can see xml playing a more and more critical role in new webdevelopment.However, when you start working with xml, you will soon find that it is better to edit xmldocuments using a professional xml editor. Good xml editors will help you to write errorfree xml documents, validate your text against a DTD or a schema, and force you to stick toa valid xml structure. Add closing tags to your opening tags automatically.3.1 EmEditorWhy is EmEditor Professional the Best Text Editor?1. EmEditor can Launch very Quickly, Almost InstantaneouslyYou are going to view or edit a large quantity of files every day, but you dont want to waitfor many seconds just to view a file! Unfortunately, many programs, including wordprocessors and text editors, require you to wait several seconds before you can start using!This doesnt make sense! You want to increase productivity by using a text editor, butwaiting so long every time doesnt justify your using a text editor. You should not waitmore than one second. Thats why EmEditor has been so popular for such a long time. 7
  • 8. 2. Extendable with Plug-ins!EmEditor exposes many APIs, so programmers can easily write plug-ins that fit their needs.Features such as Spelling, Word Count, Explorer, Web Preview, and Compare Files, etc. aredesigned as plug-ins.3. Powerful Macros with your Favorite Script Language!You can write a macro to do almost whatever you want within EmEditor! The macros arebased on the Windows Scripting Host (WSH) engine, so you can use all of the powerful,robust objects available under the Windows Scripting Host. You can program macros withpopular script languages including JavaScript and VB Script. You can even program withPerl Script, Python, PHP Script, Ruby, and other Active Script languages as long as the scriptengines you want to use are installed on your system.4. Unicode Support!EmEditor supports Unicode natively, and in fact, the whole program is built as a Unicodeapplication. EmEditor allows you to open a file with any encoding supported in theWindows system, and you can easily convert from one encoding to another withinEmEditor. EmEditor allows you to open Unicode file names, and allows you to search forUnicode characters. With EmEditor plug-ins, EmEditor allows you to convert a selected textto HTML/XML Character Reference or Universal Character Names, and vice versa.5. Easy and Intuitive Design with Tabbed Windows!EmEditor is designed for Windows XP, thus frequently used shortcut keys are similar toother Windows applications, such as Copy, Cut, Paste, Undo, and Redo. In addition,EmEditor uses tabbed windows similar to Slim Browser, Internet Explorer, Firefox andother tabbed browser applications. This allows you to open multiple documents in onewindow and jump between them quickly and easily.6. Other Features!There are many other useful Features that are Worth Mentioning: Keyword highlighting. Regular expression search and highlighting. External tools. Plug-ins using custom bars. Keyboard, toolbar, menu, font and color customization. Drag and drop. Auto save/backup. 8
  • 9. Clickable URLs and e-mail addresses. The window can be split into a maximum of 4 panes. Can define multiple configurations and associate file extensions. Can save backups to the recycle bin. Can open recently used files from the tray icon on the taskbar. Shortcut keys to insert accent marks and special characters. Application error handler support. 64-bit edition available. Windows Vista ready. Fast e-mail support.3.2 XML SPYXML Spy is the first true integrated development environment for the xml that includes allmajor aspects of xml in one powerful and easy-to-use product. Easy to use. Syntax coloring. Automatic tag completion. Automatic well-formed check. Easy switching between text view and grid view. Built in DTD and / or Schema validation. Built in graphical xml Schema designer. Powerful conversion utilities. Database import and export. Built in templates for most xml document types. Built in XPath analyzer. Full SOAP and WSDL capabilities. Powerful project management.3.3 XML Syntax RulesXML as we have seen is a formal specification for markup languages. Every formal languagespecification has an associated syntax.XML Documents as we have seen Comprise two Basic Components.Data: The actual content.Markup: Meta-information about data that describes it. 9
  • 10. The syntax rules of xml are very simple and logical. The rules are easy to learn, and easy touse.1. Every Element must have Closing Tag<p>This is a paragraph<p>This is another paragraphIn xml, it is illegal to omit the closing tag. All elements must have a closing tag.The very first line of any xml document must declare the document to be an xml documentand specify some other optional attributes.<?xml version="1.0"?>The statement above declares the document as an xml document, which means it complieswith xml syntax rules.2. XML Tags are Case SensitiveXML elements are defined using xml tags.XML tags are case sensitive. With xml, the tag <Letter> is different from the tag <Letter>.Opening and closing tags must be written with the same case.3. XML Elements must be Properly NestedIn HTML, you might see improperly nested elements:<b><i>This text is bold and italic</b></i>In xml, all elements must be properly nested within each other.<b><i>This text is bold and italic</i></b>4. XML Documents must have a Root ElementXML documents must contain one element that is the parent of all other elements. Thiselement is called the root element. XML documents must contain one element that is theparent of all other elements. This element is called the root element.<?xml version=”1.0?”> 10
  • 11. <Root><Child><Subchild>.....</Subchild></Child></Root>5. XML Attribute Values must be Quoted<?xml version="1.0" ?><Address><Bangalore><Name Nickname="12">Sumana</Name><Company>Testing</Company></Bangalore><Mysore><Name>Sumith</Name><Company EmpID="1675">Mac Studio</Company></Mysore></Address>XML elements can have attributes in name/value pairs just like in HTML.6. Entity ReferencesSome characters have a special meaning in xml. If you place a character like "<" inside anxml element, it will generate an error because the parser interprets it as the start of a newelement.To avoid this error, replace the "<" character with an entity reference.There are Five Predefined Entity References in xmlEntity Reference< (less than) &lt;> (Greater than) &gt;& (Ampersand) &amp;‘ (Apostrophe) &apos;“ (quotation mark) &quote;Note: Only the characters "<" and "&" are strictly illegal in xml. The greater than characteris legal, but it is a good habit to replace it.7. Comments in XMLComments should not appear on the first line or otherwise above the xml declaration forxml processor compatibility. The string "--" (double-hyphen) is not allowed (as it is used todelimit comments), and entities must not be recognized within comments.The Syntax for writing Comments in xml is Similar to that of HTML.<! -- This is a comment --> 11
  • 12. 8. White-Space is preserved in XMLHTML truncates multiple white-space characters to one single white-space. With xml, thewhite-space in a document is not truncated.3.4 XML ViewingXML files can be viewed in all major browsers.[Note: Dont expect xml files to be displayed as HTML pages]<?xml version="1.0"?><Address><Name>Harsh</Name><Company>Motorola </Company></Address>4. XML DocumentsXML documents are similar to HTML documents. They contain information and markuptags that define the information and are saved as ASCII text. The name of the xml documenthas an xml extension “abc.xml”. A data object is an xml document if it is well-formed.A well-formed xml document may in addition be valid if it meets certain further constraintsor Rules. Well formed xml documents contain text and xml tags which confirm to the xmlsyntax.Valid xml documents must be well formed and are additionally error checked against adocument type definition (DTD). DTD is a set of rules that defines what tags appear in anxml document. DTDs also describe the structure of a document.4.1 Well Formed XMLWell formed xml documents simply markup pages with descriptive tags. You dont need todescribe or explain what these tags mean. In other words a well formed xml document doesnot need a DTD, but is must confirm to the xml syntax rules. If all tags in a document arecorrectly formed and follow xml syntax rules or guidelines, then a document areconsidered as well formed. Some of the rules are given below.1. XML documents must contain at least one element.Well formed: <title>Software</title>Not well formed: “Software” 12
  • 13. 2. XML documents must contain a unique opening and closing tag that contains the wholedocument, forming what is called a root element.Well Formed: <title>DEL</title>Not well formed: <title>DEL3. Tags in XML are Case Sensitive: The <Author>, <AUTHOR> are not the same. The xmlprocessing instruction must be all lowercase. But keywords in DTDs must be allUPPERCASE, such as ELEMENT, ATTLIST, #REQUIRED, #IMPLIED, NMTOKEN, ID, etc.However, your own elements and attributes may be any case you choose, as long as you areconsistent.Well formed: <Author>Information</Author>Not well formed: <Author>Information</AUTHOR>4. Attribute values must always be quoted (as opposed to HTML).Well formed: <Name id="100">Asini</Name>Not well formed: <Name id="1>Asini</Name>4.2 Valid XMLValid xml is a more rigid or formal form of xml. All xml documents are well formeddocuments. Some xml documents are additionally valid. Valid documents must confirm notonly to the syntax, but also to the DTD. 13
  • 14. In the case of markup languages defined by xml, the DTD provides the grammaticalstructure to bring order to the elements of the language. The main difference between validand well formed is that valid xml requires a DTD and whereas well formed xml does not.4.3 XML ParserAn xml parser is a processor that reads an xml document and determines the structure andproperties of the data. If the parser goes beyond the xml rules for well-firmness andvalidates the document against an xml DTD, the parser is said to be a "validating" parser.A validating xml parser also checks the xml syntax and reports errors. Now you have thepossibility to check whether a document is well formed and valid. An xml parser reads xml,and converts it into an xml DOM object that can be accessed with JavaScript. Most browsershave a built-in xml parser.4.4 PrologThe prolog refers to the information that appears before the start tag of the document orroot element. It includes information that applies to the document as a whole, such ascharacter encoding, document structure, and style sheets.<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="show_book.xsl"?><!DOCTYPE catalog SYSTEM "catalog.dtd">XML DeclarationThe XML declaration typically appears as the first line in an XML document. The XMLdeclaration is not required, however, if used it must be the first line in the document and noother content or white space can precede it.The XML Declaration in the Document Map Consists of the Following:The Version Number, <?xml version="1.0"?>. This is mandatory. Although the numbermight change for future versions of XML, 1.0 is the current version.The Encoding Declaration, <?xml version="1.0" encoding="UTF-8"?>. This is optional. Ifused, the encoding declaration must appear immediately after the version information inthe XML declaration, and must contain a value representing an existing character encoding.An XML declaration can also contain a Standalone Declaration, for example, <?xmlversion="1.0" encoding="UTF-8" standalone="yes"?>. Like the encoding declaration, the 14
  • 15. standalone declaration is optional. If used, the standalone declaration must appear last inthe XML declaration.Encoding DeclarationThe encoding declaration identifies which encoding is used to represent the characters inthe document. Although XML parsers can determine automatically if a document uses theUTF-8 or UTF-16 Unicode encoding, this declaration should be used in documents thatsupport other encodings. For example, the following is the encoding declaration for adocument that uses the ISO-8859-1 (Latin 1).Example: <?xml version="1.0" encoding="ISO-8859-1"?>Standalone DeclarationThe standalone declaration indicates whether a document relies on information from anexternal source, such as external document type definition (DTD), for its content. If thestandalone declaration has a value of "yes",Example :<?xml version="1.0" standalone="yes"?>The parser will report an error if the document references an external DTD or externalentities. Leaving out the standalone declaration produces the same result as including astandalone declaration of "no". The XML parser will accept external resources, if there areany, without reporting an error.CommentsComments begin with a <!-- and end with a -->. Comments can appear in the documentprolog, including the document type definition (DTD); after the document; or in the textualcontent. Comments cannot appear within attribute values. They cannot appear inside oftags.5. Document Type Definition (DTD)XML DTD or document type definition is expected to define formal grammar of xml basedmarkup language(s). Basically DTD contains list of elements that can occur in markup, listof attributes of each element, possible attribute values or value types and content modelthat specifies allowed nesting of elements. 15
  • 16. This Information can be used in Several Ways: One can use DTD to validate document, i.e., to check whether document follows formal rules defined in DTD, in this way one can detect possible errors (like misspelled element names, attribute names/values, wrongly nested elements etc.) that otherwise would be difficult to notice. One can use DTD just to provide accurate description of markup language. Here many things depend on markup language itself, as not all xml applications can be accurately described using xml DTD. One can use DTD to define character entities, specify default attributes and bind elements to xml namespaces.The main purpose of a DTD is to define the legal building blocks of an xml document. Youcan store a DTD at the beginning of a document or externally in a separate file.All the xml documents (and HTML documents) are made up by the following buildingblocks: Elements Attributes Entities PCDATA CDATA5.1 DTD ElementsElements are the main building blocks in the document structure. The elements representthe logical components of a document and how they are arranged into a hierarchical (tree)structure.Syntax: <! ELEMENT Name Content >In a DTD, elements are declared with an ELEMENT declaration.Declaring ElementsIn a DTD, xml elements are declared with an element declaration with the following syntax.<! ELEMENT Element-Name Category>OR<! ELEMENT Element-Name (Element-Content)> 16
  • 17. Empty ElementsEmpty elements are declared with the category keyword EMPTY<! ELEMENT Element-Name EMPTY>Elements with Parsed Character DataElements with only parsed character data are declared with #PCDATA inside parentheses.<! ELEMENT Element-Name (#PCDATA)>Example: <! ELEMENT from (#PCDATA)>Elements with any contentsElements declared with the category keyword ANY, can contain any combination of parsable data:<! ELEMENT element-name ANY>Example: <! ELEMENT note ANY>Elements with children (Sequences)Elements with one or more children are declared with the name of the children elementsinside parentheses.<! ELEMENT Element-Name (Child1)>OR<! ELEMENT Element-Name (Child1, Child2,...)>Example: <! ELEMENT note (to, from, heading, body)>When children are declared in a sequence separated by commas, the children must appearin the same sequence in the document. In a full declaration, the children must also bedeclared, and the children can also have children. The full declaration of the "note" elementis:<! ELEMENT Note (To, From, Heading, Body)><! ELEMENT To (#PCDATA)><! ELEMENT from (#PCDATA)><! ELEMENT heading (#PCDATA)> 17
  • 18. <! ELEMENT body (#PCDATA)>Declaring Only one Occurrence of an Element<! ELEMENT element-name (child-name)>Example<! ELEMENT Note (Message)>The example above declares that the child element "message" must occur once, and onlyonce inside the "note" element.Declaring Minimum one Occurrence of an Element<! ELEMENT Element-Name (Child-Name+)>Example: <! ELEMENT Note (Message+)>The + sign in the example above declares that the child element "Message" must occur oneor more times inside the "Note" element.Declaring Zero or More Occurrences of an Element<! ELEMENT Element-Name (Child-Name*)>Example: <! ELEMENT Note (Message*)>The * sign in the example above declares that the child element "Message" can occur zeroor more times inside the "Note" element.Declaring Zero or One Occurrences of an Element<! ELEMENT Element-Name (Child-Name?)>Example: <! ELEMENT Note (Message?)>The ? Sign in the example above declares that the child element "message" can occur zeroor one time inside the "Note" element.Declaring Either/or ContentExample: <! ELEMENT Note (To, From, header, (message | body))> 18
  • 19. The example above declares that the "note" element must contain a "to" element, a "from"element, a "header" element, and either a "message" or a "body" element.Declaring Mixed ContentExample: <! ELEMENT Note (#PCDATA|to|from|header|message)*>The example above declares that the "Note" element can contain zero or moreoccurrences of parsed character data, "To", "From", "Header", or "Message" elements.5.2 Types of ElementsThere are Three Primary Types of Elements. They are given belowSimple elements: These are elements that contain text or "parsed character data"(represented as #PCDATA in your DTD).Compound elements: These elements contain other elements, and sometimes PCDATAand other elements.Standalone elements: They do not contain any PCDATA or other elements.5.3 AttributesAttributes allow an author to attach extra information to the elements in a document. Oneimportant difference from the elements is that the attributes cannot contain elements andthere is no "Sub-attribute".In a DTD, attributes are declared with an ATTLIST declaration. 19
  • 20. Declaring AttributesAn attribute declaration has the following syntax<! ATTLIST element-name attribute-name attribute-type default-value>DTD Example: <! ATTLIST Payment type CDATA "Check">XML Example: <Payment type="Check" />The Attribute-Type can be one of the Following:Type DescriptionCDATA The value is character data(en1|en2|..) The value must be one from an enumerated listID The value is a unique idIDREF The value is the id of another elementIDREFS The value is a list of other idsNMTOKEN The value is a valid xml nameNMTOKENS The value is a list of valid xml namesENTITY The value is an entityENTITIES The value is a list of entitiesNOTATION The value is a name of a notationxml The value is a predefined xml valueThe Default-Value can be one of the Following:Value Explanationvalue The default value of the attribute#REQUIRED The attribute is required#IMPLIED The attribute is not required#FIXED value The attribute value is fixedDefault Attribute ValueDTD<! ELEMENT Square EMPTY><! ATTLIST Square width CDATA "0">Valid xml 20
  • 21. <square width="100" />In the example above, the "square" element is defined to be an empty element with a"width" attribute of type CDATA. If no width is specified, it has a default value of 0.#REQUIREDSyntax :<! ATTLIST Element-Name Attribute-Name Attribute-Type #REQUIRED>ExampleDTD<! ATTLIST Person Number CDATA #REQUIRED>Valid xml<Person Number="5677" />Invalid xml<person />Use the #REQUIRED keyword if you dont have an option for a default value, but still wantto force the attribute to be present.#IMPLIEDSyntax :<!ATTLIST Element-Name Attribute-Name Attribute-Type #IMPLIED>ExampleDTD<! ATTLIST Contact fax CDATA #IMPLIED>Valid xml<Contact fax="555-667788" />Use the #IMPLIED keyword if you dont want to force the author to include an attribute,and you dont have an option for a default value. 21
  • 22. #FIXEDSyntax :<! ATTLIST Element-Name Attribute-Name Attribute-Type #FIXED "value">ExampleDTD<! ATTLIST Sender Company CDATA #FIXED "Microsoft">Valid xml<Sender Company="Microsoft" />Invalid xml<Sender Company="Software" />Use the #FIXED keyword when you want an attribute to have a fixed value withoutallowing the author to change it. If an author includes another value, the xml parser willreturn an error.Enumerated Attribute ValuesSyntax: <! ATTLIST Element-Name Attribute-Name (En1|En2|..) Default-value>ExampleDTD: <! ATTLIST Payment type (Check | Cash) "cash">XML Example: <Payment type="Check" />OR<Payment type="Cash" />Use enumerated attribute values when you want the attribute value to be one of a fixed setof legal values.5.4 EntitiesAn entity is a name that represents a special character, additional text or a file. There aretwo kinds of entities 22
  • 23. General Entities Parameter EntitiesThere are Two Kinds of Entities in XML Documents.1. General Entities: Used in the context of documents. References to general entities startwith & and end with;2. Parameter Entities: Used in a document’s DTD. References to parameter entities startwith % and end with;6. Why we Need a DTD?XML is a language specification. Based on this specification, individuals and organizationsdevelop their own markup languages which they then use to communicate information. Needs to know how the document is structured and Needs to check if the content is indeed compliant with the structureThe Document Type Definition also known as DTD holds information about the structure ofan xml document.6.1 Why Use a DTD? XML provides an application independent way of sharing data. With a DTD, different groups of people can agree on a common DTD for interchanging data. Your application can use a standard DTD to verify that data that you receive from the outside world is valid. The DTD can be used to verify your data.6.2 Internal DTDsInternal DTD are inserted within the doc type declaration. DTDs inserted this way are usedin the specific document.Syntax: <! DOCTYPE Root-Element [DTD Specification]>Examples1. <?xml version="1.0"?><!DOCTYPE Note [ 23
  • 24. <!ELEMENT Note (To, From, Heading, Body)><!ELEMENT To (#PCDATA)><!ELEMENT From (#PCDATA)><!ELEMENT Heading (#PCDATA)><!ELEMENT Body (#PCDATA)>]><Note><To>Tove</To><From>Jani</From><Heading>Reminder</Heading><Body>Dont Forget Me This Weekend</Body></Note>2. <?xml version="1.0"?><!DOCTYPE message [<!ELEMENT message (to,from,subject,text)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT subject (#PCDATA)><!ELEMENT text (#PCDATA)>]><message><to>Dave</to><from>Susan</from><subject>Reminder</subject><text>Dont forget to buy milk on the way home.</text></message>3. <?xml version="1.0"?><!DOCTYPE Tutorials [<!ELEMENT Tutorials (Tutorial)+><!ELEMENT Tutorial (Name, URL)><!ELEMENT Name (#PCDATA)><!ELEMENT URL (#PCDATA)>]><Tutorials><Tutorial><Name>xml Tutorial</Name><URL>www.Test.COM </URL></Tutorial><Tutorial><Name>HTML Tutorial</Name><URL><URL></Tutorial></Tutorials>4. <?xml version="1.0"?><!DOCTYPE Address[<!ELEMENT Address (Street, City, State, Zip)><!ELEMENT Street (#PCDATA)><!ELEMENT City (#PCDATA)><!ELEMENT State (#PCDATA)><!ELEMENT Zip (#PCDATA)>]><Address><Street>12 City Road</Street><City>Melbourne</City><State>Victoria</State><Zip>8001</Zip></Address> 24
  • 25. 5. <?xml version="1.0"?><!DOCTYPE Note[<!ELEMENT Note (To, From, Heading, Body)><!ELEMENT To (#PCDATA)><!ELEMENT From (#PCDATA)><!ELEMENT Heading (#PCDATA)><!ELEMENT Body (#PCDATA)>]><Note><To>Yashaswi</To><From>Jan</From><Heading>Head Lines</Heading><Body> Software</Body></Note>6. <?xml version="1.0"?><!DOCTYPE Film [<!ENTITY COM "Comedy"><!ENTITY SF "Science Fiction"><!ELEMENT Film (Title+, Genre, Year)><!ELEMENT Title (#PCDATA)><!ELEMENT Genre (#PCDATA)><!ELEMENT Year (#PCDATA)>]><Film><Title id="1">Tootsie</Title><Genre>&COM;</Genre><Year>1982</Year><Title Id="2">Jurassic Park</Title><Genre>&SF;</Genre><Year>1993</Year></Film>7. <?xml version="1.0"?><!DOCTYPE People_List [<!ELEMENT People_List (Person*)><!ELEMENT Person (Name, Birthdate?, Gender?, Social Security Number?)><!ELEMENT Name (#PCDATA)><!ELEMENT Birthdate (#PCDATA)><!ELEMENT Gender (#PCDATA)><!ELEMENT Social Security Number (#PCDATA)>]><People_List><Person><Name>Aditya</Name><Birthdate>27/11/2008</Birthdate><Gender>Male</Gender></Person></People_List>8. <?xml version="1.0"?><!DOCTYPE Newspaper [<!ELEMENT Newspaper (Article+)><!ELEMENT Article (Headline, Byline, Lead, Body, Notes)><!ELEMENT Headline (#PCDATA)><!ELEMENT Byline (#PCDATA)><!ELEMENT Lead (#PCDATA)> 25
  • 26. <!ELEMENT Body (#PCDATA)><!ELEMENT Notes (#PCDATA)><!ATTLIST Article Author CDATA #REQUIRED><!ATTLIST Article Editor CDATA #IMPLIED><!ATTLIST Article Date CDATA #IMPLIED><!ATTLIST Article Edition CDATA #IMPLIED><!ENTITY Newspaper "Times of India"><!ENTITY Publisher "Hasini"><!ENTITY Copyright "Copyright 2010 SOFTWARE ">]><Newspaper><Article Author="Yashaswi" Editor="Anurag" Date="20/02/2010"Edition="First"><Headline>Temptation 2010</Headline><Byline>New Year</Byline><Lead>No &Publisher; Matter</Lead><Body>&Newspaper;</Body><Notes>All The Best The New Year&Copyright;</Notes></Article></Newspaper>9. <?xml version="1.0"?><!DOCTYPE Parts [<!ELEMENT Parts (Title?, Part*)><!ELEMENT Title (#PCDATA)><!ELEMENT Part (Item, Manufacturer, Model, Cost)+><!ATTLIST Parttype (Computer|Auto|Airplane) #IMPLIED><!ELEMENT Item (#PCDATA)><!ELEMENT Manufacturer (#PCDATA)><!ELEMENT Model (#PCDATA)><!ELEMENT Cost (#PCDATA)>]><Parts><Title>Main Heading</Title><Part type="Computer"><Item></Item><Manufacture></Manufacture><Model></Model><Cost></Cost></Part></Parts>10. <?xml version="1.0"?><!DOCTYPE Videos [<!ELEMENT Videos (Music+) ><!ELEMENT Music (Title, Artist+)><!ELEMENT Title (#PCDATA)><!ELEMENT Artist (#PCDATA) >]><Videos><Music><Title>Video Title1</title><Artist>Artist1</artist></Music><Music><Title>Video Title2 </Title><Artist>Artist2</Artist><Artist>Artist3</Artist></Music></Videos> 26
  • 27. 11. <?xml version="1.0"?><!DOCTYPE Document [<!ELEMENT Document (Customer)*><!ELEMENT Customer (Name,Date,Orders)><!ELEMENT Name (Last_Name,First_Name)><!ELEMENT Last_Name (#PCDATA)><!ELEMENT First_Name (#PCDATA)><!ELEMENT Date (#PCDATA)><!ELEMENT Orders (Item)*><!ELEMENT Item (Product,Number,Price)><!ELEMENT Product (#PCDATA)><!ELEMENT Number (#PCDATA)><!ELEMENT Price (#PCDATA)>]><Document><Customer><Name><Last_Name>Kaif</Last_Name><First_Name>Kat</First_Name></Name><Date>20/02/2010</Date><Orders><Item><Product></Product><Number></Number><Price></Price></Item></Orders></Customer></Document>12. <?xml version="1.0"?><!DOCTYPE book [<!ELEMENT book (title, chapter+)><!ELEMENT title (#PCDATA)><!ELEMENT chapter (heading, paragraph*)><!ELEMENT heading (#PCDATA)><!ELEMENT paragraph (#PCDATA)><!ATTLIST chapter language CDATA #REQUIRED>]><book><title/><chapter language="markup"><heading>Introduction to xml</heading><paragraph> Extensible markup language,used to describe the data</paragraph></chapter></book> 27
  • 28. 6.3 External DTDAn external DTD is one that resides in a separate document. It refers saving the DTD as aseparate file with extension .dtd and then referencing the DTD file within the XMLdocument.Syntax: <! DOCTYPE Root-Element SYSTEM "File-Name">Examples1. <!ELEMENT JewelleryShop (Gold+)><!ELEMENT Gold (Chain+, Bangles+, Earings+,Necklace?)><!ELEMENT Chain (Longchain?, Shortchain+)><!ELEMENT Longchain (#PCDATA)><!ELEMENT Shortchain (#PCDATA)><!ELEMENT Bangles (#PCDATA)><!ELEMENT Earings (#PCDATA)><!ELEMENT Necklace (#PCDATA)><?xml version="1.0"?><!DOCTYPE JewelleryShop SYSTEM "gold.dtd"><JewelleryShop><Gold><Chain><Longchain>500grams</Longchain><Shortchain>200grams</Shortchain></Chain><Bangles>200grams of 4 bangles</Bangles><Earings>250 grams of 2 earings</Earings><Necklace/></Gold></JewelleryShop>2. <!ELEMENT people_list (person*)><!ELEMENT person (name, birthdate?, gender?)><!ELEMENT name (#PCDATA)><!ELEMENT birthdate (#PCDATA)><!ELEMENT gender (#PCDATA)><?xml version="1.0" encoding="UTF-8"?><!DOCTYPE people_list SYSTEM "example.dtd"><people_list><person><name>Borne</name><birthdate>04/02/1977</birthdate><gender>Male</gender></person></people_list>3. <!ELEMENT addressbook (contact)><!ELEMENT contact (name, address+, city, state, zip, phone, email, web, company)><!ELEMENT name (#PCDATA)><!ELEMENT address (#PCDATA)> 28
  • 29. <!ELEMENT city (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT zip (#PCDATA)><!ELEMENT phone (voice, fax?)><!ELEMENT voice (#PCDATA)><!ELEMENT fax (#PCDATA)><!ELEMENT email (#PCDATA)><!ELEMENT web (#PCDATA)><!ELEMENT company (#PCDATA)><?xml version="1.0"?><!DOCTYPE addressbook SYSTEM "AddressBook.dtd" [<!ENTITY amp "&#38;#38;"><!ENTITY apos "&#39;">]><addressbook><contact><name>Frank Rizzo</name><address>1212 W 304th Street</address><city>New York</city><state>New York</state><zip>10011</zip><phone><voice>212-555-1212</voice><fax>212-555-1213</fax></phone><email></email><web></web><company>Frank&apos;s Ratchet Service</company></contact><contact><name>Sol Rosenberg</name><address>1162 E 412th Street</address><city>New York</city><state>New York</state><zip>10011</zip><phone><voice>212-555-1818</voice><fax>212-555-1819</fax></phone><email></email><web></web><company>Rosenberg&apos;sShoes&amp;Glasses</company></contact></addressbook>4. <!ELEMENT movies (movie)+><!ELEMENT movie (title, writer+, producer+, director+, actor*, comments?)><!ATTLIST movie type (drama | comedy | adventure | sci-fi | mystery | horror | romance|documentary) "drama" rating (G | PG | PG-13 | R | X) "PG" review (1 | 2 | 3 | 4 | 5) "3" yearCDATA #IMPLIED><!ELEMENT title (#PCDATA)><!ELEMENT writer (#PCDATA)><!ELEMENT producer (#PCDATA)><!ELEMENT director (#PCDATA)><!ELEMENT actor (#PCDATA)><!ELEMENT comments (#PCDATA)> 29
  • 30. <?xml version="1.0" standalone="no"?><?xml-stylesheet type="text/CSS" href="Movies.CSS"?><!DOCTYPE movies SYSTEM "Movies.dtd"><movies><movie type="comedy" rating="PG-13" review="5" year="1987"><title>Raising Arizona</title><writer>Ethan Coen</writer><writer>Joel Coen</writer><producer>Ethan Coen</producer><director>Joel Coen</director><actor>Nicolas Cage</actor><actor>Holly Hunter</actor><actor>John Goodman</actor><comments>A classic one-of-a-kind screwball love story.</comments></movie><movie type="comedy" rating="R" review="5" year="1988"><title>Midnight Run</title><writer>George Gallo</writer><producer>Martin Brest</producer><director>Martin Brest</director><actor>Robert De Niro</actor><actor>Charles Grodin</actor><comments>The quintessential road comedy.</comments></movie><movie type="mystery" rating="R" review="5" year="1995"><title>The Usual Suspects</title><writer>Christopher McQuarrie</writer><producer>Bryan Singer</producer><producer>Michael McDonnell</producer><director>Bryan Singer</director><actor>Stephen Baldwin</actor><actor>Gabriel Byrne</actor><actor>Benicio Del Toro</actor><actor>Chazz Palminteri</actor><actor>Kevin Pollak</actor><actor>Kevin Spacey</actor><comments>A crime mystery with incredibly intricate plot twists.</comments></movie><movie type="sci-fi" rating="PG-13" review="4" year="1989"><title>The Abyss</title><writer>James Cameron</writer><producer>Gale Anne Hurd</producer><director>James Cameron</director> 30
  • 31. <actor>Ed Harris</actor><actor>Mary Elizabeth Mastrantonio</actor><comments>A very engaging underwater odyssey.</comments></movie></movies>5. <!ELEMENT courses (BscIT+)><!ELEMENT BscIT (details+)><!ELEMENT details (firstsem?, secondsem?, thirdsem?, forthsem?, fifthsem?, sixthsem+)><!ELEMENT firstsem (#PCDATA)><!ELEMENT secondsem (#PCDATA)><!ELEMENT thirdsem (#PCDATA)><!ELEMENT forthsem (#PCDATA)><!ELEMENT fifthsem (#PCDATA)><!ELEMENT sixthsem (visual+,UNIX+, testing+,vbLab+,UNIXLab+, Project+)><!ELEMENT visual (#PCDATA)><!ELEMENT UNIX (#PCDATA)><!ELEMENT testing (#PCDATA)><!ELEMENT vbLab (#PCDATA)><!ELEMENT UNIXLab (#PCDATA)><!ELEMENT Project (#PCDATA)><?xml version="1.0"?><!DOCTYPE courses SYSTEM "course.dtd"><courses><BscIT><details><firstsem/><secondsem/><thirdsem/><forthsem/><fifthsem/><sixthsem><visual/><UNIX/><testing/><vbLab/><UNIXLab/><Project/></sixthsem></details></BscIT></courses>6. <!ELEMENT tutorials (tutorial)+><!ELEMENT tutorial (name,url)><!ELEMENT name (#PCDATA)><!ELEMENT url (#PCDATA)><!ATTLIST tutorials type CDATA #REQUIRED><?xml version="1.0" standalone="no"?><!DOCTYPE tutorials SYSTEM "tutorials.dtd"><tutorials><tutorial><name>xml Tutorial</name><url></url></tutorial><tutorial><name>HTML Tutorial</name><url></url></tutorial></tutorials> 31
  • 32. 6.4 Problems with DTD Not itself using XML syntax. No constraints on character data. Too simple attribute value models. No support for Namespaces. Very limited support for modularity and reuse (the entity mechanism is too low- level). No support for schema evolution, extension, or inheritance of declarations (difficult to write, maintain, and read large DTDs, and to define families of related schemas). Limited white-space control. No embedded, structured self-documentation (<!-- comments --> are not enough). Content and attribute declarations cannot depend on attributes or element context (many XML languages use that, but their DTDs have to "allow too much"). Too simple ID attributes mechanism. Only defaults for attributes, not for elements. Cannot specify "any element" or "any attribute". Defaults cannot be specified separate from the declarations.6.5 Design PrinciplesThe XML Schema Language shall be More expressive than XML DTDs. Expressed in XML. Self-describing. Usable by a wide variety of applications that employ XML. Straightforwardly usable on the Internet. Optimized for interoperability. Simple enough to be implemented with modest design and runtime resources. Coordinated with relevant W3C specs.The XML Schema Language Specification shall Be prepared quickly. Be precise, concise, human-readable, and illustrated with examples. 32
  • 33. Thank You 33