Document Type Definition XML http://yht4ever.blogspot.com [email_address] B070066 - NIIT Quang Trung 07/2007
Contents Declare a DTD document Assign DTD to XML Document Introduction to DTD  Parsers, Well-formed and valid XML Documents
Parsers, Well-formed and valid XML Documents Parsers Validating Able to read DTD Determine whether XML document conforms to DTD Valid document conforms to DTD Document is then well formed, by definition Documents can be well formed, but not valid Nonvalidating Able to read DTD Cannot check document against DTD for conformity
Parsers, Well-formed and valid XML Documents Documents  must  be well-formed Document contains single root element Elements are balanced and properly nested Attributes are specified and quoted Text content contains legal XML characters Documents  may  be valid Document structure and content follows rules specified by grammar (e.g. DTD, XML Schema)
Contents Declare a DTD document Assign DTD to XML Document Introduction to DTD  Parsers, Well-formed and valid XML Documents
What is a DTD? Document Type Definition Defined in the XML 1.0 specification Allows user to create new document grammars A subset borrowed from SGML Uses non-XML syntax! Document-centric Focus on document structure Lack of “normal” datatypes (e.g. int, float)
Contents Declare a DTD document Assign DTD to XML Document Introduction to DTD  Parsers, Well-formed and valid XML Documents
Document Type Declaration Introduce DTDs into XML documents Placed in XML document’s prolog Begins with  <!DOCTYPE  and ends and with  > Can point to External subsets Declarations outside document Exist in different file typically ending with  .dtd  extension Internal subsets Declarations inside document Visible only within document in which it resides
Document Type Declaration Example <!DOCTYPE students [ <ELEMENT students (#PCDATA)> ]> <!DOCTYPE students SYSTEM “students.dtd”> <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”  http://www.w3.org/TR/html4/strict.dtd >
Exercises Assign myschool.dtd to myschool.xml and validate it. again with myschool1.xml.
Contents Declare a DTD document Assign DTD to XML Document Introduction to DTD  Parsers, Well-formed and valid XML Documents
Element Type Definition Declare elements in XML documents Begin with  <!ELEMENT End with  > <!ELEMENT  myElement (  #PCDATA  ) > myElement  is  generic identifier Parentheses specify element’s content ( content specification ) Keyword  PCDATA Element must contain parsable character data Don’t use the same element name in multiple element type.
Sequences, Pipe Characters and Occurrence Indicators Sequences Specify order in which elements occur Comma ( , ) used as delimiter <!ELEMENT  classroom ( teacher, student ) > Pipe characters ( | ) Specify choices <!ELEMENT  dessert ( ice-Cream | pastry ) >
Sequences, Pipe Characters and Occurrence Indicators Occurrence indicators Specify element’s frequency Plus sign ( + ) indicates one or more occurrences <!ELEMENT  album ( song+ ) > Asterisk ( * ) indicates optional element <!ELEMENT  library ( book* ) > Question mark ( ? ) indicates element can occur only once <!ELEMENT  seat ( person? ) >
Sequences, Pipe Characters and Occurrence Indicators
Content specification types EMPTY Elements do not contain character data Elements do not contain child elements <!ELEMENT  oven  EMPTY> Markup for  oven  element <oven/> Mixed content Combination of elements and  PCDATA <!ELEMENT  myMessage (  #PCDATA  | message )* > Markup for  myMessage <myMessage> Here is some text, some   <message> other text </message> and   <message> even more text </message>   </myMessage>
Content specification types ANY Can contain any content PCDATA , elements or combination Can also be empty elements Commonly used in early DTD-development stages Replace with specific content as DTD evolves
Exercises Create a DTD for students.xml
Attribute Type Definition Specifies element’s attribute list Uses  ATTLIST  attribute list declaration <!ATTLIST elementname attributename valuetype [attributetype] [&quot;default&quot;]> Example <!ELEMENT students (student*)> <!ELEMENT student (name, address?)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED>
Attribute Type Definition (cont.) Attribute defaults Specify attribute’s default value #IMPLIED Use (application’s) default value if attribute value not specified #REQUIRED Attribute must appear in element Document is not valid if attribute is missing #FIXED Attribute value is constant Attribute value cannot differ in XML document
Attribute Type Definition (cont.) Example <!ELEMENT students (student*)> <!ELEMENT student (name, address?)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ATTLIST student id CDATA  #REQUIRED > <!ATTLIST student center-id CDATA  #FIXED &quot;30027&quot; > <!ATTLIST student country-id CDATA  #IMPLIED >
Attribute Type Definition (cont.) Attribute types Strings ( CDATA ) No constraints on attribute values Except for disallowing  < ,  > ,  & ,  ’ and  ”  characters Tokenized attributes Constraints on permissible characters for attribute values Enumerated attributes Most restrictive Take only one value listed in attribute declaration
Attribute Type Definition (cont.) Tokenized attribute types ( ID ,  IDREF ,  ENTITY ,  NMTOKEN ) Restrict attribute values ID Uniquely identifies an element IDREF Points to elements with  ID  attribute
Attribute Type Definition (cont.) ENTITY  tokenized attribute type Indicate that attribute has entity for its value Entity declaration <!ENTITY  digits  “0123456789” > Entity may be used as follows: <useAnEntity> &digits; </useAnEntity> Entity reference  &digits;  replaced by its value <useAnEntity> 0123456789 </useAnEntity>
Attribute Type Definition (cont.) Enumerated attribute types Declare list of possible values for attribute <!ATTLIST  person gender ( M | F )  “F” > Attribute  gender  can have either value  M  or  F F  is default value
Recursive DTDs We want to capture a person with a mother and a father First attempt: <!ELEMENT person (name, address, person, person)> where the first person is the mother while the second is the father Second attempt: <!ELEMENT person (name, address, person?, person?)>
Recursive DTDs (cont.) Third attempt: <!ELEMENT person (name, address)> <!ATTLIST person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED>
More… NMTOKEN? <!NOTATION > ? Conditional Sections?
Reference XML How to program http://www.w3.org Teach Yourself XML in 21 Days, 3 rd  Edition Learning XML, 2 nd  Edition Andy Clark presentation. XML tutorial http://www.w3schools.com/w3c/
Q&A Feel free to post questions at  http://yht4ever.blogspot.com . or email to:  [email_address]  or  [email_address]
http://yht4ever.blogspot.com Thank You !

Document Type Definition

  • 1.
    Document Type DefinitionXML http://yht4ever.blogspot.com [email_address] B070066 - NIIT Quang Trung 07/2007
  • 2.
    Contents Declare aDTD document Assign DTD to XML Document Introduction to DTD Parsers, Well-formed and valid XML Documents
  • 3.
    Parsers, Well-formed andvalid XML Documents Parsers Validating Able to read DTD Determine whether XML document conforms to DTD Valid document conforms to DTD Document is then well formed, by definition Documents can be well formed, but not valid Nonvalidating Able to read DTD Cannot check document against DTD for conformity
  • 4.
    Parsers, Well-formed andvalid XML Documents Documents must be well-formed Document contains single root element Elements are balanced and properly nested Attributes are specified and quoted Text content contains legal XML characters Documents may be valid Document structure and content follows rules specified by grammar (e.g. DTD, XML Schema)
  • 5.
    Contents Declare aDTD document Assign DTD to XML Document Introduction to DTD Parsers, Well-formed and valid XML Documents
  • 6.
    What is aDTD? Document Type Definition Defined in the XML 1.0 specification Allows user to create new document grammars A subset borrowed from SGML Uses non-XML syntax! Document-centric Focus on document structure Lack of “normal” datatypes (e.g. int, float)
  • 7.
    Contents Declare aDTD document Assign DTD to XML Document Introduction to DTD Parsers, Well-formed and valid XML Documents
  • 8.
    Document Type DeclarationIntroduce DTDs into XML documents Placed in XML document’s prolog Begins with <!DOCTYPE and ends and with > Can point to External subsets Declarations outside document Exist in different file typically ending with .dtd extension Internal subsets Declarations inside document Visible only within document in which it resides
  • 9.
    Document Type DeclarationExample <!DOCTYPE students [ <ELEMENT students (#PCDATA)> ]> <!DOCTYPE students SYSTEM “students.dtd”> <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” http://www.w3.org/TR/html4/strict.dtd >
  • 10.
    Exercises Assign myschool.dtdto myschool.xml and validate it. again with myschool1.xml.
  • 11.
    Contents Declare aDTD document Assign DTD to XML Document Introduction to DTD Parsers, Well-formed and valid XML Documents
  • 12.
    Element Type DefinitionDeclare elements in XML documents Begin with <!ELEMENT End with > <!ELEMENT myElement ( #PCDATA ) > myElement is generic identifier Parentheses specify element’s content ( content specification ) Keyword PCDATA Element must contain parsable character data Don’t use the same element name in multiple element type.
  • 13.
    Sequences, Pipe Charactersand Occurrence Indicators Sequences Specify order in which elements occur Comma ( , ) used as delimiter <!ELEMENT classroom ( teacher, student ) > Pipe characters ( | ) Specify choices <!ELEMENT dessert ( ice-Cream | pastry ) >
  • 14.
    Sequences, Pipe Charactersand Occurrence Indicators Occurrence indicators Specify element’s frequency Plus sign ( + ) indicates one or more occurrences <!ELEMENT album ( song+ ) > Asterisk ( * ) indicates optional element <!ELEMENT library ( book* ) > Question mark ( ? ) indicates element can occur only once <!ELEMENT seat ( person? ) >
  • 15.
    Sequences, Pipe Charactersand Occurrence Indicators
  • 16.
    Content specification typesEMPTY Elements do not contain character data Elements do not contain child elements <!ELEMENT oven EMPTY> Markup for oven element <oven/> Mixed content Combination of elements and PCDATA <!ELEMENT myMessage ( #PCDATA | message )* > Markup for myMessage <myMessage> Here is some text, some <message> other text </message> and <message> even more text </message> </myMessage>
  • 17.
    Content specification typesANY Can contain any content PCDATA , elements or combination Can also be empty elements Commonly used in early DTD-development stages Replace with specific content as DTD evolves
  • 18.
    Exercises Create aDTD for students.xml
  • 19.
    Attribute Type DefinitionSpecifies element’s attribute list Uses ATTLIST attribute list declaration <!ATTLIST elementname attributename valuetype [attributetype] [&quot;default&quot;]> Example <!ELEMENT students (student*)> <!ELEMENT student (name, address?)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED>
  • 20.
    Attribute Type Definition(cont.) Attribute defaults Specify attribute’s default value #IMPLIED Use (application’s) default value if attribute value not specified #REQUIRED Attribute must appear in element Document is not valid if attribute is missing #FIXED Attribute value is constant Attribute value cannot differ in XML document
  • 21.
    Attribute Type Definition(cont.) Example <!ELEMENT students (student*)> <!ELEMENT student (name, address?)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED > <!ATTLIST student center-id CDATA #FIXED &quot;30027&quot; > <!ATTLIST student country-id CDATA #IMPLIED >
  • 22.
    Attribute Type Definition(cont.) Attribute types Strings ( CDATA ) No constraints on attribute values Except for disallowing < , > , & , ’ and ” characters Tokenized attributes Constraints on permissible characters for attribute values Enumerated attributes Most restrictive Take only one value listed in attribute declaration
  • 23.
    Attribute Type Definition(cont.) Tokenized attribute types ( ID , IDREF , ENTITY , NMTOKEN ) Restrict attribute values ID Uniquely identifies an element IDREF Points to elements with ID attribute
  • 24.
    Attribute Type Definition(cont.) ENTITY tokenized attribute type Indicate that attribute has entity for its value Entity declaration <!ENTITY digits “0123456789” > Entity may be used as follows: <useAnEntity> &digits; </useAnEntity> Entity reference &digits; replaced by its value <useAnEntity> 0123456789 </useAnEntity>
  • 25.
    Attribute Type Definition(cont.) Enumerated attribute types Declare list of possible values for attribute <!ATTLIST person gender ( M | F ) “F” > Attribute gender can have either value M or F F is default value
  • 26.
    Recursive DTDs Wewant to capture a person with a mother and a father First attempt: <!ELEMENT person (name, address, person, person)> where the first person is the mother while the second is the father Second attempt: <!ELEMENT person (name, address, person?, person?)>
  • 27.
    Recursive DTDs (cont.)Third attempt: <!ELEMENT person (name, address)> <!ATTLIST person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED>
  • 28.
    More… NMTOKEN? <!NOTATION> ? Conditional Sections?
  • 29.
    Reference XML Howto program http://www.w3.org Teach Yourself XML in 21 Days, 3 rd Edition Learning XML, 2 nd Edition Andy Clark presentation. XML tutorial http://www.w3schools.com/w3c/
  • 30.
    Q&A Feel freeto post questions at http://yht4ever.blogspot.com . or email to: [email_address] or [email_address]
  • 31.