2. DTD stands for Document Type Definition Allows an XML document to go further than meeting the requirements of being well-formed Specifies requirements to be valid A valid XML document matches definitions of allowable elements, attributes DTD Overview
3. Validation can be done in code (i.e. using javascript, VB and DOM) DTD’s allow use of a validating parser that compares the document against specifications Typically makes application changes and maintenance easier Less tied to a particular programming language/environment Validation
4. Declarations are used to specify document requirements Document type declaration Element declaration Attribute List declaration Entity declaration Declarations
5. Includes name of root element Allows specification of where the DTD is located DTD can be embedded in the XML file (local) DTD can refer to external file, Uniform Resource Identifier (URI) Local takes precedence over external Document Type Declaration
6. Element Declaration has 3 parts: Declaration Element name Element content Element content can include a list of child elements or data Element Declaration
7. DTD included in XML document Definition of a student: <!DOCTYPE student[ <!ELEMENT student(first, last, studentID)> <!ELEMENT first (#PCDATA)> <!ELEMENT last(#PCDATA)> <ELEMENT studentID(#PCDATA)> ]> LocalDTD Document Type Declaration Element Declaration A student element is made up of first name, last name, and student id elements
8. DTD exists in external file/location Must use keyword to specify type of location SYSTEM is a reference to local file system PUBLIC is reference to DTD accessed through a catalog Can use both together If can’t find catalog reference can use specified file External Definition
9. Reference in XML file: <!DOCTYPE student SYSTEM “student.dtd”> External file: <!ELEMENT student(first, last, studentID)> <!ELEMENT first (#PCDATA)> <!ELEMENT last(#PCDATA)> <ELEMENT studentID(#PCDATA)> ]> Sample External Definition Document Type Declaration Element Declaration
10. Element name must match name in XML document If using namespaces, prefixes must match Content Model defines what the element can store An element Mixed (i.e. data and element) Empty Any Working With Elements
11. One element can contain another Can specify the elements contained by sequence Can specify the elements contained as a choice Element Content
12. Error raised if an element is missing Error raised if there are extra elements Error raised if elements in a different order For a student, our content must be in firstname, lastname, studentID order If find an element “major”, error If order varies, error If missing first, last, or studentID, error Content by Sequence
13. Can allow content to vary between elements | (vertical bar or pipe) indicates OR If add a Grade element to a student that can be a letter or percent: <!ELEMENT grade (letter | percent)> <!ELEMENT letter (#PCDATA)> <!ELEMENT percent (#PCDATA) Indicates that must have letter or percent element Content by Choice
14. A name may be a full name (first, middle, last) or just first and last: <!ELEMENT name (fullName | (first, last))> <!ELEMENT fullName (first, middle, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT middle (#PCDATA)> <!ELEMENT last (#PCDATA)> By Choice: Example
15. Allows combination of elements and parsed character data Can include additional information within an element, eg. how to display Rules: Managed by using Choice (or) PCDATA must appear first in list of elements List cannot include inner content model (only simple elements) If there are child elements, include * * Indicates that may appear zero or more times Mixed Content
16. If want to include emphasis with the letter grade Data: <letter><em>4</em></letter> Declaration: <!ELEMENT letter (#PCDATA | em)*> Describes a letter element as the content (pcdata) plus emphasis element Mixed Content -2
17. An element can be empty <br /> (never has child, content) Declaration includes EMPTY: <!ELEMENT br EMPTY> Means that the element CANNOT contain content Empty Content
18. An element can contain any kind of value (or be empty) Any elements declared in the DTD can occur, any number of times Only elements that are part of the DTD can be part of the document! May be empty May contain PCDATA Least restrictive model Any Content
19. How many times can an element occur? How many times must an element occur? Cardinality
20. A student must have a first name A student may or may not have a last name A student may have one or more majors, or none (undeclared) <!ELEMENT student (first, last?, major*)> Note: Cardinality indicator doesn’t affect the element declaration (i.e. major) Cardinality: Example
21. Elements tend to be used to describe a logical unit of information Attributes are typically used to store data about characteristics (properties) May have a Movie element with attributes for Title, Rental Price, Rental Days No specific rules about how to use elements and attributes Attributes and DTD’s
22. Attributes allow more limits on data Can have a list of acceptable values Can have a default value Some ability to specify a data type Concise, about a single name/value pair Attributes have limits Can’t store long strings of text Can’t nest values Whitespace can’t be ignored Attributes and Elements
23. Declaration: <!ATTLISTElementNameAttrNameAttrType Default> Specify the Element the attribute belongs to Specify the Name of the attribute Specify the Type of data the attribute stores Specify characteristics of the values (Default or attribute value) List either the default value or other characteristic of value – required, optional Specifying Attributes
24. CDATA – unparsed character data Enumerated – series/list of string values Entity/Entities – reference entity definition(s) ID – unique identifier for the element IDREF – refer to the ID of another element IDREFS – list of ID’s of other elements separated by whitespace NMTOKEN/NMTOKENS – value(s) of attribute can be anything that follows rules for XML name Sample Attribute Data Types
25. Specifies that attribute value must be found in a particular list Each value in list must be valid XML name Limits on spaces, characters Use | (pipe) to separate members of list If specifying list letter grades for a student: <!ATTLIST student grade (A | B | C | D | F | V | W | I) #IMPLIED> Enumerated Attributes Element Attribute Enumerated List
26. An ID specifies that the element must have a unique value within the document Allows reliable way to refer to a specific element No spaces allowed in value Typically replace space with underscore Attribute list can include only one ID IDREF, IDREFS allows an element to be associated with another or multiple other elements A student element must have a student ID: <!ATTLIST student studentID ID #REQUIRED> ID, IDREF, IDREFS
27. Attributes can refer to entities “Entity” refers to substituting a reference for a text value & refers to the & character Unparsed Entity is a reference that isn’t parsed Can reuse references for long values, or hard to manage characters (i.e. tab, line feed) Entity must be declared in the DTD <!ENTITY classTitle “XML”> When classTitle found in document, replaced with XML Entities and Attributes
28. Can specify how the value will appear in the document Must always specify a value declaration DEFAULT sets a value for an attribute if a value isn’t provided Include default value in double quotes FIXED sets a value that must occur; if an attribute has a different value, a validation error occurs REQUIRED specifies that the attribute (and value) must exist IMPLIED means the attribute is optional Attribute Value Declarations