Schema
 Schemas
specify the structure of an XML document
constraints on its content
 This is also the purpose of a DTD
schema does it better
syntax is XML
can use existing XML tools
Schema
 What you can't do with DTD's
constrain the #PCDATA e.g.
 a telephone number
 a price
 a single word
precisely constrain repetition
 up to three children on a family ticket
a precise selection of elements
 in any combination or permutation
Schema
 XML is a meta-language for defining tag
languages
 A schema is a formal specification (in
XML) of the grammar for one language
useful for validating content & interchange
 XML Schema is a language for writing the
specifications
Schemas
• Common Vocabularies
• Shared Applications
• Network effect
• Formal Sets of Rules
• Machine-based XML processing
• Not human-based document processing
• Building Contracts
• Core rules for a series of transactions
Schemas
• DTDs
• good at describing documents
• can't manage complex data structures
• syntax is not extensible
• available tools won't work
Schemas
 Schemas build on primitive types
integers, floating point, strings, dates
 Types can be based on other types
aggregations
specifications
restrictions
equivalences
 Distinction between types and elements
Schemas
 Schema building is very like
OO data design
E-R diagrams
 Schemas may be complex compared to
the documents
because humans 'intuitively understand' tag
names
Schema Standards
• XML Schema (current W3C standard)
• large, full-featured, unimplemented
• XML-Data
• early contender, supported by Microsoft
• reduced set of XML-Data is part of IE5.
• DCD
• joint creation of Microsoft and IBM
• simpler version of XML-Data
Schema Standards
• SOX
• XML structures via OO-inheritance
• Schematron
• uses XSLT for schemas
• DSD
• like Schematron with simpler XML syntax
• RELAX
• based on hedge automata theory
• much simpler than XML Schema
Schema
History
Schema Problems
 Legal implications of schemas as
contracts
Eskimo Snow and Scottish Rain:
Legal Considerations of Schema Design
 http://www.w3.org/TR/md-policy-design
syntactic operability with semantic fault
occurs because DTDs and schemas mix
 syntax
 semantics
Schema Problems
• W3C standard "XML Schemas"
• Too big, too complex
• XML 1.0 spec = 30 pages, Schemas >200
• Too much, too soon
• it isn't clear that many developers are sure
what to do with this enormous toolkit today.
• Competitors
Defining A Schema (IE5)
 Take an example XML document instance
<?xml version="1.0"?>
<pizzaOrder>
<when>18:04:30</when>
<cost>8.75</cost>
<pizza>Hot n Spicy</pizza>
</pizzaOrder>
Defining A Schema (IE5)
 First declare that it uses a schema
definition via the default namespace
<?xml version="1.0"?
xmlns="x-schema:pizzaOrderSchema.xml">
<pizzaOrder>
<when>18:04:30</when>
<cost>8.75</cost>
<pizza>Hot n Spicy</pizza>
</pizzaOrder>
Defining a Schema (IE5)
 Now create an outline schema
<Schema
xmlns="urn:schemas-microsoft-com:xml-data">
...
</Schema>
 ie an XML document from the schema
language namespace
Defining a Schema (IE5)
 First we declare the kinds of elements we
have
<Schema xmlns="urn:schemas-microsoft-com:xml-data">
<ElementType name="when"/>
<ElementType name="cost"/>
<ElementType name="pizza"/>
<ElementType name="pizzaOrder"/>
</Schema>
Defining a Schema (IE5)
 and specify allowable content
<Schema xmlns="urn:schemas-microsoft-com:xml-data">
<ElementType name="when" content="textOnly"/>
<ElementType name="cost" content="textOnly"/>
<ElementType name="pizza" content="textOnly"/>
<ElementType name="pizzaOrder" content="eltOnly"/>
</Schema>
textOnly, eltOnly, mixed, empty
Defining a Schema (IE5)
 and then content model
<Schema xmlns="urn:schemas-microsoft-com:xml-data">
<ElementType name="when" content="textOnly" />
<ElementType name="cost" content="textOnly"/>
<ElementType name="pizza" content="textOnly"/>
<ElementType name="pizzaOrder" content="eltOnly">
<element type="when"/>
<element type="cost"/>
<element type="pizza"/>
</ElementType>
</Schema>
Defining a Schema (IE5)
 and even the content model for text
<Schema xmlns="urn:schemas-microsoft-com:xml-data">
<ElementType name="when" content="textOnly"
type="time"/>
<ElementType name="cost" content="textOnly"
type="float"/>
<ElementType name="pizza" content="textOnly"/>
<ElementType name="pizzaOrder" content="eltOnly">
<element type="when"/>
<element type="cost"/>
<element type="pizza"/>
...
Defining a Schema (IE5)
 But that requires another namespace
<Schema xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="when" content="textOnly"
dt:type="time"/>
<ElementType name="cost" content="textOnly"
dt:type="float"/>
<ElementType name="pizza" content="textOnly"/>
<ElementType name="pizzaOrder" content="eltOnly">
<element type="when"/>
<element type="cost"/>
...
Schema Components
 Schemas build on the declarations and
usage of elements and attributes
 ElementType elements declare a kind of element
 AttributeType elements declare a kind of attribute
 element elements show the use of an element within
the context of another element
 attribute elements show the use of an attribute on an
element
Schema Components
 Various attributes specify the allowable
properties each element or attribute
 model specifies whether the element may contain
'foreign' elements, not specified in the schema
 minOccurs and maxOccurs put lower- and upper-
bounds on the repetition of an element
 order specifies whether subelements must appear in
the order specified, or whether only a single
subelement can be chosen
 required states that an attribute must be present
 default gives a default value for a missing attribute
Schema Data Types
 Microsoft's Schema provides 23 built-in
data types to which textual content can
conform
various numeric types (float, ints)
date, time, urn, uuid, char, hex, boolean and
blob
 No derived / extended types are allowed
 Separate namespace labels data vocab
Using Data Types
 A node's validated data type is directly
accessible within the IE DOM
DOMelement.nodeTypedValue
instead of .nodeValue or .text
 A node's schema definition is available
DOMElement.definition property
see
weather
.html

Xml schema

  • 1.
    Schema  Schemas specify thestructure of an XML document constraints on its content  This is also the purpose of a DTD schema does it better syntax is XML can use existing XML tools
  • 2.
    Schema  What youcan't do with DTD's constrain the #PCDATA e.g.  a telephone number  a price  a single word precisely constrain repetition  up to three children on a family ticket a precise selection of elements  in any combination or permutation
  • 3.
    Schema  XML isa meta-language for defining tag languages  A schema is a formal specification (in XML) of the grammar for one language useful for validating content & interchange  XML Schema is a language for writing the specifications
  • 4.
    Schemas • Common Vocabularies •Shared Applications • Network effect • Formal Sets of Rules • Machine-based XML processing • Not human-based document processing • Building Contracts • Core rules for a series of transactions
  • 5.
    Schemas • DTDs • goodat describing documents • can't manage complex data structures • syntax is not extensible • available tools won't work
  • 6.
    Schemas  Schemas buildon primitive types integers, floating point, strings, dates  Types can be based on other types aggregations specifications restrictions equivalences  Distinction between types and elements
  • 7.
    Schemas  Schema buildingis very like OO data design E-R diagrams  Schemas may be complex compared to the documents because humans 'intuitively understand' tag names
  • 8.
    Schema Standards • XMLSchema (current W3C standard) • large, full-featured, unimplemented • XML-Data • early contender, supported by Microsoft • reduced set of XML-Data is part of IE5. • DCD • joint creation of Microsoft and IBM • simpler version of XML-Data
  • 9.
    Schema Standards • SOX •XML structures via OO-inheritance • Schematron • uses XSLT for schemas • DSD • like Schematron with simpler XML syntax • RELAX • based on hedge automata theory • much simpler than XML Schema
  • 10.
  • 11.
    Schema Problems  Legalimplications of schemas as contracts Eskimo Snow and Scottish Rain: Legal Considerations of Schema Design  http://www.w3.org/TR/md-policy-design syntactic operability with semantic fault occurs because DTDs and schemas mix  syntax  semantics
  • 12.
    Schema Problems • W3Cstandard "XML Schemas" • Too big, too complex • XML 1.0 spec = 30 pages, Schemas >200 • Too much, too soon • it isn't clear that many developers are sure what to do with this enormous toolkit today. • Competitors
  • 13.
    Defining A Schema(IE5)  Take an example XML document instance <?xml version="1.0"?> <pizzaOrder> <when>18:04:30</when> <cost>8.75</cost> <pizza>Hot n Spicy</pizza> </pizzaOrder>
  • 14.
    Defining A Schema(IE5)  First declare that it uses a schema definition via the default namespace <?xml version="1.0"? xmlns="x-schema:pizzaOrderSchema.xml"> <pizzaOrder> <when>18:04:30</when> <cost>8.75</cost> <pizza>Hot n Spicy</pizza> </pizzaOrder>
  • 15.
    Defining a Schema(IE5)  Now create an outline schema <Schema xmlns="urn:schemas-microsoft-com:xml-data"> ... </Schema>  ie an XML document from the schema language namespace
  • 16.
    Defining a Schema(IE5)  First we declare the kinds of elements we have <Schema xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="when"/> <ElementType name="cost"/> <ElementType name="pizza"/> <ElementType name="pizzaOrder"/> </Schema>
  • 17.
    Defining a Schema(IE5)  and specify allowable content <Schema xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="when" content="textOnly"/> <ElementType name="cost" content="textOnly"/> <ElementType name="pizza" content="textOnly"/> <ElementType name="pizzaOrder" content="eltOnly"/> </Schema> textOnly, eltOnly, mixed, empty
  • 18.
    Defining a Schema(IE5)  and then content model <Schema xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="when" content="textOnly" /> <ElementType name="cost" content="textOnly"/> <ElementType name="pizza" content="textOnly"/> <ElementType name="pizzaOrder" content="eltOnly"> <element type="when"/> <element type="cost"/> <element type="pizza"/> </ElementType> </Schema>
  • 19.
    Defining a Schema(IE5)  and even the content model for text <Schema xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="when" content="textOnly" type="time"/> <ElementType name="cost" content="textOnly" type="float"/> <ElementType name="pizza" content="textOnly"/> <ElementType name="pizzaOrder" content="eltOnly"> <element type="when"/> <element type="cost"/> <element type="pizza"/> ...
  • 20.
    Defining a Schema(IE5)  But that requires another namespace <Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="when" content="textOnly" dt:type="time"/> <ElementType name="cost" content="textOnly" dt:type="float"/> <ElementType name="pizza" content="textOnly"/> <ElementType name="pizzaOrder" content="eltOnly"> <element type="when"/> <element type="cost"/> ...
  • 21.
    Schema Components  Schemasbuild on the declarations and usage of elements and attributes  ElementType elements declare a kind of element  AttributeType elements declare a kind of attribute  element elements show the use of an element within the context of another element  attribute elements show the use of an attribute on an element
  • 22.
    Schema Components  Variousattributes specify the allowable properties each element or attribute  model specifies whether the element may contain 'foreign' elements, not specified in the schema  minOccurs and maxOccurs put lower- and upper- bounds on the repetition of an element  order specifies whether subelements must appear in the order specified, or whether only a single subelement can be chosen  required states that an attribute must be present  default gives a default value for a missing attribute
  • 23.
    Schema Data Types Microsoft's Schema provides 23 built-in data types to which textual content can conform various numeric types (float, ints) date, time, urn, uuid, char, hex, boolean and blob  No derived / extended types are allowed  Separate namespace labels data vocab
  • 24.
    Using Data Types A node's validated data type is directly accessible within the IE DOM DOMelement.nodeTypedValue instead of .nodeValue or .text  A node's schema definition is available DOMElement.definition property see weather .html