Jan 21, 2016
JAXB
Java Architecture for XML Binding
What is JAXB?
 JAXB is Java Architecture for XML Binding
 SAX and DOM are generic XML parsers
 They will parse any well-structured XML
 JAXB creates a parser that is specific to your DTD
 A JAXB parser will parse only valid XML (as defined by your
DTD)
 DOM and JAXB both produce a tree in memory
 DOM produces a generic tree; everything is a Node
 JAXB produces a tree of Objects with names and attributes as
described by your DTD
Advantages and disadvantages
 Advantages:
 JAXB requires a DTD

Using JAXB ensures the validity of your XML
 A JAXB parser is actually faster than a generic SAX parser
 A tree created by JAXB is smaller than a DOM tree
 It’s much easier to use a JAXB tree for application-specific code
 You can modify the tree and save it as XML
 Disadvantages:
 JAXB requires a DTD

Hence, you cannot use JAXB to process generic XML (for example, if
you are writing an XML editor or other tool)
 You must do additional work up front to tell JAXB what kind of tree
you want it to construct

But this more than pays for itself by simplifying your application
 JAXB is new: Version 1.0 dates from Q4 (fourth quarter) 2002
How JAXB works
 JAXB takes as input two files: your DTD and a binding
schema (which you also write)
 A binding schema is an XML document written in a “binding
language” defined by JAXB (with extension .xjs)
 A binding schema is used to customize the JAXB output
 Your binding schema can be very simple or quite complex
 JAXB produces as output Java source code which you
compile and add to your program
 Your program will uses the specific classes generated by JAXB
 Your program can then read and write XML files
 JAXB also provides an API for working directly with XML
 Some examples in this lecture are taken from the JAXB User’s guide,
http://java.sun.com/xml/jaxb/docs.html
A first example
 The DTD: <!ELEMENT book (title, author, chapter+) >
<!ELEMENT title (#PCDATA) >
<!ELEMENT author (#PCDATA)>
<!ELEMENT chapter (#PCDATA) >
 The schema: <xml-java-binding-schema>
<element name="book" type="class" root="true" />
</xml-java-binding-schema>
 The results: public Book(); // constructor
public String getTitle();
public void setTitle(String x);
public String getAuthor();
public void setAuthor(String x);
public List getChapter();
public void deleteChapter();
public void emptyChapter();
Note 1: In these slides
we only show the class
outline, but JAXB
creates a complete
class for you
Note 2: JAXB constructs
names based on yours,
with good capitalization
style
Adding complexity
 Adding a choice can reduce the usefulness of the parser
 <!ELEMENT book (title, author, (prologue | preface), chapter+)>
<!ELEMENT prologue (#PCDATA) >
<!ELEMENT preface (#PCDATA) >
 With the same binding schema, this gives:

public Book();
public List getContent();
public void deleteContent();
public void emptyContent();
 An improved binding schema can give better results
Improving the binding schema
 <xml-java-binding-schema>
<element name="book" type="class" root="true">
<content>
<element-ref name="title" />
<element-ref name="author” />
<choice property="prologue-or-preface" />
</content>
</element>
</xml-java-binding-schema>
 Result is same as the original, plus methods for the choice:
 public Book(); // constructor
. . .
public void emptyChapter();
public MarshallableObject getPrologueOrPreface();
public void setPrologueOrPreface(MarshallableObject x);
Marshalling
 marshal, v.t.: to place or arrange in order
 marshalling: the process of producing an XML
document from Java objects
 unmarshalling: the process of producing a content tree
from an XML document
 JAXB only allows you to unmarshal valid XML
documents
 JAXB only allows you to martial valid content trees
into XML
Limitations of JAXB
 JAXB only supports DTDs and a subset of XML
Schemas
 Later versions may support more schema languages
 JAXB does not support the following legal DTD
constructs:
 Internal subsets
 NOTATIONs
 ENTITY and ENTITIES
 Enumerated NOTATION types
A minimal binding schema
 A JAXB binding schema is itself in XML
 Start with: <xml-java-binding-schema version="1.0ea">
 The version is optional
 “ea” stands for “early access,” that is, not yet released
 Put in:
<element name="rootName" type="class" root="true" />
for each possible root element
 An XML document can have only one root
 However, the DTD does not say what that root must be
 Any top-level element defined by the DTD may be a root
 The value of name must match exactly with the name in the DTD
 End with: </xml-java-binding-schema>
More complex schemata
 JAXB requires that you supply a binding schema
 As noted on the previous slide, this would be
<xml-java-binding-schema version="1.0ea">
<element name="rootName" type="class" root="true" />
</xml-java-binding-schema>
 With this binding schema, JAXB uses its default rule
set to generate your “bindings”
 A binding is an association between an XML element and
the Java code used to process that element
 By adding to this schema, you can customize the
bindings and thus the generated Java code
Default bindings, I
 A “simple element” is one that has no attributes and only
character contents:
 <!ELEMENT elementName (#PCDATA) >
 For simple elements, JAXB assumes:
<element name="elementName" type="value"/>
 JAXB will treat this element as an instance variable of the class
for its enclosing element
 This is the default binding, that is, this is what JAXB will assume
unless you tell it otherwise

For example, you could write this yourself, but set type="class"
 For simple elements, JAXB will generate these methods in the
class of the enclosing element:
void setElementName(String x);
String getElementName();
 We will see later how to convert the #PCDATA into some type
other than String
Default bindings, II
 If an element is not simple, JAXB will treat it as a class
 Attributes and simple subelements are treated as instance variables
 DTD: <!ELEMENT elementName (subElement1, subElement2) >
<!ATTLIST elementName attributeName CDATA #IMPLIED>
 Binding: <element name="elementName" type="class">
<attribute name="attributeName"/>
<content>
<element-ref name="subElement1" /> <!-- simple element -->
<element-ref name="subElement2" /> <!-- complex element -->
</content>
</element>
 Java: class ElementName extends MarshallableObject {
void setAttributeName1(String x);
String getAttributeName1();
String getSubElement1();
void setSubElement1(String x);
// Non-simple subElement2 is described on the next slide
Default bindings, III
 If an element contains a subelement that is defined by a
class, the code generated will be different
 <element name="elementName" type="class">
<content>
<element-ref name="subElement2" />
<!-- Note that "element-ref" means this is a reference to
an
element that is defined elsewhere, not the element
itself -->
</content>
</element>
 Results in:
class ElementName extends MarshallableObject {
SubElement2 getSubElement2();
void setSubElement2(SubElement2 x);
...}
 Elsewhere, the DTD definition for subElement2 will result in:
class SubElement2 extends MarshallableObject { ... }
Default bindings, IV
 A simple sequence is just a list of contents, in order, with no
+ or * repetitions
 Example: <!ELEMENT html (head, body) >
 For an element defined with a simple sequence, setters and getters are
created for each item in the sequence
 If an element’s definition isn’t simple, or if it contains
repetitions, JAXB basically “gives up” and says “it’s got
some kind of content, but I don’t know what”
 Example: <!ELEMENT book (title, forward, chapter*)>
 Result:
public Book(); // constructor
public List getContent(); // "general content"--not too useful!
public void deleteContent();
public void emptyContent();
Customizing the binding schema
 You won’t actually see these default bindings anywhere--
they are just assumed
 If a default binding is OK with you, don’t do anything
 If you don’t like a default binding, just write your own
 Here’s the minimal binding you must write:
<xml-java-binding-schema>
<element name="rootElement" type="class" root="true" />
</xml-java-binding-schema>
 Start by “opening up” the root element:
<xml-java-binding-schema>
<element name="rootElement" type="class" root="true" >
</element>
</xml-java-binding-schema>
 Now you have somewhere to put your customizations
Primitive attributes
 By default, attributes are assumed to be Strings
 <!ATTLIST someElement someAttribute CDATA #IMPLIED>
 class SomeElement extends MarshallableObject {
void setSomeAttribute(String x);
String getSomeAttribute();
 You can define your own binding and use the convert attribute
to force the defined attribute to be a primitive, such as an int:
 <element name="someElement " type="class" >
<attribute name="someAttribute" convert="int" />
</element>
 class SomeElement extends MarshallableObject {
void setSomeAttribute(int x);
int getSomeAttribute();
Conversions to Objects, I
 At the top level (within <xml-binding-schema>), add
a conversion declaration, such as:
 <conversion name="BigDecimal" type="java.math.BigDecimal" />

name is used in the binding schema

type is the actual class to be used
 Add a convert attribute where you need it:
 <element name="name" type="value" convert="BigDecimal" />
 The result should be:
 public java.math.BigDecimal getName();
public void setName(java.math.BigDecimal x);
 This works for BigDecimal because it has a
constructor that takes a String as its argument
Conversions to Objects, II
 There is a constructor for Date that takes a String as its
one argument, but this constructor is deprecated
 This is because there are many ways to write dates
 For an object like this, you need to supply methods to “parse”
and “print”
 <conversion name="MyDate" type="java.util.Date”
parse="MyDate.parseDate" print="MyDate.printDate"/>
 Your class, MyDate, would extend Date and provide
parseDate and printDate methods
Creating enumerations
 <!ATTLIST shirt size (small | medium | large) #IMPLIED> defines an attribute
of shirt that can take on one of a predefined set of values
 A typesafe enum is a class whose instances are a predefined set of values
 To create a typesafe enum for size:
 <enumeration name="shirtSize" members="small medium large">
 <element name="shirt" ...>
<attribute name="size" convert="shirtSize" />
</element>
 You get:
 public final class ShirtSize {
public final static ShirtSize SMALL;
public final static ShirtSize MEDIUM;
public final static ShirtSize LARGE;
public static ShirtSize parse(String x);
public String toString();
}
Content models
 The <content> tag describes one of two kinds of content
models:
 A general-content property binds a single property

You’ve seen this before:
<content property="my-content" />

Gives: public List getMyContent();
public void deleteMyContent();
public void emptyMyContent();
 A model-based content property can contain four types of
declarations:

element-ref says that this element contains another element

choice says that there are alternative contents

sequence says that contents must be in a particular order

rest can be used to specify any kind of content
Using JAXB
 JAXB is not currently a part of the standard Java distributions
 The steps involved in using JAXB are:
 Download, install, and configure JAXB
 Write a JAXB schema to describe the bindings you want for your XML
 Use JAXB to read the JAXB schema and the XML DTD (or XML
Schema) and produce Java code
 Add the Java code to your program and compile it
 Use the resultant program to:

Read and validate XML input files

Modify the XML tree

Optionally validate and output the modified XML
 Note: Validation is optional and can be performed during unmarshalling
or any time thereafter
The End

Jaxb

  • 1.
    Jan 21, 2016 JAXB JavaArchitecture for XML Binding
  • 2.
    What is JAXB? JAXB is Java Architecture for XML Binding  SAX and DOM are generic XML parsers  They will parse any well-structured XML  JAXB creates a parser that is specific to your DTD  A JAXB parser will parse only valid XML (as defined by your DTD)  DOM and JAXB both produce a tree in memory  DOM produces a generic tree; everything is a Node  JAXB produces a tree of Objects with names and attributes as described by your DTD
  • 3.
    Advantages and disadvantages Advantages:  JAXB requires a DTD  Using JAXB ensures the validity of your XML  A JAXB parser is actually faster than a generic SAX parser  A tree created by JAXB is smaller than a DOM tree  It’s much easier to use a JAXB tree for application-specific code  You can modify the tree and save it as XML  Disadvantages:  JAXB requires a DTD  Hence, you cannot use JAXB to process generic XML (for example, if you are writing an XML editor or other tool)  You must do additional work up front to tell JAXB what kind of tree you want it to construct  But this more than pays for itself by simplifying your application  JAXB is new: Version 1.0 dates from Q4 (fourth quarter) 2002
  • 4.
    How JAXB works JAXB takes as input two files: your DTD and a binding schema (which you also write)  A binding schema is an XML document written in a “binding language” defined by JAXB (with extension .xjs)  A binding schema is used to customize the JAXB output  Your binding schema can be very simple or quite complex  JAXB produces as output Java source code which you compile and add to your program  Your program will uses the specific classes generated by JAXB  Your program can then read and write XML files  JAXB also provides an API for working directly with XML  Some examples in this lecture are taken from the JAXB User’s guide, http://java.sun.com/xml/jaxb/docs.html
  • 5.
    A first example The DTD: <!ELEMENT book (title, author, chapter+) > <!ELEMENT title (#PCDATA) > <!ELEMENT author (#PCDATA)> <!ELEMENT chapter (#PCDATA) >  The schema: <xml-java-binding-schema> <element name="book" type="class" root="true" /> </xml-java-binding-schema>  The results: public Book(); // constructor public String getTitle(); public void setTitle(String x); public String getAuthor(); public void setAuthor(String x); public List getChapter(); public void deleteChapter(); public void emptyChapter(); Note 1: In these slides we only show the class outline, but JAXB creates a complete class for you Note 2: JAXB constructs names based on yours, with good capitalization style
  • 6.
    Adding complexity  Addinga choice can reduce the usefulness of the parser  <!ELEMENT book (title, author, (prologue | preface), chapter+)> <!ELEMENT prologue (#PCDATA) > <!ELEMENT preface (#PCDATA) >  With the same binding schema, this gives:  public Book(); public List getContent(); public void deleteContent(); public void emptyContent();  An improved binding schema can give better results
  • 7.
    Improving the bindingschema  <xml-java-binding-schema> <element name="book" type="class" root="true"> <content> <element-ref name="title" /> <element-ref name="author” /> <choice property="prologue-or-preface" /> </content> </element> </xml-java-binding-schema>  Result is same as the original, plus methods for the choice:  public Book(); // constructor . . . public void emptyChapter(); public MarshallableObject getPrologueOrPreface(); public void setPrologueOrPreface(MarshallableObject x);
  • 8.
    Marshalling  marshal, v.t.:to place or arrange in order  marshalling: the process of producing an XML document from Java objects  unmarshalling: the process of producing a content tree from an XML document  JAXB only allows you to unmarshal valid XML documents  JAXB only allows you to martial valid content trees into XML
  • 9.
    Limitations of JAXB JAXB only supports DTDs and a subset of XML Schemas  Later versions may support more schema languages  JAXB does not support the following legal DTD constructs:  Internal subsets  NOTATIONs  ENTITY and ENTITIES  Enumerated NOTATION types
  • 10.
    A minimal bindingschema  A JAXB binding schema is itself in XML  Start with: <xml-java-binding-schema version="1.0ea">  The version is optional  “ea” stands for “early access,” that is, not yet released  Put in: <element name="rootName" type="class" root="true" /> for each possible root element  An XML document can have only one root  However, the DTD does not say what that root must be  Any top-level element defined by the DTD may be a root  The value of name must match exactly with the name in the DTD  End with: </xml-java-binding-schema>
  • 11.
    More complex schemata JAXB requires that you supply a binding schema  As noted on the previous slide, this would be <xml-java-binding-schema version="1.0ea"> <element name="rootName" type="class" root="true" /> </xml-java-binding-schema>  With this binding schema, JAXB uses its default rule set to generate your “bindings”  A binding is an association between an XML element and the Java code used to process that element  By adding to this schema, you can customize the bindings and thus the generated Java code
  • 12.
    Default bindings, I A “simple element” is one that has no attributes and only character contents:  <!ELEMENT elementName (#PCDATA) >  For simple elements, JAXB assumes: <element name="elementName" type="value"/>  JAXB will treat this element as an instance variable of the class for its enclosing element  This is the default binding, that is, this is what JAXB will assume unless you tell it otherwise  For example, you could write this yourself, but set type="class"  For simple elements, JAXB will generate these methods in the class of the enclosing element: void setElementName(String x); String getElementName();  We will see later how to convert the #PCDATA into some type other than String
  • 13.
    Default bindings, II If an element is not simple, JAXB will treat it as a class  Attributes and simple subelements are treated as instance variables  DTD: <!ELEMENT elementName (subElement1, subElement2) > <!ATTLIST elementName attributeName CDATA #IMPLIED>  Binding: <element name="elementName" type="class"> <attribute name="attributeName"/> <content> <element-ref name="subElement1" /> <!-- simple element --> <element-ref name="subElement2" /> <!-- complex element --> </content> </element>  Java: class ElementName extends MarshallableObject { void setAttributeName1(String x); String getAttributeName1(); String getSubElement1(); void setSubElement1(String x); // Non-simple subElement2 is described on the next slide
  • 14.
    Default bindings, III If an element contains a subelement that is defined by a class, the code generated will be different  <element name="elementName" type="class"> <content> <element-ref name="subElement2" /> <!-- Note that "element-ref" means this is a reference to an element that is defined elsewhere, not the element itself --> </content> </element>  Results in: class ElementName extends MarshallableObject { SubElement2 getSubElement2(); void setSubElement2(SubElement2 x); ...}  Elsewhere, the DTD definition for subElement2 will result in: class SubElement2 extends MarshallableObject { ... }
  • 15.
    Default bindings, IV A simple sequence is just a list of contents, in order, with no + or * repetitions  Example: <!ELEMENT html (head, body) >  For an element defined with a simple sequence, setters and getters are created for each item in the sequence  If an element’s definition isn’t simple, or if it contains repetitions, JAXB basically “gives up” and says “it’s got some kind of content, but I don’t know what”  Example: <!ELEMENT book (title, forward, chapter*)>  Result: public Book(); // constructor public List getContent(); // "general content"--not too useful! public void deleteContent(); public void emptyContent();
  • 16.
    Customizing the bindingschema  You won’t actually see these default bindings anywhere-- they are just assumed  If a default binding is OK with you, don’t do anything  If you don’t like a default binding, just write your own  Here’s the minimal binding you must write: <xml-java-binding-schema> <element name="rootElement" type="class" root="true" /> </xml-java-binding-schema>  Start by “opening up” the root element: <xml-java-binding-schema> <element name="rootElement" type="class" root="true" > </element> </xml-java-binding-schema>  Now you have somewhere to put your customizations
  • 17.
    Primitive attributes  Bydefault, attributes are assumed to be Strings  <!ATTLIST someElement someAttribute CDATA #IMPLIED>  class SomeElement extends MarshallableObject { void setSomeAttribute(String x); String getSomeAttribute();  You can define your own binding and use the convert attribute to force the defined attribute to be a primitive, such as an int:  <element name="someElement " type="class" > <attribute name="someAttribute" convert="int" /> </element>  class SomeElement extends MarshallableObject { void setSomeAttribute(int x); int getSomeAttribute();
  • 18.
    Conversions to Objects,I  At the top level (within <xml-binding-schema>), add a conversion declaration, such as:  <conversion name="BigDecimal" type="java.math.BigDecimal" />  name is used in the binding schema  type is the actual class to be used  Add a convert attribute where you need it:  <element name="name" type="value" convert="BigDecimal" />  The result should be:  public java.math.BigDecimal getName(); public void setName(java.math.BigDecimal x);  This works for BigDecimal because it has a constructor that takes a String as its argument
  • 19.
    Conversions to Objects,II  There is a constructor for Date that takes a String as its one argument, but this constructor is deprecated  This is because there are many ways to write dates  For an object like this, you need to supply methods to “parse” and “print”  <conversion name="MyDate" type="java.util.Date” parse="MyDate.parseDate" print="MyDate.printDate"/>  Your class, MyDate, would extend Date and provide parseDate and printDate methods
  • 20.
    Creating enumerations  <!ATTLISTshirt size (small | medium | large) #IMPLIED> defines an attribute of shirt that can take on one of a predefined set of values  A typesafe enum is a class whose instances are a predefined set of values  To create a typesafe enum for size:  <enumeration name="shirtSize" members="small medium large">  <element name="shirt" ...> <attribute name="size" convert="shirtSize" /> </element>  You get:  public final class ShirtSize { public final static ShirtSize SMALL; public final static ShirtSize MEDIUM; public final static ShirtSize LARGE; public static ShirtSize parse(String x); public String toString(); }
  • 21.
    Content models  The<content> tag describes one of two kinds of content models:  A general-content property binds a single property  You’ve seen this before: <content property="my-content" />  Gives: public List getMyContent(); public void deleteMyContent(); public void emptyMyContent();  A model-based content property can contain four types of declarations:  element-ref says that this element contains another element  choice says that there are alternative contents  sequence says that contents must be in a particular order  rest can be used to specify any kind of content
  • 22.
    Using JAXB  JAXBis not currently a part of the standard Java distributions  The steps involved in using JAXB are:  Download, install, and configure JAXB  Write a JAXB schema to describe the bindings you want for your XML  Use JAXB to read the JAXB schema and the XML DTD (or XML Schema) and produce Java code  Add the Java code to your program and compile it  Use the resultant program to:  Read and validate XML input files  Modify the XML tree  Optionally validate and output the modified XML  Note: Validation is optional and can be performed during unmarshalling or any time thereafter
  • 23.

Editor's Notes

  • #2 This whole talk is based on http://java.sun.com/xml/jaxb/docs.html, which is actually pretty badly written; it uses many examples but the discussions seriously lack precision. I should go to the spec or find another description somewhere, and check these slides carefully.