Java and XML

698 views

Published on

tutorial about processing XML document using Java. It covers DOM, SAX, and JDOM APIs.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
698
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Java and XML

  1. 1. Java and XML (DOM, SAX, JDOM) Raji GHAWI 20/01/2009
  2. 2. Outlines 1. 2. 3. DOM SAX JDOM 20/01/2009 2
  3. 3. 1. DOM Document Object Model
  4. 4. <inventory> <book year="2000"> <title>Snow Crash</title> <author>Neal Stephenson</author> <publisher>Spectra</publisher> <isbn>0553380958</isbn> <price>14.95</price> </book> <book year="2005"> <title>Burning Tower</title> <author>Larry Niven</author> <author>Jerry Pournelle</author> <publisher>Pocket</publisher> <isbn>0743416910</isbn> <price>5.99</price> </book> <book year="1995"> <title>Zodiac</title> <author>Neal Stephenson</author> <publisher>Spectra</publisher> <isbn>0553573862</isbn> <price>7.50</price> </book> <!-- more books... --> </inventory> 20/01/2009 4
  5. 5. Import required packages import javax.xml.parsers.*; import org.w3c.dom.*; 20/01/2009 5
  6. 6. Create the parser DOM parser factory try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); // .... } catch (Exception e) { e.printStackTrace(System.out); } DOM parser 20/01/2009 IOException ParserConfigurationException SAXException 6
  7. 7. Parse an XML file Document document = builder.parse("../inventory.xml"); the entire XML file (as a tree) (the Document Object Model) 20/01/2009 7
  8. 8. Root element the root element Element root = document.getDocumentElement(); System.out.println(root.getTagName()); 20/01/2009 8
  9. 9. Nodes Node Text Element may have children Attr leaves Operations on Nodes Element Text Attr getNodeName() tag name "#text" name of attribute getNodeValue() null text contents value of attribute getNodeType() ELEMENT_NODE TEXT_NODE ATTRIBUTE_NODE getAttributes() NamedNodeMap null null 20/01/2009 9
  10. 10. Distinguishing Node types switch(node.getNodeType()) { case Node.ELEMENT_NODE: Element element = (Element)node; ...; break; case Node.TEXT_NODE: Text text = (Text)node; ... break; case Node.ATTRIBUTE_NODE: Attr attr = (Attr)node; ... break; default: ... } 20/01/2009 10
  11. 11. Operations on Nodes        getParentNode() getFirstChild() getNextSibling() getPreviousSibling() getLastChild() hasAttributes() hasChildNodes() 20/01/2009 11
  12. 12. Travel through children nodes if (element.hasChildNodes()) { Node child = element.getFirstChild(); while (child != null) { // .... child = child.getNextSibling(); } } 20/01/2009 12
  13. 13. Operations for Elements      String getTagName() boolean hasAttribute(String name) String getAttribute(String name) boolean hasAttributes() NamedNodeMap getAttributes() 20/01/2009 13
  14. 14. NamedNodeMap    Node getNamedItem(String name) int getLength() Node item(int index) NamedNodeMap map = element.getAttributes(); for (int i = 0; i < map.getLength(); i++) { Attr attr = (Attr) map.item(i); System.out.println(attr.getNodeName() + "='"+ attr.getNodeValue()+"'"); } 20/01/2009 14
  15. 15. Operations on Texts    String getData() int getLength() String substringData(int offset, int count) 20/01/2009 15
  16. 16. Operations on Attrs    String getName() Element getOwnerElement() String getValue() 20/01/2009 16
  17. 17. 2. SAX Simple API for XML
  18. 18. Import required packages import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; 20/01/2009 18
  19. 19. Create the parser // Create a parser factory SAXParserFactory factory = SAXParserFactory.newInstance(); // Tell factory that the parser must understand namespaces factory.setNamespaceAware(true); try { // Make the parser SAXParser saxParser = factory.newSAXParser(); XMLReader parser = saxParser.getXMLReader(); } catch(Exception e){ e.printStackTrace(); } 20/01/2009 IOException ParserConfigurationException SAXException 19
  20. 20. Parse an XML file // Create a handler Handler handler = new Handler(); // Tell the parser to use this handler parser.setContentHandler(handler); // Finally, read and parse the document parser.parse("./inventory.xml"); 20/01/2009 20
  21. 21. SAX handlers  A callback handler for SAX must implement four interfaces:  interface ContentHandler  interface DTDHandler  interface EntityResolver  interface ErrorHandler  It is easier to use an adapter class 20/01/2009 21
  22. 22. Class DefaultHandler    DefaultHandler is in package org.xml.sax.helpers DefaultHandler implements ContentHandler, DTDHandler, EntityResolver, and ErrorHandler DefaultHandler is an adapter class   Provides empty methods for every method declared in each of the four interfaces To use this class, extend it and override the methods that are important to your application 20/01/2009 22
  23. 23. The Handler class class Handler extends DefaultHandler { // SAX calls this method when it encounters a start tag public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes attributes) throws SAXException { System.out.println("startElement: " + qualifiedName); } // SAX calls this method to pass in character data public void characters(char ch[], int start, int length) throws SAXException { System.out.println("characters: "" + new String(ch, start, length) + """); } // SAX call this method when it encounters an end tag public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException { System.out.println("endElement: /" + qualifiedName); } } 20/01/2009 23
  24. 24. <inventory> <book year="2000"> <title>Snow Crash</title> <author>Neal Stephenson</author> <publisher>Spectra</publisher> <isbn>0553380958</isbn> <price>14.95</price> </book> <book year="2005"> <title>Burning Tower</title> <author>Larry Niven</author> <author>Jerry Pournelle</author> <publisher>Pocket</publisher> <isbn>0743416910</isbn> <price>5.99</price> </book> <book year="1995"> <title>Zodiac</title> <author>Neal Stephenson</author> <publisher>Spectra</publisher> <isbn>0553573862</isbn> <price>7.50</price> </book> <!-- more books... --> </inventory> 20/01/2009 startElement: inventory characters: " " startElement: book characters: " " startElement: title characters: "Snow Crash" endElement: /title characters: " " startElement: author characters: "Neal Stephenson" endElement: /author characters: " " startElement: publisher characters: "Spectra" endElement: /publisher characters: " ... " endElement: /book ... endElement: /inventory 24
  25. 25. Attributes          getLength() getLocalName(index) getQName(index) getValue(index) getType(index) int getIndex(String qualifiedName) int getIndex(String uri, String localName) String getValue(String qualifiedName) String getValue(String uri, String localName) 20/01/2009 25
  26. 26. Attributes public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes attributes) throws SAXException { // .... for (int i = 0; i < attributes.getLength(); i++) { String attName = attributes.getQName(i); String attValue = attributes.getValue(i); System.out.println(attName+"='"+attValue+"'"); } // .... } 20/01/2009 26
  27. 27. 3. JDOM Java DOM
  28. 28. Import required packages import import import import org.jdom.*; org.jdom.input.*; org.jdom.output.*; org.jdom.adapters.*; org.jdom org.jdom.adapters org.jdom.input org.jdom.output 20/01/2009 28
  29. 29. Create the parser try { SAXBuilder builder = new SAXBuilder(); // .... } catch (IOException ioe) { ioe.printStackTrace(); } catch (JDOMException je) { je.printStackTrace(); } 20/01/2009 29
  30. 30. Parse an XML file Document document = builder.build("../inventory.xml"); 20/01/2009 30
  31. 31. Root element Element root = document.getRootElement(); System.out.println(root.getName()); 20/01/2009 31
  32. 32. Print out the document XMLOutputter outputter = new XMLOutputter(); outputter.output(document, System.out); StringWriter sw = new StringWriter(); XMLOutputter outputter = new XMLOutputter(); outputter.output(document, sw); String xml = sw.toString(); Advantage 1: 20/01/2009 Output facility 32
  33. 33. Get children • Get all direct children List allChildren = element.getChildren(); • Get all direct children with a given name List namedChildren = element.getChildren("book"); • Get the first child with a given name Element child = element.getChild("book"); Advantage 2: 20/01/2009 supports Java Collections 33
  34. 34. Travel through children nodes List children = element.getChildren(); for (int i = 0; i < children.size(); i++) { Element elem = (Element) children.get(i); // .... } 20/01/2009 34
  35. 35. Get attributes • Get all attributes List attrs = element.getAttributes(); for (int i = 0; i < attrs.size(); i++) Attribute attr = (Attribute) attrs.get(i); System.out.println(attr.getName()+" = "+attr.getValue()); } • Get an attribute with a given name Attribute attr = element.getAttribute("year"); • Get an attribute value with a given name String value = element.getAttributeValue("year"); 20/01/2009 35
  36. 36. Reading Element Content • The text content is directly available String content = element.getText(); • Remove extra whitespace String content = element.getTextTrim(); 20/01/2009 36
  37. 37. Mixed Content • Sometimes an element may contain comments, text content, and children <table> <!-- Some comment --> Some text <tr>Some child</tr> </table> String text = table.getTextTrim(); Element tr = table.getChild("tr"); 20/01/2009 37
  38. 38. Mixed Content List mixedContent = table.getContent(); Iterator iter = mixedContent.iterator(); while (iter.hasNext()) { Object obj = iter.next(); if (obj instanceof Comment) { System.out.println("Comment: " + obj); } else if (obj instanceof String) { System.out.println("String: " + obj); } else if (obj instanceof Element) { System.out.println("Element: " + ((Element)obj).getName()); } } 20/01/2009 38
  39. 39. References  Processing XML with Java; Elliotte Rusty Harold http://cafeconleche.org/books/xmljava/chapters/index.html 20/01/2009 39

×