Java Course 12: XML & XSL, Web & Servlets


Published on

Lecture 12 from the IAG0040 Java course in TTÜ.
See the accompanying source code written during the lectures:

  1. 1. Java course - IAG0040 Java and the Web: XML, XSL, ServletsAnton Keks 2011
  2. 2. Introduction to XML ● XML = Extensible Markup Language – recommended by W3C general-purpose markup language – includes text and extra information (markup) – “simplified SGML” – meta-language, can be used to create new ones ● XML has hit the “sweet spot” between simplicity and flexibility – very widely used for exchange of various data – even HTML has been retrofitted as XHTML – MathML, MusicXML, SVG, WSDL, RSS, OpenDocument, etcJava course – IAG0040 Lecture 13Anton Keks Slide 2
  3. 3. XML design goals ● Human-readable – human-readable and self-descriptive markup – text files, supports Unicode ● Easily machine-parseable – strict structure, well-defined formal rules – well-compressible for storage and transmission – platform-independent ● Multi-purpose and extensible – hierarchical structure: records, lists, trees – schemas, namespacesJava course – IAG0040 Lecture 13Anton Keks Slide 3
  4. 4. XML syntax ● Single element – <name attribute="value">content</name> ● Example document – <?xml version="1.0" encoding="UTF-8"?> <recipe name="bread" prepTime="5 mins" cookTime="3 hours"> <title>Basic bread</title> <ingredient amount="3" unit="cups">Flour</ingredient> <ingredient amount="0.25" unit="ounce">Yeast</ingredient> <ingredient amount="1.5" unit="cups" state="warm">Water</ingredient> <ingredient amount="1" unit="teaspoon">Salt</ingredient> <instructions> <step>Mix all ingredients together, and knead thoroughly.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Knead again, place in a tin, and then bake in the oven.</step> </instructions> </recipe>Java course – IAG0040 Lecture 13Anton Keks Slide 4
  5. 5. XML Structure ● XML Declaration (version, encoding, external dependencies) – <?xml version="1.0" standalone="yes" encoding="UTF-8"?> ● Document type definitions (DTD): <!DOCTYPE example [ ... ]> ● Single root element, nested elements, some with attributes and content – <name attribute="value">content</name> or <foo/> – starting and ending tag, content or nested elements between, no overlapping – case-sensitive ● Special chars and entities – predefined: &amp; &lt; &gt; &apos; &quot; &#DDD; &#xHH; – more can be declared: <!ENTITY copy "©"> – unescaped data: <![CDATA[ A & B ]]> ● Comments: <!-- Hello -->Java course – IAG0040 Lecture 13Anton Keks Slide 5
  6. 6. XML correctness ● Well-formed – conforms to all syntax rules ● Valid (only if well-formed) – data and structure conforms to a set of rules, describing correct data values and locations – must comply to a schema – DTD – a part of XML spec – More functional: XML Schema (XSD), RELAX NGJava course – IAG0040 Lecture 13Anton Keks Slide 6
  7. 7. DTD example ● DTD = Document Type Definition ● Declaration – <!DOCTYPE customer [ element declarations here ]> - internal DTD – <!DOCTYPE customer SYSTEM "customer.dtd"> - external DTD – <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ""> ● Content – <!ELEMENT people_list (person*)> <!ELEMENT person (name, birthdate?, gender?, personal_id?)> <!ATTLIST person index CDATA #REQUIRED> <!ELEMENT name (#PCDATA)> <!ELEMENT birthdate (#PCDATA)> <!ELEMENT gender (#PCDATA)> <!ELEMENT personal_id (#PCDATA)>Java course – IAG0040 Lecture 13Anton Keks Slide 7
  8. 8. XML Namespaces ● Help to avoid naming conflicts ● Allow merging of XML documents with different semantics ● Uses prefixes to distinguish namespaces – <xhtml:table><xhtml:tr/></xhtml:table> – prefix names are not fixed, defined in declaration ● xmlns:prefix=”namespaceURI” ● <h:table xmlns:h=””> – default namespace can be declared with xmlns alone ● <table xmlns=””>Java course – IAG0040 Lecture 13Anton Keks Slide 8
  9. 9. XSD: W3C XML Schema ● XML-based ● Has more features than DTD ● Namespaces are directly supported ● Data model – the vocabulary ● element and attribute names – the content model ● relationships, structure, ordering – the data types ● semantics and validation rulesJava course – IAG0040 Lecture 13Anton Keks Slide 9
  10. 10. XSD (cont)● Schema example (country.xsd) – <xs:schema xmlns:xs=""> <xs:element name="country" type="countryType"/> <xs:complexType name="countryType"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="population" type="xs:decimal"/> </xs:sequence> </xs:complexType> </xs:schema>● Declaration (country.xml) – <country xmlns:xsi="" xsi:noNamespaceSchemaLocation="country.xsd"> <name>France</name><population>59.7</population> </country> – <c:country xmlns:c="" xmlns:xsi="" xsi:schemaLocation=" country.xsd"> </c:country> Java course – IAG0040 Lecture 13 Anton Keks Slide 10
  11. 11. Unit testing ● XMLUnit is a 3rd party addition to JUnit – was designed for JUnit 3.x, however perfectly usable with JUnit 4 – provides XMLAssert class that can be statically imported ● import static org.custommonkey.xmlunit.XMLAssert.*; ● assertXXX() methods take XML as String or Document – simplifies code testing that works with XML ● XML equality and similarity checking ● Validation ● XPath evaluation and checking ● TransformationJava course – IAG0040 Lecture 13Anton Keks Slide 11
  12. 12. XPath● XPath is a language for finding information in an XML document – uses path expressions to select nodes (elements, attributes) – has a library of built-in functions – XML documents are treated as trees of nodes● Sample XPath expressions – /bookstore/book – all book elements under bookstore – //book – all book elements in the document – @lang – the value of lang attribute of current element – bookstore/book[price > 35.00] – all books costing more than 35 – //book[@lang=en] – all books in English – book[1]/author[1]/name – first author of the first book – book[last() - 1] – the book before the last oneJava course – IAG0040 Lecture 13Anton Keks Slide 12
  13. 13. Introduction to XSL ● Meaning of arbitrary XML tags is not well understood by e.g. a web browser ● XSL describes how the XML document should be displayed ● XSL = Extensible Stylesheet Language – XML based, again ● XSLT = XSL Transformations – can be used to transform one XML format to another XML or other text format (very often HTML or XHTML) ● XSL-FO – a language for formatting XML documents (to produce, e.g. PDF documents, images, graphics, etc)Java course – IAG0040 Lecture 13Anton Keks Slide 13
  14. 14. XSLT basics ● Assigning stylesheets – <?xml-stylesheet type="text/xsl" href="file.xsl"?> ● XSL stylesheet – <xsl:stylesheet version=”1.0” xmlns:xsl=””> – then, one or more templates are defined <xsl:template match=”/”> <contents><xsl:copy-of select=”.”/></contents> </xsl:template> – Various xsl elements are used for querying of data using XPath ● all match, select, and test attributes take XPath expressions ● in other attributes, you can put XPath into {} – Most used elements: copy-of, value-of, for-each, sort, if, choose, apply-templates, call-templateJava course – IAG0040 Lecture 13Anton Keks Slide 14
  15. 15. XML Parsing ● There are 3 different ways to work with XML – DOM = Document Object Model ● stores the full XML tree in memory as objects ● convenient to work with, but not suitable for very large XMLs – SAX = Streaming API for XML ● reads (streams) XML data and produces events ● no access to the full document, state must be maintained manually ● no limits on XML size, generally faster – XPP, XML Pull Parser (SAX is a push parser) ● non-standard, not bundled with Java ● does not produce events, but rather waits for the client program to pull information about parsing, then continues processingJava course – IAG0040 Lecture 13Anton Keks Slide 15
  16. 16. Introduction to JAXP ● JAXP = Java API for XML Processing – javax.xml.parsers ● DocumentBuilder – DOM, SAXParser – SAX – javax.xml.xpath ● XPath – compilation and evaluation of XPath expressions – javax.xml.transform ● Transformer – XSLT – JAXP defines only interfaces, implementations are pluggable ● access to the implementations is via Factories – DocumentBuilderFactory, SAXParserFactory, etc ● Java 1.6 bundles Apache Xerces and XalanJava course – IAG0040 Lecture 13Anton Keks Slide 16
  17. 17. JAXP overview ● Other parts of the API are in packages, according to standards that define them – DOM is in org.w3c.dom ● Document, Node, Element, Attr, etc interfaces for storing of DOM trees – SAX is in org.xml.sax ● XMLReader, ContentHandler interfaces for handling/producing of SAX eventsJava course – IAG0040 Lecture 13Anton Keks Slide 17
  18. 18. JAXP and DOM ● javax.xml.parsers.DocumentBuilderFactory - creates DocumentBuilder instances. Used to set various attributes for the parser, including its validating behavior. ● javax.xml.parsers.DocumentBuilder - performs parsing and creates DOM Documents representing parsed XML ● org.w3c.dom.Document - represents the root of the XML DOM tree. An element that contains the elements of the document. ● org.w3c.dom.Node - a single node in the document tree. A node can be an element, an attribute, an entity, a document, or a text node. ● org.w3c.dom.NodeList - an ordered enumeration of nodes ● org.w3c.dom.Element - a Node representing an XML element ● org.w3c.dom.Attr - an attribute attached to an Element ● org.w3c.dom.Text - a text Node (content of an element), CharacterDataJava course – IAG0040 Lecture 13Anton Keks Slide 18
  19. 19. JAXP and SAX ● org.xml.sax.SAXParserFactory - creates SAXParser instances. Allows various parameters to be set for the creation of the parser. ● javax.xml.parsers.SAXParser - used to initiate parsing of XML documents. Encapsulates an XMLReader for generation of SAX events. ● org.xml.sax.XMLReader - used to register event handlers. Calls the callback methods as content being scanned (generates SAX events) ● org.xml.sax.ContentHandler - the interface to implement in order to receive SAX events. Instance must be registered with XMLReader. ● org.xml.sax.ErrorHandler – the interface to implement in order to handle parsing errors. ● org.xml.sax.helpers.DefaultHandler - default implementation of ContentHandler, ErrorHandler and a couple of other interfaces; can be extended to simplify SAX event handling.Java course – IAG0040 Lecture 13Anton Keks Slide 19
  20. 20. JAXP and XSL ● javax.xml.transform.TransformerFactory - creates Transformer instances, either simple, that just copies source to the result, or with an associated stylesheet that does the actual transformation ● javax.xml.transform.Transformer - represents the transformation rules (stylesheet); used to transform the source XML and write the result ● javax.xml.transform.Source - interface for sources of transformation. Used to provide both the stylesheet and the XML to the Transformer. – Implementations: DOMSource, SAXSource, StreamSource, etc ● javax.xml.transform.Result - interface for writing of transformation result. – Implementations: DOMResult, SAXResult, StreamResult, etc ● javax.xml.transform.ErrorListener – interface for customized error handlingJava course – IAG0040 Lecture 13Anton Keks Slide 20
  21. 21. JAXP and XPath ● javax.xml.xpath.XPathFactory - creates XPath instances and can be used to define custom XPathFunctionResolver and XPathVariableResolver ● javax.xml.xpath.XPath - XPath evaluation environment. Used to compile and evaluate XPath expressions. Evaluation takes the context node as a parameter to evaluate the expression on. ● javax.xml.xpath.XPathExpression - compiled XPath expression, used directly for multiple evaluations of same expressions. ● javax.xml.xpath.XPathConstants - a mapping between XPath and Java data typesJava course – IAG0040 Lecture 13Anton Keks Slide 21
  22. 22. JAXB ● JAXB = Java API for XML Binding – XML serialization of Java objects – javax.xml.bind – Involves generation of Java classes according to the XML schema or vice-versa – JAXBContext is a factory for Marshaller and UnmarshallerJava course – IAG0040 Lecture 13Anton Keks Slide 22
  23. 23. JDOM & DOM4J ● org.w3c.dom API was designed for any OO language and was mapped to Java more or less directly – the resulting API is not very convenient for Java ● Two similar 3rd party DOM APIs address this – JDOM is more lightweight and was proposed for inclusion in Java SE – DOM4J has integrated support for XPath, provides better interoperability with W3C DOM and Transformer – Most operations can be done using single method calls – Java Strings and Collections are usedJava course – IAG0040 Lecture 13Anton Keks Slide 23
  24. 24. XML generation ● There are many options: – String concatenation ● inflexible, can easily produce broken XML – Programmatic creation of DOM tree – Manual generation of SAX events – JDOM/DOM4J – XML marshalling using JAXB or similar API – Template engines, e.g. StringTemplate, Velocity ● basically pre-created XML files with holes that can be filled with dataJava course – IAG0040 Lecture 13Anton Keks Slide 24
  25. 25. Servlets ● Servlets are server-side Java applications ● Now javax.servlet API is officially a part of Java EE ● They process asynchronous requests and generate responses ● Servlets are most often used in Web applications ● Servlets are deployed and run within containers (web application servers) – there are many commercial application servers – Jetty and Tomcat are open-source ones ● JSP (Java Server Pages) are PHP/ASP-like Java files with embedded HTML, but they must be compiled into servlets (usually on-the-fly)Java course – IAG0040 Lecture 13Anton Keks Slide 25
  26. 26. Servlet API ● A Servlet must implement javax.servlet.Servlet interface. However, most servlets extend either javax.servlet.GenericServlet or javax.servlet.http.HttpServlet. ● A container creates a single instance of the servlet class using the default constructor, then it calls the init() method ● On every client request, the service() method is called – for HTTP, there are various higher-level methods defined, e.g. doGet(), doPost(), doPut(), doDelete(), etc – these methods must be thread-safe because they are executed concurrently. javax.servlet.SingleThreadModel interface can tell the container not to do it. – all these methods take HttpServletRequest and HttpServletResponse as parametersJava course – IAG0040 Lecture 13Anton Keks Slide 26
  27. 27. HttpServletRequest ● HttpServletRequest is for reading users input – getParameter() is for reading of HTTP request parameters – getHeader() is for reading HTTP headers – getCookies() is for examining the available cookies – getSession() creates/obtains the HTTP session – getReader() / getInputStream() are for reading of large request payloads (e.g. uploaded files) – getLocalXXX() / getServerXXX() return various info about the host, where servlet is running and the server itself – getRemoteXXX() returns various info on the remote client – various other methods provide even more informationJava course – IAG0040 Lecture 13Anton Keks Slide 27
  28. 28. HttpServletResponse ● HttpServletResponse is for generating the response to the user – addCookie() adds a cookie to the response – addHeader() adds an arbitrary HTTP header to the response – getWriter() / getOutputStream() provide a stream for writing of response content, not further header modifications are possible if isCommitted() returns true – sendError() / setStatus() is for setting response status codes – setContentLength() sets the size in bytes of outputted content – setContentType() sets the MIME type of outputted content (text/html for HTML content) – There are a lot of SC_XXX status code constants defined – There are many other useful methodsJava course – IAG0040 Lecture 13Anton Keks Slide 28
  29. 29. Sessions ● Sessions are used to persist some information (state) about the client between asynchronous requests ● Provided by HttpSession interface – request.getSession() returns an instance – session attributes are any Objects with String keys, they are persisted until session is either invalidate()d or expired (after 30 min by default) ● Servlet container uses either cookies or URL-rewriting to pass/retrieve the session ID – response.encodeURL() must be used with any output URLs for URL-rewriting to work, in case cookies are not available – these URLs typically look like this: http://host/servlet;jsessionid=72183CAFE23?abc=helloJava course – IAG0040 Lecture 13Anton Keks Slide 29
  30. 30. Servlet Filters ● Filters can be used to pre- or post-process requests – Called in chain, one after another before the servlet, like decorator pattern – Can be used for access control, logging, context initialization, compression, etc ● Need to implement javax.servlet.Filter – Method doFilter(request, response, chain) – To delegate processing further down the chain (optional), call chain.doFilter(request, response) – Or requests can be processed directly just like in a servletJava course – IAG0040 Lecture 13Anton Keks Slide 30
  31. 31. Deployment ● Web applications typically have defined directory structure – The root of the application is the document root, e.g. where images and other static content is located – There is a WEB-INF directory. Files contained there are hidden from direct access ● web.xml – deployment descriptor, defines URL- patterns, deployed servlets, various parameters, etc ● classes – directory with compiled .class files ● lib – directory with .jar files (all are automatically loaded) ● Another possibility is to put the same things into a single .war (Web ARchive) file, which is in the same format as .jarJava course – IAG0040 Lecture 13Anton Keks Slide 31
  32. 32. web.xml example <?xml version="1.0"?> <web-app xmlns="" xmlns:xsi="" xsi:schemaLocation=""> <servlet> <servlet-name>Hello</servlet-name> <servlet-class></servlet-class> <init-param> <param-name>name</param-name> <param-value>mega_value</param-value> </init-param> </servlet> <servlet-mapping> <servlet-name>Hello</servlet-name> <url-pattern>/hello*</url-pattern> </servlet-mapping> </web-app>Java course – IAG0040 Lecture 13Anton Keks Slide 32
  33. 33. Apache Digester ● Is a 3rd party jar for reading of XML – stores data directly in Java domain object tree (not DOM), e.g. Customer, Order – similar to unmarshalling; stack-based approach – rules can be created either programmatically or put into an XML file ● Example – Digester digester = new Digester(); digester.push(this); digester.addObjectCreate(“customers/customer”, Customer.class); digester.setProperties(“customers/customer”); digester.addSetNext(“customers/customer”, “addCustomer”, Customer.class.getName()); digester.addCallMethod(“customers/customer/address”, “setAddress”, 0); digester.parse(“customers.xml”);Java course – IAG0040 Lecture 13Anton Keks Slide 33
  34. 34. More info ● Good source of information and tutorials about all W3, XML and related technologies – course – IAG0040 Lecture 13Anton Keks Slide 34