Published on

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. DOM and SAX Jussi Pohjolainen TAMK University of Applied Sciences
  2. 2. DOM and SAX <ul><li>DOM and SAX </li></ul><ul><ul><li>Platform and language-independent APIs for manipulating or reading XML-documents </li></ul></ul><ul><ul><ul><li>API: Application Programming Interface, set of functions, procedures, methods, classes and interfaces </li></ul></ul></ul><ul><li>DOM and SAX is implemented in most programming languages: Java, PHP.. </li></ul>
  3. 3. Differences between DOM and SAX DOM SAX Standardization W3C Recommendation No formal specification Manipulation Reading and Writing (manipulation) Only Reading Memory Consumption Depends on the size of the source xml-file, can be large Very low XML handling Tree-based Event-based
  4. 4. SAX
  5. 5. Overview of SAX <ul><li>SAX: Simple API for XML </li></ul><ul><li>Originally a Java – only API </li></ul><ul><ul><li>Nowdays SAX is supported in almost all programming languages </li></ul></ul><ul><li>Uses a event-driven model </li></ul><ul><li>Quantity of memory usage is low </li></ul><ul><li>Only for reading xml-documents </li></ul>
  6. 6. Event-driven? <ul><li>SAX uses event-driven model for reading xml-documents </li></ul><ul><li>The basic idea is, that SAX parser reads the xml-document &quot; one line at a time&quot;. </li></ul><ul><li>Handler functions reacts when finding elements and other parts of the xml-document. </li></ul><ul><ul><li>When the parser finds starting tag, then a certain function is called.. when the parser winds ending tag a certain function is called </li></ul></ul>
  7. 7. Example (Wikipedia) <ul><li><?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> </li></ul><ul><li><RootElement param=&quot;value&quot;> </li></ul><ul><li><FirstElement> </li></ul><ul><li>Some Text </li></ul><ul><li></FirstElement> </li></ul><ul><li><SecondElement param2=&quot;something&quot;> </li></ul><ul><li>Pre-Text <Inline>Inlined text</Inline> Post-text. </li></ul><ul><li></SecondElement> </li></ul><ul><li></RootElement> </li></ul>
  8. 8. Example (Wikipedia) <ul><li>XML Processing Instruction, named xml, with attributes version equal to &quot;1.0&quot; and encoding equal to &quot;UTF-8&quot; </li></ul><ul><li>XML Element start, named RootElement, with an attribute param equal to &quot;value&quot; </li></ul><ul><li>XML Element start, named FirstElement </li></ul><ul><li>XML Text node, with data equal to &quot;Some Text&quot; (note: text processing, with regard to spaces, can be changed) </li></ul><ul><li>XML Element end, named FirstElement </li></ul><ul><li>.... </li></ul>
  9. 9. PHP and SAX <ul><li>// Creates an XML Parser </li></ul><ul><li>$xml_parser = xml_parser_create(); </li></ul><ul><li>// Set up for reading </li></ul><ul><li>xml_set_element_handler($xml_parser, &quot;startElement&quot;, &quot;endElement&quot;); </li></ul><ul><li>xml_set_character_data_handler($xml_parser, &quot;characterData&quot;); </li></ul><ul><li>// Open XML file </li></ul><ul><li>if (!($fp = fopen($file, &quot;r&quot;))) { </li></ul><ul><li>die(&quot;could not open XML input&quot;); </li></ul><ul><li>} </li></ul><ul><li>// Reading and Parsing xml-file </li></ul><ul><li>while ($data = fread($fp, 4096)) { </li></ul><ul><li>if (!xml_parse($xml_parser, $data, feof($fp))) { </li></ul><ul><li>die(sprintf(&quot;XML error: %s at line %d&quot;, </li></ul><ul><li>xml_error_string(xml_get_error_code($xml_parser)), </li></ul><ul><li>xml_get_current_line_number($xml_parser))); </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>xml_parser_free($xml_parser); </li></ul>
  10. 10. PHP and SAX <ul><li>function startElement($parser, $name, $attrs) </li></ul><ul><li>{ </li></ul><ul><li>// Do something </li></ul><ul><li>} </li></ul><ul><li>function endElement($parser, $name) </li></ul><ul><li>{ </li></ul><ul><li>// Do something </li></ul><ul><li>} </li></ul><ul><li>function characterData($parser, $data) </li></ul><ul><li>{ </li></ul><ul><li>echo $data; </li></ul><ul><li>} </li></ul>
  11. 11. Benefits of SAX <ul><li>Excellent API when just reading the contents of the XML – file </li></ul><ul><li>Easy and clean API </li></ul><ul><li>Does not require much resources (mobile devices!) </li></ul>
  12. 12. DOM
  13. 13. DOM <ul><li>The Document Object Model (DOM) is a platform- and language-independent standard object model for representing HTML or XML and related formats. </li></ul><ul><li>W3C Recommendation </li></ul><ul><li>Can be used for manipulating XML – documents </li></ul><ul><li>Different versions: DOM 1, DOM 2 and DOM 3 </li></ul>
  14. 14. Basic Idea behind DOM <ul><li>API for manipulating XML – documents </li></ul><ul><li>DOM loads xml-document into memory and creates a tree-model of the xml-data. </li></ul><ul><ul><li>Can consume memory, if documents are large </li></ul></ul>
  15. 15. Tree and Nodes <ul><li>Tree consists of nodes </li></ul><ul><li>Node can be </li></ul><ul><ul><li>Element (Element) </li></ul></ul><ul><ul><li>Text (Text) </li></ul></ul><ul><ul><li>Attribute (Attr) </li></ul></ul><ul><ul><li>CDATA (CDATASection) </li></ul></ul><ul><ul><li>Comment (Comment) </li></ul></ul><ul><ul><li>Etc </li></ul></ul>
  16. 16. Nodes and Relationships <ul><li>Node has references to it's </li></ul><ul><ul><li>first child (firstChild) </li></ul></ul><ul><ul><li>last child (lastChild) </li></ul></ul><ul><ul><li>next sibling (nextSibling) </li></ul></ul><ul><ul><li>previous sibling (previousSibling) </li></ul></ul><ul><ul><li>parent (parentNode </li></ul></ul>
  17. 17. Node's contents <ul><li>Some nodes have contents (nodeValue) </li></ul><ul><ul><li>Attribute's value </li></ul></ul><ul><ul><li>Element's value (text) </li></ul></ul><ul><ul><li>Comment's value (text) </li></ul></ul><ul><ul><li>etc </li></ul></ul>
  18. 18. Collections <ul><li>NodeList (List of nodes) </li></ul><ul><ul><li>length </li></ul></ul><ul><ul><li>item ( index ) </li></ul></ul><ul><li>NamedNodeMap (List of attributes) </li></ul><ul><ul><li>getNamedItem( name ) </li></ul></ul><ul><ul><li>item ( index ) </li></ul></ul>
  19. 19. Example using PHP DOM <ul><li>// Load the xml - document </li></ul><ul><li>$dom = new domDocument(); </li></ul><ul><li>$dom->load(&quot;books.xml&quot;); </li></ul><ul><li>// NodeList of name-elements </li></ul><ul><li>$listOfNodes = $dom->getElementsByTagName(&quot;name&quot;); </li></ul><ul><li>// Browse all nodes </li></ul><ul><li>foreach($listOfNodes as $node) </li></ul><ul><li>{ </li></ul><ul><li>print $node->nodeValue; </li></ul><ul><li>} </li></ul>
  20. 20. Example using PHP DOM <ul><li>// Load xml-document </li></ul><ul><li>$dom = new domDocument(); </li></ul><ul><li>$dom->load(&quot;books.xml&quot;); </li></ul><ul><li>// Create element <book></book> </li></ul><ul><li>$book = $dom->createElement(&quot;book&quot;); </li></ul><ul><li>// create element <title>some contents</title> </li></ul><ul><li>$title = $dom->createElement(&quot;title&quot;, $_GET['title']); </li></ul><ul><li>// <book><title>some contents</title></book> </li></ul><ul><li>$book->appendChild($title); </li></ul><ul><li>// Add the book under root element of &quot;books.xml&quot; </li></ul><ul><li>$dom->documentElement->appendChild($book); </li></ul><ul><li>// save </li></ul><ul><li>$dom->save(&quot;books.xml&quot;); </li></ul>
  21. 21. Removing element <ul><li>$elements = $dom->getElementsByTagname(&quot;kirja&quot;); </li></ul><ul><li>$element = $elements->item(0); </li></ul><ul><li>$children = $element->childNodes(); </li></ul><ul><li>$child = $element->removeChild( $children->item(0) ); </li></ul><ul><li><kirjat> </li></ul><ul><li><kirja> </li></ul><ul><li><nimi>Tuntematon Sotilas</nimi> </li></ul><ul><li></kirja> </li></ul><ul><li><kirja> </li></ul><ul><li><nimi>Learn Java</nimi> </li></ul><ul><li></kirja> </li></ul><ul><li></kirjat> </li></ul>
  22. 22. PHP DOM and Encoding <ul><li>PHP DOM uses utf-8 internally </li></ul><ul><li>Everything you put into xml-document using PHP DOM must be converted to utf-8. </li></ul><ul><li>utf8_encode(..), utf8_decode(...) </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.