XML and XPath with PHP

                                TobiasSchlitt <toby@php.net>

                                    ...
About me



    Tobias Schlitt <toby@php.net>
    PHP since 2001
    Freelancing consultant
    Qualified IT Specialist
   ...
And who are you?




     What is your name?
     Where are you from?
     What is your business?
     What are your exper...
Overview




1 XML


2 XML in PHP


3 XPath




 Tobias Schlitt (IPC SE 2009)   XML and XPath with PHP   2009-05-29   4 / ...
Outline




1 XML
     Overview
     Terminology
     The XML tree

2 XML in PHP


3 XPath




  Tobias Schlitt (IPC SE 20...
Outline - XML



1 XML
      Overview
      Terminology
      The XML tree

2 XML in PHP


3 XPath




 Tobias Schlitt (IP...
What is XML?



     General specification for creating markup languages
     Data exchange between computer systems
      ...
XML languages by example



    XHTML                  XML variant of the popular HTML
          RSS              Really S...
Related technologies - Schemas

Schema
A schema defines the structure XML instance documents.

XMLSchema                  W...
Related technologies - Querying




Query
A query extracts a sub-set of information from a data source.

        XPath    ...
Outline - XML



1 XML
      Overview
      Terminology
      The XML tree

2 XML in PHP


3 XPath




 Tobias Schlitt (IP...
An XML document
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code</t...
Preamble
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code</title>
 ...
Element nodes
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code</tit...
Attribute nodes
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code</t...
Atomic value nodes
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code...
Document element
<?xml version=”1.0” encoding=”UTF-8”?>
<bookshelf>

  <book id=”1”>
    <title lang=”en”>Beautiful code</...
Outline - XML



1 XML
      Overview
      Terminology
      The XML tree

2 XML in PHP


3 XPath




 Tobias Schlitt (IP...
The XML tree

                                         <bookshelf>

                                <book>                ...
Children I

                                          <bookshelf>

                                 <book>                ...
Children II

                                          <bookshelf>

                                 <book>               ...
Parent

                                         <bookshelf>

                                <book>                     <...
Descendants

                                         <bookshelf>

                                <book>                 ...
Ascendants

                                         <bookshelf>

                                <book>                  ...
Siblings

                                          <bookshelf>

                                 <book>                  ...
CDATA
CDATA
Avoid the escaping hell in text content.


Without CDATA
<b o o k s h e l f>
  <book i d=” 1 ”>
     < t i t l...
CDATA
CDATA
Avoid the escaping hell in text content.


With CDATA
<b o o k s h e l f>
  <book i d=” 1 ”>
     < t i t l e ...
CDATA
CDATA
Avoid the escaping hell in text content.


The CDATA dilemma
<b o o k s h e l f>
  <book i d=” 1 ”>
     < t i...
CDATA
CDATA
Avoid the escaping hell in text content.


The CDATA dilemma workaround
<b o o k s h e l f>
  <book i d=” 1 ”>...
CDATA
CDATA
Avoid the escaping hell in text content.


The CDATA dilemma solution
<b o o k s h e l f>
  <book i d=” 1 ”>
 ...
Comments


Comments in XML
<b o o k s h e l f>

   <book i d=” 1 ”>
     < t i t l e l a n g=” en ”> B e a u t i f u l c o...
Namespaces
Namespaces
Allow to avoid naming conflicts between different XML sources.


Single, default namespace

<b o o k s...
Namespaces
Namespaces
Allow to avoid naming conflicts between different XML sources.


Multiple namespaces

<b o o k s h e l...
Outline



1 XML


2 XML in PHP
     DOM
         Introductional example
         Essential classes
         Practical DOM...
Overview



XML APIs
PHP has quite some XML APIs.

     The most important are:
            DOM
            XMLReader / XM...
Overview - DOM


     Document Object Model
     Standardized API to access XML tree
            W3C recommendation
      ...
Overview - XMLReader/-Writer




     Popular approach to access XML data
     Similar implementations available in
      ...
Overview - SimpleXml




     Very simple access to XML data
     Unique (?) to PHP
     Represents XML structures as obje...
APIs compared

                                     DOM        XMLReader/-Writer   SimpleXML
        Read                 ...
Outline - XML in PHP


1 XML


2 XML in PHP
      DOM
         Introductional example
         Essential classes
         ...
Outline - DOM




                                Introductional example




 Tobias Schlitt (IPC SE 2009)      XML and XP...
Getting started
Printing all authors

$dom = new DOMDocument ( ) ;
$dom−>l o a d ( ’ s o u r c e s / e x a m p l e . xml ’...
Outline - DOM




                                Essential classes




 Tobias Schlitt (IPC SE 2009)   XML and XPath with...
DOMNode

Purpose
Base class for all nodes types (elements, attributes, ...).


     Typical tree operations               ...
DOMNode

Purpose
Base class for all nodes types (elements, attributes, ...).


     Typical tree operations               ...
DOMNode

Purpose
Base class for all nodes types (elements, attributes, ...).


     Typical tree operations               ...
DOMNode

Purpose
Base class for all nodes types (elements, attributes, ...).


     Typical tree operations               ...
DOMElement

Purpose
Representation of an element (extends node)


     Attribute related operations                       ...
DOMElement

Purpose
Representation of an element (extends node)


     Attribute related operations                       ...
DOMElement

Purpose
Representation of an element (extends node)


     Attribute related operations                       ...
DOMDocument
Purpose
Representation of a XML document


    Creation methods                                     Load /save...
DOMDocument
Purpose
Representation of a XML document


    Creation methods                                     Load /save...
DOMDocument
Purpose
Representation of a XML document


    Creation methods                                     Load /save...
DOMDocument
Purpose
Representation of a XML document


    Creation methods                                     Load /save...
DOMDocument
Purpose
Representation of a XML document


    Creation methods                                     Load /save...
DOMNodeList
Purpose
Collection of DOMNodes (elements, attributes, ...). Iteratable.

     Operations
            item()
  ...
DOMNodeList
Purpose
Collection of DOMNodes (elements, attributes, ...). Iteratable.

     Operations
            item()
  ...
Outline - DOM




                                Practical DOM




 Tobias Schlitt (IPC SE 2009)   XML and XPath with PHP...
Reading XML I



Reading

$dom = new DOMDocument ( ) ;
$dom−>l o a d ( ’ s o u r c e s / e x a m p l e . xml ’ ) ;

$ r o ...
Reading XML II

Reading II

f o r e a c h ( $ b o o k s a s $book )
{
        echo ’ Book w i t h ID ’
              . $bo...
Reading XML III




Output

Document i s a b o o k s h e l f
 Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro
 Book w i t h ID ...
Create XML I



Creating

$dom = new DOMDocument ( ’ 1 . 0 ’ , ’UTF−8 ’ ) ;
$dom−>f o r m a t O u t p u t = t r u e ;

$ r...
Create XML II
Creating II

$dvd−>s e t A t t r i b u t e ( ’ i d ’ , 1 ) ;

$dvd−>a p p e n d C h i l d (
    $dom−>c r e ...
Manipulate XML I



Source XML
<? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?>
<b o o k s h e l f>         <book...
Manipulate XML II



Manipulating

$dom = new DOMDocument ( ) ;
$dom−>f o r m a t O u t p u t = t r u e ;

$dom−>l o a d (...
Manipulate XML III

Manipulating II

$ r o o t = $dom−>documentElement ;

$newBook = $ r o o t −>a p p e n d C h i l d (
 ...
Manipulate XML IV


Output

<? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?>
<b o o k s h e l f>
  <book i d=” 1 ...
Outline - XML in PHP


1 XML


2 XML in PHP
      DOM
         Introductional example
         Essential classes
         ...
Reading XML I



Reading

$ r e a d e r = new XMLReader ( ) ;
$ r e a d e r −>open ( ’ s o u r c e s / e x a m p l e . xml...
Reading XML II

Reading

        i f ( $ r e a d e r −>l o c a l N a m e === ’ book ’ )
        {
             echo ’ Book...
Reading XML III




Output

Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro
Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro




  To...
Create XML I



Creating

$ w r i t e r = new XMLWriter ( ) ;
$ w r i t e r −>o p e n U r i ( ’ s o u r c e s / x m l w r ...
Create XML II


Creating II

$ w r i t e r −>s t a r t E l e m e n t ( ’ dvd ’ ) ;
$ w r i t e r −>w r i t e A t t r i b u...
Create XML III




Output

<? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?>
<b o o k s h e l f>
  <book i d=” 1 ”...
Outline - XML in PHP


1 XML


2 XML in PHP
      DOM
         Introductional example
         Essential classes
         ...
Reading XML I



Reading

$xml = s i m p l e x m l l o a d f i l e ( ’ s o u r c e s / e x a m p l e . xml ’ ) ;

foreach ...
Reading XML II




Output

 Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro
 Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro




 To...
Create XML I

Creating

$xml = new S i m p l e X m l E l e m e n t ( ’<d v d s h e l f ></d v d s h e l f > ’ ) ;

$xml−>a...
Outline



1 XML


2 XML in PHP


3 XPath
     Overview
     Basics
     In depth
         Axis
         Predicates
     X...
Outline - XPath


1 XML


2 XML in PHP


3 XPath
      Overview
      Basics
      In depth
         Axis
         Predica...
Overview


XPath
Enables you to select information parts from XML documents.

     Traverse the XML tree
     Select XML n...
Overview


XPath
Enables you to select information parts from XML documents.

     Traverse the XML tree
     Select XML n...
XML example reminder

<b o o k s h e l f>
  <book id=”1”>
     < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i...
Introductory example

<b o o k s h e l f>
  <book id=”1”>
     < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i...
Introductory example

<b o o k s h e l f>
  <book id=”1”>
     < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i...
Outline - XPath


1 XML


2 XML in PHP


3 XPath
      Overview
      Basics
      In depth
         Axis
         Predica...
Addressing

      Every XPath expression matches a set of nodes (0..n)
      It encodes an “address” for the selected node...
Contexts

       Every expression step creates a new context
       The next is evaluated in the context created by the pr...
Attributes




Select attributes
Prepend the attribute name with an @.

Select all currency attributes

/ b o o k s h e l ...
Steps


Navigation
Navigation is not only possible in parent → child direction.

Navigate to parent

../
/ b o o k s h e l...
Indexing




Access nodes by position
It is possible to access a specific node in a set by its position.

Indexing

/ b o o...
Indexing


Access nodes by position
It is possible to access a specific node in a set by its position.

Indexing

/ b o o k...
Wildcards




Wildcard search
A wildcard represents a node of a certain type with arbitrary name.

Wildcards
/ b o o k s h...
Union




Union
Union the node sets selected by multiple XPath expressions.

Union
/ b o o k s h e l f / book / t i t l e ...
A first PHP example

Querying first book title

$dom = new DOMDocument ( ) ;
$dom−>l o a d ( ’ s o u r c e s / e x a m p l e...
Second PHP example
Querying all currencies

// . . .

$ c u r r e n c i e s = $xpath −>q u e r y ( ’ // p r i c e / @ c u ...
Outline - XPath


1 XML


2 XML in PHP


3 XPath
      Overview
      Basics
      In depth
         Axis
         Predica...
XPath syntax


                                                  4 axis:
    An XPath query consists of                   ...
XPath syntax


                                                   4 axis:
     An XPath query consists of                 ...
XPath syntax


                                                   4 axis:
     An XPath query consists of                 ...
XPath syntax


                                                   4 axis:
     An XPath query consists of                 ...
XPath syntax


                                                   4 axis:
     An XPath query consists of                 ...
XPath syntax


                                                            4 axis:
     An XPath query consists of        ...
Outline - In depth




                                         Axis




  Tobias Schlitt (IPC SE 2009)   XML and XPath wi...
Axis
13 dimensions
Imagine an XML document to be a 13 dimensional space...

                                 Axis         ...
Axis



Axis syntax

         <a x i s n a m e >::< n o d e t e s t >




  Tobias Schlitt (IPC SE 2009)           XML and...
Axis



Axis syntax

         <a x i s n a m e >::< n o d e t e s t >



Child axis
              / b o o k s h e l f / bo...
Axis



Axis syntax

         <a x i s n a m e >::< n o d e t e s t >



Descendant-or-self axis
              // book
   ...
Axis



Axis syntax

         <a x i s n a m e >::< n o d e t e s t >



Attribute axis
              // book / @id
      ...
Axis



Axis syntax

         <a x i s n a m e >::< n o d e t e s t >



Parent axis
              // book / @id / . .
   ...
Ancestor axis
XPath with ancestor axis
          // t i t l e / a n c e s t o r : : ∗




  Tobias Schlitt (IPC SE 2009)  ...
Ancestor axis
XPath with ancestor axis
           // t i t l e / a n c e s t o r : : ∗



Selected XML nodes
<? xml v e r ...
Following axis
XPath with ancestor axis
          // book / f o l l o w i n g : : ∗




  Tobias Schlitt (IPC SE 2009)    ...
Following axis
XPath with ancestor axis
           // book / f o l l o w i n g : : ∗



Selected XML nodes
<? xml v e r s ...
Following-sibling axis
XPath with ancestor axis
          // book / f o l l o w i n g − s i b l i n g : : ∗




  Tobias S...
Following-sibling axis
XPath with ancestor axis
           // book / f o l l o w i n g − s i b l i n g : : ∗



Selected X...
Namespace axis
XPath with namespace axis

         namespace : : ∗




  Tobias Schlitt (IPC SE 2009)   XML and XPath with...
Namespace axis
XPath with namespace axis

          namespace : : ∗



XML with namespaces

<b o o k s h e l f
      xmlns...
Outline - In depth




                                    Predicates




  Tobias Schlitt (IPC SE 2009)   XML and XPath w...
Predicate syntax




      You already saw indexing with numeric predicates
      Predicates can also be booleans
      Fu...
Predicate examples

Select all books with ID 1
              // book [ @id = ’ 1 ’ ]




  Tobias Schlitt (IPC SE 2009)   ...
Predicate examples

Select all books that have any attribute at all

              // book [ @ ∗ ]
              // book /...
Predicate examples

Select all books with price round 40

              // book [ r o u n d ( p r i c e ) = 4 0 ]




  To...
Predicate examples

Select all authors with first name initial T
                // book / a u t h o r [ s u b s t r i n g ...
Operator overview
Mathematical operators
 +, −, ∗      Addition, subtraction, multiplication
   div        Division
  mod ...
Operator overview
Mathematical operators
 +, −, ∗      Addition, subtraction, multiplication
   div        Division
  mod ...
Operator overview
Mathematical operators
 +, −, ∗      Addition, subtraction, multiplication
   div        Division
  mod ...
Operator overview
Mathematical operators
 +, −, ∗      Addition, subtraction, multiplication
   div        Division
  mod ...
Functions by example
     String functions
     string-join() Concatenates 2 strings
      substring() Extracts a part fro...
Functions by example
     String functions
     string-join() Concatenates 2 strings
      substring() Extracts a part fro...
Functions by example
     String functions
     string-join() Concatenates 2 strings
      substring() Extracts a part fro...
Functions by example
     String functions
     string-join() Concatenates 2 strings
      substring() Extracts a part fro...
Functions by example
     String functions
     string-join() Concatenates 2 strings
      substring() Extracts a part fro...
Outline - XPath


1 XML


2 XML in PHP


3 XPath
      Overview
      Basics
      In depth
         Axis
         Predica...
Outline - XPath in action




                                 Namespaces in RDF




  Tobias Schlitt (IPC SE 2009)    XML...
Find all namespaces in an RDF document

           RDF Resource Description Framework

      Semantic web
      Makes heav...
Find all namespaces in an RDF document

           RDF Resource Description Framework

      Semantic web
      Makes heav...
Outline - XPath in action




                                      pQuery




  Tobias Schlitt (IPC SE 2009)   XML and XP...
jQuery




     Cool JavaScript framework
     Allows selecting HTML elements by CSS selectors
     http://jquery.com/



...
pQuery




     Little example class
     Models a tiny bits of jQuery, using
            DOM
            XPath




 Tobia...
Example HTML
<html>
  <head>
     < t i t l e>Some w e b s i t e</ t i t l e>
  </ head>
<body>
  <h1>Some h e a d l i n e...
Usage
r e q u i r e ’ p q u e r y / p q u e r y . php ’ ;

$dom = new DOMDocument ( ) ;
$dom−>f o r m a t O u t p u t = t ...
The pQuery class
c l a s s pQuery
{
        p r o t e c t e d $dom ;

       protected $context ;

       p r o t e c t e ...
Issueing a query
      // . . .

      p u b l i c f u nc t i o n query ( $ s t r i n g )
      {
             $ x p a t h...
Creating XPath
     // . . .

     protected f u n c t i o n createXPath ( $ s t r i n g )
     {
         $ p a r t s = e...
Creating XPath part I
     // . . .

     protected function                createSingleXPath ( $string )
     {
         ...
Creating XPath part II
              // . . .

              i f ( count ( $ p a r t s ) < 3 )
              {
           ...
Creating selector
        // . . .

        protected function c r e a t e S e l e c t o r ( array $parts )
        {
    ...
Adding a class I
     // . . .

     public function addClass ( $ c la s s )
     {
         f o r e a c h ( $ t h i s −>c...
Adding a class II
                      // . . .

                      i f ( ! in array ( $class , $classes ) )
         ...
Simple example
r e q u i r e ’ p q u e r y / p q u e r y . php ’ ;

$dom = new DOMDocument ( ) ;
$dom−>f o r m a t O u t p...
Simple example result
<!DOCTYPE html PUBLIC ”−//W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN”
          ” h t t p : / ...
Advanced example
r e q u i r e ’ p q u e r y / p q u e r y . php ’ ;

$dom = new DOMDocument ( ) ;
$dom−>f o r m a t O u t...
Advanced example result
<!DOCTYPE html PUBLIC ”−//W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN”
          ” h t t p : ...
The end

                                Thank you for listening!
     Are there any questions left?
     I hope you learn...
The end

                                Thank you for listening!
     Are there any questions left?
     I hope you learn...
The end

                                Thank you for listening!
     Are there any questions left?
     I hope you learn...
The end

                                Thank you for listening!
     Are there any questions left?
     I hope you learn...
The end

                                Thank you for listening!
     Are there any questions left?
     I hope you learn...
Upcoming SlideShare
Loading in …5
×

XML and XPath with PHP

20,904 views

Published on

XML and its related technologies are ubiquitous in todays web development. PHP offers many ways to create and process XML content. This workshop will give you an overview on the most important XML extensions for PHP, focussing on the use of XPath in cooperation with them. Do you still scrape web content using regular expressions? Ever wondered you people do all those nifty operations in their XSLTs? Don't know, what axis are in terms of XPath? If you can answer any of the questions above with "yes" or are simply interested in XPath and XML in PHP, you should join this session.

Published in: Technology, News & Politics
  • As a management instructor I appreciate viewing the work of others. This is one of the best demonstration on planning I've viewed.
    Sharika
    http://financeadded.com http://traveltreble.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I benefited a lot from these lessons because it’s easy to understand Thank You
    --------------------------------------
    زيروجيت : http://0gate.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Thanks a lot to the author for this excellent overview of PHP's XML support and the disadvantages of the majority of libraries out there. It's great to see a skilled programmer with a strong grasp of the available options in as convoluted an area as this presenting honest findings in a clear and concise manner. This is a real resource for the rest of us. After searching, I think it's fair to say that most of the PHP XML data out there (including the official documentation) is either poorly presented (most cases), misguided or out (many cases) completely of date. To all programmers considering posting their 'findings' to the web - I challenge you to communicate with as clear and well grounded an understanding of your subject as this guy! Simply a joy to read.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

XML and XPath with PHP

  1. 1. XML and XPath with PHP TobiasSchlitt <toby@php.net> IPC SE 2009 2009-05-29 Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 1 / 102
  2. 2. About me Tobias Schlitt <toby@php.net> PHP since 2001 Freelancing consultant Qualified IT Specialist Studying CS at TU Dortmund (expect to finish this year) OSS addicted PHP eZ Components PHPUnit Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 2 / 102
  3. 3. And who are you? What is your name? Where are you from? What is your business? What are your experiences with XML / XPath? What do you expect from this workshop? Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 3 / 102
  4. 4. Overview 1 XML 2 XML in PHP 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 4 / 102
  5. 5. Outline 1 XML Overview Terminology The XML tree 2 XML in PHP 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 5 / 102
  6. 6. Outline - XML 1 XML Overview Terminology The XML tree 2 XML in PHP 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 6 / 102
  7. 7. What is XML? General specification for creating markup languages Data exchange between computer systems System independent Human readable Most used on the web Successor of SGML W3C recommendation 1.0 1998 (last update 2008-11-26) 1.1 2004 (last update 2006-08-16) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 7 / 102
  8. 8. XML languages by example XHTML XML variant of the popular HTML RSS Really Simply Syndication Provide news / updates / ... of websites Read by special clients Aggregation on portals / planets SVG Scalable Vector Graphics Describe vector graphics in XML Potentially interactive / animated (via ECMAScript) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 8 / 102
  9. 9. Related technologies - Schemas Schema A schema defines the structure XML instance documents. XMLSchema Written in XML W3C recommendation Popular RelaxNG 2 syntax variants XML based Short plain text based OASIS / ISO standard Popular DTD Plain text W3C recommendation Deprecated Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 9 / 102
  10. 10. Related technologies - Querying Query A query extracts a sub-set of information from a data source. XPath W3C recommendation Navigation in XML documents more on that later... XQuery Functional programming language Allows complex queries Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 10 / 102
  11. 11. Outline - XML 1 XML Overview Terminology The XML tree 2 XML in PHP 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 11 / 102
  12. 12. An XML document <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  13. 13. Preamble <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  14. 14. Element nodes <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  15. 15. Attribute nodes <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  16. 16. Atomic value nodes <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  17. 17. Document element <?xml version=”1.0” encoding=”UTF-8”?> <bookshelf> <book id=”1”> <title lang=”en”>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=”Euro”>35.95</price> </book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=”Euro”>39.95</price> </book> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
  18. 18. Outline - XML 1 XML Overview Terminology The XML tree 2 XML in PHP 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 13 / 102
  19. 19. The XML tree <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  20. 20. Children I <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  21. 21. Children II <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  22. 22. Parent <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  23. 23. Descendants <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  24. 24. Ascendants <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  25. 25. Siblings <bookshelf> <book> <book> id= <title> <author> <author> <year> <price> ”1” A. Oram G. Wilson 2007 Beautiful currency= lang= 39.95 code ”en” ”Euro” Attribute siblings Attributes are not considered siblings to elements. Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102
  26. 26. CDATA CDATA Avoid the escaping hell in text content. Without CDATA <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <h i n t> Some e x a m p l e s make u s e o f &lt;xml&gt; . </ h i n t> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 15 / 102
  27. 27. CDATA CDATA Avoid the escaping hell in text content. With CDATA <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s make u s e o f <xml> . ]]> </ h i n t> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 15 / 102
  28. 28. CDATA CDATA Avoid the escaping hell in text content. The CDATA dilemma <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s show t h e u s a g e o f <![CDATA[ ]]> ]]> </ h i n t> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 15 / 102
  29. 29. CDATA CDATA Avoid the escaping hell in text content. The CDATA dilemma workaround <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s show t h e u s a g e o f <![CDATA[ ]]]]><![CDATA[> ]]> </ h i n t> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 15 / 102
  30. 30. CDATA CDATA Avoid the escaping hell in text content. The CDATA dilemma solution <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <h i n t> U29tZSBleGFtcGxlcyBzaG93IHRoZSB1c2FnZSBvZiA8IVtDREFUQVsgXV0+ </ h i n t> </ book> </ b o o k s h e l f> Base 64 in PHP Use the built in functions base64 encode() and base64 decode(). Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 15 / 102
  31. 31. Comments Comments in XML <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> <!−− . . . more b o o k s . . . −−> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 16 / 102
  32. 32. Namespaces Namespaces Allow to avoid naming conflicts between different XML sources. Single, default namespace <b o o k s h e l f x m l n s=” h t t p : // e x a m p l e . com/ book ”> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 17 / 102
  33. 33. Namespaces Namespaces Allow to avoid naming conflicts between different XML sources. Multiple namespaces <b o o k s h e l f x m l n s=” h t t p : // e x a m p l e . com/ book ” x m l n s : b o o k=” h t t p : // e x a m p l e . com/ book ” x m l n s : d c=” h t t p : // p u r l . o r g / dc / e l e m e n t s / 1 . 1 / ” > <book i d=” 1 ”> < d c : t i t l e b o o k : l a n g=” en ”> B e a u t i f u l c o d e</ d c : t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 17 / 102
  34. 34. Outline 1 XML 2 XML in PHP DOM Introductional example Essential classes Practical DOM XMLReader/-Writer (by example) SimpleXml (by example) 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 18 / 102
  35. 35. Overview XML APIs PHP has quite some XML APIs. The most important are: DOM XMLReader / XMLWriter SimpleXML Deprecated are: DOM XML XML Parser Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 19 / 102
  36. 36. Overview - DOM Document Object Model Standardized API to access XML tree W3C recommendation Level 1 in 1999 Currently: Level 3 (2004) Available in many languages C Java Perl Python ... Represents XML nodes as objects Loads full XML tree into memory Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 20 / 102
  37. 37. Overview - XMLReader/-Writer Popular approach to access XML data Similar implementations available in Java C# Pull / push based Does not load XML fully into memory Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 21 / 102
  38. 38. Overview - SimpleXml Very simple access to XML data Unique (?) to PHP Represents XML structures as objects Initial implementation hackish Loads full XML tree into memory You don’t want to use SimpleXML, seriously! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 22 / 102
  39. 39. APIs compared DOM XMLReader/-Writer SimpleXML Read • Write ◦ Manipulate • • Full control - Namespaces ◦ XPath - Validate DTD DTD - Schema Schema RelaxNG RelaxNG Comfort • ◦ Fully supported • Supported but not nice ◦ Poorly supported - Not supported at all Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 23 / 102
  40. 40. Outline - XML in PHP 1 XML 2 XML in PHP DOM Introductional example Essential classes Practical DOM XMLReader/-Writer (by example) SimpleXml (by example) 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 24 / 102
  41. 41. Outline - DOM Introductional example Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 25 / 102
  42. 42. Getting started Printing all authors $dom = new DOMDocument ( ) ; $dom−>l o a d ( ’ s o u r c e s / e x a m p l e . xml ’ ) ; $ a u t h o r s = $dom−>getElementsByTagName ( ’ a u t h o r ’ ) ; foreach ( $authors as $author ) { echo ’ A u t h o r : ’ . $ a u t h o r −>n o d e V a l u e . ” n” ; } Output Author : A. Oram Author : G. Wilson Author : T. Schlitt Author : K. Nordmann Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 26 / 102
  43. 43. Outline - DOM Essential classes Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 27 / 102
  44. 44. DOMNode Purpose Base class for all nodes types (elements, attributes, ...). Typical tree operations Typical tree properties appendChild() $parent removeChild() $childNodes replaceChild() $previousSibling hasChildNodes() $nextSibling insertBefore() DOM specific properties DOM specific operations $nodeType cloneNode() $nodeValue lookupNamespaceURI() $ownerDocument lookupPrefix() $namespaceURI isDefaultNamespace() $prefix normalize() $localName Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 28 / 102
  45. 45. DOMNode Purpose Base class for all nodes types (elements, attributes, ...). Typical tree operations Typical tree properties appendChild() $parent removeChild() $childNodes replaceChild() $previousSibling hasChildNodes() $nextSibling insertBefore() DOM specific properties DOM specific operations $nodeType cloneNode() $nodeValue lookupNamespaceURI() $ownerDocument lookupPrefix() $namespaceURI isDefaultNamespace() $prefix normalize() $localName Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 28 / 102
  46. 46. DOMNode Purpose Base class for all nodes types (elements, attributes, ...). Typical tree operations Typical tree properties appendChild() $parent removeChild() $childNodes replaceChild() $previousSibling hasChildNodes() $nextSibling insertBefore() DOM specific properties DOM specific operations $nodeType cloneNode() $nodeValue lookupNamespaceURI() $ownerDocument lookupPrefix() $namespaceURI isDefaultNamespace() $prefix normalize() $localName Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 28 / 102
  47. 47. DOMNode Purpose Base class for all nodes types (elements, attributes, ...). Typical tree operations Typical tree properties appendChild() $parent removeChild() $childNodes replaceChild() $previousSibling hasChildNodes() $nextSibling insertBefore() DOM specific properties DOM specific operations $nodeType cloneNode() $nodeValue lookupNamespaceURI() $ownerDocument lookupPrefix() $namespaceURI isDefaultNamespace() $prefix normalize() $localName Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 28 / 102
  48. 48. DOMElement Purpose Representation of an element (extends node) Attribute related operations Properties hasAttribute[NS]() $tagName getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]() Element related operations getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102
  49. 49. DOMElement Purpose Representation of an element (extends node) Attribute related operations Properties hasAttribute[NS]() $tagName getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]() Element related operations getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102
  50. 50. DOMElement Purpose Representation of an element (extends node) Attribute related operations Properties hasAttribute[NS]() $tagName getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]() Element related operations getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102
  51. 51. DOMDocument Purpose Representation of a XML document Creation methods Load /save methods createAttribute[NS]() load/save() createElement[NS]() load/saveHTMLFile() Element retrieval methods Properties getElementsByTagName[NS]() $documentElement getElementById() $documentURI Misc operations $preserveWhitespace registerNodeClass() $formatOutput validate() $doctype schemaValidate() $xmlVersion relaxNGValidate() $xmlEncoding $recover (not DOM!) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 30 / 102
  52. 52. DOMDocument Purpose Representation of a XML document Creation methods Load /save methods createAttribute[NS]() load/save() createElement[NS]() load/saveHTMLFile() Element retrieval methods Properties getElementsByTagName[NS]() $documentElement getElementById() $documentURI Misc operations $preserveWhitespace registerNodeClass() $formatOutput validate() $doctype schemaValidate() $xmlVersion relaxNGValidate() $xmlEncoding $recover (not DOM!) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 30 / 102
  53. 53. DOMDocument Purpose Representation of a XML document Creation methods Load /save methods createAttribute[NS]() load/save() createElement[NS]() load/saveHTMLFile() Element retrieval methods Properties getElementsByTagName[NS]() $documentElement getElementById() $documentURI Misc operations $preserveWhitespace registerNodeClass() $formatOutput validate() $doctype schemaValidate() $xmlVersion relaxNGValidate() $xmlEncoding $recover (not DOM!) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 30 / 102
  54. 54. DOMDocument Purpose Representation of a XML document Creation methods Load /save methods createAttribute[NS]() load/save() createElement[NS]() load/saveHTMLFile() Element retrieval methods Properties getElementsByTagName[NS]() $documentElement getElementById() $documentURI Misc operations $preserveWhitespace registerNodeClass() $formatOutput validate() $doctype schemaValidate() $xmlVersion relaxNGValidate() $xmlEncoding $recover (not DOM!) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 30 / 102
  55. 55. DOMDocument Purpose Representation of a XML document Creation methods Load /save methods createAttribute[NS]() load/save() createElement[NS]() load/saveHTMLFile() Element retrieval methods Properties getElementsByTagName[NS]() $documentElement getElementById() $documentURI Misc operations $preserveWhitespace registerNodeClass() $formatOutput validate() $doctype schemaValidate() $xmlVersion relaxNGValidate() $xmlEncoding $recover (not DOM!) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 30 / 102
  56. 56. DOMNodeList Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations item() Properties $length Standard f o r ( $ i = 0 ; $ i < $ a u t h o r s −>l e n g t h ; ++$ i ) { echo ’ A u t h o r : ’ . $ a u t h o r s −>i t e m ( $ i )−>n o d e V a l u e . ” n” ; } Foreach foreach ( $authors as $author ) { echo ’ A u t h o r : ’ . $ a u t h o r −>n o d e V a l u e . ” n” ; } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 31 / 102
  57. 57. DOMNodeList Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations item() Properties $length Standard f o r ( $ i = 0 ; $ i < $ a u t h o r s −>l e n g t h ; ++$ i ) { echo ’ A u t h o r : ’ . $ a u t h o r s −>i t e m ( $ i )−>n o d e V a l u e . ” n” ; } Foreach foreach ( $authors as $author ) { echo ’ A u t h o r : ’ . $ a u t h o r −>n o d e V a l u e . ” n” ; } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 31 / 102
  58. 58. Outline - DOM Practical DOM Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 32 / 102
  59. 59. Reading XML I Reading $dom = new DOMDocument ( ) ; $dom−>l o a d ( ’ s o u r c e s / e x a m p l e . xml ’ ) ; $ r o o t = $dom−>documentElement ; echo ’ Document i s a ’ . $ r o o t −>tagName . ” n” ; $ b o o k s = $ r o o t −>getElementsByTagName ( ’ book ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 33 / 102
  60. 60. Reading XML II Reading II f o r e a c h ( $ b o o k s a s $book ) { echo ’ Book w i t h ID ’ . $book−>g e t A t t r i b u t e ( ’ i d ’ ) ; f o r e a c h ( $book−>getElementsByTagName ( ’ p r i c e ’ ) a s $price ) { echo ’ c o s t s ’ . $ p r i c e −>n o d e V a l u e . ’ ’ . $ p r i c e −>g e t A t t r i b u t e ( ’ c u r r e n c y ’ ) . ” n” ; } } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 34 / 102
  61. 61. Reading XML III Output Document i s a b o o k s h e l f Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 35 / 102
  62. 62. Create XML I Creating $dom = new DOMDocument ( ’ 1 . 0 ’ , ’UTF−8 ’ ) ; $dom−>f o r m a t O u t p u t = t r u e ; $ r o o t = $dom−>a p p e n d C h i l d ( $dom−>c r e a t e E l e m e n t ( ’ d v d s h e l f ’ ) ); $dvd = $ r o o t −>a p p e n d C h i l d ( $dom−>c r e a t e E l e m e n t ( ’ dvd ’ ) ); Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 36 / 102
  63. 63. Create XML II Creating II $dvd−>s e t A t t r i b u t e ( ’ i d ’ , 1 ) ; $dvd−>a p p e n d C h i l d ( $dom−>c r e a t e E l e m e n t ( ’ t i t l e ’ , ’ S t a r Trek ’ ) ); $dom−>s a v e ( ’ s o u r c e s / d o m c r e a t e . xml ’ ) ; Output <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> < d v d s h e l f> <dvd i d=” 1 ”> < t i t l e >S t a r Trek</ t i t l e > </ dvd> </ d v d s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 37 / 102
  64. 64. Manipulate XML I Source XML <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <b o o k s h e l f> <book i d=” 1 ”>< t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> <p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 38 / 102
  65. 65. Manipulate XML II Manipulating $dom = new DOMDocument ( ) ; $dom−>f o r m a t O u t p u t = t r u e ; $dom−>l o a d ( ’ s o u r c e s / e x a m p l e w e i r d f o r m a t . xml ’ , LIBXML NOBLANKS ); Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 39 / 102
  66. 66. Manipulate XML III Manipulating II $ r o o t = $dom−>documentElement ; $newBook = $ r o o t −>a p p e n d C h i l d ( $dom−>c r e a t e E l e m e n t ( ’ book ’ ) ); $newBook−>s e t A t t r i b u t e ( ’ i d ’ , 2 ) ; $newBook−>a p p e n d C h i l d ( $dom−>c r e a t e E l e m e n t ( ’ t i t l e ’ , ’ eZ Components − Das E n t w i c k l e r h a n d b u c h ’ ) ); $dom−>s a v e ( ’ s o u r c e s / d o m m a n i p u l a t e . xml ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 40 / 102
  67. 67. Manipulate XML IV Output <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> <book i d=” 2 ”> < t i t l e >eZ Components − Das E n t w i c k l e r h a n d b u c h</ t i t l e > </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 41 / 102
  68. 68. Outline - XML in PHP 1 XML 2 XML in PHP DOM Introductional example Essential classes Practical DOM XMLReader/-Writer (by example) SimpleXml (by example) 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 42 / 102
  69. 69. Reading XML I Reading $ r e a d e r = new XMLReader ( ) ; $ r e a d e r −>open ( ’ s o u r c e s / e x a m p l e . xml ’ ) ; w h i l e ( $ r e a d e r −>r e a d ( ) ) { i f ( $ r e a d e r −>nodeType !== XMLReader : : ELEMENT ) { continue ; } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 43 / 102
  70. 70. Reading XML II Reading i f ( $ r e a d e r −>l o c a l N a m e === ’ book ’ ) { echo ’ Book w i t h ID ’ . $ r e a d e r −>g e t A t t r i b u t e ( ’ i d ’ ) ; } i f ( $ r e a d e r −>l o c a l N a m e === ’ p r i c e ’ ) { echo ’ c o s t s ’ . $ r e a d e r −>r e a d S t r i n g ( ) . ’ ’ . $ r e a d e r −>g e t A t t r i b u t e ( ’ c u r r e n c y ’ ) . ” n” ; } } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 44 / 102
  71. 71. Reading XML III Output Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 45 / 102
  72. 72. Create XML I Creating $ w r i t e r = new XMLWriter ( ) ; $ w r i t e r −>o p e n U r i ( ’ s o u r c e s / x m l w r i t e r c r e a t e . xml ’ ) ; $ w r i t e r −>s e t I n d e n t S t r i n g ( ’ ’ ); $ w r i t e r −>s e t I n d e n t ( t r u e ) ; $ w r i t e r −>s t a r t D o c u m e n t ( ’ 1 . 0 ’ , ’UTF−8 ’ ) ; $ w r i t e r −>s t a r t E l e m e n t ( ’ d v d s h e l f ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 46 / 102
  73. 73. Create XML II Creating II $ w r i t e r −>s t a r t E l e m e n t ( ’ dvd ’ ) ; $ w r i t e r −>w r i t e A t t r i b u t e ( ’ i d ’ , ’ 1 ’ ) ; $ w r i t e r −>w r i t e E l e m e n t ( ’ t i t l e ’ , ’ S t a r Trek ’ ) ; $ w r i t e r −>e n d E l e m e n t ( ) ; // <dvd> $ w r i t e r −>f l u s h ( ) ; $ w r i t e r −>e n d E l e m e n t ( ) ; // <d v d s h e l f > $ w r i t e r −>endDocument ( ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 47 / 102
  74. 74. Create XML III Output <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <b o o k s h e l f> <book i d=” 1 ”> < t i t l e > B e a u t i f u l c o d e</ t i t l e > </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 48 / 102
  75. 75. Outline - XML in PHP 1 XML 2 XML in PHP DOM Introductional example Essential classes Practical DOM XMLReader/-Writer (by example) SimpleXml (by example) 3 XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 49 / 102
  76. 76. Reading XML I Reading $xml = s i m p l e x m l l o a d f i l e ( ’ s o u r c e s / e x a m p l e . xml ’ ) ; foreach ( $xml−>book a s $book ) { echo ’ Book w i t h ID ’ . $book [ ’ i d ’ ] . ’ costs ’ . $book−>p r i c e . ’ ’ . $book−>p r i c e [ ’ c u r r e n c y ’ ] . ” n” ; } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 50 / 102
  77. 77. Reading XML II Output Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 51 / 102
  78. 78. Create XML I Creating $xml = new S i m p l e X m l E l e m e n t ( ’<d v d s h e l f ></d v d s h e l f > ’ ) ; $xml−>a d d C h i l d ( ’ dvd ’ ) ; $xml−>dvd [0]−> a d d A t t r i b u t e ( ’ i d ’ , 1 ) ; $xml−>dvd [0]−> a d d C h i l d ( ’ t i t l e ’ , ’ S t a r Trek ’ ) ; $xml−>asXML ( ’ s o u r c e s / s i m p l e x m l c r e a t e . xml ’ ) ; Output <? xml v e r s i o n=” 1 . 0 ” ?> < d v d s h e l f><dvd i d=” 1 ”>< t i t l e >S t a r Trek</ t i t l e ></ dvd></ d v d s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 52 / 102
  79. 79. Outline 1 XML 2 XML in PHP 3 XPath Overview Basics In depth Axis Predicates XPath in action Namespaces in RDF pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 53 / 102
  80. 80. Outline - XPath 1 XML 2 XML in PHP 3 XPath Overview Basics In depth Axis Predicates XPath in action Namespaces in RDF pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 54 / 102
  81. 81. Overview XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 55 / 102
  82. 82. Overview XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 55 / 102
  83. 83. XML example reminder <b o o k s h e l f> <book id=”1”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </book> <book id=”2”> < t i t l e l a n g=” de ”>eZ Components − Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 56 / 102
  84. 84. Introductory example <b o o k s h e l f> <book id=”1”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </book> <book id=”2”> < t i t l e l a n g=” de ”>eZ Components − Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f> 2 variants to fetch all books / b o o k s h e l f / book // book Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 56 / 102
  85. 85. Introductory example <b o o k s h e l f> <book id=”1”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </book> <book id=”2”> < t i t l e l a n g=” de ”>eZ Components − Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f> 2 variants to fetch all authors / b o o k s h e l f / book / a u t h o r // a u t h o r Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 56 / 102
  86. 86. Outline - XPath 1 XML 2 XML in PHP 3 XPath Overview Basics In depth Axis Predicates XPath in action Namespaces in RDF pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 57 / 102
  87. 87. Addressing Every XPath expression matches a set of nodes (0..n) It encodes an “address” for the selected nodes Simple XPath expressions look similar to Unix file system addresses Two generally different ways of addressing are supported Absolute addressing / b o o k s h e l f / book / t i t l e / b o o k s h e l f / book / a u t h o r Relative addressing book / a u t h o r ../ title Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 58 / 102
  88. 88. Contexts Every expression step creates a new context The next is evaluated in the context created by the previous one Contexts / b o o k s h e l f / book / t i t l e / resets the context to global bookshelf selects all <bookshelf> elements in the global context / creates a new context, all children of <bookshelf> book selects all <book> elements in this context / creates a new context, all children of selected <book>s title selects all < title > elements in this context → A set of all title element nodes. Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 59 / 102
  89. 89. Attributes Select attributes Prepend the attribute name with an @. Select all currency attributes / b o o k s h e l f / book / p r i c e / @ c u r r e n c y Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 60 / 102
  90. 90. Steps Navigation Navigation is not only possible in parent → child direction. Navigate to parent ../ / b o o k s h e l f / book / t i t l e / . . / a u t h o r Navigate to descendants // t i t l e / b o o k s h e l f / book // @ c u r r e n c y Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 61 / 102
  91. 91. Indexing Access nodes by position It is possible to access a specific node in a set by its position. Indexing / b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 62 / 102
  92. 92. Indexing Access nodes by position It is possible to access a specific node in a set by its position. Indexing / b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ] Start index Indexing generally 1 based Some Internet Explorer versions start with 0 Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 63 / 102
  93. 93. Wildcards Wildcard search A wildcard represents a node of a certain type with arbitrary name. Wildcards / b o o k s h e l f /∗/ t i t l e / b o o k s h e l f / book /@∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 64 / 102
  94. 94. Union Union Union the node sets selected by multiple XPath expressions. Union / b o o k s h e l f / book / t i t l e | / b o o k s h e l f / book / a u t h o r Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 65 / 102
  95. 95. A first PHP example Querying first book title $dom = new DOMDocument ( ) ; $dom−>l o a d ( ’ s o u r c e s / e x a m p l e . xml ’ ) ; $ x p a t h = new DOMXPath( $dom ) ; $ t i t l e s = $xpath −>q u e r y ( ’ / b o o k s h e l f / book [ 1 ] / t i t l e ’ ) ; echo ’ T i t l e o f f i r s t book i s ’ . $ t i t l e s −>i t e m ( 0 )−>n o d e V a l u e . ” n” ; Output T i t l e o f f i r s t book i s B e a u t i f u l c o d e Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 66 / 102
  96. 96. Second PHP example Querying all currencies // . . . $ c u r r e n c i e s = $xpath −>q u e r y ( ’ // p r i c e / @ c u r r e n c y ’ ) ; echo ” F o l l o w i n g c u r r e n c i e s o c c u r : n” ; foreach ( $ c u r r e n c i e s as $currency ) { echo $ c u r r e n c y −>n o d e V a l u e . ” n” ; } Output Following c u r r e n c i e s occur : Euro Euro Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 67 / 102
  97. 97. Outline - XPath 1 XML 2 XML in PHP 3 XPath Overview Basics In depth Axis Predicates XPath in action Namespaces in RDF pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 68 / 102
  98. 98. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) Predicate: accessing a node by index Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  99. 99. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) You already saw examples Predicate: accessing a node by index Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  100. 100. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) You already saw examples Predicate: accessing a node by index Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  101. 101. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) You already saw examples Predicate: accessing a node by index Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  102. 102. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) You already saw examples Predicate: accessing a node by index Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  103. 103. XPath syntax 4 axis: An XPath query consists of child (default) steps attribute (@) Each step consists of: descendant-or-self (//) 1 an axis parent (..) 2 a node test Node tests: 3 a predicate name of the node (default) wildcard (*) You already saw examples Predicate: accessing a node by index Step syntax <a x i s >::< n o d e t e s t >[< p r e d i c a t e >] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 69 / 102
  104. 104. Outline - In depth Axis Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 70 / 102
  105. 105. Axis 13 dimensions Imagine an XML document to be a 13 dimensional space... Axis Shortcut ancestor ancestor-or-self attribute @ child - descendant descendant-or-self // following following-sibling namespace parent .. preceding preceding-sibling self . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 71 / 102
  106. 106. Axis Axis syntax <a x i s n a m e >::< n o d e t e s t > Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 72 / 102
  107. 107. Axis Axis syntax <a x i s n a m e >::< n o d e t e s t > Child axis / b o o k s h e l f / book c h i l d : : b o o k s h e l f / c h i l d : : book Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 72 / 102
  108. 108. Axis Axis syntax <a x i s n a m e >::< n o d e t e s t > Descendant-or-self axis // book d e s c e n d a n t −or− s e l f : : book Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 72 / 102
  109. 109. Axis Axis syntax <a x i s n a m e >::< n o d e t e s t > Attribute axis // book / @id d e s c e n d a n t −or− s e l f : : book / a t t r i b u t e : : i d Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 72 / 102
  110. 110. Axis Axis syntax <a x i s n a m e >::< n o d e t e s t > Parent axis // book / @id / . . d e s c e n d a n t −or− s e l f : : book / a t t r i b u t e : : i d / p a r e n t : : node ( ) Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 72 / 102
  111. 111. Ancestor axis XPath with ancestor axis // t i t l e / a n c e s t o r : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 73 / 102
  112. 112. Ancestor axis XPath with ancestor axis // t i t l e / a n c e s t o r : : ∗ Selected XML nodes <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <bookshelf> <book id=”1”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </book> <!−− . . . −−> </bookshelf> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 73 / 102
  113. 113. Following axis XPath with ancestor axis // book / f o l l o w i n g : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 74 / 102
  114. 114. Following axis XPath with ancestor axis // book / f o l l o w i n g : : ∗ Selected XML nodes <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <!−− . . . −−> </ book> <book id=”2”> <title lang=”de”>eZ Components - Das Entwicklerhandbuch</title> <!– ... –> </book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 74 / 102
  115. 115. Following-sibling axis XPath with ancestor axis // book / f o l l o w i n g − s i b l i n g : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 75 / 102
  116. 116. Following-sibling axis XPath with ancestor axis // book / f o l l o w i n g − s i b l i n g : : ∗ Selected XML nodes <? xml v e r s i o n=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> <b o o k s h e l f> <book i d=” 1 ”> < t i t l e l a n g=” en ”> B e a u t i f u l c o d e</ t i t l e > <!−− . . . −−> </ book> <book id=”2”> < t i t l e l a n g=” de ”>eZ Components − Das E n t w i c k l e r h a n d b u c h </ t i t l e > <!−− . . . −−> </book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 75 / 102
  117. 117. Namespace axis XPath with namespace axis namespace : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 76 / 102
  118. 118. Namespace axis XPath with namespace axis namespace : : ∗ XML with namespaces <b o o k s h e l f xmlns=”http://example.com/book” xmlns:book=”http://example.com/book” xmlns:dc=”http://purl.org/dc/elements/1.1/” > <book i d=” 1 ”> < d c : t i t l e b o o k : l a n g=” en ”> B e a u t i f u l c o d e</ d c : t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y=” Euro ”>3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 76 / 102
  119. 119. Outline - In depth Predicates Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 77 / 102
  120. 120. Predicate syntax You already saw indexing with numeric predicates Predicates can also be booleans Functions and operators allow fine grained tests Predicate syntax <n o d e t e s t >[< p r e d i c a t e >] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 78 / 102
  121. 121. Predicate examples Select all books with ID 1 // book [ @id = ’ 1 ’ ] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 79 / 102
  122. 122. Predicate examples Select all books that have any attribute at all // book [ @ ∗ ] // book /@ ∗ / . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 79 / 102
  123. 123. Predicate examples Select all books with price round 40 // book [ r o u n d ( p r i c e ) = 4 0 ] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 79 / 102
  124. 124. Predicate examples Select all authors with first name initial T // book / a u t h o r [ s u b s t r i n g ( . , 1 , 1 ) = ’T ’ ] Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 79 / 102
  125. 125. Operator overview Mathematical operators +, −, ∗ Addition, subtraction, multiplication div Division mod Modulo operation Comparison operators = Check for equality != Check for inequality <, <= Less than and less than or equal >, >= Greater than and greater than or equal Logical operators or Logical or and Logical and Logical negation not() is a function in XPath! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102
  126. 126. Operator overview Mathematical operators +, −, ∗ Addition, subtraction, multiplication div Division mod Modulo operation Comparison operators = Check for equality != Check for inequality <, <= Less than and less than or equal >, >= Greater than and greater than or equal Logical operators or Logical or and Logical and Logical negation not() is a function in XPath! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102
  127. 127. Operator overview Mathematical operators +, −, ∗ Addition, subtraction, multiplication div Division mod Modulo operation Comparison operators = Check for equality != Check for inequality <, <= Less than and less than or equal >, >= Greater than and greater than or equal Logical operators or Logical or and Logical and Logical negation not() is a function in XPath! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102
  128. 128. Operator overview Mathematical operators +, −, ∗ Addition, subtraction, multiplication div Division mod Modulo operation Comparison operators = Check for equality != Check for inequality <, <= Less than and less than or equal >, >= Greater than and greater than or equal Logical operators or Logical or and Logical and Logical negation not() is a function in XPath! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102
  129. 129. Functions by example String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer floor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
  130. 130. Functions by example String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer floor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
  131. 131. Functions by example String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer floor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
  132. 132. Functions by example String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer floor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
  133. 133. Functions by example String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer floor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
  134. 134. Outline - XPath 1 XML 2 XML in PHP 3 XPath Overview Basics In depth Axis Predicates XPath in action Namespaces in RDF pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 82 / 102
  135. 135. Outline - XPath in action Namespaces in RDF Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 83 / 102
  136. 136. Find all namespaces in an RDF document RDF Resource Description Framework Semantic web Makes heavy usage of different namespaces Complex variant / / ∗ [ name ( . ) = ’ r d f : D e s c r i p t i o n ’ ] / ∗ [ namespace−u r i ( . ) != namespace−u r i ( . . ) and namespace−u r i ( . ) != ’ ’ and namespace−u r i ( . ) != namespace−u r i ( p r e c e d i n g − sibling ::∗) ] Simpler variant / / ∗ [ name ( . ) = ’ r d f : D e s c r i p t i o n ’ ] / n a m e s p a c e s : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 84 / 102
  137. 137. Find all namespaces in an RDF document RDF Resource Description Framework Semantic web Makes heavy usage of different namespaces Complex variant / / ∗ [ name ( . ) = ’ r d f : D e s c r i p t i o n ’ ] / ∗ [ namespace−u r i ( . ) != namespace−u r i ( . . ) and namespace−u r i ( . ) != ’ ’ and namespace−u r i ( . ) != namespace−u r i ( p r e c e d i n g − sibling ::∗) ] Simpler variant / / ∗ [ name ( . ) = ’ r d f : D e s c r i p t i o n ’ ] / n a m e s p a c e s : : ∗ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 84 / 102
  138. 138. Outline - XPath in action pQuery Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 85 / 102
  139. 139. jQuery Cool JavaScript framework Allows selecting HTML elements by CSS selectors http://jquery.com/ Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 86 / 102
  140. 140. pQuery Little example class Models a tiny bits of jQuery, using DOM XPath Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 87 / 102
  141. 141. Example HTML <html> <head> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f=” t e s t . ht m l ” c l a s s=” i n t e r n a l ”>c o n t e n t</a >. </p> <h1 i d=” s e c o n d ”>Second h e a d l i n e</h1> <p> More n i c e <a h r e f=” t e s t . ht m l ”>c o n t e n t</a> . </p> </ body> </ html> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 88 / 102
  142. 142. Usage r e q u i r e ’ p q u e r y / p q u e r y . php ’ ; $dom = new DOMDocument ( ) ; $dom−>f o r m a t O u t p u t = t r u e ; $dom−>loadHTMLFile ( ’ . . / e x a m p l e . h tm l ’ ) ; $q = new pQuery ( $dom ) ; $q−>q u e r y ( ’ h1#s e c o n d ’ ) ; $q−>a d d C l a s s ( ’ s o m e c l a s s ’ ) ; $dom−>saveHTMLFile ( ’ . . / p q u e r y s i m p l e . ht m l ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 89 / 102
  143. 143. The pQuery class c l a s s pQuery { p r o t e c t e d $dom ; protected $context ; p r o t e c t e d $xpath ; public function c o n s t r u c t ( DOMDocument $dom ) { $ t h i s −>dom = $dom ; $ t h i s −>c o n t e x t = a r r a y ( $dom−>documentElement ) ; $ t h i s −>x p a t h = new DOMXPath( $dom ) ; } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 90 / 102
  144. 144. Issueing a query // . . . p u b l i c f u nc t i o n query ( $ s t r i n g ) { $ x p a t h = $ t h i s −>c r e a t e X P a t h ( $ s t r i n g ) ; echo ” Q u e r y i n g w i t h XPath ’ $ x p a t h ’ . n” ; $n ew C o n t e x t = a r r a y ( ) ; f o r e a c h ( $ t h i s −>c o n t e x t a s $node ) { $ n o d e L i s t = $ t h i s −>xpath −>q u e r y ( $xpath , $node ) ; f o r e a c h ( $ n o d e L i s t a s $newNode ) { $n ew Co nt ext [ ] = $newNode ; } } $ t h i s −>c o n t e x t = $ ne w C on t ex t ; } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 91 / 102
  145. 145. Creating XPath // . . . protected f u n c t i o n createXPath ( $ s t r i n g ) { $ p a r t s = explode ( ’ ’ , $ s t r i n g ) ; $xpath = array () ; foreach ( $par ts as $part ) { $ x p a t h [ ] = $ t h i s −>c r e a t e S i n g l e X P a t h ( $ p a r t ) ; } r e t u r n implode ( ’ | ’ , $xpath ) ; } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 92 / 102
  146. 146. Creating XPath part I // . . . protected function createSingleXPath ( $string ) { $parts = p r e g split ( ’ ((#|.) ) ’ , $string , −1, PREG SPLIT DELIM CAPTURE ); $ x p a t h = ’ // ’ ; // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 93 / 102
  147. 147. Creating XPath part II // . . . i f ( count ( $ p a r t s ) < 3 ) { $ x p a t h .= $ s t r i n g ; } else { $ x p a t h .= $ t h i s −>c r e a t e S e l e c t o r ( $ p a r t s ) ; } r e t u r n $xpath ; } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 94 / 102
  148. 148. Creating selector // . . . protected function c r e a t e S e l e c t o r ( array $parts ) { switch ( $parts [ 1 ] ) { case ’ . ’ : return $parts [0] . ’ [ contains ( @class , ” ’ . $parts [ 2 ] . ’” ) ] ’; break ; c a s e ’# ’ : return $parts [0] . ’ [ @id = ” ’ . $ p a r t s [ 2 ] . ’ ”] ’ ; break ; } } } Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 95 / 102
  149. 149. Adding a class I // . . . public function addClass ( $ c la s s ) { f o r e a c h ( $ t h i s −>c o n t e x t a s $node ) { i f ( $node−>nodeType !== XML ELEMENT NODE ) { continue ; } i f ( ! $node−>h a s A t t r i b u t e ( ’ c l a s s ’ ) ) { $ c l a s s e s = array () ; } else { $ c l a s s e s = explode ( ’ ’, $node−>g e t A t t r i b u t e ( ’ c l a s s ’ ) ); } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 96 / 102
  150. 150. Adding a class II // . . . i f ( ! in array ( $class , $classes ) ) { $classes [] = $class ; } $node−>s e t A t t r i b u t e ( ’ class ’ , implode ( ’ ’ , $ c l a s s e s ) ); } } // . . . Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 97 / 102
  151. 151. Simple example r e q u i r e ’ p q u e r y / p q u e r y . php ’ ; $dom = new DOMDocument ( ) ; $dom−>f o r m a t O u t p u t = t r u e ; $dom−>loadHTMLFile ( ’ . . / e x a m p l e . h tm l ’ ) ; $q = new pQuery ( $dom ) ; $q−>q u e r y ( ’ h1#s e c o n d ’ ) ; $q−>a d d C l a s s ( ’ s o m e c l a s s ’ ) ; $dom−>saveHTMLFile ( ’ . . / p q u e r y s i m p l e . ht m l ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 98 / 102
  152. 152. Simple example result <!DOCTYPE html PUBLIC ”−//W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN” ” h t t p : / /www . w3 . o r g /TR/REC−h t m l 4 0 / l o o s e . d t d ”> <html> <head> <meta h t t p −e q u i v=” Content −Type ” c o n t e n t=” t e x t / ht ml ; c h a r s e t= UTF−8”> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f=” t e s t . ht m l ” c l a s s=” i n t e r n a l ”>c o n t e n t</a >. </p> <h1 i d=” s e c o n d ” c l a s s=” s o m e c l a s s ”>Second h e a d l i n e</h1> <p> More n i c e <a h r e f=” t e s t . ht m l ”>c o n t e n t</a> . </p> </ body> </ html> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 99 / 102
  153. 153. Advanced example r e q u i r e ’ p q u e r y / p q u e r y . php ’ ; $dom = new DOMDocument ( ) ; $dom−>f o r m a t O u t p u t = t r u e ; $dom−>loadHTMLFile ( ’ . . / e x a m p l e . h tm l ’ ) ; $q = new pQuery ( $dom ) ; $q−>q u e r y ( ’ h1 a . i n t e r n a l ’ ) ; $q−>a d d C l a s s ( ’ c o o l c l a s s ’ ) ; $dom−>saveHTMLFile ( ’ . . / p q u e r y a d v a n c e d . ht m l ’ ) ; Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 100 / 102
  154. 154. Advanced example result <!DOCTYPE html PUBLIC ”−//W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN” ” h t t p : / /www . w3 . o r g /TR/REC−h t m l 4 0 / l o o s e . d t d ”> <html> <head> <meta h t t p −e q u i v=” Content −Type ” c o n t e n t=” t e x t / ht ml ; c h a r s e t= UTF−8”> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1 c l a s s=” c o o l c l a s s ”>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f=” t e s t . ht m l ” c l a s s=” i n t e r n a l c o o l c l a s s ”> c o n t e n t</a> . </p> <h1 i d=” s e c o n d ” c l a s s=” c o o l c l a s s ”>Second h e a d l i n e</h1> <p> More n i c e <a h r e f=” t e s t . ht m l ”>c o n t e n t</a> . </p> </ body> </ html> Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 101 / 102
  155. 155. The end Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 102 / 102
  156. 156. The end Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 102 / 102
  157. 157. The end Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 102 / 102
  158. 158. The end Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 102 / 102
  159. 159. The end Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference! Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 102 / 102

×