Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lumberjack XPath 101

1,679 views

Published on

  • Be the first to comment

  • Be the first to like this

Lumberjack XPath 101

  1. 1. #PBCAT The Lumberjack - Xpath 101 Thomas Weinert
  2. 2. About Me ● Application Developer ● PHP ● JavaScript ● XSL ● papaya Software GmbH ● papaya CMS ● Technical Director ● FluentDOM
  3. 3. Questions! Please ask any time!
  4. 4. Xpath 1 ● XML Path Language ● W3C Recommendation 16 November 1999 ● Used by ● XSLT 1 ● XPointer
  5. 5. Xpath 2 ● W3C Recommendation 23 January 2007 ● Superset of Xpath 1 ● More data types
  6. 6. DOM ● Document Object Modell ● Standard extension: ext/dom ● LibXml2 ● Xpath 1
  7. 7. DOMXpath ● Create after loading the document! ● evaluate()/query() <?php $str = '<sample><element/></sample>'; $dom = new DOMDocument(); $dom->loadXML($str); $xpath = new DOMXPath($dom); var_dump($xpath->evaluate('//element')); var_dump($xpath->evaluate('//noelement')); var_dump($xpath->evaluate('//noelement/@attr')); ?> object(DOMNodeList)[5]
  8. 8. SimpleXML ● Always return SimpleXML <?php $str = '<sample><element/></sample>'; $xml = simplexml_load_string($str); var_dump($xml->xpath('//element')); var_dump($xml->xpath('//noelement')); var_dump($xml->xpath('//noelement/@attr')); ?> array array boolean false 0 => empty object(SimpleXMLElement)[2]
  9. 9. XSL ● Libxslt ● based on Libxml2 ● ext/xsl ● ext/xslcache
  10. 10. Syntax /element/child[@attr] Absolute Path Step 1 Predicate Separator Step 2
  11. 11. Nodes ● node() ● * or qualified-name ● text() ● comment() ● processing-instruction()
  12. 12. Axes ● axis::... ● Full syntax ● Short Syntax ● Default Axis
  13. 13. child <barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp> </barcamps>
  14. 14. descendant <barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp> </barcamps>
  15. 15. parent <barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp> </barcamps>
  16. 16. following-sibling <barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp> </barcamps>
  17. 17. More Axes ● ancestor ● attribute ● ancestor-or-self ● namespaces ● descendant-or-self ● following ● preceding ● preceding-sibling ● self
  18. 18. Short Syntax ● self::node()/ descendant-or-self::node()/ child::para ● .//para Axis Short child self . parent .. attribute @ descendant-or-self /
  19. 19. Cast Functions ● string() ● number() ● boolean() echo $xpath->evaluate('string(/html/head/title)');
  20. 20. Node Functions ● count() ● name() ● last() ● local-name() ● position() ● namespace-uri() $list = $xpath->evaluate( '//*[local-name() = 'li' and position() = last()]' );
  21. 21. String Functions ● concat() ● normalize-string() ● starts-with() ● translate() ● contains() ● substring-before() ● substring-after() ● substring() ● string-length()
  22. 22. Match A Class ● normalize-string() ● concat() ● contains()
  23. 23. Namespaces ● URN ● Prefix ● Default Namespace ● Own Prefixes ● Attributes
  24. 24. Bug #49490 ● Namespace prefix conflict $dom = new DOMDocument(); $dom->loadXML( '<foobar><a:foo xmlns:a="urn:a">'. '<b:bar xmlns:b="urn:b"/></a:foo>'. '</foobar>' ); $xpath = new DOMXPath($dom); $context = $dom->documentElement->firstChild; $xpath->registerNamespace('a', 'urn:b'); var_dump( $xpath->evaluate('descendant-or-self::a:*', $context) ->item(0)->tagName );
  25. 25. Tools ● Firebug ● Firefox AddOns
  26. 26. CSS Selectors ● JavaScript libraries ● element nodes ● * ● no axes ● descendant-or-self::* ● can ignore namespaces ● descendant-or-self::*[local-name() = '...']
  27. 27. Thanks ● Web: ● http://www.papaya-cms.com/ ● http://www.a-basketful-of-papayas.net/ ● Twitter ● @ThomasWeinert ● Joind.in ● http://joind.in/1621

×