• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
VTD-XML: The Future of XML Processing
 

VTD-XML: The Future of XML Processing

on

  • 4,831 views

 

Statistics

Views

Total Views
4,831
Views on SlideShare
4,791
Embed Views
40

Actions

Likes
2
Downloads
66
Comments
0

4 Embeds 40

http://www.slideshare.net 32
http://albert-myptc.blogspot.com 6
http://albert-myptc.blogspot.in 1
http://albert-myptc.blogspot.ru 1

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    VTD-XML: The Future of XML Processing VTD-XML: The Future of XML Processing Presentation Transcript

    • VTD-XML: The Future of XML Processing
      Albert Guo
      junyuo@gmail.com
    • VTD-XML
      Motivations Behind VTD-XML
      Why VTD-XML?
      When to Use VTD-XML?
      Known Limitations
      Basic Concept
      Essential Classes
      Shortcomings
      Typical Programming Flows
      Demo
      Reference
      Agenda
      2
    • *Numerous*, well-known issues of old XML processing models, below summarizes a few:
      Comparison with DOM, SAX and PULL
      http://vtd-xml.sourceforge.net/userGuide/5.html
      Motivations Behind VTD-XML
      3
      • The next generation XML processing model that is simultaneously:
      • The world's most memory-efficient random-access XML parser.
      • The world's fastest XML parser
      • The world's fastest XPath 1.0 implementation.
      • The world's most efficient XML indexer that seamlessly integrates with your XML applications.
      • The world's only incremental-update capable XML parser capable of cutting, pasting, splitting and assembling XML documents with max efficiency.
      • The world's only XML parser that allows you to use XPath to process 256 GB XML documents.
      Why VTD-XML?
      4
    • The scenarios that you may consider using VTD-XML
      Large XML files that DOM can’t handle
      Performance-critical transactional Web- Services/SOA applications
      Native XML database applications
      Network-based XML content switching/routing/security applications
      When to Use VTD-XML?
      5
      • Not yet support external entities (those declared within DTD)
      • Not yet process DTD (return as a single VTD record)
      • Schema validation feature is planned for a future release.
      • Extreme long (>=512 chars) element/attribute names or ultra deep document (>= 255 levels) will cause parse exception
      • http://vtd-xml.sourceforge.net/userGuide/0.html
      Known Limitations
      6
    • Basic Concept
      • Non-extractive tokenization based on Virtual Token Descriptor (VTD): use 64-bit integers to encode offsets, lengths, token types, depths
      • The XML document is kept intact and un-decoded.
      7
    • Basic Concept – cont.
      • In other words, in vast majority of the cases string allocation is *unnecessary*, and nothing but a waste of CPU and memory
      • VTD-XML performs many string operations directly on VTD records
      • String to VTD record comparison (both boolean and lexicographically)
      • Direct conversions from VTD records to ints, longs, floats and doubles
      • VTD record to String conversion also provided, but avoid them whenever possible for performance reasons
      8
    • Basic Concept – cont.
      • VTD-XML’s document hierarchy consists *exclusively* of elements
      • Move a single, global cursor to different locations in the document tree
      • Many VTDNav’s methods identify a VTD record with its index value
      • -1 corresponds to “no such record”
      9
    • Essential Classes
      10
    • Essential Classes – cont.
      11
    • Poor exception handling
      Shortcomings
      12
      If this method does not execute properly,
      it will just return false from parseFile method,
      and does not report any exception message.
    • Add BufferedInput Stream in parseFile method to avoid running out of read buffer max size in UNIX platform
      Shortcomings – cont.
      13
      You need to modify the build.bat to rebuild VTD-XML jar file, then set it into class path.
      //add commons-io jar file into the first line
      javac-classpath .;D:libcommons-io-1.4commons-io-1.4.jar comximpleware*.java
      javac comximplewarexpath*.java
      javac comximplewareparser*.java

      Finally, you just need to execute build.bat file. Then it will generate the brand-new jar file for you.
    • Typical Programming Flows
      Call VTDGen’s parseFile(…)
      Start with a byte buffer containing the content of XML, call set_doc() of VTDGen
      Call VTDGen’s loadIndex(…)
      Call VTDGen’s parse()
      Obtain an instance
      VTDNav from VTDGen
      Move VTDNav’s cursor manually to
      various locations and perform
      corresponding application logic
      Instantiate autoPilot for node
      iteration and XPath to perform
      Corresponding application logic
      14
    • Demo
      15
    • 1. Add <age> tag after <geneder>
      16
    • 1. Add <age> tag after <geneder> – cont.
      17
      Compiled XPath expression
      Binded with NTDNav
      Assigned age value
      Moved to gender cursor, and added
      <age> tag after <gender> tag
      Outputted to new xml file
    • 2. Remove <age> tag
      18
    • 2. Remove <age> tag – cont.
      19
      Compiled XPath expression
      Binded with NTDNav
      Remove <age>
      Outputted to new xml file
    • 3. Add Contact info after <age> tag
      20
    • 3. Add Contact info after <age> tag – cont.
      21
      Compiled XPath expression
      Binded with NTDNav
      Assigned age value
      Inserted new value after
      <gender> tag
      Outputted to new xml file
    • 4. Visit XML file
      22
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Albert, gender=男
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-11111111
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0911111111
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0915555555
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Mandy, gender=女
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-22222222
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0912222222
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Verio, gender=男
      2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0913333333
    • 4. Visit XML file – cont.
      23
      Compiled XPath expression
      Binded with NTDNav
    • 4. Visit XML file – cont.
      24
    • Official site
      http://vtd-xml.sourceforge.net/
      Jar files, source code, sample code
      http://sourceforge.net/projects/vtd-xml/files/
      JavaDoc
      http://vtd-xml.sourceforge.net/javadoc/
      Accelerate WSS applications with VTD-XML
      http://www.javaworld.com/javaworld/jw-01-2007/jw-01-vtd.html
      Reference
      25
    • Process SOAP with VTD-XML
      http://jimmyzhang.sys-con.com/node/48764/mobile
      XPathTutorial:
      http://www.w3schools.com/XPath/default.asp
      Sample code
      http://sites.google.com/site/junyuo/Home/code
      26
      Reference – cont.