Your SlideShare is downloading. ×
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

VTD-XML: The Future of XML Processing

3,385

Published on

Published in: Technology, News & Politics
1 Comment
2 Likes
Statistics
Notes
  • http://www.dbmanagement.info/Tutorials/XML.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
3,385
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
82
Comments
1
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. VTD-XML: The Future of XML Processing
    Albert Guo
    junyuo@gmail.com
  • 2. VTD-XML
    Motivations Behind VTD-XML
    Why VTD-XML?
    When to Use VTD-XML?
    Known Limitations
    Basic Concept
    Essential Classes
    Shortcomings
    Typical Programming Flows
    Demo
    Reference
    Agenda
    2
  • 3. *Numerous*, well-known issues of old XML processing models, below summarizes a few:
    Comparison with DOM, SAX and PULL
    http://vtd-xml.sourceforge.net/userGuide/5.html
    Motivations Behind VTD-XML
    3
  • 4.
    • The next generation XML processing model that is simultaneously:
    • 5. The world's most memory-efficient random-access XML parser.
    • 6. The world's fastest XML parser
    • 7. The world's fastest XPath 1.0 implementation.
    • 8. The world's most efficient XML indexer that seamlessly integrates with your XML applications.
    • 9. The world's only incremental-update capable XML parser capable of cutting, pasting, splitting and assembling XML documents with max efficiency.
    • 10. The world's only XML parser that allows you to use XPath to process 256 GB XML documents.
    Why VTD-XML?
    4
  • 11. The scenarios that you may consider using VTD-XML
    Large XML files that DOM can’t handle
    Performance-critical transactional Web- Services/SOA applications
    Native XML database applications
    Network-based XML content switching/routing/security applications
    When to Use VTD-XML?
    5
  • 12.
    • Not yet support external entities (those declared within DTD)
    • 13. Not yet process DTD (return as a single VTD record)
    • 14. Schema validation feature is planned for a future release.
    • 15. Extreme long (>=512 chars) element/attribute names or ultra deep document (>= 255 levels) will cause parse exception
    • 16. http://vtd-xml.sourceforge.net/userGuide/0.html
    Known Limitations
    6
  • 17. Basic Concept
    • Non-extractive tokenization based on Virtual Token Descriptor (VTD): use 64-bit integers to encode offsets, lengths, token types, depths
    • 18. The XML document is kept intact and un-decoded.
    7
  • 19. Basic Concept – cont.
    • In other words, in vast majority of the cases string allocation is *unnecessary*, and nothing but a waste of CPU and memory
    • 20. VTD-XML performs many string operations directly on VTD records
    • 21. String to VTD record comparison (both boolean and lexicographically)
    • 22. Direct conversions from VTD records to ints, longs, floats and doubles
    • 23. VTD record to String conversion also provided, but avoid them whenever possible for performance reasons
    8
  • 24. Basic Concept – cont.
    • VTD-XML’s document hierarchy consists *exclusively* of elements
    • 25. Move a single, global cursor to different locations in the document tree
    • 26. Many VTDNav’s methods identify a VTD record with its index value
    • 27. -1 corresponds to “no such record”
    9
  • 28. Essential Classes
    10
  • 29. Essential Classes – cont.
    11
  • 30. Poor exception handling
    Shortcomings
    12
    If this method does not execute properly,
    it will just return false from parseFile method,
    and does not report any exception message.
  • 31. Add BufferedInput Stream in parseFile method to avoid running out of read buffer max size in UNIX platform
    Shortcomings – cont.
    13
    You need to modify the build.bat to rebuild VTD-XML jar file, then set it into class path.
    //add commons-io jar file into the first line
    javac-classpath .;D:libcommons-io-1.4commons-io-1.4.jar comximpleware*.java
    javac comximplewarexpath*.java
    javac comximplewareparser*.java

    Finally, you just need to execute build.bat file. Then it will generate the brand-new jar file for you.
  • 32. Typical Programming Flows
    Call VTDGen’s parseFile(…)
    Start with a byte buffer containing the content of XML, call set_doc() of VTDGen
    Call VTDGen’s loadIndex(…)
    Call VTDGen’s parse()
    Obtain an instance
    VTDNav from VTDGen
    Move VTDNav’s cursor manually to
    various locations and perform
    corresponding application logic
    Instantiate autoPilot for node
    iteration and XPath to perform
    Corresponding application logic
    14
  • 33. Demo
    15
  • 34. 1. Add <age> tag after <geneder>
    16
  • 35. 1. Add <age> tag after <geneder> – cont.
    17
    Compiled XPath expression
    Binded with NTDNav
    Assigned age value
    Moved to gender cursor, and added
    <age> tag after <gender> tag
    Outputted to new xml file
  • 36. 2. Remove <age> tag
    18
  • 37. 2. Remove <age> tag – cont.
    19
    Compiled XPath expression
    Binded with NTDNav
    Remove <age>
    Outputted to new xml file
  • 38. 3. Add Contact info after <age> tag
    20
  • 39. 3. Add Contact info after <age> tag – cont.
    21
    Compiled XPath expression
    Binded with NTDNav
    Assigned age value
    Inserted new value after
    <gender> tag
    Outputted to new xml file
  • 40. 4. Visit XML file
    22
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Albert, gender=男
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-11111111
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0911111111
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0915555555
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Mandy, gender=女
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-22222222
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0912222222
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Verio, gender=男
    2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0913333333
  • 41. 4. Visit XML file – cont.
    23
    Compiled XPath expression
    Binded with NTDNav
  • 42. 4. Visit XML file – cont.
    24
  • 43. Official site
    http://vtd-xml.sourceforge.net/
    Jar files, source code, sample code
    http://sourceforge.net/projects/vtd-xml/files/
    JavaDoc
    http://vtd-xml.sourceforge.net/javadoc/
    Accelerate WSS applications with VTD-XML
    http://www.javaworld.com/javaworld/jw-01-2007/jw-01-vtd.html
    Reference
    25
  • 44. Process SOAP with VTD-XML
    http://jimmyzhang.sys-con.com/node/48764/mobile
    XPathTutorial:
    http://www.w3schools.com/XPath/default.asp
    Sample code
    http://sites.google.com/site/junyuo/Home/code
    26
    Reference – cont.

×