VTD-XML: The Future of XML ProcessingAlbert Guojunyuo@gmail.com
VTD-XMLMotivations Behind VTD-XMLWhy VTD-XML?When to Use VTD-XML?Known LimitationsBasic ConceptEssential ClassesShortcomingsTypical Programming FlowsDemoReferenceAgenda2
*Numerous*, well-known issues of old XML processing models, below summarizes a few:Comparison with DOM, SAX and PULL	http://vtd-xml.sourceforge.net/userGuide/5.htmlMotivations Behind VTD-XML3
The next generation XML processing model that is simultaneously:
The world's most memory-efficient random-access XML parser.
The world's fastest XML parser
The world's fastest XPath 1.0 implementation.
The world's most efficient XML indexer that seamlessly integrates with your XML applications.
The world's only incremental-update capable XML parser capable of cutting, pasting, splitting and assembling XML documents with max efficiency.
The world's only XML parser that allows you to use XPath to process 256 GB XML documents.Why VTD-XML?4
The scenarios that you may consider using VTD-XMLLarge XML files that DOM can’t handlePerformance-critical transactional Web- Services/SOA applicationsNative XML database applicationsNetwork-based XML content switching/routing/security applicationsWhen to Use VTD-XML?5
Not yet support external entities (those declared within DTD)
Not yet process DTD (return as a single VTD record)
Schema validation feature is planned for a future release.
Extreme long (>=512 chars) element/attribute names or ultra deep document (>= 255 levels) will cause parse exception
http://vtd-xml.sourceforge.net/userGuide/0.htmlKnown Limitations6
Basic ConceptNon-extractive tokenization based on Virtual Token Descriptor (VTD): use 64-bit integers to encode offsets, lengths, token types, depths
The XML document is kept intact and un-decoded.7
Basic Concept – cont.In other words, in vast majority of the cases string allocation is *unnecessary*, and nothing but a waste of CPU and memory
VTD-XML performs many string operations directly on VTD records
String to VTD record comparison (both boolean and lexicographically)
Direct conversions from VTD records to ints, longs, floats and doubles
VTD record to String conversion also provided, but avoid them whenever possible for performance reasons8
Basic Concept – cont.VTD-XML’s document hierarchy consists *exclusively* of elements
Move a single, global cursor to different locations in the document tree
Many VTDNav’s methods identify a VTD record with its index value

VTD-XML: The Future of XML Processing