3. XProc: An XML Pipeline Language
W3C Candidate Recommendation
http://www.w3.org/TR/xproc/
Language for describing operations to be performed on XML
documents
Declarative, XML-based, extensible
4. XProc Status
XProc processor implementations
Calabash (http://xmlcalabash.com)
Calumet (http://developer.emc.com/xmltech)
…more in development
XProc.org
Informal website about XProc and its use
http://xproc.org
XProc Test Suite
http://tests.xproc.org
6. XProc Basics
Step
Performs a well-defined task
Validate, XSLT, XInclude, identity transformation, rename elements, …
Three main step types:
Atomic, compound, and built-in language constructs
Pipeline
Sequence (possibly non-linear) of steps
Is a step itself
Built-ins
For-each, choose, try/catch, …
XPath as expression language
7. Anatomy of an XProc Step
Step declaration
Type, input ports, output ports, options
When used in a pipeline, a step source
is a black box that Step
Expects zero or more XML documents on result
its input ports
Produces zero or more XML documents
on its output ports
11. Application Integration with XProc
XProc as the XML processing layer
Integration using standard XProc facilities
Interoperable
Integration using extensions
Implementation-dependent
Limited interoperability
12. XProc – Enabling Technology
Some XML standards depend on
XML processing capabilities XForms
XForms XML
The XRX architecture HTTP
XForms/REST/X...... XML
End-to-end XML model
XProc
XProc is a natural fit
Native
XML DB
13. Validate/XQuery/Transform Pipeline
XProc itself integrates multiple XML technologies
Easy to use, robust
Focus on WHAT, not on the low-level HOW
Better maintainability and customizability
p:validate-
with-xml- p:xquery p:xslt
schema
15. Executing external programs
p:exec
Program to execute p:exec
Command-line arguments cmd
Working directory
Standard input/output
Support for non-XML data
Error handling
16. Executing external programs – Example
Pipeline that counts the words in the input document
<p:declare-step>
<p:input port="source"/>
<p:output port="result"/>
<p:exec command="/usr/bin/wc"
source-is-xml="false"
result-is-xml="false"/>
</p:declare-step>
17. Integration with REST Web Services
p:http-request
Request URL p:http-request
Request method Request
Authentication HTTP
Headers Response
REST
Multipart messages
Service
18. Integration with REST Web Services – Example
Pipeline that retrieves the Twitter public timeline
<p:declare-step>
<p:output port="result"/>
<p:http-request>
<p:input port="source">
<p:inline>
<c:request method="GET"
href="http://twitter.com/statuses/public_timeline.xml"/>
</p:inline>
</p:input>
</p:http-request>
</p:declare-step>
19. Custom Atomic Steps
Custom step that provides the
ext:custom-
integration logic step
Implementation in the processor’s host
language
Step declaration + import
Potentially not interoperable
p:step-available() XPath function
EXProc.org
http://exproc.org
21. Integration on the I/O level
Resource access in XProc is URI-based
XProc processors should support file and http(s)
Support for other URI schemes is implementation-defined
Transparent support for additional storage models/systems
p:load p:store
Storage system
22. Integration on the I/O Level - Example
Pipeline that loads a document from a native XML database
<p:declare-step>
<p:output port="result"/>
<p:load href="xhive:/books/book24.xml"/>
</p:declare-step>
23. Other Types of Integration
XML data model integration
Persistent DOM-based native XML databases
Integration via other XML languages
XSLT and XQuery extensions
24. Conclusions
XProc deals with composing XML processes
Extensible by nature
Easy to integrate with external world
Declarative code easier to develop and maintain
25. Questions and Answers
Vojtech Toman
toman_vojtech@emc.com
http://www.emc.com
http://developer.emc.com/xmltech