Painless OO <-> XML
  with XML::Pastor
     (2009 remix)

   Joel Bernstein
  YAPC::EU 2009
It’s all Greek to me

      schema
      σχήµα (skhēma)
      shape, plan
How many of you?
How many of you?

• Use XML
How many of you?

• Use XML
• Hate XML
How many of you?

• Use XML
• Hate XML
• Like XML
A Confession


• I do not like XML
• People use it wrong
XML Data Binding
• Binding XML documents to objects
  specifically designed for the data in
  those documents.
• I often ha...
XML is hard, right?
   Some hard things:

• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling...
Typical horrendous XML document
Sales Order XML   Logical data model




                     XML DOM
I shouldn’t need to
   care about this
How this makes me feel:
Fundamental problem

• I don’t think in elements and attributes
• I think about my data, not how it’s stored
• This is Per...
Solution
Tools should make both the syntax and the details of
         the manipulation of XML invisible
Do you write XML
Do you write XML

• By hand?
Do you write XML

• By hand?
• Programmatically?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
XML::Pastor is for
  all of you.
XML::Pastor

• Available on CPAN
• Abstracts away some of the pain of XML
• Ayhan Ulusoy is the author
• I am just a user
What does it do?

• Generates Perl code from W3C XML
  Schema (XSD)
• Roundtrip and validate XML to/from Perl
  without lo...
pastorize

• Automates codegen process
• Conceptually similar to DBIC::Schema::Loader
• TMTOWTDI - offline or runtime
• Wor...
pastorize in use
pastorize   --mode offline --style multiple 
            --destination /tmp/lib/perl     
            --c...
Very simple contrived
 Album XML demo
Album XML document
Album XML schema
Pastorize the Album XML schema:




    Resulting code tree like:
Modify some XML
Roundtrip and modify XML data using Pastor:




                               # Load XML
                                ...
The result!
Real world Pastor
$HASH1 = {
      1 => 'Vodafone UK',
      2 => 'O2 UK',
      3 => 'Orange UK',
      4 => 'T-Mobile UK',
      8 => 'Hut...
Country XML
Dynamic schema parsing of Country XML
Query the Country object
Modify elements and attributes
     with uniform syntax
Manipulate array-like data
Create new City data and
combine with existing Country object
Validate modified data
against the stored schema
Turn Pastor objects back into XML, or
  transform to XML::LibXML DOM
Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM
  object
• Convert XML DOM tree into native Perl
  objects
•...
Reasons to not use
      XML::Pastor
• When you have no XML Schema
 • Although several tools can infer XML
    schemata fr...
XML::Pastor Scope

• Good for “data XML”
• Unsuitable for “mixed markup”
 • e.g. XHTML
• Unsuitable for “huge” documents
XML::Pastor
    known limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute...
Other XML modules
•   XML::Twig

•   XML::Compile

•   XML::Simple

•   XML::Smart
XML::Twig
• Manipulates XML directly
 • Using code is coupled closely to
    document structure
• Optimised for processing...
XML::Compile
• Original design rationale is to deal with
  SOAP envelopes and WSDL documents
• Different approach but simi...
XML::Compile pt. 2

• Schema support incomplete
 • Shaky support for imports, includes
    • Include restriction on target...
XML::Simple
• Working roundtrip binding for simple cases
 • e.g. XMLout(XMLin($file))
    works
• Simple API
• Produces si...
XML::Simple pt. 2

• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
XML::Smart

• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure informa...
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded
  objects as HASH, ARRAY...
Any questions?
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/
Bonus Material
  If we have enough time
XML::Pastor Supported
XML Schema Features
• Simple and Complex Types
• Global Elements
• Groups, Attributes, AttributeGrou...
XML Schema Inference
• Create an XML schema from an XML
  document instance
• Every document has an (implicit) schema
• To...
Simple D::HA object
Rekeying data
Rekeying data deeper
Warning, boring bit
XML::Pastor
    Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Cr...
How Pastor works
    Code generation
• Parse schemata into schema model
 • Perl data structures containing all the
    glo...
How Pastor works
   Generated classes
• Each generated class (i.e. type) has classdata
  “XmlSchemaType” containing schema...
How Pastor works
        In use
• If classes generated offline, then “use”
  them, if online then they are already loaded
•...
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/
Upcoming SlideShare
Loading in …5
×

Painless OO XML with XML::Pastor - 2009 Remix

2,987 views

Published on

How to build Perl classes with roundtrip data binding to XML, painlessly, using W3C XML Schema and XML::Pastor

Slides from a previous revision of this talk are online at:
http://www.slideshare.net/joelbernstein/painless-oo-xml-with-xmlpastorq-presentation/

I will be presenting an expanded, more practical, 2009 version of this talk. Now with more code and less theory!

- XML is hard, right? Some things which are hard.
- XML data binding
- Comparisons of modules
- XML::Twig
- XML::Smart
- XML::Simple
- XML::Pastor
- Pastor howto
- XML schema inference
- Trang, Relaxer
- Relaxer howto
- The future?

For more information on XML::Pastor see:
http://search.cpan.org/~aulusoy/XML-Pastor/

Relaxer download:
http://www.relaxer.jp/download/relaxer-1.0.zip

Relaxer book (Japanese...):
http://www.amazon.co.jp/exec/obidos/ASIN/4894715279/

Trang:
http://www.thaiopensource.com/download/trang-20030619.zip

Published in: Technology, News & Politics
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,987
On SlideShare
0
From Embeds
0
Number of Embeds
28
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Painless OO XML with XML::Pastor - 2009 Remix

  1. Painless OO <-> XML with XML::Pastor (2009 remix) Joel Bernstein YAPC::EU 2009
  2. It’s all Greek to me schema σχήµα (skhēma) shape, plan
  3. How many of you?
  4. How many of you? • Use XML
  5. How many of you? • Use XML • Hate XML
  6. How many of you? • Use XML • Hate XML • Like XML
  7. A Confession • I do not like XML • People use it wrong
  8. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents. • I often have to do this.
  9. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  10. Typical horrendous XML document
  11. Sales Order XML Logical data model XML DOM
  12. I shouldn’t need to care about this
  13. How this makes me feel:
  14. Fundamental problem • I don’t think in elements and attributes • I think about my data, not how it’s stored • This is Perl. DWIM.
  15. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  16. Do you write XML
  17. Do you write XML • By hand?
  18. Do you write XML • By hand? • Programmatically?
  19. Do you write XML • By hand? • Programmatically? • Schemata?
  20. Do you write XML • By hand? • Programmatically? • Schemata? • Validation?
  21. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  22. XML::Pastor is for all of you.
  23. XML::Pastor • Available on CPAN • Abstracts away some of the pain of XML • Ayhan Ulusoy is the author • I am just a user
  24. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  25. pastorize • Automates codegen process • Conceptually similar to DBIC::Schema::Loader • TMTOWTDI - offline or runtime • Works on multiple XSDs (caveat, collisions)
  26. pastorize in use pastorize --mode offline --style multiple --destination /tmp/lib/perl --class_prefix MyApp::Data /some/path/to/schema.xsd
  27. Very simple contrived Album XML demo
  28. Album XML document
  29. Album XML schema
  30. Pastorize the Album XML schema: Resulting code tree like:
  31. Modify some XML
  32. Roundtrip and modify XML data using Pastor: # Load XML # Accessors # Modify # Write XML
  33. The result!
  34. Real world Pastor
  35. $HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK' }; Real world Pastor
  36. Country XML
  37. Dynamic schema parsing of Country XML
  38. Query the Country object
  39. Modify elements and attributes with uniform syntax
  40. Manipulate array-like data
  41. Create new City data and combine with existing Country object
  42. Validate modified data against the stored schema
  43. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  44. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  45. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  46. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  47. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  48. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  49. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  50. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  51. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  52. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  53. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  54. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  55. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  56. Any questions?
  57. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/
  58. Bonus Material If we have enough time
  59. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  60. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  61. Simple D::HA object
  62. Rekeying data
  63. Rekeying data deeper
  64. Warning, boring bit
  65. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  66. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  67. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  68. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  69. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/

×