Painless OO <-> XML
    with XML::Pastor


Joel Bernstein - LPW 2008
It’s all Greek to me

      schema (pl. schemata)
      σχήμα (skhēma)
      shape, plan
I do not like XML
  People use it wrong

• Apple Property Lists
• Tag soup
• Data transfer format vs data storage format
How many of you?

• Use XML
• Hate XML
• Like XML
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
XML::Pastor is for
  all of you.
XML is hard, right?
   Some hard things:

• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling...
Solution
Tools should make both the syntax and the details of
         the manipulation of XML invisible
XML::Pastor

• I didn’t write it
• Written by Ayhan Ulusoy
• Available on CPAN
• Abstracts away some of the pain of XML
What does it do?

• Generates Perl code from W3C XML
  Schema (XSD)
• Roundtrip and validate XML to/from Perl
  without lo...
Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM
  object
• Convert XML DOM tree into native Perl
  objects
•...
Reasons to not use
      XML::Pastor
• When you have no XML Schema
 • Although several tools can infer XML
    schemata fr...
XML::Pastor
    Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Cr...
Warning, boring bit
How Pastor works
    Code generation
• Parse schemata into schema model
 • Perl data structures containing all the
    glo...
How Pastor works
Code Generation pt. 2
How Pastor works
   Generated classes
• Each generated class (i.e. type) has classdata
  “XmlSchemaType” containing schema...
How Pastor works
        In use
• If classes generated offline, then “use”
  them, if online then they are already loaded
•...
Very simple Album
   XML demo
Album XML document
Album XML schema
Pastorize creates Perl classes from
       Album XML schema:



     Resulting code tree like:
Roundtrip and modify XML data using Pastor:
The result!
Real world Pastor
Moose::Role for Pastor
Country XML
Dynamic XML::Pastor usage
Query the Country object
Modify elements and attributes
     with uniform syntax
NodeArray syntax
Create new City data and
combine with existing Country object
Validate modified data
against the stored schema
Turn Pastor objects back into XML, or
  transform to XML::LibXML DOM
Simple D::HA object
Rekeying data
Rekeying data deeper
XML::Pastor Scope

• Good for “data XML”
• Unsuitable for “mixed markup”
 • e.g. XHTML
• Unsuitable for “huge” documents
XML::Pastor Supported
XML Schema Features
• Simple and Complex Types
• Global Elements
• Groups, Attributes, AttributeGrou...
XML::Pastor
    known limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute...
XML Data Binding

• Binding XML documents to objects
  specifically designed for the data in those
  documents
• Allows e.g...
Sales Order XML
Sales Order XML   Logical data model




                     XML DOM
XML DOM
How this makes me feel:
Other XML modules
•   XML::Twig

•   XML::Compile

•   XML::Simple

•   XML::Smart
XML::Twig
• Manipulates XML directly
 • Using code is coupled closely to
    document structure
• Optimised for processing...
XML::Compile
• Original design rationale is to deal with
  SOAP envelopes and WSDL documents
• Different approach but simi...
XML::Compile pt. 2

• Schema support incomplete
 • Shaky support for imports, includes
    • Include restriction on target...
XML::Simple
• Working roundtrip binding for simple cases
 • e.g. XMLout(XMLin($file))
    works
• Simple API
• Produces si...
XML::Simple pt. 2

• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
XML::Smart

• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure informa...
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded
  objects as HASH, ARRAY...
Any questions?
Thanks for coming
    See you next year
Bonus Material
  If we have enough time
XML Schema Inference
• Create an XML schema from an XML
  document instance
• Every document has an (implicit) schema
• To...
Schema diff
Upcoming SlideShare
Loading in …5
×

Painless OO XML with XML::Pastor

3,642 views
3,528 views

Published on

An introduction to XML::Pastor, comparison with other modules etc

Published in: Technology
1 Comment
2 Likes
Statistics
Notes
  • http://www.dbmanagement.info/Tutorials/XML.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
3,642
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
38
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Painless OO XML with XML::Pastor

  1. 1. Painless OO <-> XML with XML::Pastor Joel Bernstein - LPW 2008
  2. 2. It’s all Greek to me schema (pl. schemata) σχήμα (skhēma) shape, plan
  3. 3. I do not like XML People use it wrong • Apple Property Lists • Tag soup • Data transfer format vs data storage format
  4. 4. How many of you? • Use XML • Hate XML • Like XML
  5. 5. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  6. 6. XML::Pastor is for all of you.
  7. 7. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  8. 8. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  9. 9. XML::Pastor • I didn’t write it • Written by Ayhan Ulusoy • Available on CPAN • Abstracts away some of the pain of XML
  10. 10. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  11. 11. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  12. 12. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  13. 13. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  14. 14. Warning, boring bit
  15. 15. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  16. 16. How Pastor works Code Generation pt. 2
  17. 17. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  18. 18. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  19. 19. Very simple Album XML demo
  20. 20. Album XML document
  21. 21. Album XML schema
  22. 22. Pastorize creates Perl classes from Album XML schema: Resulting code tree like:
  23. 23. Roundtrip and modify XML data using Pastor:
  24. 24. The result!
  25. 25. Real world Pastor
  26. 26. Moose::Role for Pastor
  27. 27. Country XML
  28. 28. Dynamic XML::Pastor usage
  29. 29. Query the Country object
  30. 30. Modify elements and attributes with uniform syntax
  31. 31. NodeArray syntax
  32. 32. Create new City data and combine with existing Country object
  33. 33. Validate modified data against the stored schema
  34. 34. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  35. 35. Simple D::HA object
  36. 36. Rekeying data
  37. 37. Rekeying data deeper
  38. 38. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  39. 39. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  40. 40. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  41. 41. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents • Allows e.g. data-centric applications to manipulate data more naturally than by using DOM API
  42. 42. Sales Order XML
  43. 43. Sales Order XML Logical data model XML DOM
  44. 44. XML DOM
  45. 45. How this makes me feel:
  46. 46. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  47. 47. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  48. 48. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  49. 49. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  50. 50. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  51. 51. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  52. 52. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  53. 53. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  54. 54. Any questions?
  55. 55. Thanks for coming See you next year
  56. 56. Bonus Material If we have enough time
  57. 57. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  58. 58. Schema diff

×