Painless OO XML with XML::Pastor - 2009 Remix

  • 1,201 views
Uploaded on

How to build Perl classes with roundtrip data binding to XML, painlessly, using W3C XML Schema and XML::Pastor …

How to build Perl classes with roundtrip data binding to XML, painlessly, using W3C XML Schema and XML::Pastor

Slides from a previous revision of this talk are online at:
http://www.slideshare.net/joelbernstein/painless-oo-xml-with-xmlpastorq-presentation/

I will be presenting an expanded, more practical, 2009 version of this talk. Now with more code and less theory!

- XML is hard, right? Some things which are hard.
- XML data binding
- Comparisons of modules
- XML::Twig
- XML::Smart
- XML::Simple
- XML::Pastor
- Pastor howto
- XML schema inference
- Trang, Relaxer
- Relaxer howto
- The future?

For more information on XML::Pastor see:
http://search.cpan.org/~aulusoy/XML-Pastor/

Relaxer download:
http://www.relaxer.jp/download/relaxer-1.0.zip

Relaxer book (Japanese...):
http://www.amazon.co.jp/exec/obidos/ASIN/4894715279/

Trang:
http://www.thaiopensource.com/download/trang-20030619.zip

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,201
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
17
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Painless OO <-> XML with XML::Pastor (2009 remix) Joel Bernstein YAPC::EU 2009
  • 2. It’s all Greek to me schema σχήµα (skhēma) shape, plan
  • 3. How many of you?
  • 4. How many of you? • Use XML
  • 5. How many of you? • Use XML • Hate XML
  • 6. How many of you? • Use XML • Hate XML • Like XML
  • 7. A Confession • I do not like XML • People use it wrong
  • 8. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents. • I often have to do this.
  • 9. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  • 10. Typical horrendous XML document
  • 11. Sales Order XML Logical data model XML DOM
  • 12. I shouldn’t need to care about this
  • 13. How this makes me feel:
  • 14. Fundamental problem • I don’t think in elements and attributes • I think about my data, not how it’s stored • This is Perl. DWIM.
  • 15. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  • 16. Do you write XML
  • 17. Do you write XML • By hand?
  • 18. Do you write XML • By hand? • Programmatically?
  • 19. Do you write XML • By hand? • Programmatically? • Schemata?
  • 20. Do you write XML • By hand? • Programmatically? • Schemata? • Validation?
  • 21. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  • 22. XML::Pastor is for all of you.
  • 23. XML::Pastor • Available on CPAN • Abstracts away some of the pain of XML • Ayhan Ulusoy is the author • I am just a user
  • 24. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  • 25. pastorize • Automates codegen process • Conceptually similar to DBIC::Schema::Loader • TMTOWTDI - offline or runtime • Works on multiple XSDs (caveat, collisions)
  • 26. pastorize in use pastorize --mode offline --style multiple --destination /tmp/lib/perl --class_prefix MyApp::Data /some/path/to/schema.xsd
  • 27. Very simple contrived Album XML demo
  • 28. Album XML document
  • 29. Album XML schema
  • 30. Pastorize the Album XML schema: Resulting code tree like:
  • 31. Modify some XML
  • 32. Roundtrip and modify XML data using Pastor: # Load XML # Accessors # Modify # Write XML
  • 33. The result!
  • 34. Real world Pastor
  • 35. $HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK' }; Real world Pastor
  • 36. Country XML
  • 37. Dynamic schema parsing of Country XML
  • 38. Query the Country object
  • 39. Modify elements and attributes with uniform syntax
  • 40. Manipulate array-like data
  • 41. Create new City data and combine with existing Country object
  • 42. Validate modified data against the stored schema
  • 43. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  • 44. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  • 45. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  • 46. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  • 47. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  • 48. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  • 49. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  • 50. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  • 51. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  • 52. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  • 53. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  • 54. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  • 55. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  • 56. Any questions?
  • 57. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/
  • 58. Bonus Material If we have enough time
  • 59. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  • 60. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  • 61. Simple D::HA object
  • 62. Rekeying data
  • 63. Rekeying data deeper
  • 64. Warning, boring bit
  • 65. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  • 66. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  • 67. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  • 68. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  • 69. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/