Painless OO XML with XML::Pastor
Upcoming SlideShare
Loading in...5

Painless OO XML with XML::Pastor



An introduction to XML::Pastor, comparison with other modules etc

An introduction to XML::Pastor, comparison with other modules etc



Total Views
Views on SlideShare
Embed Views



2 Embeds 7 6 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Painless OO XML with XML::Pastor Painless OO XML with XML::Pastor Presentation Transcript

  • Painless OO <-> XML with XML::Pastor Joel Bernstein - LPW 2008
  • It’s all Greek to me schema (pl. schemata) σχήμα (skhēma) shape, plan
  • I do not like XML People use it wrong • Apple Property Lists • Tag soup • Data transfer format vs data storage format
  • How many of you? • Use XML • Hate XML • Like XML
  • Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  • XML::Pastor is for all of you.
  • XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  • Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  • XML::Pastor • I didn’t write it • Written by Ayhan Ulusoy • Available on CPAN • Abstracts away some of the pain of XML
  • What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  • Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  • Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  • XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  • Warning, boring bit
  • How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  • How Pastor works Code Generation pt. 2
  • How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  • How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  • Very simple Album XML demo
  • Album XML document
  • Album XML schema
  • Pastorize creates Perl classes from Album XML schema: Resulting code tree like:
  • Roundtrip and modify XML data using Pastor:
  • The result!
  • Real world Pastor
  • Moose::Role for Pastor
  • Country XML
  • Dynamic XML::Pastor usage
  • Query the Country object
  • Modify elements and attributes with uniform syntax
  • NodeArray syntax
  • Create new City data and combine with existing Country object
  • Validate modified data against the stored schema
  • Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  • Simple D::HA object
  • Rekeying data
  • Rekeying data deeper
  • XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  • XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  • XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  • XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents • Allows e.g. data-centric applications to manipulate data more naturally than by using DOM API
  • Sales Order XML
  • Sales Order XML Logical data model XML DOM
  • How this makes me feel:
  • Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  • XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  • XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  • XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  • XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  • XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  • XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  • XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  • Any questions?
  • Thanks for coming See you next year
  • Bonus Material If we have enough time
  • XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  • Schema diff