SlideShare a Scribd company logo
1 of 58
Download to read offline
Painless OO <-> XML
    with XML::Pastor


Joel Bernstein - LPW 2008
It’s all Greek to me

      schema (pl. schemata)
      σχήμα (skhēma)
      shape, plan
I do not like XML
  People use it wrong

• Apple Property Lists
• Tag soup
• Data transfer format vs data storage format
How many of you?

• Use XML
• Hate XML
• Like XML
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
XML::Pastor is for
  all of you.
XML is hard, right?
   Some hard things:

• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling order,
  comments, XML entities etc.
Solution
Tools should make both the syntax and the details of
         the manipulation of XML invisible
XML::Pastor

• I didn’t write it
• Written by Ayhan Ulusoy
• Available on CPAN
• Abstracts away some of the pain of XML
What does it do?

• Generates Perl code from W3C XML
  Schema (XSD)
• Roundtrip and validate XML to/from Perl
  without loss of schema information
• Lets you program without caring about
  XML structure
Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM
  object
• Convert XML DOM tree into native Perl
  objects
• Throw away DOM, no longer needed
Reasons to not use
      XML::Pastor
• When you have no XML Schema
 • Although several tools can infer XML
    schemata from documents
• It’s a code-generator
• No stream parsing
XML::Pastor
    Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Create code in a scalar in memory
• Create code and eval() it for use
Warning, boring bit
How Pastor works
    Code generation
• Parse schemata into schema model
 • Perl data structures containing all the
    global elements, types, attributes, ...
• “Resolve” Model - determine class names,
  resolve references, etc
• Create boilerplate code, write out / eval
How Pastor works
Code Generation pt. 2
How Pastor works
   Generated classes
• Each generated class (i.e. type) has classdata
  “XmlSchemaType” containing schema
  model
• If the class isa SimpleType it may contain
  restriction facets
• If the class isa ComplexType it will contain
  info about child elements and attributes
How Pastor works
        In use
• If classes generated offline, then “use”
  them, if online then they are already loaded
• These classes have methods to create,
  retrieve, save object to/from XML
• Manipulate/query data using OO API to
  complexType fields
• Validate modified objects against schema
Very simple Album
   XML demo
Album XML document
Album XML schema
Pastorize creates Perl classes from
       Album XML schema:



     Resulting code tree like:
Roundtrip and modify XML data using Pastor:
The result!
Real world Pastor
Moose::Role for Pastor
Country XML
Dynamic XML::Pastor usage
Query the Country object
Modify elements and attributes
     with uniform syntax
NodeArray syntax
Create new City data and
combine with existing Country object
Validate modified data
against the stored schema
Turn Pastor objects back into XML, or
  transform to XML::LibXML DOM
Simple D::HA object
Rekeying data
Rekeying data deeper
XML::Pastor Scope

• Good for “data XML”
• Unsuitable for “mixed markup”
 • e.g. XHTML
• Unsuitable for “huge” documents
XML::Pastor Supported
XML Schema Features
• Simple and Complex Types
• Global Elements
• Groups, Attributes, AttributeGroups
• Derive simpleTypes by extension
• Derive complexTypes by restriction
• W3C built-in Types, Unions, Lists
• (Most) Restriction Facets for Simple types
• External Schema import, include, redefine
XML::Pastor
    known limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute’ elements
  unsupported
• Encodings (only UTF-8 officially supported)
• Default values for attributes - help needed
XML Data Binding

• Binding XML documents to objects
  specifically designed for the data in those
  documents
• Allows e.g. data-centric applications to
  manipulate data more naturally than by
  using DOM API
Sales Order XML
Sales Order XML   Logical data model




                     XML DOM
XML DOM
How this makes me feel:
Other XML modules
•   XML::Twig

•   XML::Compile

•   XML::Simple

•   XML::Smart
XML::Twig
• Manipulates XML directly
 • Using code is coupled closely to
    document structure
• Optimised for processing huge documents
  as trees
• No schemata, no validation
XML::Compile
• Original design rationale is to deal with
  SOAP envelopes and WSDL documents
• Different approach but similar goals to
  Pastor - processes XML based on XSD into
  Perl data structures
• More like XML::Simple with Schema
  support
XML::Compile pt. 2

• Schema support incomplete
 • Shaky support for imports, includes
    • Include restriction on targetNamespace
• I haven’t used it yet but it looks good
XML::Simple
• Working roundtrip binding for simple cases
 • e.g. XMLout(XMLin($file))
    works
• Simple API
• Produces single deep data structure
• Gotchas with element multiplicity
XML::Simple pt. 2

• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
XML::Smart

• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure information from XML
  instance, rather than schema
• No code generation!
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded
  objects as HASH, ARRAY, SCALAR, CODE
  & GLOB
• Like Pastor, overloads array/hashref access
  to the data - promotes decoupling
• Reasonable docs, some community growing
Any questions?
Thanks for coming
    See you next year
Bonus Material
  If we have enough time
XML Schema Inference
• Create an XML schema from an XML
  document instance
• Every document has an (implicit) schema
• Tools like Relaxer, Trang, as well as the
  System.Xml.Serializer the .NET Framework
  can all infer XML Schemata from document
  instances
Schema diff

More Related Content

What's hot

DB multi tenancy with Rails 6
DB multi tenancy with Rails 6DB multi tenancy with Rails 6
DB multi tenancy with Rails 6João Fernandes
 
Phorum MySQL tricks
Phorum MySQL tricksPhorum MySQL tricks
Phorum MySQL tricksguestd34230
 
Webdevelopment workshop-html
Webdevelopment workshop-htmlWebdevelopment workshop-html
Webdevelopment workshop-htmlJoe Joseph
 
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with BallerinaWSO2
 
Distributed computing in browsers as client side attack
Distributed computing in browsers as client side attackDistributed computing in browsers as client side attack
Distributed computing in browsers as client side attackIvan Novikov
 
QueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesQueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesMatt Butcher
 
Javascript debugging
Javascript debuggingJavascript debugging
Javascript debuggingaudiodog
 
The Lumber Mill Xslt For Your Templates
The Lumber Mill   Xslt For Your TemplatesThe Lumber Mill   Xslt For Your Templates
The Lumber Mill Xslt For Your TemplatesThomas Weinert
 
Scala with mongodb
Scala with mongodbScala with mongodb
Scala with mongodbKnoldus Inc.
 

What's hot (20)

Nodeconf npm 2011
Nodeconf npm 2011Nodeconf npm 2011
Nodeconf npm 2011
 
Introduction to Web Designing
Introduction to Web DesigningIntroduction to Web Designing
Introduction to Web Designing
 
Web Languages
Web LanguagesWeb Languages
Web Languages
 
Web Information Systems Html and css
Web Information Systems Html and cssWeb Information Systems Html and css
Web Information Systems Html and css
 
Web Information Systems XML
Web Information Systems XMLWeb Information Systems XML
Web Information Systems XML
 
NoSQL & JSON
NoSQL & JSONNoSQL & JSON
NoSQL & JSON
 
MongoDB
MongoDBMongoDB
MongoDB
 
DB multi tenancy with Rails 6
DB multi tenancy with Rails 6DB multi tenancy with Rails 6
DB multi tenancy with Rails 6
 
Introduction to AJAX
Introduction to AJAXIntroduction to AJAX
Introduction to AJAX
 
Phorum MySQL tricks
Phorum MySQL tricksPhorum MySQL tricks
Phorum MySQL tricks
 
Webdevelopment workshop-html
Webdevelopment workshop-htmlWebdevelopment workshop-html
Webdevelopment workshop-html
 
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina
[WSO2Con EU 2017] Manipulating XML, JSON and SQL Data Types with Ballerina
 
XML-RPC and SOAP (April 2003)
XML-RPC and SOAP (April 2003)XML-RPC and SOAP (April 2003)
XML-RPC and SOAP (April 2003)
 
Distributed computing in browsers as client side attack
Distributed computing in browsers as client side attackDistributed computing in browsers as client side attack
Distributed computing in browsers as client side attack
 
Ajax
AjaxAjax
Ajax
 
QueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesQueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web Services
 
Javascript debugging
Javascript debuggingJavascript debugging
Javascript debugging
 
The Lumber Mill Xslt For Your Templates
The Lumber Mill   Xslt For Your TemplatesThe Lumber Mill   Xslt For Your Templates
The Lumber Mill Xslt For Your Templates
 
Scala with mongodb
Scala with mongodbScala with mongodb
Scala with mongodb
 
Ajax
Ajax Ajax
Ajax
 

Viewers also liked

Experimental Photography Artist Research
Experimental Photography Artist ResearchExperimental Photography Artist Research
Experimental Photography Artist ResearchJaskirt Boora
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 
cf. city flows - A comparative visualization of bike sharing systems
cf. city flows - A comparative visualization of bike sharing systemscf. city flows - A comparative visualization of bike sharing systems
cf. city flows - A comparative visualization of bike sharing systemsTill Nagel
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
 

Viewers also liked (6)

E4X - Primitive XML
E4X - Primitive XMLE4X - Primitive XML
E4X - Primitive XML
 
Experimental Photography Artist Research
Experimental Photography Artist ResearchExperimental Photography Artist Research
Experimental Photography Artist Research
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 
cf. city flows - A comparative visualization of bike sharing systems
cf. city flows - A comparative visualization of bike sharing systemscf. city flows - A comparative visualization of bike sharing systems
cf. city flows - A comparative visualization of bike sharing systems
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Similar to Painless OO XML with XML::Pastor

Mashups with Drupal and QueryPath
Mashups with Drupal and QueryPathMashups with Drupal and QueryPath
Mashups with Drupal and QueryPathMatt Butcher
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...David Horvath
 
XML Schema Patterns for Databinding
XML Schema Patterns for DatabindingXML Schema Patterns for Databinding
XML Schema Patterns for DatabindingPaul Downey
 
Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentationweb25
 
Killing the Angle Bracket
Killing the Angle BracketKilling the Angle Bracket
Killing the Angle Bracketjnewmanux
 
Flickr and PHP - Cal Henderson
Flickr and PHP - Cal HendersonFlickr and PHP - Cal Henderson
Flickr and PHP - Cal Hendersonkangaro10a
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Modelchomas kandar
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Modelchomas kandar
 
When To Use Ruby On Rails
When To Use Ruby On RailsWhen To Use Ruby On Rails
When To Use Ruby On Railsdosire
 
Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentationeraz
 
Ajax Tutorial
Ajax TutorialAjax Tutorial
Ajax Tutorialoscon2007
 
MongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouchMongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouchWynn Netherland
 
Catalyst - refactor large apps with it and have fun!
Catalyst - refactor large apps with it and have fun!Catalyst - refactor large apps with it and have fun!
Catalyst - refactor large apps with it and have fun!mold
 
JSON-RPC Proxy Generation with PHP 5
JSON-RPC Proxy Generation with PHP 5JSON-RPC Proxy Generation with PHP 5
JSON-RPC Proxy Generation with PHP 5Stephan Schmidt
 
Ruby Xml Mapping
Ruby Xml MappingRuby Xml Mapping
Ruby Xml MappingMarc Seeger
 

Similar to Painless OO XML with XML::Pastor (20)

Mashups with Drupal and QueryPath
Mashups with Drupal and QueryPathMashups with Drupal and QueryPath
Mashups with Drupal and QueryPath
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
 
XML Schema Patterns for Databinding
XML Schema Patterns for DatabindingXML Schema Patterns for Databinding
XML Schema Patterns for Databinding
 
Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentation
 
Killing the Angle Bracket
Killing the Angle BracketKilling the Angle Bracket
Killing the Angle Bracket
 
Flickr and PHP - Cal Henderson
Flickr and PHP - Cal HendersonFlickr and PHP - Cal Henderson
Flickr and PHP - Cal Henderson
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Model
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Model
 
When To Use Ruby On Rails
When To Use Ruby On RailsWhen To Use Ruby On Rails
When To Use Ruby On Rails
 
Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentation
 
Xml Demystified
Xml DemystifiedXml Demystified
Xml Demystified
 
Ajax Tutorial
Ajax TutorialAjax Tutorial
Ajax Tutorial
 
MongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouchMongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouch
 
The Skinny on Slim
The Skinny on SlimThe Skinny on Slim
The Skinny on Slim
 
Catalyst - refactor large apps with it and have fun!
Catalyst - refactor large apps with it and have fun!Catalyst - refactor large apps with it and have fun!
Catalyst - refactor large apps with it and have fun!
 
JSON-RPC Proxy Generation with PHP 5
JSON-RPC Proxy Generation with PHP 5JSON-RPC Proxy Generation with PHP 5
JSON-RPC Proxy Generation with PHP 5
 
Ruby Xml Mapping
Ruby Xml MappingRuby Xml Mapping
Ruby Xml Mapping
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
Web Scraping Basics
Web Scraping BasicsWeb Scraping Basics
Web Scraping Basics
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Painless OO XML with XML::Pastor

  • 1. Painless OO <-> XML with XML::Pastor Joel Bernstein - LPW 2008
  • 2. It’s all Greek to me schema (pl. schemata) σχήμα (skhēma) shape, plan
  • 3. I do not like XML People use it wrong • Apple Property Lists • Tag soup • Data transfer format vs data storage format
  • 4. How many of you? • Use XML • Hate XML • Like XML
  • 5. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  • 6. XML::Pastor is for all of you.
  • 7. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  • 8. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  • 9. XML::Pastor • I didn’t write it • Written by Ayhan Ulusoy • Available on CPAN • Abstracts away some of the pain of XML
  • 10. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  • 11. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  • 12. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  • 13. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  • 15. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  • 16. How Pastor works Code Generation pt. 2
  • 17. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  • 18. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  • 19. Very simple Album XML demo
  • 22. Pastorize creates Perl classes from Album XML schema: Resulting code tree like:
  • 23. Roundtrip and modify XML data using Pastor:
  • 30. Modify elements and attributes with uniform syntax
  • 32. Create new City data and combine with existing Country object
  • 33. Validate modified data against the stored schema
  • 34. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  • 38. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  • 39. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  • 40. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  • 41. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents • Allows e.g. data-centric applications to manipulate data more naturally than by using DOM API
  • 43. Sales Order XML Logical data model XML DOM
  • 45. How this makes me feel:
  • 46. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  • 47. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  • 48. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  • 49. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  • 50. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  • 51. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  • 52. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  • 53. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  • 55. Thanks for coming See you next year
  • 56. Bonus Material If we have enough time
  • 57. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances