SlideShare a Scribd company logo
1 of 69
Download to read offline
Painless OO <-> XML
  with XML::Pastor
     (2009 remix)

   Joel Bernstein
  YAPC::EU 2009
It’s all Greek to me

      schema
      σχήµα (skhēma)
      shape, plan
How many of you?
How many of you?

• Use XML
How many of you?

• Use XML
• Hate XML
How many of you?

• Use XML
• Hate XML
• Like XML
A Confession


• I do not like XML
• People use it wrong
XML Data Binding
• Binding XML documents to objects
  specifically designed for the data in
  those documents.
• I often have to do this.
XML is hard, right?
   Some hard things:

• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling order,
  comments, XML entities etc.
Typical horrendous XML document
Sales Order XML   Logical data model




                     XML DOM
I shouldn’t need to
   care about this
How this makes me feel:
Fundamental problem

• I don’t think in elements and attributes
• I think about my data, not how it’s stored
• This is Perl. DWIM.
Solution
Tools should make both the syntax and the details of
         the manipulation of XML invisible
Do you write XML
Do you write XML

• By hand?
Do you write XML

• By hand?
• Programmatically?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
XML::Pastor is for
  all of you.
XML::Pastor

• Available on CPAN
• Abstracts away some of the pain of XML
• Ayhan Ulusoy is the author
• I am just a user
What does it do?

• Generates Perl code from W3C XML
  Schema (XSD)
• Roundtrip and validate XML to/from Perl
  without loss of schema information
• Lets you program without caring about
  XML structure
pastorize

• Automates codegen process
• Conceptually similar to DBIC::Schema::Loader
• TMTOWTDI - offline or runtime
• Works on multiple XSDs (caveat, collisions)
pastorize in use
pastorize   --mode offline --style multiple 
            --destination /tmp/lib/perl     
            --class_prefix MyApp::Data      
            /some/path/to/schema.xsd
Very simple contrived
 Album XML demo
Album XML document
Album XML schema
Pastorize the Album XML schema:




    Resulting code tree like:
Modify some XML
Roundtrip and modify XML data using Pastor:




                               # Load XML
                                              # Accessors
        # Modify


                          # Write XML
The result!
Real world Pastor
$HASH1 = {
      1 => 'Vodafone UK',
      2 => 'O2 UK',
      3 => 'Orange UK',
      4 => 'T-Mobile UK',
      8 => 'Hutchinson 3 UK'
   };




Real world Pastor
Country XML
Dynamic schema parsing of Country XML
Query the Country object
Modify elements and attributes
     with uniform syntax
Manipulate array-like data
Create new City data and
combine with existing Country object
Validate modified data
against the stored schema
Turn Pastor objects back into XML, or
  transform to XML::LibXML DOM
Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM
  object
• Convert XML DOM tree into native Perl
  objects
• Throw away DOM, no longer needed
Reasons to not use
      XML::Pastor
• When you have no XML Schema
 • Although several tools can infer XML
    schemata from documents
• It’s a code-generator
• No stream parsing
XML::Pastor Scope

• Good for “data XML”
• Unsuitable for “mixed markup”
 • e.g. XHTML
• Unsuitable for “huge” documents
XML::Pastor
    known limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute’ elements
  unsupported
• Encodings (only UTF-8 officially supported)
• Default values for attributes - help needed
Other XML modules
•   XML::Twig

•   XML::Compile

•   XML::Simple

•   XML::Smart
XML::Twig
• Manipulates XML directly
 • Using code is coupled closely to
    document structure
• Optimised for processing huge documents
  as trees
• No schemata, no validation
XML::Compile
• Original design rationale is to deal with
  SOAP envelopes and WSDL documents
• Different approach but similar goals to
  Pastor - processes XML based on XSD into
  Perl data structures
• More like XML::Simple with Schema
  support
XML::Compile pt. 2

• Schema support incomplete
 • Shaky support for imports, includes
    • Include restriction on targetNamespace
• I haven’t used it yet but it looks good
XML::Simple
• Working roundtrip binding for simple cases
 • e.g. XMLout(XMLin($file))
    works
• Simple API
• Produces single deep data structure
• Gotchas with element multiplicity
XML::Simple pt. 2

• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
XML::Smart

• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure information from XML
  instance, rather than schema
• No code generation!
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded
  objects as HASH, ARRAY, SCALAR, CODE
  & GLOB
• Like Pastor, overloads array/hashref access
  to the data - promotes decoupling
• Reasonable docs, some community growing
Any questions?
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/
Bonus Material
  If we have enough time
XML::Pastor Supported
XML Schema Features
• Simple and Complex Types
• Global Elements
• Groups, Attributes, AttributeGroups
• Derive simpleTypes by extension
• Derive complexTypes by restriction
• W3C built-in Types, Unions, Lists
• (Most) Restriction Facets for Simple types
• External Schema import, include, redefine
XML Schema Inference
• Create an XML schema from an XML
  document instance
• Every document has an (implicit) schema
• Tools like Relaxer, Trang, as well as the
  System.Xml.Serializer the .NET Framework
  can all infer XML Schemata from document
  instances
Simple D::HA object
Rekeying data
Rekeying data deeper
Warning, boring bit
XML::Pastor
    Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Create code in a scalar in memory
• Create code and eval() it for use
How Pastor works
    Code generation
• Parse schemata into schema model
 • Perl data structures containing all the
    global elements, types, attributes, ...
• “Resolve” Model - determine class names,
  resolve references, etc
• Create boilerplate code, write out / eval
How Pastor works
   Generated classes
• Each generated class (i.e. type) has classdata
  “XmlSchemaType” containing schema
  model
• If the class isa SimpleType it may contain
  restriction facets
• If the class isa ComplexType it will contain
  info about child elements and attributes
How Pastor works
        In use
• If classes generated offline, then “use”
  them, if online then they are already loaded
• These classes have methods to create,
  retrieve, save object to/from XML
• Manipulate/query data using OO API to
  complexType fields
• Validate modified objects against schema
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/

More Related Content

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Painless OO XML with XML::Pastor - 2009 Remix

  • 1. Painless OO <-> XML with XML::Pastor (2009 remix) Joel Bernstein YAPC::EU 2009
  • 2. It’s all Greek to me schema σχήµα (skhēma) shape, plan
  • 3. How many of you?
  • 4. How many of you? • Use XML
  • 5. How many of you? • Use XML • Hate XML
  • 6. How many of you? • Use XML • Hate XML • Like XML
  • 7. A Confession • I do not like XML • People use it wrong
  • 8. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents. • I often have to do this.
  • 9. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  • 11. Sales Order XML Logical data model XML DOM
  • 12. I shouldn’t need to care about this
  • 13. How this makes me feel:
  • 14. Fundamental problem • I don’t think in elements and attributes • I think about my data, not how it’s stored • This is Perl. DWIM.
  • 15. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  • 17. Do you write XML • By hand?
  • 18. Do you write XML • By hand? • Programmatically?
  • 19. Do you write XML • By hand? • Programmatically? • Schemata?
  • 20. Do you write XML • By hand? • Programmatically? • Schemata? • Validation?
  • 21. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  • 22. XML::Pastor is for all of you.
  • 23. XML::Pastor • Available on CPAN • Abstracts away some of the pain of XML • Ayhan Ulusoy is the author • I am just a user
  • 24. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  • 25. pastorize • Automates codegen process • Conceptually similar to DBIC::Schema::Loader • TMTOWTDI - offline or runtime • Works on multiple XSDs (caveat, collisions)
  • 26. pastorize in use pastorize --mode offline --style multiple --destination /tmp/lib/perl --class_prefix MyApp::Data /some/path/to/schema.xsd
  • 27. Very simple contrived Album XML demo
  • 30. Pastorize the Album XML schema: Resulting code tree like:
  • 32. Roundtrip and modify XML data using Pastor: # Load XML # Accessors # Modify # Write XML
  • 35. $HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK' }; Real world Pastor
  • 37. Dynamic schema parsing of Country XML
  • 39. Modify elements and attributes with uniform syntax
  • 41. Create new City data and combine with existing Country object
  • 42. Validate modified data against the stored schema
  • 43. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  • 44. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  • 45. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  • 46. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  • 47. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  • 48. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  • 49. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  • 50. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  • 51. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  • 52. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  • 53. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  • 54. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  • 55. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  • 57. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/
  • 58. Bonus Material If we have enough time
  • 59. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  • 60. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  • 65. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  • 66. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  • 67. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  • 68. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  • 69. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/