SlideShare a Scribd company logo
1 of 69
Download to read offline
Painless OO <-> XML
  with XML::Pastor
     (2009 remix)

   Joel Bernstein
  YAPC::EU 2009
It’s all Greek to me

      schema
      σχήµα (skhēma)
      shape, plan
How many of you?
How many of you?

• Use XML
How many of you?

• Use XML
• Hate XML
How many of you?

• Use XML
• Hate XML
• Like XML
A Confession


• I do not like XML
• People use it wrong
XML Data Binding
• Binding XML documents to objects
  specifically designed for the data in
  those documents.
• I often have to do this.
XML is hard, right?
   Some hard things:

• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling order,
  comments, XML entities etc.
Typical horrendous XML document
Sales Order XML   Logical data model




                     XML DOM
I shouldn’t need to
   care about this
How this makes me feel:
Fundamental problem

• I don’t think in elements and attributes
• I think about my data, not how it’s stored
• This is Perl. DWIM.
Solution
Tools should make both the syntax and the details of
         the manipulation of XML invisible
Do you write XML
Do you write XML

• By hand?
Do you write XML

• By hand?
• Programmatically?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
Do you write XML

• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
XML::Pastor is for
  all of you.
XML::Pastor

• Available on CPAN
• Abstracts away some of the pain of XML
• Ayhan Ulusoy is the author
• I am just a user
What does it do?

• Generates Perl code from W3C XML
  Schema (XSD)
• Roundtrip and validate XML to/from Perl
  without loss of schema information
• Lets you program without caring about
  XML structure
pastorize

• Automates codegen process
• Conceptually similar to DBIC::Schema::Loader
• TMTOWTDI - offline or runtime
• Works on multiple XSDs (caveat, collisions)
pastorize in use
pastorize   --mode offline --style multiple 
            --destination /tmp/lib/perl     
            --class_prefix MyApp::Data      
            /some/path/to/schema.xsd
Very simple contrived
 Album XML demo
Album XML document
Album XML schema
Pastorize the Album XML schema:




    Resulting code tree like:
Modify some XML
Roundtrip and modify XML data using Pastor:




                               # Load XML
                                              # Accessors
        # Modify


                          # Write XML
The result!
Real world Pastor
$HASH1 = {
      1 => 'Vodafone UK',
      2 => 'O2 UK',
      3 => 'Orange UK',
      4 => 'T-Mobile UK',
      8 => 'Hutchinson 3 UK'
   };




Real world Pastor
Country XML
Dynamic schema parsing of Country XML
Query the Country object
Modify elements and attributes
     with uniform syntax
Manipulate array-like data
Create new City data and
combine with existing Country object
Validate modified data
against the stored schema
Turn Pastor objects back into XML, or
  transform to XML::LibXML DOM
Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM
  object
• Convert XML DOM tree into native Perl
  objects
• Throw away DOM, no longer needed
Reasons to not use
      XML::Pastor
• When you have no XML Schema
 • Although several tools can infer XML
    schemata from documents
• It’s a code-generator
• No stream parsing
XML::Pastor Scope

• Good for “data XML”
• Unsuitable for “mixed markup”
 • e.g. XHTML
• Unsuitable for “huge” documents
XML::Pastor
    known limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute’ elements
  unsupported
• Encodings (only UTF-8 officially supported)
• Default values for attributes - help needed
Other XML modules
•   XML::Twig

•   XML::Compile

•   XML::Simple

•   XML::Smart
XML::Twig
• Manipulates XML directly
 • Using code is coupled closely to
    document structure
• Optimised for processing huge documents
  as trees
• No schemata, no validation
XML::Compile
• Original design rationale is to deal with
  SOAP envelopes and WSDL documents
• Different approach but similar goals to
  Pastor - processes XML based on XSD into
  Perl data structures
• More like XML::Simple with Schema
  support
XML::Compile pt. 2

• Schema support incomplete
 • Shaky support for imports, includes
    • Include restriction on targetNamespace
• I haven’t used it yet but it looks good
XML::Simple
• Working roundtrip binding for simple cases
 • e.g. XMLout(XMLin($file))
    works
• Simple API
• Produces single deep data structure
• Gotchas with element multiplicity
XML::Simple pt. 2

• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
XML::Smart

• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure information from XML
  instance, rather than schema
• No code generation!
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded
  objects as HASH, ARRAY, SCALAR, CODE
  & GLOB
• Like Pastor, overloads array/hashref access
  to the data - promotes decoupling
• Reasonable docs, some community growing
Any questions?
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/
Bonus Material
  If we have enough time
XML::Pastor Supported
XML Schema Features
• Simple and Complex Types
• Global Elements
• Groups, Attributes, AttributeGroups
• Derive simpleTypes by extension
• Derive complexTypes by restriction
• W3C built-in Types, Unions, Lists
• (Most) Restriction Facets for Simple types
• External Schema import, include, redefine
XML Schema Inference
• Create an XML schema from an XML
  document instance
• Every document has an (implicit) schema
• Tools like Relaxer, Trang, as well as the
  System.Xml.Serializer the .NET Framework
  can all infer XML Schemata from document
  instances
Simple D::HA object
Rekeying data
Rekeying data deeper
Warning, boring bit
XML::Pastor
    Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Create code in a scalar in memory
• Create code and eval() it for use
How Pastor works
    Code generation
• Parse schemata into schema model
 • Perl data structures containing all the
    global elements, types, attributes, ...
• “Resolve” Model - determine class names,
  resolve references, etc
• Create boilerplate code, write out / eval
How Pastor works
   Generated classes
• Each generated class (i.e. type) has classdata
  “XmlSchemaType” containing schema
  model
• If the class isa SimpleType it may contain
  restriction facets
• If the class isa ComplexType it will contain
  info about child elements and attributes
How Pastor works
        In use
• If classes generated offline, then “use”
  them, if online then they are already loaded
• These classes have methods to create,
  retrieve, save object to/from XML
• Manipulate/query data using OO API to
  complexType fields
• Validate modified objects against schema
Thanks for coming
            See you next year
http://search.cpan.org/dist/XML-Pastor/

More Related Content

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Painless OO XML with XML::Pastor - 2009 Remix

  • 1. Painless OO <-> XML with XML::Pastor (2009 remix) Joel Bernstein YAPC::EU 2009
  • 2. It’s all Greek to me schema σχήµα (skhēma) shape, plan
  • 3. How many of you?
  • 4. How many of you? • Use XML
  • 5. How many of you? • Use XML • Hate XML
  • 6. How many of you? • Use XML • Hate XML • Like XML
  • 7. A Confession • I do not like XML • People use it wrong
  • 8. XML Data Binding • Binding XML documents to objects specifically designed for the data in those documents. • I often have to do this.
  • 9. XML is hard, right? Some hard things: • Roundtripping data • Manipulating XML via DOM API • Preserving element sibling order, comments, XML entities etc.
  • 11. Sales Order XML Logical data model XML DOM
  • 12. I shouldn’t need to care about this
  • 13. How this makes me feel:
  • 14. Fundamental problem • I don’t think in elements and attributes • I think about my data, not how it’s stored • This is Perl. DWIM.
  • 15. Solution Tools should make both the syntax and the details of the manipulation of XML invisible
  • 17. Do you write XML • By hand?
  • 18. Do you write XML • By hand? • Programmatically?
  • 19. Do you write XML • By hand? • Programmatically? • Schemata?
  • 20. Do you write XML • By hand? • Programmatically? • Schemata? • Validation?
  • 21. Do you write XML • By hand? • Programmatically? • Schemata? • Validation? • Transformation?
  • 22. XML::Pastor is for all of you.
  • 23. XML::Pastor • Available on CPAN • Abstracts away some of the pain of XML • Ayhan Ulusoy is the author • I am just a user
  • 24. What does it do? • Generates Perl code from W3C XML Schema (XSD) • Roundtrip and validate XML to/from Perl without loss of schema information • Lets you program without caring about XML structure
  • 25. pastorize • Automates codegen process • Conceptually similar to DBIC::Schema::Loader • TMTOWTDI - offline or runtime • Works on multiple XSDs (caveat, collisions)
  • 26. pastorize in use pastorize --mode offline --style multiple --destination /tmp/lib/perl --class_prefix MyApp::Data /some/path/to/schema.xsd
  • 27. Very simple contrived Album XML demo
  • 30. Pastorize the Album XML schema: Resulting code tree like:
  • 32. Roundtrip and modify XML data using Pastor: # Load XML # Accessors # Modify # Write XML
  • 35. $HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK' }; Real world Pastor
  • 37. Dynamic schema parsing of Country XML
  • 39. Modify elements and attributes with uniform syntax
  • 41. Create new City data and combine with existing Country object
  • 42. Validate modified data against the stored schema
  • 43. Turn Pastor objects back into XML, or transform to XML::LibXML DOM
  • 44. Parsing with Pastor • Parse entire XML into XML::LibXML::DOM object • Convert XML DOM tree into native Perl objects • Throw away DOM, no longer needed
  • 45. Reasons to not use XML::Pastor • When you have no XML Schema • Although several tools can infer XML schemata from documents • It’s a code-generator • No stream parsing
  • 46. XML::Pastor Scope • Good for “data XML” • Unsuitable for “mixed markup” • e.g. XHTML • Unsuitable for “huge” documents
  • 47. XML::Pastor known limitations • Mixed elements unsupported • Substitution groups unsupported • ‘any’ and ‘anyAttribute’ elements unsupported • Encodings (only UTF-8 officially supported) • Default values for attributes - help needed
  • 48. Other XML modules • XML::Twig • XML::Compile • XML::Simple • XML::Smart
  • 49. XML::Twig • Manipulates XML directly • Using code is coupled closely to document structure • Optimised for processing huge documents as trees • No schemata, no validation
  • 50. XML::Compile • Original design rationale is to deal with SOAP envelopes and WSDL documents • Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures • More like XML::Simple with Schema support
  • 51. XML::Compile pt. 2 • Schema support incomplete • Shaky support for imports, includes • Include restriction on targetNamespace • I haven’t used it yet but it looks good
  • 52. XML::Simple • Working roundtrip binding for simple cases • e.g. XMLout(XMLin($file)) works • Simple API • Produces single deep data structure • Gotchas with element multiplicity
  • 53. XML::Simple pt. 2 • No schemata, no validation • Can be teamed with a SAX parser • More suitable for configuration files?
  • 54. XML::Smart • Similar implementation to XML::Pastor • Uses tie() and lots of crac^H^H^H^Hmagic • Gathers structure information from XML instance, rather than schema • No code generation!
  • 55. XML::Smart pt. 2 • No schemata, so no schema validation • Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB • Like Pastor, overloads array/hashref access to the data - promotes decoupling • Reasonable docs, some community growing
  • 57. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/
  • 58. Bonus Material If we have enough time
  • 59. XML::Pastor Supported XML Schema Features • Simple and Complex Types • Global Elements • Groups, Attributes, AttributeGroups • Derive simpleTypes by extension • Derive complexTypes by restriction • W3C built-in Types, Unions, Lists • (Most) Restriction Facets for Simple types • External Schema import, include, redefine
  • 60. XML Schema Inference • Create an XML schema from an XML document instance • Every document has an (implicit) schema • Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
  • 65. XML::Pastor Code Generation • Write out static code to tree of .pm files • Write out static code to single .pm file • Create code in a scalar in memory • Create code and eval() it for use
  • 66. How Pastor works Code generation • Parse schemata into schema model • Perl data structures containing all the global elements, types, attributes, ... • “Resolve” Model - determine class names, resolve references, etc • Create boilerplate code, write out / eval
  • 67. How Pastor works Generated classes • Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model • If the class isa SimpleType it may contain restriction facets • If the class isa ComplexType it will contain info about child elements and attributes
  • 68. How Pastor works In use • If classes generated offline, then “use” them, if online then they are already loaded • These classes have methods to create, retrieve, save object to/from XML • Manipulate/query data using OO API to complexType fields • Validate modified objects against schema
  • 69. Thanks for coming See you next year http://search.cpan.org/dist/XML-Pastor/