A Short PMML Tutorial by LatentView

5,032
-1

Published on

A Short PMML Tutorial by LatentView

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,032
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
167
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

A Short PMML Tutorial by LatentView

  1. 1. www.LatentView.com PMML Tutorial Ramesh Hariharan 12-Feb-2009 www.LatentView.com www.latentview.com/blog This presentation is solely for the use of LatentView. No part of this presentation may be circulated, quoted, or reproduced for distribution without prior written approval from LatentView.
  2. 2. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 2 LatentView Analytics Pvt. Ltd (Confidential)
  3. 3. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 3 LatentView Analytics Pvt. Ltd (Confidential)
  4. 4. www.LatentView.com PMML Overview PMML – Predictive Modeling Mark-up Language Used for Model Scoring XML Document Owned by DMG. A consortium led by SPSS, SAS, IBM, Microsoft, Oracle and others Currently in version 3.2 Advantages of PMML Drawbacks of PMML Portability of models Least Common Denominator Metadata standardization Potential loss of precision Model once, score anywhere (MOSA ☺) Lack of support for complex transformations Lack of support from Tools Some of the Model Types Supported Association Rules, Clustering, General Regression, Naïve Bayes, Neural Networks, Support Vector Machines Capabilities of PMML Model Composition – model sequencing & model selection Built-in and User-defined functions Usual data types – date, numbers, category Model Verification – sample results for testing Output field – create output tables based on the models Extension Mechanisms 4 LatentView Analytics Pvt. Ltd (Confidential)
  5. 5. www.LatentView.com PMML in the Decision Management Architecture Business Rules Sales & formulation Client Marketing Create Managers Operational Systems Rules Business Rules Customer Requests Management Decision Models Risk Scores and Management Decisions Model Repository LatentView Analytic Analysts Modeling Enterprise Decision Engine Other Model Applications Development Analytics Data Backbone Payment Interaction Product Channel Customer History Data Data Data Data Data Enterprise Data LatentView Analytics Pvt. Ltd (Confidential)
  6. 6. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 6 LatentView Analytics Pvt. Ltd (Confidential)
  7. 7. www.LatentView.com Constructing a PMML <?xml version=quot;1.0quot;?> <PMML version=quot;3.2quot; xmlns=quot;http://www.dmg.org/PMML-3_2quot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot; > <Header copyright=quot;Example.comquot;/> <DataDictionary> ... </DataDictionary> ... a model ... </PMML> www.dmg.org http://dmg.org/v3-2/GeneralStructure.html http://dmg.org/v3-2/pmml-3-2.xsd 7 LatentView Analytics Pvt. Ltd (Confidential)
  8. 8. www.LatentView.com Constructing a PMML <?xml version=quot;1.0quot;?> <PMML version=quot;3.2quot; xmlns=quot;http://www.dmg.org/PMML-3_2quot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot; > <Header copyright=quot;Example.comquot;/> <DataDictionary> ... </DataDictionary> ... a model ... </PMML> www.dmg.org http://dmg.org/v3-2/GeneralStructure.html http://dmg.org/v3-2/pmml-3-2.xsd 8 LatentView Analytics Pvt. Ltd (Confidential)
  9. 9. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 9 LatentView Analytics Pvt. Ltd (Confidential)
  10. 10. www.LatentView.com XSD Overview XSD – XML Schema Definition The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: • defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes LatentView Analytics Pvt. Ltd (Confidential)
  11. 11. www.LatentView.com A First Example Look at this simple XML document called quot;note.xmlquot;: <?xml version=quot;1.0quot;?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> Look at the XML Schema for the same <?xml version=quot;1.0quot;?> <xs:schema xmlns:xs=quot;http://www.w3.org/2001/XMLSchemaquot; targetNamespace=quot;http://www.w3schools.comquot; xmlns=quot;http://www.w3schools.comquot; elementFormDefault=quot;qualifiedquot;> <xs:element name=quot;notequot;> <xs:complexType> <xs:sequence> <xs:element name=quot;toquot; type=quot;xs:stringquot;/> <xs:element name=quot;fromquot; type=quot;xs:stringquot;/> <xs:element name=quot;headingquot; type=quot;xs:stringquot;/> <xs:element name=quot;bodyquot; type=quot;xs:stringquot;/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> LatentView Analytics Pvt. Ltd (Confidential)
  12. 12. www.LatentView.com Simple Elements <xs:element name=quot;xxxquot; type=quot;yyyquot;/> XML Schema has a lot of built-in data types. The most common types are: • xs:string • xs:decimal • xs:integer • xs:boolean • xs:date • xs:time Example <lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03-27</dateborn> <xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/> <xs:element name=quot;agequot; type=quot;xs:integerquot;/> <xs:element name=quot;datebornquot; type=quot;xs:datequot;/> LatentView Analytics Pvt. Ltd (Confidential)
  13. 13. www.LatentView.com XSD Attributes Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type. <xs:attribute name=quot;xxxquot; type=quot;yyyquot;/> where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a lot of built-in data types. The most common types are: • xs:string • xs:decimal • xs:integer • xs:boolean • xs:date • xs:time Example <lastname lang=quot;ENquot;>Smith</lastname> <xs:attribute name=quot;langquot; type=quot;xs:stringquot;/> LatentView Analytics Pvt. Ltd (Confidential)
  14. 14. www.LatentView.com Simple Elements: Restrictions Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on XML elements are called facets. Restrictions on Values <xs:element name=quot;agequot;> <xs:simpleType> <xs:restriction base=quot;xs:integerquot;> <xs:minInclusive value=quot;0quot;/> <xs:maxInclusive value=quot;120quot;/> </xs:restriction> </xs:simpleType> </xs:element> Restrictions on a set of Values <xs:element name=quot;carquot; type=quot;carTypequot;/> <xs:simpleType name=quot;carTypequot;> <xs:restriction base=quot;xs:stringquot;> <xs:enumeration value=quot;Audiquot;/> <xs:enumeration value=quot;Golfquot;/> <xs:enumeration value=quot;BMWquot;/> </xs:restriction> </xs:simpleType> LatentView Analytics Pvt. Ltd (Confidential)
  15. 15. www.LatentView.com Complex Elements <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee> <xs:element name=quot;employeequot; type=quot;personinfoquot;/> <xs:complexType name=quot;personinfoquot;> <xs:sequence> <xs:element name=quot;firstnamequot; type=quot;xs:stringquot;/> <xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/> </xs:sequence> </xs:complexType> <xs:element name=quot;employee“> <xs:complexType> <xs:sequence> <xs:element name=quot;firstnamequot; type=quot;xs:stringquot;/> <xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/> </xs:sequence> </xs:complexType> <xs:element> LatentView Analytics Pvt. Ltd (Confidential)
  16. 16. www.LatentView.com More Complex Elements You can also base a complex element on an existing complex element and add some elements, like this: <xs:element name=quot;employeequot; type=quot;fullpersoninfoquot;/> <xs:complexType name=quot;personinfoquot;> <xs:sequence> <xs:element name=quot;firstnamequot; type=quot;xs:stringquot;/> <xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/> </xs:sequence> </xs:complexType> <xs:complexType name=quot;fullpersoninfoquot;> <xs:complexContent> <xs:extension base=quot;personinfoquot;> <xs:sequence> <xs:element name=quot;addressquot; type=quot;xs:stringquot;/> <xs:element name=quot;cityquot; type=quot;xs:stringquot;/> <xs:element name=quot;countryquot; type=quot;xs:stringquot;/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> LatentView Analytics Pvt. Ltd (Confidential)
  17. 17. www.LatentView.com XSD Indicators You can also base a complex element on an existing complex element and add some elements, like this: Indicators There are seven indicators: Order indicators: • All • Choice • Sequence Occurrence indicators: • maxOccurs • minOccurs Group indicators: • Group name • attributeGroup name LatentView Analytics Pvt. Ltd (Confidential)
  18. 18. www.LatentView.com Complex Type: Example Let's have a look at this XML document called quot;shiporder.xmlquot;: <?xml version=quot;1.0quot; encoding=quot;ISO-8859-1quot;?> <shiporder orderid=quot;889923quot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot; xsi:noNamespaceSchemaLocation=quot;shiporder.xsdquot;> <orderperson>John Smith</orderperson> <shipto> <name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country> </shipto> <item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price> </item> <item> <title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price> </item> </shiporder> LatentView Analytics Pvt. Ltd (Confidential)
  19. 19. www.LatentView.com Complex Type: Example Solution The XSD for the file: <?xml version=quot;1.0quot; encoding=quot;ISO-8859-1quot; ?> <xs:schema xmlns:xs=quot;http://www.w3.org/2001/XMLSchemaquot;> <xs:simpleType name=quot;stringtypequot;> <xs:restriction base=quot;xs:stringquot;/> </xs:simpleType> <xs:simpleType name=quot;inttypequot;> <xs:restriction base=quot;xs:positiveIntegerquot;/> </xs:simpleType> <xs:simpleType name=quot;dectypequot;> <xs:restriction base=quot;xs:decimalquot;/> </xs:simpleType> <xs:simpleType name=quot;orderidtypequot;> <xs:restriction base=quot;xs:stringquot;> <xs:pattern value=quot;[0-9]{6}quot;/> </xs:restriction> </xs:simpleType> <xs:complexType name=quot;shiptotypequot;> <xs:sequence> <xs:element name=quot;namequot; type=quot;stringtypequot;/> <xs:element name=quot;addressquot; type=quot;stringtypequot;/> <xs:element name=quot;cityquot; type=quot;stringtypequot;/> <xs:element name=quot;countryquot; type=quot;stringtypequot;/> </xs:sequence> </xs:complexType> continued next slide LatentView Analytics Pvt. Ltd (Confidential)
  20. 20. www.LatentView.com Complex Type: Example Solution The XSD for the file: …continuous from the previous slide <xs:complexType name=quot;itemtypequot;> <xs:sequence> <xs:element name=quot;titlequot; type=quot;stringtypequot;/> <xs:element name=quot;notequot; type=quot;stringtypequot; minOccurs=quot;0quot;/> <xs:element name=quot;quantityquot; type=quot;inttypequot;/> <xs:element name=quot;pricequot; type=quot;dectypequot;/> </xs:sequence> </xs:complexType> <xs:complexType name=quot;shipordertypequot;> <xs:sequence> <xs:element name=quot;orderpersonquot; type=quot;stringtypequot;/> <xs:element name=quot;shiptoquot; type=quot;shiptotypequot;/> <xs:element name=quot;itemquot; maxOccurs=quot;unboundedquot; type=quot;itemtypequot;/> </xs:sequence> <xs:attribute name=quot;orderidquot; type=quot;orderidtypequot; use=quot;requiredquot;/> </xs:complexType> <xs:element name=quot;shiporderquot; type=quot;shipordertypequot;/> </xs:schema> LatentView Analytics Pvt. Ltd (Confidential)
  21. 21. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 21 LatentView Analytics Pvt. Ltd (Confidential)
  22. 22. www.LatentView.com PMML: Headers <Header copyright=quot;Copyright (c) 2009 LatentViewquot; description=quot;LatentView Logit Model v1.0quot;> <Extension name=quot;timestampquot; value=quot;2009-01-19 19:38:13quot; extender=quot;Rattlequot; /> <Extension name=quot;descriptionquot; value=quot;Administratorquot; extender=quot;Rattlequot; /> <Application name=quot;Rattle/PMMLquot; version=quot;1.2.0quot; /> </Header> LatentView Analytics Pvt. Ltd (Confidential)
  23. 23. www.LatentView.com PMML: Data Dictionary <DataDictionary numberOfFields=quot;23quot;> <DataField name=quot;ind_Salequot; optype=quot;continuousquot; dataType=quot;doublequot; /> … <DataField name=quot;STATEquot; optype=quot;categoricalquot; dataType=quot;stringquot; /> </DataDictionary> LatentView Analytics Pvt. Ltd (Confidential)
  24. 24. www.LatentView.com PMML Transformations PMML defines various kinds of simple data transformations: Normalization: map values to numbers, the input can be continuous or discrete. Discretization: map continuous values to discrete values. Value mapping: map discrete values to discrete values. Functions: derive a value by applying a function to one or more parameters Aggregation: summarize or collect groups of values, e.g., compute average. Value Mapping <DerivedField name=quot;ETHNICGROUPCODE_02quot; optype=quot;ordinalquot; dataType=quot;integerquot;> <MapValues outputColumn=quot;derivedquot; defaultValue=quot;0quot; mapMissingTo=quot;0quot;> <FieldColumnPair field=quot;ETHNICGROUPCODEquot; column=quot;originalquot; /> <InlineTable> <row> <original>02</original> <derived>1</derived> </row> </InlineTable> </MapValues> </DerivedField> Built-in Function <DerivedField name=quot;I1EXACTAGE_drquot; optype=quot;continuousquot; dataType=quot;doublequot;> <Apply function=quot;sumquot;> <FieldRef field=quot;I1EXACTAGEquot;/> <FieldRef field=quot;I1ESTIMATEDAGEquot;/> </Apply> </DerivedField> LatentView Analytics Pvt. Ltd (Confidential)
  25. 25. www.LatentView.com PMML: Mining Schema LatentView Analytics Pvt. Ltd (Confidential)
  26. 26. www.LatentView.com PMML: Mining Schema < <MiningSchema> <MiningField name=quot;ind_Salequot; usageType=quot;predictedquot; missingValueReplacement=quot;-1quot; missingValueTreatment=quot;asValuequot; /> <MiningField name=quot;I1ESTIMATEDAGEquot; usageType=quot;activequot; missingValueReplacement=quot;-1quot; missingValueTreatment=quot;asValuequot;/> <MiningField name=quot;I2ESTIMATEDAGEquot; usageType=quot;activequot; missingValueReplacement=quot;-1quot; missingValueTreatment=quot;asValuequot;/> … <MiningField name=quot;I1EXACTAGEquot; usageType=quot;activequot; missingValueReplacement=quot;-1quot; missingValueTreatment=quot;asValuequot;/> </MiningSchema> LatentView Analytics Pvt. Ltd (Confidential)
  27. 27. www.LatentView.com Agenda • PMML Overview • Constructing a PMML • XSD Overview • Reading the PMML Specification • Next Steps… 27 LatentView Analytics Pvt. Ltd (Confidential)
  28. 28. www.LatentView.com Next Steps Create a PMML file from your models – one for Logistic, Clustering and Decision Tree models Build PMML manually, and validate it using an XML editor such as XMLFox (a syntactically valid PMML may not be logically valid) LatentView Analytics Pvt. Ltd (Confidential)
  29. 29. www.LatentView.com Thank You ! JVL Plaza, Ground Floor, 80, Broad Street, 5th Floor 626 Anna Salai, Teynampet, New York, NY 10004 Chennai – 600 018 Phone: +91-44-4509 4039/40 Phone: +1-212-837-7874 LatentView Analytics Pvt. Ltd (Confidential)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×