Your SlideShare is downloading. ×
PMML - Predictive Model Markup Language
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

PMML - Predictive Model Markup Language


Published on

PMML (Predictive Model Markup Language) provides a standard way to represent data mining models so that these can be shared between different statistical applications. Not only PMML can represent a …

PMML (Predictive Model Markup Language) provides a standard way to represent data mining models so that these can be shared between different statistical applications. Not only PMML can represent a wide range of statistical techniques, but it can also be used to represent the data transformations necessary to transform raw data into meaningful feature detectors. In this way, PMML offers a standard to represent data manipulation and modeling in a single and concise way.

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. PMML Overview
  • 2. PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing) PMML Predictive Model Markup Language Transformations
    • PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications.
    • It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models.
    • PMML allows for the clear separation of tasks: Model development vs. model deployment. As a consequence, scientists can focus on building the best model.
    • PMML eliminates need for custom model deployment and ensures scalability and reliability.
  • 3. Matured and Supported by Industry PMML PMML Industry Support
    • Data Mining Group http://
    • Mature standard
      • Current version 4.0 (just released)
      • Active group and constant enhancements
    • Vendor independent consortium
    • Industry supporters
      • Major Players: IBM, Oracle, SAP, Microsoft
      • Analytics: KXEN, SAS, Salford, SPSS, Zementis
      • BI: Microstrategy, Teradata, Tibco
      • Open Source: KNIME, R, Rapid-I
      • Others: Equifax, FICO, Open Data Group, Visa, Pervasive
  • 4. PMML Components
    • A Data Dictionary defines all the raw data fields (including missing value strategy and outlier treatment).
    • Several Data Transformations strategies allow for intelligent extraction of feature detectors from raw data (“data massaging”).
    • A comprehensive list of Data-Mining Models offers power and flexibility.
    • Post-processing of results allow for tailored decisions.
    • Model Explanation allows for performance evaluation.
  • 5. PMML Files 3) Post-Processing Scaling of model outputs can be performed with PMML element Targets 1) Pre-Processing PMML elements Transformations, Mining Schema and Functions allow for effective pre-processing 2) Models PMML allows for several predictive modeling techniques to be fully expressed PMML
  • 6. PMML: Data Pre-Processing
    • Data Dictionary : Allows for the explicit specification of valid, invalid and missing values.
    • Mining Schema : Used to define the appropriate treatment to be applied to missing and invalid values.
    • Transformations : Allow for variable discretization, normalization, and mapping with handling of missing and default values.
    • Built-in Functions : Arithmetic expressions, handling of date and time as well as strings. Also used for implementing IF-THEN-ELSE logic and Boolean operations.
    Data Pre-Processing 1
  • 7. Data Pre-Processing: PMML Example Arbitrary Piecewise Linear Function This PMML code implements: Var_b:=interpolate(Var_a,((100,0),(200,1),(800,3),(900,4))) See - look for element NormContinuous.
  • 8. Modeling Elements
    • PMML allows for several predictive modeling techniques to be expressed directly. Supported techniques which have their own elements are:
      • Regression and General Regression
      • Neural Networks
      • Support Vector Machines
      • Decision Trees
      • Naïve Bayes
      • Clustering
      • Sequences
      • Rule Sets
      • Association Rules
      • Time-Series (as of PMML 4.0)
      • Text Models
      • Support for Multiple Models
    Easy Expression of Predictive Models 2
  • 9. Modeling Elements: PMML Example for Neural Network
  • 10. The PMML code below implements score post-processing. It uses the PMML element Targets for checking boundaries ( min and max ) and to rescale ( rescaleConstant and rescaleFactor ) the original score generated by model See Data Post-Processing: PMML Example 3
  • 11. Applications Service Providers External Vendors Divisions One Standard, One Process
  • 12. PMML = Easy Model Deployment Model Deployment Model Building PMML
  • 13. PMML - Zementis Contributions
    • ADAPA : A decision engine that deploys models expressed in PMML and executes them in real-time. Now available as a service on the Amazon Cloud.
    • PMML Converter : Validates, converts, and corrects old and new PMML code. Available at the DMG website and at .
    • Contributing Member of the DMG : Submitted several proposals for PMML 4.0 and already working with other members on PMML 4.1.
    • Code contributor for the R PMML package (available on CRAN).
    • PMML Articles : R Journal and SIGKDD Explorations Newsletter. Available for downloading at http://
    • PMML Blogs : Several blogs on PMML topics ( and http :// ).
  • 14. Thank You! U.S.A Headquarters Asia Office E-mail: [email_address] 19/F., Unit A Ho Lee Commercial Building 38-44 D’Aguilar Street Central, Hong Kong (S.A.R.) Tel: +852 2868-0878 Fax: +852 2845-6027 6125 Cornerstone Court East Suite 250 San Diego, CA, 92121 Tel: +1 619 330-0780 Fax: +1 858 535-0227