Your SlideShare is downloading. ×
PMML - Predictive Model Markup Language
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

PMML - Predictive Model Markup Language


Published on

PMML (Predictive Model Markup Language) provides a standard way to represent data mining models so that these can be shared between different statistical applications. Not only PMML can represent a …

PMML (Predictive Model Markup Language) provides a standard way to represent data mining models so that these can be shared between different statistical applications. Not only PMML can represent a wide range of statistical techniques, but it can also be used to represent the data transformations necessary to transform raw data into meaningful feature detectors. In this way, PMML offers a standard to represent data manipulation and modeling in a single and concise way.

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. PMML Overview
  • 2. PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing) PMML Predictive Model Markup Language Transformations
    • PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications.
    • It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models.
    • PMML allows for the clear separation of tasks: Model development vs. model deployment. As a consequence, scientists can focus on building the best model.
    • PMML eliminates need for custom model deployment and ensures scalability and reliability.
  • 3. Matured and Supported by Industry PMML PMML Industry Support
    • Data Mining Group http://
    • Mature standard
      • Current version 4.0 (just released)
      • Active group and constant enhancements
    • Vendor independent consortium
    • Industry supporters
      • Major Players: IBM, Oracle, SAP, Microsoft
      • Analytics: KXEN, SAS, Salford, SPSS, Zementis
      • BI: Microstrategy, Teradata, Tibco
      • Open Source: KNIME, R, Rapid-I
      • Others: Equifax, FICO, Open Data Group, Visa, Pervasive
  • 4. PMML Components
    • A Data Dictionary defines all the raw data fields (including missing value strategy and outlier treatment).
    • Several Data Transformations strategies allow for intelligent extraction of feature detectors from raw data (“data massaging”).
    • A comprehensive list of Data-Mining Models offers power and flexibility.
    • Post-processing of results allow for tailored decisions.
    • Model Explanation allows for performance evaluation.
  • 5. PMML Files 3) Post-Processing Scaling of model outputs can be performed with PMML element Targets 1) Pre-Processing PMML elements Transformations, Mining Schema and Functions allow for effective pre-processing 2) Models PMML allows for several predictive modeling techniques to be fully expressed PMML
  • 6. PMML: Data Pre-Processing
    • Data Dictionary : Allows for the explicit specification of valid, invalid and missing values.
    • Mining Schema : Used to define the appropriate treatment to be applied to missing and invalid values.
    • Transformations : Allow for variable discretization, normalization, and mapping with handling of missing and default values.
    • Built-in Functions : Arithmetic expressions, handling of date and time as well as strings. Also used for implementing IF-THEN-ELSE logic and Boolean operations.
    Data Pre-Processing 1
  • 7. Data Pre-Processing: PMML Example Arbitrary Piecewise Linear Function This PMML code implements: Var_b:=interpolate(Var_a,((100,0),(200,1),(800,3),(900,4))) See - look for element NormContinuous.
  • 8. Modeling Elements
    • PMML allows for several predictive modeling techniques to be expressed directly. Supported techniques which have their own elements are:
      • Regression and General Regression
      • Neural Networks
      • Support Vector Machines
      • Decision Trees
      • Naïve Bayes
      • Clustering
      • Sequences
      • Rule Sets
      • Association Rules
      • Time-Series (as of PMML 4.0)
      • Text Models
      • Support for Multiple Models
    Easy Expression of Predictive Models 2
  • 9. Modeling Elements: PMML Example for Neural Network
  • 10. The PMML code below implements score post-processing. It uses the PMML element Targets for checking boundaries ( min and max ) and to rescale ( rescaleConstant and rescaleFactor ) the original score generated by model See Data Post-Processing: PMML Example 3
  • 11. Applications Service Providers External Vendors Divisions One Standard, One Process
  • 12. PMML = Easy Model Deployment Model Deployment Model Building PMML
  • 13. PMML - Zementis Contributions
    • ADAPA : A decision engine that deploys models expressed in PMML and executes them in real-time. Now available as a service on the Amazon Cloud.
    • PMML Converter : Validates, converts, and corrects old and new PMML code. Available at the DMG website and at .
    • Contributing Member of the DMG : Submitted several proposals for PMML 4.0 and already working with other members on PMML 4.1.
    • Code contributor for the R PMML package (available on CRAN).
    • PMML Articles : R Journal and SIGKDD Explorations Newsletter. Available for downloading at http://
    • PMML Blogs : Several blogs on PMML topics ( and http :// ).
  • 14. Thank You! U.S.A Headquarters Asia Office E-mail: [email_address] 19/F., Unit A Ho Lee Commercial Building 38-44 D’Aguilar Street Central, Hong Kong (S.A.R.) Tel: +852 2868-0878 Fax: +852 2845-6027 6125 Cornerstone Court East Suite 250 San Diego, CA, 92121 Tel: +1 619 330-0780 Fax: +1 858 535-0227