Easy Expression and
             Execution of Data Mining
             Models through PMML
             Alex Guazzelli, Ph...
Development, Deployment, and Execution
 of Predictive Models




Development                                            De...
Model Development


               The R Project
               The R Project

   R           R is an integrated suite of ...
Deployment


               Predictive Model Markup Language (PMML)
               Predictive Model Markup Language (PMML)...
PMML Industry Support


               Matured and Supported by Industry
               Matured and Supported by Industry
...
PMML
Bringing data and Models Together


                                         Predictive Model Markup Language
       ...
Zementis ©   7
Zementis ©   8
Got Models…




                Data Analysis

               Statistical Model

                PMML Export




         ...
Execution
The ADAPA Example



                    Predictive Analytics Scoring Engine
                    Predictive Anal...
Zementis ©   11
Zementis ©   12
1 through 6 – From Raw Data to Smart Decisions


               1          Data Extraction and Analysis
               2  ...
Thank You!


                          E-mail: info@zementis.com



    U.S.A                          Asia

   6125 Corne...
Upcoming SlideShare
Loading in …5
×

Easy Expression and Execution of Data Mining Models through PMML

1,080 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,080
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Easy Expression and Execution of Data Mining Models through PMML

  1. 1. Easy Expression and Execution of Data Mining Models through PMML Alex Guazzelli, Ph.D. Director of Analytics - Zementis, Inc. Forum on Analytics November 12, 2008 Zementis ©
  2. 2. Development, Deployment, and Execution of Predictive Models Development Deployment R allows for reliable data PMML allows for easy manipulation and model expression and building Open deployment of data Standards transformations and data-mining models Execution Real-time execution of models via web-services calls Zementis © 2
  3. 3. Model Development The R Project The R Project R R is an integrated suite of software facilities for data R is an integrated suite of software facilities for data manipulation, calculation and graphical display. manipulation, calculation and graphical display. R provides a wide variety of statistical techniques and is R provides a wide variety of statistical techniques and is highly extensible. highly extensible. R is similar to the S language and environment R is similar to the S language and environment developed at Bell Labs. developed at Bell Labs. It is Open Source and a GNU project. It is Open Source and a GNU project. R is available for free at http://www.r-project.org/ R is available for free at http://www.r-project.org/ Zementis © 3
  4. 4. Deployment Predictive Model Markup Language (PMML) Predictive Model Markup Language (PMML) PMML PMML is an XML-based language to PMML is an XML-based language to Define statistical and data mining models Define statistical and data mining models Share models between compliant applications Share models between compliant applications Standard for exchange of models to Standard for exchange of models to Avoid proprietary issues and incompatibilities Avoid proprietary issues and incompatibilities Deploy models in operational infrastructure Deploy models in operational infrastructure Clear separation of tasks Clear separation of tasks Model development vs. model deployment Model development vs. model deployment Scientists focus on building the best model Scientists focus on building the best model Eliminates need for custom model deployment Eliminates need for custom model deployment Ensures scalability and reliability Ensures scalability and reliability Zementis © 4
  5. 5. PMML Industry Support Matured and Supported by Industry Matured and Supported by Industry PMML Data Mining Group http://www.dmg.org Data Mining Group http://www.dmg.org Mature standard Mature standard Current version 3.2 Current version 3.2 Active group and constant enhancements Active group and constant enhancements Vendor independent consortium Vendor independent consortium Industry supporters Industry supporters Major Players: IBM, Oracle, SAP, Microsoft Major Players: IBM, Oracle, SAP, Microsoft Analytics: SAS, SPSS, Fair Isaac, Zementis Analytics: SAS, SPSS, Fair Isaac, Zementis Business Intelligence: MicroStrategy, Teradata Business Intelligence: MicroStrategy, Teradata Open Source: R Open Source: R Zementis © 5
  6. 6. PMML Bringing data and Models Together Predictive Model Markup Language Predictive Model Markup Language A Data Dictionary defines all the raw A Data Dictionary defines all the raw data fields (including missing value data fields (including missing value strategy and outlier treatment). a PMML defines strategy and outlier treatment). standard not only to Models represent data- Several Data Transformations Several Data Transformations but mining models, strategies allow for intelligent strategies allow for intelligent also data handling extraction of feature detectors from extraction of feature detectors from and data raw data (“data massaging”). raw data (“data massaging”). transformations Transformations (pre- and post- A comprehensive list of Data-Mining A comprehensive processing) list of Data-Mining Models offers power and flexibility. Models offers power and flexibility. Post-processing of results allow for Post-processing of results allow for tailored decisions tailored decisions Data Transformations and Data-Mining Models come together in PMML. Zementis © 6
  7. 7. Zementis © 7
  8. 8. Zementis © 8
  9. 9. Got Models… Data Analysis Statistical Model PMML Export What Now? Zementis © 9
  10. 10. Execution The ADAPA Example Predictive Analytics Scoring Engine Predictive Analytics Scoring Engine ADAPA Data transformations and model execution in real-time Data transformations and model execution in real-time (via web-services calls) or batch-mode. (via web-services calls) or batch-mode. Environment to manage and deploy many predictive Environment to manage and deploy many predictive models or rule sets. models or rule sets. Framework for SOA-based IT integration Framework for SOA-based IT integration Completely standards based and easily integrated Completely standards based and easily integrated with any existing infrastructure. with any existing infrastructure. Not a model building environment. Not a model building environment. Zementis © 10
  11. 11. Zementis © 11
  12. 12. Zementis © 12
  13. 13. 1 through 6 – From Raw Data to Smart Decisions 1 Data Extraction and Analysis 2 Model Building 3 PMML Export 4 PMML Import 5 Web-Service Calls 6 Model Execution Zementis © 13
  14. 14. Thank You! E-mail: info@zementis.com U.S.A Asia 6125 Cornerstone Court East 19/F., Unit A Suite 250 Ho Lee Commercial Building San Diego, CA, 92121 38-44 D’Aguilar Street Central, Hong Kong (S.A.R.) Tel: +1 619 330-0780 Tel: +852 2868-0878 Fax: +1 858 535-0227 Fax: +852 2845-6027 Zementis © 14

×