Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The case for (J)PMML

407 views

Published on

Overview of a typical machine learning workflow, and the role of Predictive Model Markup Language (PMML) in it.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

The case for (J)PMML

  1. 1. The case for (J)PMML Villu Ruusmann Openscoring OÜ
  2. 2. "Machines that learn" https://twitter.com/chrisalbon/status/889987842675429376
  3. 3. Data science ● Business design ● All existing data records in batch mode ● Mathematical optimization problem
  4. 4. ML algorithms
  5. 5. DevOps ● Business implementation ● Single new data records in real-time mode, multiple new data records in streaming and batch modes ● Ease-of-use and robustness typically preferable to raw performance
  6. 6. Workflow
  7. 7. Data and Labels ● Unstructured data: ○ Low-level ○ Binary (video, images, voice) ● Structured data: ○ High-level ○ Conceptual, relational (SQL); possibly with sequential and/or temporal dimensions ○ Metadata
  8. 8. Rules ● "Black box" (aka "Artificial Intelligence") models: ○ Deep and wide neural networks ● "Grey box" (aka "Machine Learning") models: ○ Shallow and narrow neural networks ○ Ensemble models ● "White box" (aka "Statistical") models: ○ Linear and logistic regression ○ Decision tree
  9. 9. Predictive Model Markup Language (PMML) ● A DMG.org effort since 1999 ● Representing rules (functions) in terms of standardized data structures: ○ "Conventions over configuration" ○ Backward- and forward-compatibility ○ Vendor extensions ● Platform-, language- and framework-agnostic ● Human- and machine readable, executable, writable
  10. 10. JPMML software stack (1/2) ● "The (Java-) API is the product" ○ PMML producer (aka model conversion) vertical ○ PMML consumer (aka model scoring) vertical ○ ML-framework vs. PMML integration testing suite ● Layered library approach ○ Lower layer(s): BSD 3-Clause License ○ Higher layers: Affero GPLv3
  11. 11. JPMML software stack (2/2)
  12. 12. Q&A villu@openscoring.io https://github.com/jpmml https://github.com/openscoring https://groups.google.com/forum/#!forum/jpmml

×