This document discusses scaling machine learning models from a laboratory setting to production. It proposes using a standardized representation called PMML to capture models produced by R and Scikit-Learn. PMML allows models to be deployed across different frameworks and languages. The document outlines APIs for evaluating, maintaining, and integrating models as reusable functions within data pipelines in Hadoop ecosystems like Spark, Pig, and Cascading. The goal is a portable, platform-agnostic architecture for operationalizing machine learning based on open standards.