×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

Pattern - an open source project for migrating predictive models from SAS, etc., onto Hadoop

by on Jul 11, 2013

  • 2,585 views

"Pattern" is an open source project which takes models trained in popular analytics frameworks, such as SAS, Microstrategy, SQL Server, etc., and runs them at scale on Apache Hadoop. This machine ...

"Pattern" is an open source project which takes models trained in popular analytics frameworks, such as SAS, Microstrategy, SQL Server, etc., and runs them at scale on Apache Hadoop. This machine learning library works by translating PMML -- an established XML standard for predictive model markup -- into data workflows based on the Cascading API in Java. PMML models can be run in a pre-defined JAR file with no coding required. PMML can also be combined with other flows based on ANSI SQL (Lingual), Scala (Scalding), Clojure (Cascalog), etc. Multiple companies have collaborated to implement parallelized algorithms: Random Forest, Logistic Regression, K-Means, Hierarchical Clustering, etc., with more machine learning support being added. Benefits include greatly reduced development costs and less licensing issues at scale ?- while leveraging a combination of Apache Hadoop clusters, existing intellectual property in predictive models, and the core competencies of analytics staff. Sample code in the talk will show apps using predictive models built in SAS and R, e.g., anti-fraud classifiers. In addition, examples will show how to compare variations of models for large-scale customer experiments. Portions of this material come from the O`Reilly book "Enterprise Data Workflows with Cascading", due June 2013.

Statistics

Views

Total Views
2,585
Views on SlideShare
2,424
Embed Views
161

Actions

Likes
3
Downloads
53
Comments
0

1 Embed 161

http://www.scoop.it 161

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
Post Comment
Edit your comment

Pattern - an open source project for migrating predictive models from SAS, etc., onto Hadoop Pattern - an open source project for migrating predictive models from SAS, etc., onto Hadoop Presentation Transcript