Data Science on Cloud Foundry
Ian Huston @ianhuston
Alexander Kagoshima @akagoshima
Who are we?
•  Data Scientists at Pivotal Labs
•  Using Cloud Foundry since 2013
•  Working with enterprises to get value out
of their data
Image by Drew Conway: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Data Scientist (n.): 

Person who is better at statistics than any
software engineer and better at software
engineering than any statistician.

- Josh Wills
Typical Projects
Risk
Analysis
Predictive
Maintenance
Understanding
Your Customer
Data Services
Easy control of incoming data
Data Services
Bind and scale system services 
–  Databases, NoSQL, message queues etc.
$	
  cf	
  create-­‐service	
  rediscloud	
  PLAN_NAME	
  
INSTANCE_NAME	
  
$	
  cf	
  bind-­‐service	
  APP_NAME	
  INSTANCE_NAME	
  
	
  
Add User Provided Services
–  Standalone Hadoop or Apache Spark cluster,
Big Data System
$	
  cf	
  cups	
  SERVICE_INSTANCE	
  -­‐p	
  "host,	
  
port,	
  username,	
  password"	
  	
  
	
  
Data Service
App
 App
 App
App
App
App
Deploy a Model Prediction API
Control distributed computation
h"ps://github.com/ihuston/python-­‐conda-­‐buildpack	
  
Install	
  PyData	
  packages	
  with	
  binary	
  builds	
  using	
  conda	
  
h"ps://github.com/alexkago/cf-­‐buildpack-­‐r	
  
R	
  interpreter	
  and	
  package	
  setup,	
  ready	
  for	
  RShiny	
  
Siloed
Data
Siloed
Systems
Distributed
Big Data
Platform
HOW TO 
DEPLOY
MODELS?
 Data Extract
?
(Model
development
happens here!)
(Business
needs model
predictions
here!)
App
App
App
App
App
Big Data Platform
Big Data Storage
R
E
S
T

A
P
I
Send data as JSON
Data
Ingest
Model
Create Model
Redis
Kicking off
periodic
retraining
Save training
data
Save model
object
Send JSON data
without label
Receive prediction
from trained model
instance
Deployed at:
http://dsoncf.cfapps.io
Code:
https://github.com/pivotalsoftware/ds-cfpylearning
PREDICTION API
ARCHITECTURE
$	
  cf	
  create-­‐service	
  
rediscloud	
  
PLAN_NAME	
  
INSTANCE_NAME	
  
MODEL
INTERFACE
Data Driven Applications
SIMPLE HTML + JS
MODEL
PREDICTIONS
http://ds-demo-transport.cfapps.io
RSHINY APP
INTERACTIVE
EXPLORATION
https://ak-insurance-demo.cfapps.io:4443/	
  
Show off your data
science related Cloud
Foundry apps:

Twitter: @dsoncf
http://dsoncf.com
@ianhuston
@akagoshima
R
E
S
T

A
P
I
Send data as JSON
Data
Ingest
Model
Create Model
Redis
Kicking off
periodic
retraining
Save training
data
Save model
object
Send JSON data
without label
Receive prediction
from trained model
instance
Deployed at:
http://dsoncf.cfapps.io
Code:
https://github.com/pivotalsoftware/ds-cfpylearning
Visualization
PREDICTION API
ARCHITECTURE
Data Services
Bind and scale system services 
–  Databases, NoSQL, message queues etc.
$	
  cf	
  create-­‐service	
  rediscloud	
  PLAN_NAME	
  INSTANCE_NAME	
  
$	
  cf	
  bind-­‐service	
  APP_NAME	
  INSTANCE_NAME	
  
	
  
Add User Provided Services
–  Standalone Hadoop or Apache Spark cluster, Big Data System
$	
  cf	
  cups	
  SERVICE_INSTANCE	
  -­‐p	
  "host,	
  port,	
  username,	
  
password"	
  	
  
	
  

CFSummit: Data Science on Cloud Foundry