Josh Wills, MLconf 2013

1,755 views

Published on

Josh Wills, Senior Director of Data Science, Cloudera: Building a Production Machine Learning Infrastructure (Quickly)

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,755
On SlideShare
0
From Embeds
0
Number of Embeds
879
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Josh Wills, MLconf 2013

  1. 1. From  The  Lab  to  the  Factory   Building  A  Produc8on  Machine  Learning  Infrastructure   Josh  Wills,  Senior  Director  of  Data  Science   Cloudera   1
  2. 2. About  Me   2  
  3. 3. Data  Science:  Another  Defini8on   3
  4. 4. Data  Scien8sts  Build  Data  Products.   4
  5. 5. All*  Products  Become  Data  Products   5
  6. 6. Iden8fying  the  BoHlenecks   6
  7. 7. Oryx:  Model  Building  and  Serving   •  Algorithms   •  •  •  •  •  •  7   ALS  Recommenders   K-­‐Means  Parallel   RDF   Batch  model  building   via  MapReduce   Server  for  real-­‐8me   scoring  and  updates   PMML  4.1  Models    
  8. 8. Gertrude:  Evalua8on  via  Experiments   •  Mul8variate  Tes8ng   •  •  Overlapping   Experiments   •  •  8   Define  and  explore  a   space  of  parameters   Tang  et  al.  (2010)   Runs  mul8ple   independent   experiments  on  every   request  
  9. 9. Planning  For  The  Future   9
  10. 10. Thank  you!    Josh  Wills,  Director  of  Data  Science,  Cloudera              @josh_wills  

×