Predictive Models for
Production Applications
July 2013
Why building analytical apps is hard
Overcoming the challenge
Case study: building a beer recommender
If you double the number of
experiments you do per year, you're
going to double your inventiveness.
“
”- Jeff Bezos
We need to reduce
churn. Okay. I'll look into it.
Lots of conversations like this
I figured out that....some
complex stuff about
vector space that'll
improve...
....and that's how we'll
reduce churn.
Soun...
Now what?
Any of you know
what Gradient
Boosting is?
So when can we go
live with the new
model?
It's hard to incorporate
analytical work into
day-to-day operations
We know finding a data scientist tough.
http://drewconway.com/
Building applications
from their insights is tougher.
"cool. what do we do now?" scenarios
http://scikit-learn.org/stable/auto_examples/index.html
Product Page Search Results
Order Confirmation / Checkout Page
Reduce QA time for classifying purchases made online
How to go from prototype to product?
How do companies
address this problem?
Rewriting Code
Common Approaches Challenge
Rewriting Code
Common Approaches
Cross-environment validation
Challenge
Rewriting Code
Batch Jobs
Common Approaches
Cross-environment validation
Challenge
Rewriting Code
Batch Jobs
Common Approaches
Cross-environment validation
High maintenance and config
Challenge
Rewriting Code
Batch Jobs
PMML
Common Approaches
Cross-environment validation
High maintenance and config
Challenge
Rewriting Code
Batch Jobs
PMML
Common Approaches
Cross-environment validation
High maintenance and config
Limited to certa...
Rewriting Code
Batch Jobs
PMML
Common Approaches
Cross-environment validation
High maintenance and config
Limited to certa...
Can we build and bring
to market smarter
applications faster?
Rewriting Code
Batch Jobs
PMML
Common Approaches
Cross-environment validation
High maintenance and config
Limited to certa...
Key Tenets
1. Work with the tools you already know
Key Tenets
1. Work with the tools you already know
2. Iterate quickly
Key Tenets
1. Work with the tools you already know
2. Iterate quickly
3. Low touch
Key Tenets
1. Work with the tools you already know
2. Iterate quickly
3. Low touch
4. No rewriting code
Key Tenets
demo
A Beer Recommender in Python
What beer should I drink?
Tell us a beer you like
We'll tell you some
other beers you'll like
1. Import the data
2. Find common reviewers
3. Calculate review distance
4. Rank beers
Plan
1. Import the data
2. Find common reviewers
3. Calculate review distance
4. Rank beers
Plan
The Dataset
The Dataset
The Dataset
1. Import the data
2. Find common reviewers
3. Calculate review distance
4. Rank beers
Plan
Find common reviewers
1. Import the data
2. Find common reviewers
3. Calculate distance
4. Rank beers
Plan
Comparing 2 Similar Beers
vs
Dale's Pale Ale and Fat Tire Amber Ale
Dale's Pale Ale and Fat Tire Amber Ale
"Perfect Agreement"
Dale's Pale Ale and Fat Tire Amber Ale
Dale's Pale Ale and Fat Tire Amber Ale
Dale's Pale Ale and Fat Tire Amber Ale
Dale's Pale Ale and Fat Tire Amber Ale
Similar reviews
Comparing 2 Dissimilar Beers
vs
Dale's Pale Ale and Fat Tire Amber Ale
Michelob Ultra and Fat Tire Amber Ale
Measuring distance
Measuring distance
...yes, there are other ways to do this.
Calculating Distance
Distance Implementation
Calculate the Distance
● Generate all beer pairs
● Calculate distance between each pair
1. Import the data
2. Find common reviewers
3. Calculate review distance
4. Rank results
Plan
So if I like Coors Light, what other
beers might I like?
shipping your work
with
Make analytical routines available to other apps
{ "beer": "Coors Light",
"weights": [3, 2, 0, 1]}
[
["Bud Light", 9.2],
[...
1. Pre-processing & transformations
2. Prediction & post-processing
Use the same code you wrote
during exploration and modeling.
2. Prediction & post-processing
We're ready to deploy
Analytical routine is now ready to be
deployed Go-to-market with as little overhead as possible
Deploy
Pass objects you'd like included in
your model as named arguments
Specify User Defined
Functions you want to
include in your project
Deploy
Pass the name of your
model and your BaseModel
object to the deploy
function
Deploy
Execute deploy to host
your model on Yhat
Deploy
Make predictions in a
production app
pydata-beer.herokuapp.com
https://github.com/yhat/Beer-Rec-Flask
Data/Code Bundle
Webapp: https://github.com/yhat/Beer-Rec-Flask
IPython Notebook: http://bit.ly/1bkCTHz
Dataset: http://bi...
Want to try?
yhathq.com
We're Hiring
info@yhathq.com
yhathq.com/jobs
Questions?
greg@yhathq.com
austin@yhathq.com
yhathq.com
@YhatHQ
blog.yhathq.com
Appendix
Learn by iteration from the context of real-
world business applications.
Deployment
Execute the deploy
function to host your
model on Yhat
Pass the name of your
model and your BaseModel
object to...
Predictive Models for Production Apps with Yhat
Upcoming SlideShare
Loading in...5
×

Predictive Models for Production Apps with Yhat

2,932

Published on

Yhat presentation at PyData Boston 2013. Predictive Models for Production Apps with Yhat. Building a beer recommender with Python and Yhat.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,932
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
72
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Predictive Models for Production Apps with Yhat

  1. 1. Predictive Models for Production Applications July 2013
  2. 2. Why building analytical apps is hard Overcoming the challenge Case study: building a beer recommender
  3. 3. If you double the number of experiments you do per year, you're going to double your inventiveness. “ ”- Jeff Bezos
  4. 4. We need to reduce churn. Okay. I'll look into it. Lots of conversations like this
  5. 5. I figured out that....some complex stuff about vector space that'll improve... ....and that's how we'll reduce churn. Sounds good. Let's do that... The "a ha" moment isn't the end.
  6. 6. Now what? Any of you know what Gradient Boosting is? So when can we go live with the new model?
  7. 7. It's hard to incorporate analytical work into day-to-day operations
  8. 8. We know finding a data scientist tough. http://drewconway.com/
  9. 9. Building applications from their insights is tougher.
  10. 10. "cool. what do we do now?" scenarios http://scikit-learn.org/stable/auto_examples/index.html
  11. 11. Product Page Search Results Order Confirmation / Checkout Page Reduce QA time for classifying purchases made online
  12. 12. How to go from prototype to product?
  13. 13. How do companies address this problem?
  14. 14. Rewriting Code Common Approaches Challenge
  15. 15. Rewriting Code Common Approaches Cross-environment validation Challenge
  16. 16. Rewriting Code Batch Jobs Common Approaches Cross-environment validation Challenge
  17. 17. Rewriting Code Batch Jobs Common Approaches Cross-environment validation High maintenance and config Challenge
  18. 18. Rewriting Code Batch Jobs PMML Common Approaches Cross-environment validation High maintenance and config Challenge
  19. 19. Rewriting Code Batch Jobs PMML Common Approaches Cross-environment validation High maintenance and config Limited to certain libraries, Still rewriting Challenge
  20. 20. Rewriting Code Batch Jobs PMML Common Approaches Cross-environment validation High maintenance and config Limited to certain libraries, Still rewriting Challenge More people, more tools, more time to market.
  21. 21. Can we build and bring to market smarter applications faster?
  22. 22. Rewriting Code Batch Jobs PMML Common Approaches Cross-environment validation High maintenance and config Limited to certain libraries, Still rewriting Challenge A platform for running predictive models in production applications.
  23. 23. Key Tenets
  24. 24. 1. Work with the tools you already know Key Tenets
  25. 25. 1. Work with the tools you already know 2. Iterate quickly Key Tenets
  26. 26. 1. Work with the tools you already know 2. Iterate quickly 3. Low touch Key Tenets
  27. 27. 1. Work with the tools you already know 2. Iterate quickly 3. Low touch 4. No rewriting code Key Tenets
  28. 28. demo
  29. 29. A Beer Recommender in Python
  30. 30. What beer should I drink?
  31. 31. Tell us a beer you like
  32. 32. We'll tell you some other beers you'll like
  33. 33. 1. Import the data 2. Find common reviewers 3. Calculate review distance 4. Rank beers Plan
  34. 34. 1. Import the data 2. Find common reviewers 3. Calculate review distance 4. Rank beers Plan
  35. 35. The Dataset
  36. 36. The Dataset
  37. 37. The Dataset
  38. 38. 1. Import the data 2. Find common reviewers 3. Calculate review distance 4. Rank beers Plan
  39. 39. Find common reviewers
  40. 40. 1. Import the data 2. Find common reviewers 3. Calculate distance 4. Rank beers Plan
  41. 41. Comparing 2 Similar Beers vs
  42. 42. Dale's Pale Ale and Fat Tire Amber Ale
  43. 43. Dale's Pale Ale and Fat Tire Amber Ale "Perfect Agreement"
  44. 44. Dale's Pale Ale and Fat Tire Amber Ale
  45. 45. Dale's Pale Ale and Fat Tire Amber Ale
  46. 46. Dale's Pale Ale and Fat Tire Amber Ale
  47. 47. Dale's Pale Ale and Fat Tire Amber Ale Similar reviews
  48. 48. Comparing 2 Dissimilar Beers vs
  49. 49. Dale's Pale Ale and Fat Tire Amber Ale
  50. 50. Michelob Ultra and Fat Tire Amber Ale
  51. 51. Measuring distance
  52. 52. Measuring distance ...yes, there are other ways to do this.
  53. 53. Calculating Distance
  54. 54. Distance Implementation
  55. 55. Calculate the Distance ● Generate all beer pairs ● Calculate distance between each pair
  56. 56. 1. Import the data 2. Find common reviewers 3. Calculate review distance 4. Rank results Plan
  57. 57. So if I like Coors Light, what other beers might I like?
  58. 58. shipping your work with
  59. 59. Make analytical routines available to other apps { "beer": "Coors Light", "weights": [3, 2, 0, 1]} [ ["Bud Light", 9.2], ["Budweiser", 12.2], ["Sierra Nevada", 21.2], ]
  60. 60. 1. Pre-processing & transformations
  61. 61. 2. Prediction & post-processing
  62. 62. Use the same code you wrote during exploration and modeling. 2. Prediction & post-processing
  63. 63. We're ready to deploy
  64. 64. Analytical routine is now ready to be deployed Go-to-market with as little overhead as possible
  65. 65. Deploy Pass objects you'd like included in your model as named arguments
  66. 66. Specify User Defined Functions you want to include in your project Deploy
  67. 67. Pass the name of your model and your BaseModel object to the deploy function Deploy
  68. 68. Execute deploy to host your model on Yhat Deploy
  69. 69. Make predictions in a production app
  70. 70. pydata-beer.herokuapp.com https://github.com/yhat/Beer-Rec-Flask
  71. 71. Data/Code Bundle Webapp: https://github.com/yhat/Beer-Rec-Flask IPython Notebook: http://bit.ly/1bkCTHz Dataset: http://bit.ly/14Wl64k
  72. 72. Want to try? yhathq.com
  73. 73. We're Hiring info@yhathq.com yhathq.com/jobs
  74. 74. Questions? greg@yhathq.com austin@yhathq.com yhathq.com @YhatHQ blog.yhathq.com
  75. 75. Appendix
  76. 76. Learn by iteration from the context of real- world business applications.
  77. 77. Deployment Execute the deploy function to host your model on Yhat Pass the name of your model and your BaseModel object to the deploy function Pass objects you'd like included in your model as named arguments Specify User Defined Functions you want to include in your project
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×