Big data meetup

1,847 views

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,847
On SlideShare
0
From Embeds
0
Number of Embeds
79
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Big data meetup

  1. 1. Data Science Data Meetup Jan. 12
  2. 2. What is data science?Besides a reason to have beer and pizza…
  3. 3. What does the literature say?
  4. 4. Hacking“Good data scientists understand, in adeep way, that the heavy lifting ofcleanup and preparation isn’tsomething that gets in the way of solvingthe problem… it is the problem” DJ Patil bash/awk/sed
  5. 5. StatisticsWhat’s the probability that 2 people inthe front 2 rows share a birthday?1. ~10%2. ~20%3. ~50%4. ~90%What’s the probability that a 99%accurate test diagnosed a 1/1000 disease?1. ~10%2. ~50%3. ~90%4. ~99%
  6. 6. Domain Expertise
  7. 7. Intelligence Cookbook Just follow the steps
  8. 8. The RecipeFirst, make it valuable.Then, make it possible.Then, make it beautiful. Then, make it smart.
  9. 9. ExampleE-Commerce website
  10. 10. Make it valuableFind a KPI that is correlated to bottom line revenuee.g. number of products the visitor browses through
  11. 11. Make it possibleDevelop the simplest heuristice.g. show the visitor one of the top 10 selling products
  12. 12. Make it beautifulCreate a method to quickly test new algorithms against old ones e.g. create a framework that split tests two models and reports which one is better
  13. 13. Make it smartFigure out in what field your problem is and choose an off the shelf algorithm e.g. recognize that the problem is product recommendation and use collaborative filtering
  14. 14. Common ML problems• Supervised learning • Classification • Regression • Anomaly detection• Unsupervised learning • Clustering • Separation• Recommendation • Feature based recommendation • Collaborative filtering• Search • Indexing • Ranking
  15. 15. To sum it all upReal data science is hardbut …Real data science is the last step in datascience, not the firstand besides …The most important thing in data science isthe business, not the science
  16. 16. Questions?email: vitalyp@liveperson.com Twitter: @bigdatasc

×