Successfully reported this slideshow.

How to Become a Data Scientist

155,727 views

Published on

How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com


Zipfian Academy @ Crowdflower

How to Become a Data Scientist

  1. 1. Ryan Orban Co-Founder & CEO ryan@zipfianacademy.com @ryanorban
  2. 2. Why are we talking about data science?
  3. 3. Data Analyst Shortage Source: http://www.delphianalytics.net/wp-content/uploads/2013/04/GrowthOfDataVsDataAnalysts.png
  4. 4. What is data science?
  5. 5. Perfect Storm
  6. 6. Technology Source: http://www.jcmit.com/diskprice.htm 0 1000 2000 3000 4000 1992 1997 2002 2007 2012 Capacity (GB) Cost per GB (USD)
  7. 7. Unprecedented Data Growth
  8. 8. Enter the Data Scientist
  9. 9. What is Data Science? + Communication
  10. 10. What do people look for in a data scientist?
  11. 11. Broad-range generalist Deepexpertise T-Shaped Skillset
  12. 12. T-Shaped Skillset Machine Learning, Statistics, Domain Knowledge Softw are EngineeringBusiness Acum en Distributed Com puting Com m unication
  13. 13. Data Science Roles
  14. 14. How to I become a data scientist?
  15. 15. Data scientists need to know how to code.
  16. 16. Python R Julia Java C++/GoScala/Clojure High-level Lower-level Learn to Code
  17. 17. Learn to Code
  18. 18. Data scientists need to be comfortable with mathematics & statistics.
  19. 19. Mathematics Statistical Analysis Mathematics & Statistics Distributions (Binomial, Poisson, etc.) Summary Statistics (Mean, Variance, etc.) Hypothesis Testing Bayesian Analysis Linear Algebra (Matrix Factorization) Calculus (Integrals, Derivatives, etc) Graph Theory Probability/ Combinatorics
  20. 20. Mathematics & Statistics
  21. 21. Data scientists need know machine learning & software engineering.
  22. 22. Distributed Computing Supervised (SVM, Random Forest) NLP / Information Retrieval Algorithms & Data Structures Data Visualization Data Munging Machine Learning & Software Engineering Machine Learning Software Engineering Validation, Model Comparison Unsupervised (K-means, LDA)
  23. 23. Open-Source Data Science Masters
  24. 24. SlideRule
  25. 25. DataTau
  26. 26. Learning data science can be really hard.
  27. 27. ≠ Data Science
  28. 28. Learning data science can be really hard.
  29. 29. Context is King
  30. 30. It’s about putting the pieces together
  31. 31. Pathways: MS/PhD in Data Science Internship Immersive Programs Self-study
  32. 32. You don’t need a PhD to do data science.
  33. 33. Backgrounds Educational Background BS MS PhD 0 4 8 12 16
  34. 34. Backgrounds Disciplines Software Engineering Analysts Finance/Economics Engineering Physics Physical Sciences Mathematics Statistics Astronomy Linguistics Professional Poker 0 2 4 6 8
  35. 35. Backgrounds 94% Placement Rate91% Placement $115k avg. salary
  36. 36. The Program • 12-week immersive bootcamp in San Francisco • Project-based curriculum with real datasets, solving actual problems • Guest lectures from leaders in the field • Personal mentorship to help students grow
  37. 37. Timeline STRUCTURED CURRICULUM HIRING DAY CAPSTONE PROJECT GRADUATION 1 8 11 12 INTERVIEW PREP Program Timeline
  38. 38. Learning Techniques
  39. 39. Hiring Partners
  40. 40. ! • Working knowledge of programming • Background in a quantitative discipline • Comfortable with mathematics and statistics • Child-like curiosity What We Look For
  41. 41. Zipfian Academy Data Science Immersive Data Fellowship Data Engineering Immersive Weekend Workshops
  42. 42. Zipfian Academy @ZipfianAcademy Data Science Immersive 12-weeks (Sep 8th) Weekend Workshops http://zipfianacademy.com/apply http://zipfianacademy.com/workshops Next: Interactive Visualizations w/ d3.js ( July 19 )
  43. 43. The best way to learn data science is by doing data science.
  44. 44. https://github.com/ipython/ipython/wiki/A-gallery-of- interesting-IPython-Notebooks
  45. 45. Checklist: Learn the fundamentals Build out a project portfolio Apply! Blog about your experience
  46. 46. A Practical Intro to Data Science http://bit.ly/learndatascience
  47. 47. Thank You! Ryan Orban Co-Founder ryan@zipfianacademy.com @ryanorban

×