Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Loading in …5
×

# JHU Data Science MOOCs - Behind the Scenes

1,713 views

Published on

A talk given at the Harvard big data seminar

Published in: Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

### JHU Data Science MOOCs - Behind the Scenes

1. 1. @simplystats
2. 2. Tophat2 derfinder (HMMs) +
3. 3. Tophat2 derfinder (bumphunter) +
4. 4. Rail-RNA (cloud computing) derfinder (bumphunter) +
5. 5. 7/1/2012 (3:45PM) “Hopkins has a deal w/ Coursera" 7/1/2012 (5:00PM) Roger has bad timing 7/2/2012 Roger + Jeff sign on 7/5/2012 We make advertising videos 7/17/2012 Official JHU + Coursera announcement
6. 6. Wouldn’t it be amazing if we got 2,000 people to learn statistics! “ ”-Jeff Leek 7/17/12
7. 7. date: 7/19/12 from: jtleek@gmail.com Roger let me know you gave him a ballpark figure for the number of students registered for his course "Computing for Data Analysis”. Could you give me an idea of how many have registered for my course "Data Analysis?”
8. 8. date: 7/19/12 from: pangwei@coursera.org Hi Jeff, 7,000 students! It's pretty awesome. (You'll be able to check this out yourself next week, once the class sites are up.)
9. 9. date: 7/19/12 from: rdpeng@gmail.com You are f**ed. -roger
10. 10. 7/2012 Official JHU + Coursera announcement 9/2012 Brian/Roger run classes 1/2013 Jeff runs “data analysis”
11. 11. A MOOC is Videos
12. 12. A MOOC is Quizzes
13. 13. A MOOC is Forums
14. 14. A MOOC is Peer grading
15. 15. Formatting 1. Does the analysis have an introduction, analysis, and conclusions? (wt = 10) 2. Does the analysis include references for the statistical methods used? (wt = 2) …. The Question 1. Is the type of question specified (exploratory, inferential, predictive, causal)? (wt = 10) 2. Does the analysis answer the scientific question? (wt = 10) 3. Does the analysis report a measure of uncertainty about the answer? (wt = 10) …. …
16. 16. Leek & Peng 2015 PNAS
17. 17. Experiment 1
18. 18. Fisher et al. 2014 PeerJ n=2,048
19. 19. Fisher et al. 2014 PeerJ
20. 20. Fisher et al. 2014 PeerJ
21. 21. Experiment 2
22. 22. 69% vs 40% n=1,985
23. 23. Mathematical Biostatistics Bootcamp Computing for Data Analysis Data Analysis ~15K enrolled ~50K enrolled ~100K enrolled Understanding scale
24. 24. 6503 Data analysis completers 6761* M.S. in Statistics * http://community.amstat.org/blogs/steve-pierson/2014/02/09/largest-graduate-programs-in-statistics Understanding scale
25. 25. Understanding cost Laptop iPhone w/tripod mount Tripod Microphone (a good one) Camtasia (screen recording) Final Cut Pro X (video editing) Total Cost: \$2,877
26. 26. 7/2012 Official JHU + Coursera announcement 9/2012 Brian/Roger run classes 1/2013 Jeff runs “data analysis” 11/2013 Daphne Koller visits
27. 27. 7/2012 Official JHU + Coursera announcement 9/2012 Brian/Roger run classes 1/2013 Jeff runs “data analysis” 11/2013 Daphne Koller visits We claim to have data science sequence
28. 28. This is false
29. 29. Failure is not an option
30. 30. 1/2013 Jeff runs “data analysis” 11/2013 Daphne Koller visits We claim to have data science sequence 12/2013 We start making DSS 2/2014 We start testing DSS 4/2014 We launch!
31. 31. 9 classes 1 month long Every month
32. 32. Less standard content Standard content Github Data cleaning Interactive graphics Presentations Capstone Probability Inference Regression and GLMs EDA
33. 33. Moore Data Science Environments 0/3 directors, 1/25 speakers statisticians NAS Big Data Workshop 2/13 speakers statisticians NIH BD2K Proposal Workshop 0/18 participants Big Data Rollout from White House 0/4 thought leaders in statistics
34. 34. (1/n) reasons: speed
35. 35. Should we teach the Lasso? “ ”
36. 36. No“ ”
37. 37. (2/n) reasons: infrastructure
38. 38. Less standard content Standard content Github Data cleaning Interactive graphics Presentations Capstone Probability Inference Regression and GLMs EDA
39. 39. swirl + Coursera
40. 40. Want to do a capstone? “ ”
41. 41. Ok guy I just met“ ”
42. 42. LinkedIn Certification
43. 43. This is not a degree! Portfolio based Open content Johns Hopkins backing Alumni “social network”
44. 44. Enrollment
45. 45. Sigtrack
46. 46. Completion percentage
47. 47. Sigtrack completion
48. 48. Total Time Running: 13 months Avg. Monthly Enrollment: 170,837 Avg. Monthly SigTrack: 12,486 (7.3%) Overall Completion Rate: 10% SigTrack Completion Rate: 85% First Capstone Enrollment: 663
49. 49. Cost comparison
50. 50. Revenue for 2014 (Q2—Q4): \$1.75M Revenue to Biostatistics: \$1.24M Resources req’d to date: 0.5 staff Low overhead: No admissions process, no student supervision, no administrative support Student population: Orthogonal?
51. 51. Why I think we were successful
52. 52. http://www.provost.umd.edu/announcements/ new_coursera_mooc.cfm
53. 53. @jtleek jtleek.com/talks