data science in academia and the real world

  • 1,312 views
Uploaded on

talk given 2014-04-29 to the New York Open Statistical Programming Meetup (http://www.meetup.com/nyhackr/) by Chris Wiggins (columbia/NYT/hackNY)

talk given 2014-04-29 to the New York Open Statistical Programming Meetup (http://www.meetup.com/nyhackr/) by Chris Wiggins (columbia/NYT/hackNY)

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,312
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. what is a computational biologist doing at the New York Times? ! (and what can academia do for a 163-year old company?) chris.wiggins@columbia.edu chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins
  • 2. context/background
  • 3. context/background (before ‘the talk’)
  • 4. biology: 1892 vs. 1995 biology changed for good.
  • 5. genetics: 1837 vs. 2012 from “segments” to algorithms
  • 6. genetics: 1837 vs. 2012 from intuition to prediction
  • 7. data science: web scale
  • 8. example: 163 yr old
  • 9. bit.ly/nyt-interactive-2013
  • 10. R+D: nytlabs.com
  • 11. developer.nytimes.com: 2008
  • 12. example: millions of views per hour
  • 13. from “segments” to algorithms insert figure here
  • 14. from intuition to prediction insert figure here
  • 15. data science: the web
  • 16. data science: the web is your “online presence”
  • 17. data science: the web is a microscope
  • 18. data science: the web is an experimental tool
  • 19. data science: the web is an optimization tool
  • 20. </header>
  • 21. </header> i.e., <body>
  • 22. common requirements in data science:
  • 23. common requirements in data science: 1. practices 2. skills 3. culture
  • 24. common requirements in data science: 1. practices 2. skills 3. culture
  • 25. common requirements in data science: 1. practices 2. skills 3. culture
  • 26. data science: practice
  • 27. data science: practice - reframe domain questions as machine learning tasks
  • 28. data science: practice - better wrong than "nice"
  • 29. data science: practice - be relevant !
  • 30. data science: practice - be relevant !
  • 31. data science: practice - be relevant !
  • 32. data science: practice - hypotheses are not data jeopardy !
  • 33. data science: practice - befriend experimentalists !
  • 34. data science: practice - befriend experimentalists !
  • 35. data science: practice - befriend experimentalists !
  • 36. data science: skills
  • 37. data science: skills - find quantifiables !
  • 38. data science: skills - find quantifiables (choose carefully) !
  • 39. data science: skills - straw man first !
  • 40. data science: skills - straw man first !
  • 41. data science: skills - small wins before feature engineering !
  • 42. data science: skills - data engineering before data science !
  • 43. data science: culture
  • 44. data science: culture - be communicative !
  • 45. data science: culture - be communicative (promote rhetorical literacy)
  • 46. data science: culture - be communicative (promote rhetorical literacy) - related: strive to build models which are both predictive and interpretable
  • 47. data science: culture - be skeptical (promote critical literacy)
  • 48. data science: culture - be empowering !
  • 49. data science: culture - be transparent !
  • 50. data science: culture - promote literacy: functional critical rhetorical ! (cf. Selber, Multiliteracies for a Digital Age. 2004)
  • 51. data science: culture - promote literacies: 1. functional 2. critical 3. rhetorical ! (cf. Selber, Multiliteracies for a Digital Age. 2004)
  • 52. data science: culture - promote literacies: 1. functional 2. critical 3. rhetorical ! (cf. Selber, Multiliteracies for a Digital Age. 2004)
  • 53. data science: culture - promote literacies: 1. functional 2. critical 3. rhetorical ! (cf. Selber, Multiliteracies for a Digital Age. 2004)
  • 54. data science: culture - promote literacies: 1. functional 2. critical 3. rhetorical ! (cf. Selber, Multiliteracies for a Digital Age. 2004)
  • 55. </body> i.e., <footer>
  • 56. summary:
  • 57. summary: pay attention to: 1. practices 2. skills 3. culture
  • 58. practices: 1. reframe questions as ML 2. better wrong than "nice" 3. be relevant 4. aim for hypothesis vs data jeapordy 5. befriend experimentalists
  • 59. skills: 1. find quantifiables 2. straw man first 3. small wins before feature engineering 4. data engineering before data science !
  • 60. culture: 1. be communicative 2. be skeptical 3. be empowering 4. be transparent 5. promote literacies
  • 61. find out more! 1. postdoc/student opportunities: chris.wiggins@columbia.edu ! 2. always hiring: chris.wiggins@nytimes.com ! 3. let’s talk: - @chrishwiggins - gist.github.com/chrishwiggins/
  • 62. what is a computational biologist doing at the New York Times? ! (and what can academia do for a 163-year old company?) chris.wiggins@columbia.edu chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins