• Save
"Human Cloning: The Data Scientist Bottleneck Resolved" Dr. Alex Farquhar @ds_ldn
Upcoming SlideShare
Loading in...5
×
 

"Human Cloning: The Data Scientist Bottleneck Resolved" Dr. Alex Farquhar @ds_ldn

on

  • 3,611 views

Dr. Alex Farquhar, Data Scientist @ForwardTek presentation at Data Science London @ds_ldn On the scarcity of data scientists, and how data scientists can maximise their output.

Dr. Alex Farquhar, Data Scientist @ForwardTek presentation at Data Science London @ds_ldn On the scarcity of data scientists, and how data scientists can maximise their output.

Statistics

Views

Total Views
3,611
Slideshare-icon Views on SlideShare
3,022
Embed Views
589

Actions

Likes
4
Downloads
0
Comments
0

6 Embeds 589

http://datasciencelondon.org 575
https://twitter.com 9
http://eventifier.info 2
http://www.onlydoo.com 1
http://pult.io 1
http://eventifier.co 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    "Human Cloning: The Data Scientist Bottleneck Resolved" Dr. Alex Farquhar @ds_ldn "Human Cloning: The Data Scientist Bottleneck Resolved" Dr. Alex Farquhar @ds_ldn Presentation Transcript

    • HUMAN CLONING The Data Scientist bottleneck resolved Dr Alex FarquharFriday, 24 February 2012
    • exabytes data (IDC/EMC report 2008) 20,000 15,000 10,000 5,000 0 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017Friday, 24 February 2012
    • By 2018, the United States alone could face a shortage of 140,000 to 190,000 data people...Friday, 24 February 2012
    • WE’RE ALL DOOMEDFriday, 24 February 2012
    • DATA PEOPLE? © Drew ConwayFriday, 24 February 2012
    • MAYBE WE CAN JUST.... •1 statistician + 1 developer ≈ 1 data scientist?Friday, 24 February 2012
    • HOW ABOUT.... •4 statisticians + 4 developers ≈ 4 Data Scientists?Friday, 24 February 2012
    • Friday, 24 February 2012
    • Friday, 24 February 2012
    • WHAT CAN WE DO? • Train more new data scientists (not fast enough) • Cross-train people • Cobble together different skills in teams (see above)Friday, 24 February 2012
    • WHAT CAN WE DO? • Do more workFriday, 24 February 2012
    • DOING MORE • simplify (fob the work off) • automate (fob even more work off) • choose/build the right tools • parallelise • iterateFriday, 24 February 2012
    • SIMPLIFY & AUTOMATE • Counting stuff is not much funFriday, 24 February 2012
    • SIMPLIFY & AUTOMATE Hive TSV files HadoopFriday, 24 February 2012
    • AUTOMATE / PARALLELISE magic Hadoop JobFriday, 24 February 2012
    • AUTOMATE / PARALLELISE magic Hadoop Lots of jobs at once Job 1 Job 2 Job 3 Job 4Friday, 24 February 2012
    • TOOLS • something thats allows fast iteration i.e. not java • R, ruby, pythonFriday, 24 February 2012
    • PARALLELISEFriday, 24 February 2012
    • ITERATE • try different things • improve what works • dump what doesn’t • constant improvement & learning → get fasterFriday, 24 February 2012
    • WE’RE NOT ALL DOOMEDFriday, 24 February 2012