1. The Building Data
Science
Summit | Boston 2017
Saturday, June 24 1
In my presentation, entitled ‘The Rise of the Data
Scientist,’ I will be discussing what it takes to truly
become great in this industry. I’ll disseminate the
difference between data science & data analysis,
and I’ll take you on a whistle-stop tour of the
history that made Data Science what it is today.
An introduction to Adam Keene
Adam Keene
Head of Data Science, Boston MA
Harnham Inc.
2. THE RISE OF THE DATA SCIENTIST
By Adam Keene
Head of Data Science - Boston, at Harnham
4. The Building Data Science
Summit | Boston 2017
Saturday, June 24 4
- Adam Keene (not from around these parts)
- Liverpool FC
- Head of Data Science Practice
- 8 years recruitment experience
- 4 years on agency-side in analytics
- UK to Chicago
- Chicago to New York
Who am I?
5. The Building Data Science
Summit | Boston 2017
Saturday, June 24 5
- A short history of Data Science & Machine Learning
- Exposition of Data Analysis & Data Science
- What it takes to be a great Data Scientist
Content
6. The Building Data Science
Summit | Boston 2017
Saturday, June 24 6
“Data science is an interdisciplinary field pertaining to scientific methods,
processes, and systems to extract knowledge or insights from data in
various forms, either structured or unstructured”
- Vasant Dhar
Defining Data Science
7. The Building Data Science
Summit | Boston 2017
Saturday, June 24 7
- The opportunity to share
- Aspiring, Practicing, Leadership
- Evolve your craft, expand your horizons, stand out
- Mentors & thought leaders, on tap
- Now, let’s go back to the beginning…
Before we get into this…
9. The Building Data Science
Summit | Boston 2017
Saturday, June 24 9
- Peter Naur
- Computer Science Pioneer
- Turing Award recipient
- Concise Survey of Computer Methods
- Computer Science to Data Science
1974
10. The Building Data Science
Summit | Boston 2017
Saturday, June 24 10
- The International Association for Statistical Computing
established (IASC)
- Section of the ISI (International Statistical Institute)
1977
11. The Building Data Science
Summit | Boston 2017
Saturday, June 24 11
- Gregory Piatetsky
- Knowledge Discovery in Databases workshop (KDD)
- KDNuggets = a very useful resource
1989
12. The Building Data Science
Summit | Boston 2017
Saturday, June 24 12
- BusinessWeek cover story on Database Marketing
“Companies are collecting mountains of information about
you, crunching it to predict how likely you are to buy a
product, and using that knowledge to craft a marketing
message precisely calibrated to get you to do so”
1994
13. The Building Data Science
Summit | Boston 2017
Saturday, June 24 13
“Data Science, Classification and Related Methods”
- International Federation of Classification Societies (IFCS)
- Data Science used in the title of their conference for the
first time:
1996
14. The Building Data Science
Summit | Boston 2017
Saturday, June 24 14
- C.F. Jeff Wu
- Professor at University of Michigan (now at Georgia Tech)
- Coca-Cola Chair in Engineering Statistics
- Called for Statisticians to be renamed Data Scientists
1997
15. The Building Data Science
Summit | Boston 2017
Saturday, June 24 15
- William S. Cleveland
- Distinguished Professor of Statistics, Purdue
“Data Science: An action plan for expanding the
technical areas of the field of statistics”
2002
16. The Building Data Science
Summit | Boston 2017
Saturday, June 24 16
- Hal Varian
- Google Chief Economist
“I keep saying that the sexiest job of the next
ten years will be statisticians”
2009
18. The Building Data
Science
Summit | Boston 2017
Saturday, June 24 18
2012 – Data Science gets sexy!
Thomas H Davenport DJ Patil
Data Science: The Sexiest Job of the 21st Century
- Harvard Business Review October 2012
19. The Building Data Science
Summit | Boston 2017
Saturday, June 24 19
- 1812 - Bayes Theorem completed
- 1913 – Creation of Markov Chains
- 1950 – Creation of Alan Turing’s Learning Machine
- 1951 – First Neural Networks (The rat in a maze simulation)
- 1952 – Computers play checkers
- 1982 – The first Recurrent Neural Network
- 1992 – Computers play backgammon
- 1995 – Random Forests & Support Vector Machines (SVM)
- 1997 – IBM Deep Blue beats Kasparov
- 2006 – Netflix Award (user rating prediction algorithm)
- 2010 – First Kaggle competition
- 2011 – Watson wins Jeopardy
- 2014 - Facebook publish DeepFace (Facial recognition)
- 2016 - Watson beats humans at Pokémon Go
Key milestones in Machine Learning
21. The Building Data Science
Summit | Boston 2017
Saturday, June 24 21
“A Data Analyst performs the process of inspecting, cleansing,
transforming, and modeling data with the goal of discovering useful
information, suggesting conclusions, and supporting decision-making”
- Wikipedia
Defining a Data Analyst
22. The Building Data
Science
Summit | Boston 2017
Saturday, June 24 22
Data Analysis & Data Science – Key Differences
Data Analyst
- High-level of education not a pre-
requisite
- Collection & processing of data
- Analysis of data using tools such as
SAS & SQL or BI Tools
- Analyzes and mines business data,
often from single data sources (such as
CRM)
- Creates reports to aid business
decision-making
- Finds the answers to prescribed
business questions
Data Scientist
- High-level of education a necessity
- A highly capable programmer in the
likes of Python, R, Spark etc.
- Using multiple, often disparate &
uncategorized data sources
- A storyteller who can communicate
internally/externally
- Identifying new business questions that
can add value
- T-shaped people
23. The Building Data Science
Summit | Boston 2017
Saturday, June 24 23
- A great Data Analyst can analyze what they have, and come up with
insightful answers to prescribed business questions
BUT
- A great Data Scientist can discover the questions that the business
should have been asking all along
A couple more simplifications…
24. The Building Data Science
Summit | Boston 2017
Saturday, June 24 24
- A Data Analyst is like a tour guide,
able to take what we already know,
and provide a unique insight into it
- A Data Scientist is an explorer,
discovering unchartered lands and
bringing insight and new questions
from the unknown
Something that stuck with me
26. The Building Data Science
Summit | Boston 2017
Saturday, June 24 26
- A Masters/Ph.D. education in a statistical, math, computer science or related discipline
- Or, something industry specific
- A curious, obsessed problem solver
- An excellent level of technical ability using key tools such as Python & R
- Statistical rigor (Scientific Method)
- These things are the new normal
Great Foundations
27. The Building Data Science
Summit | Boston 2017
Saturday, June 24 27
1.Business acumen
2.Top-tier communication skills
3.A hobbyist mindset
The Triple Threat
28. The Building Data Science
Summit | Boston 2017
Saturday, June 24 28
“Business acumen is keenness and speed in understanding and deciding on a
business situation.”
- The Financial Times lexicon
- There is no doubt that whether liaising internally or externally, that disseminating the
ROI of your work will be absolutely intrinsic to your success
Business acumen
29. The Building Data Science
Summit | Boston 2017
Saturday, June 24 29
- A great Data Scientist:
- Eloquently breaks down their most complex thoughts into simple words
- Tells stories which define with simplicity & brevity the problem you were trying to solve, how
you solved it, and most critically, the outcome and what it means for the business
- Is capable of sticking to their guns, confident in their statistical rigor and able to do this with
diplomacy
- Evangelizes the value of their work with the entire business
- Is a collaborator who shares their ideas and can work with others well and with open lines
of communication
Top-tier communication skills
30. The Building Data Science
Summit | Boston 2017
Saturday, June 24 30
“A hobbyist is someone that would spend their time doing Data Science even if it wasn’t a
career option”
- Adam Keene, 2017
- A hobbyist will:
- Have projects they work on in their spare time, especially those which expose them to new
areas in their field
- Stays up to date, and takes part in conversations online as well as in person through
Meetups and hackathons
- Cultivate a great GitHub repository, sharing their best work with pride
- Compete with their peers on Kaggle, creating a profile that makes them proud
A hobbyist mindset
31. The Building Data Science
Summit | Boston 2017
Saturday, June 24 31
- There are huge & untold benefits to making the most of Kaggle & GitHub
Kaggle & GitHub
32. The Building Data
Science
Summit | Boston 2017
Saturday, June 24 32
A note on the hiring process
Rockstar Purple Squirrel Weapon
33. The Building Data Science
Summit | Boston 2017
Saturday, June 24 33
- Tangibility, and a focus on the outcomes is a necessity
- Create a segment on your resume for times when you’ve presented, whether
Meetups or otherwise and include the topic you spoke on
- Show evidence of both personal successes, and fruitful collaboration
- Outline your communication skills in your opening statement, be specific
- Make sure your hobbies & interests include your passion for Data Science with
examples
- Put links to your Kaggle & GitHub on your resume, be proud of them
Getting the interview
34. The Building Data Science
Summit | Boston 2017
Saturday, June 24 34
- Partner closely with your recruiter, whether internal or external
- If you use a 3rd party, stick with one initially
- But, ensure expertise
- Define what you wish to see in order to interview
- A convoluted process is like picking pizza toppings
- Know why you want to hire, not just how
- Bring your team in
- Don’t just pick the best, pick the best for you
Hiring
36. THANK YOU!
Adam Keene
Head of Data Science – Boston, MA
adamkeene@harnham.com
www.harnham.com/us
Or, find me on LinkedIn
Editor's Notes
My name is Adam Keene, and if my accent hasn’t already given it away, I’m not from around these parts
I’m originally from the UK, and as you may already know, football (soccer) is life there, and as such I’ve been a devoted LFC fan since I was as young as I am in this flattering photo (!)
I head up the Data Science practice at Harnham, and my role is focused solely right here in Boston. My role is to help the very best companies find the very best Data Science talent
I have 8 years experience in recruitment, and on top of that, 4 years working on the agency-side as an Account Director on Digital Marketing & Analytics projects
Combining my experience, I took the opportunity to move to the US in 2015 to work with a generalist technology recruitment company in Chicago
My role here was to head up a Data Science practice for the Midwest. I enjoyed the role, but learned some lessons too – role was within a generalist with little specialism, the geography was too large (explain)
This led me to Harnham, where I moved in October last year – we’re the largest dedicated data & analytics company in the world (75 people, 3 offices, starting 2006 in London leading to offices in NY & SF)
Discuss DS, D&A, M&I, Cred Risk & Data & Tech
It also allows me to focus entirely on what is, in my opinion one of the very best tech communities in the world – Boston, with the Data Science community being right at the heart of that
Quote - Vasant Dhar, Professor of Information Systems, NYU
Time to go back to the start – sources include ‘A Very Short History of Data Science’ – Gil Press, Forbes 2013
Before we dive in here, I want to let you know how much this means to me
To have the opportunity today to share my knowledge & insight with a room comprising some of the very best aspirants, practitioner & leaders in the Data Science industry
By being here you’ve taken a step toward evolving your craft, expanding your horizons and ultimately, standing out from the crowd
You have the opportunity today to pick the brains of some of the very best people in this industry, make the most of it!
The truth is, we’re appreciative of the chance to do this, so once again – thank you for being here
Now, let’s go back to the beginning…
Peter Naur, esteemed computer science pioneer & Turing award winner – he unfortunately passed away last year
Called for the term of Computer Science to be renamed Data Science
IASC is established as part of the ISI
Gregory Piatetsky chairs the first KDD workshop in 1989
Big suggestion from me for the audience to check out kdnuggets
September cover story in BusinessWeek
Quote
Book published by Wiley in 2002
More detail on this available at KD Nuggets - Data Science Tools – Are Proprietary Vendors Still Relevant?
In 2012, Harvard Business Review published “Data Science: The Sexiest Job of the 21st Century”
The article was written by Thomas H Davenport (Babson) (Harvard Alma Mater), DJ Patil, former White House Chief of Data Science
- Wikipedia actually has a phenomenal, and more detailed background for those interested
Dzyre.com wrote an article in February of this year that expanded on this - “A data scientist job roles involves estimating the unknown whilst a data analyst job roles involves looking at the known from new perspectives.”
I will preface this by telling you that I shamelessly stole this from a contact that I spoke to back in December, it resonated with me, it stuck with me and I’d like to share it with you today…
Our data – 92% of DS have an advanced degree (44% Masters, 48% PhD) & 69% of DS have less than 10 years experience so it’s key that you realize that for all the education you’ve gained, this is the new norm. Not only that, it gets more competitive with every single graduating class
Therefore, to really stand out, you have to have more…
KD Nuggets article March 2017
So, if a great education, and great technical skills are the new norm, then what else is needed to succeed in this game?
Let me introduce the triple threat… Discuss how it came about
Define why it’s important to deliver ROI, and how the FT definition points towards that
Get into a business mindset – ask explicitly on every brief what the business objective is, write this down and stick to it like glue. Take business courses on Coursera in your spare time to help yourself understand how what you do can impact a business, Do a program such as Insight where you can work on a live brief with specific outcomes you need to achieve
Remind the room of their level of intelligence & complexity vs most rooms that they are in
Evangelism – show & tell example
Importance of collaboration
Take every chance you can get to present, go to Meetups and offer to speak, present to your classmates, step up at work and offer to do a lunch & learn where you can present on a topic. Really look to put yourself out of your comfort zone and build up your confidence
Emphasize the Kaggle/Git aspect and what employers look for on resumes
- We are an open source society – we’ve embraced open source tools, and this now extends to our work itself. Exposing your work in the public domain like this is a surefire way to spur yourself on to bigger and better things, and will opoen you up to feedback from your peers, and potential employers about how you can reach your own next version number
I just wanted to give some quick examples, from my own experience, of how recruiters and businesses work to identify the very best talent
- Today will show you how you can become the next Rockstar/purple squirrel or weapon to make an impact in this industry
Discuss Meetup benefits – there’s a community here!
I’ve defined this as getting the interview, and not getting hired – the rest of the day will give you plenty of knowledge on that