WHO AM I?
• 3+ years in Data Science
• MS in Applied Mathematics
• Professional interests: recommendations systems, natural language
processing, scalable data science solutions
• Authors of two blogs: energyﬁrefox.blogspot.com,
• Fan of online education (20+ ﬁnished MOOCs)
• What is Data Science and why do we need it?
• Data Scientists.Who they are and what do they
• How to start?
• Practical case
WHO DATA SCIENTISTS ARE
AND WHAT DOTHEY DO?
TYPES OF DATA SCIENTISTS
A - Analysis
B - Building
DSTYPE “A” - ANALYSIS
• making sense of data or working with it in a fairly static way.
• very similar to a statistician (and may be one)
• knows all the practical details of working with data that
aren’t taught in the statistics curriculum: data cleaning,
methods for dealing with very large data sets, visualization,
deep knowledge of a particular domain, writing well
• share some statistical background withType A
• very strong coders and may be trained software
• mainly interested in using data “in production.”
• build models which interact with users, often serving
recommendations (products, people you may know, ads,
movies, search results).
DSTYPE “B” - BUILDING
WHAT DOTHEY DO?
TYPICAL DATA SCIENCE
• Preparing to run a model (Gathering, cleaning,
• Running the model
• Interpreting the results
“80% of work” - Aaron Kimball
“Other 80% of the work”
DOMAIN KNOWLEDGE AND
• Passionate about the business
• Curios about data
• Inﬂuence without authority
• Hacker mindset
• Problem solver
• Strategic, proactive, creative, innovative and collaborative
• Computer science fundamentals
• Scripting language
• Statistical computing language
• Relational algebra
• Distributed computations
• Ability to engage with senior management
• Storytelling skills
• Visual art design
• Knowledge of a vizualisation tool
• Translate data-driven insights into decisions and actions