SlideShare a Scribd company logo
Tech Talks: How to Setup a Data Science Business Function
Jun 2015
www.applied.ai
How to Setup a Data
Science Business Function
Applied AI Tech Talk
● We are data scientists:
○ variously quants, statisticians, actuarial & machine learning types
● We are consultants:
○ we do complex data analysis, predictive modelling etc
○ and we also help to do the soft stuff...
… enabling companies to learn from their data in a sustainable way
This is a totally biased talk
Like any collaborative business effort involving research & development,
a data science function should be built carefully in order to enable the
best expertise and technologies.
- Me, ~2 weeks ago
http://blog.applied.ai/how-to-build-a-data-science-business-function/
How to Setup a Data Science Business Function a.ka.
Making in-house Data Science sustainable
● Including, for example:
Data Science is a broad discipline
one-off scenario-
specific modelling
exercises
on-line predictive
modelling of user
actions
regular analysis of
campaigns and
customer discovery
… and a significant amount of data acquisition, preparation, storage etc
● To be sustainable and minimise
risk, we need to combine:
○ great people
○ advanced maths
○ scientific experimentation
○ software engineering
○ high-quality data
○ solid business practices
○ communication
The most important thing is communication
https://www.quora.com/How-could-the-Data-Science-Venn-Diagram-be-improved
1. Setting up and sizing the team
2. Defining and operating projects
3. Systemising the data pipeline and analyses
4. Ensuring effective communication
… to help us make in-house Data Science sustainable
Four main areas to cover:
● The practitioner will use a wide variety of tools to:
○ acquire, manipulate, store and access data efficiently
○ design surveys and scientific experiments to test hypotheses
○ undertake statistically valid analyses
○ implement high-quality, optimised predictive models
○ derive and communicate actionable insights
… requiring diverse skills covering database management, software
engineering, statistical analysis, machine learning, graphic design, ethics,
social responsibility, domain knowledge and communication.
1. Setting up and sizing the team
Data Scientists need a lot of skills!
● But the days of hiring a single, unicorn-like, 'full-stack' data scientist are
pretty much gone, and probably never really existed.
1. Setting up and sizing the team
Don’t believe in unicorns
The team needs to be small, agile and focused:
● 2-6 data scientists is ample
● they should be proven generalists, team-players and pragmatists
● able to cope with vague requirements, messy data and high failure rates
“The first hire(s) should help get three things ready: your data; a clear problem
to be solved; and a process to evaluate the business impact of any new
solution".
- Simon Chan, Forbes, April 2015 http://www.forbes.com/sites/theyec/2015/04/30/how-to-do-your-
first-data-science-hire-right/
1. Setting up and sizing the team
Start with a small, focused team
Any piece of research or development likely to last more than a few
days and/or involve more than one person should have:
● A primary sponsor and a project leader
● A well defined goal (SMART), and a written spec
● Progress meetings to validate and update the plan, with full and frank
communication between major stakeholders
● Knowledge sharing upon completion
● Consider maintaining a basic RACI and risks & issues register.
2. Defining and operating projects
Automate good workflows and deal with technical debt:
● Understand and map the data 'pipeline'
● Stop when the models are good enough
● Encourage a systematic, shared approach to the creation of all machine
learning tools and analyses, with:
○ proper source control and documentation
○ code reviews & 'lunch and learn' seminar sessions
○ regular refactoring of algorithms, applications and data preparation
scripts where appropriate.
3. Systemising the data pipeline and analyses
Strong communication within & without the team is vital, helping to
ensure that projects stay on-track and issues are spotted early:
● Daily stand-up meetings (<10 mins), sharing immediate activities & issues
● An up-to-date communal task schedule - e.g. the Kanban methodology
● Simplified and centralised comms tech; move written discussions away
from email and towards wikis, message boards, and group chats Slack
● Try to allow data scientists / software engineers the time & space to get
into a productive flow state without meetings and interruptions.
4. Ensuring effective communication
● Start with a small team of capable generalists and work hard to define the
business problems and success criteria, set timescales and to understand &
access the available data
● Allow for and embrace failure, give data scientists time and space to
research and experiment
● Specialise when necessary, automate where possible and embed into an
ongoing cycle of development, maintenance and support.
● Require a corporate sponsor with clout and encourage strong
communication within the team and the rest of the business
http://blog.applied.ai/how-to-build-a-data-science-business-function/
In review
Applied AI is a data science consultancy
We provide data-driven insights and solutions using applied artificial intelligence
www.applied.ai
Thank You
Any questions?
Applied AI Tech Talk: How to Setup a Data Science Dept

More Related Content

What's hot

Become a Data Analyst
Become a Data Analyst Become a Data Analyst
Become a Data Analyst
Aaron Lamphere
 
Analytics Lessons Learnt
Analytics Lessons Learnt Analytics Lessons Learnt
Analytics Lessons Learnt
Venkata Pingali
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
Zhihao Lin
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
Ashish Bansal
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Web Development or Data Science
Web Development or Data Science Web Development or Data Science
Web Development or Data Science
Aaron Lamphere
 
Landing a career in data science
Landing a career in data scienceLanding a career in data science
Landing a career in data science
Parul Pandey
 
A picture is worth a thousand words
A picture is worth a thousand wordsA picture is worth a thousand words
A picture is worth a thousand words
Masum Billah
 
How to think like a data scientist sandeep
How to think like a data scientist sandeepHow to think like a data scientist sandeep
How to think like a data scientist sandeep
sandeep kumar
 
Data science ppt
Data science pptData science ppt
Data science ppt
Alexander Fleming
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
Steven Mugerwa
 
Top career opportunities in data science
Top career opportunities in data scienceTop career opportunities in data science
Top career opportunities in data science
TanyaAgarwal71
 
Lifecycle of a Data Science Project
Lifecycle of a Data Science ProjectLifecycle of a Data Science Project
Lifecycle of a Data Science Project
Digital Vidya
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and Tools
VMware Tanzu
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
Venkatesh Umaashankar
 
Data sciences course pdf
Data sciences course pdfData sciences course pdf
Data sciences course pdf
AjeetPandey51
 
Data sciences course
Data sciences courseData sciences course
Data sciences course
AJEETPANDEY50
 
IT & Innovation - short summary
IT & Innovation - short summaryIT & Innovation - short summary
IT & Innovation - short summary
Perry Nouwens
 
How tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualizationHow tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualization
Vishanth Bala
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
VMware Tanzu
 

What's hot (20)

Become a Data Analyst
Become a Data Analyst Become a Data Analyst
Become a Data Analyst
 
Analytics Lessons Learnt
Analytics Lessons Learnt Analytics Lessons Learnt
Analytics Lessons Learnt
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Web Development or Data Science
Web Development or Data Science Web Development or Data Science
Web Development or Data Science
 
Landing a career in data science
Landing a career in data scienceLanding a career in data science
Landing a career in data science
 
A picture is worth a thousand words
A picture is worth a thousand wordsA picture is worth a thousand words
A picture is worth a thousand words
 
How to think like a data scientist sandeep
How to think like a data scientist sandeepHow to think like a data scientist sandeep
How to think like a data scientist sandeep
 
Data science ppt
Data science pptData science ppt
Data science ppt
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
 
Top career opportunities in data science
Top career opportunities in data scienceTop career opportunities in data science
Top career opportunities in data science
 
Lifecycle of a Data Science Project
Lifecycle of a Data Science ProjectLifecycle of a Data Science Project
Lifecycle of a Data Science Project
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and Tools
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Data sciences course pdf
Data sciences course pdfData sciences course pdf
Data sciences course pdf
 
Data sciences course
Data sciences courseData sciences course
Data sciences course
 
IT & Innovation - short summary
IT & Innovation - short summaryIT & Innovation - short summary
IT & Innovation - short summary
 
How tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualizationHow tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualization
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
 

Viewers also liked

Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
Greg Goltsov
 
AI Everywhere: How Microsoft is Democratizing AI - Lightning Version
AI Everywhere: How Microsoft is Democratizing AI - Lightning VersionAI Everywhere: How Microsoft is Democratizing AI - Lightning Version
AI Everywhere: How Microsoft is Democratizing AI - Lightning Version
Paul Prae
 
Pi ai landscape
Pi ai landscapePi ai landscape
Pi ai landscape
Manish Singhal
 
Rise of Applied Artificial Intelligence in India
Rise of Applied Artificial Intelligence in IndiaRise of Applied Artificial Intelligence in India
Rise of Applied Artificial Intelligence in India
Manish Singhal
 
Quick Guide to Artificial Intelligence - Transform XO
Quick Guide to Artificial Intelligence - Transform XOQuick Guide to Artificial Intelligence - Transform XO
Quick Guide to Artificial Intelligence - Transform XO
Grimur Fjeldsted
 
Venture Scanner Artificial Intelligence 2016 Q4
Venture Scanner Artificial Intelligence 2016 Q4Venture Scanner Artificial Intelligence 2016 Q4
Venture Scanner Artificial Intelligence 2016 Q4
Nathan Pacer
 
artificial intelligence in power plants.
artificial intelligence in power plants.artificial intelligence in power plants.
artificial intelligence in power plants.
8105268008
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & Tactics
Matt Turck
 
Manufacturing executives share their views on Industry 4.0
Manufacturing executives share their views on Industry 4.0Manufacturing executives share their views on Industry 4.0
Manufacturing executives share their views on Industry 4.0
Deloitte UK
 
Artificial intelligence in power plants
Artificial intelligence in power plantsArtificial intelligence in power plants
Artificial intelligence in power plants
vivekprajapatiankur
 

Viewers also liked (10)

Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
AI Everywhere: How Microsoft is Democratizing AI - Lightning Version
AI Everywhere: How Microsoft is Democratizing AI - Lightning VersionAI Everywhere: How Microsoft is Democratizing AI - Lightning Version
AI Everywhere: How Microsoft is Democratizing AI - Lightning Version
 
Pi ai landscape
Pi ai landscapePi ai landscape
Pi ai landscape
 
Rise of Applied Artificial Intelligence in India
Rise of Applied Artificial Intelligence in IndiaRise of Applied Artificial Intelligence in India
Rise of Applied Artificial Intelligence in India
 
Quick Guide to Artificial Intelligence - Transform XO
Quick Guide to Artificial Intelligence - Transform XOQuick Guide to Artificial Intelligence - Transform XO
Quick Guide to Artificial Intelligence - Transform XO
 
Venture Scanner Artificial Intelligence 2016 Q4
Venture Scanner Artificial Intelligence 2016 Q4Venture Scanner Artificial Intelligence 2016 Q4
Venture Scanner Artificial Intelligence 2016 Q4
 
artificial intelligence in power plants.
artificial intelligence in power plants.artificial intelligence in power plants.
artificial intelligence in power plants.
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & Tactics
 
Manufacturing executives share their views on Industry 4.0
Manufacturing executives share their views on Industry 4.0Manufacturing executives share their views on Industry 4.0
Manufacturing executives share their views on Industry 4.0
 
Artificial intelligence in power plants
Artificial intelligence in power plantsArtificial intelligence in power plants
Artificial intelligence in power plants
 

Similar to Applied AI Tech Talk: How to Setup a Data Science Dept

data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
sunnypatil1778
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
VamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
saitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
SaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science training
DIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
VamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
KumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
SayyedYusufali
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
Rubikal
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
Ahmed Amr Abdul-Fattah
 

Similar to Applied AI Tech Talk: How to Setup a Data Science Dept (20)

data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
 

More from Jonathan Sedar

Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
Jonathan Sedar
 
How is Data Science going to Improve Insurance?
How is Data Science going to Improve Insurance?How is Data Science going to Improve Insurance?
How is Data Science going to Improve Insurance?
Jonathan Sedar
 
Visualising High Dimensional Data with TSNE
Visualising High Dimensional Data with TSNEVisualising High Dimensional Data with TSNE
Visualising High Dimensional Data with TSNE
Jonathan Sedar
 
Bayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier DetectionBayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier Detection
Jonathan Sedar
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Jonathan Sedar
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
Jonathan Sedar
 
Text mining to correct missing CRM information: a practical data science project
Text mining to correct missing CRM information: a practical data science projectText mining to correct missing CRM information: a practical data science project
Text mining to correct missing CRM information: a practical data science project
Jonathan Sedar
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer Marketing
Jonathan Sedar
 

More from Jonathan Sedar (8)

Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
How is Data Science going to Improve Insurance?
How is Data Science going to Improve Insurance?How is Data Science going to Improve Insurance?
How is Data Science going to Improve Insurance?
 
Visualising High Dimensional Data with TSNE
Visualising High Dimensional Data with TSNEVisualising High Dimensional Data with TSNE
Visualising High Dimensional Data with TSNE
 
Bayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier DetectionBayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier Detection
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
 
Text mining to correct missing CRM information: a practical data science project
Text mining to correct missing CRM information: a practical data science projectText mining to correct missing CRM information: a practical data science project
Text mining to correct missing CRM information: a practical data science project
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer Marketing
 

Applied AI Tech Talk: How to Setup a Data Science Dept

  • 1. Tech Talks: How to Setup a Data Science Business Function Jun 2015 www.applied.ai How to Setup a Data Science Business Function Applied AI Tech Talk
  • 2. ● We are data scientists: ○ variously quants, statisticians, actuarial & machine learning types ● We are consultants: ○ we do complex data analysis, predictive modelling etc ○ and we also help to do the soft stuff... … enabling companies to learn from their data in a sustainable way This is a totally biased talk
  • 3. Like any collaborative business effort involving research & development, a data science function should be built carefully in order to enable the best expertise and technologies. - Me, ~2 weeks ago http://blog.applied.ai/how-to-build-a-data-science-business-function/ How to Setup a Data Science Business Function a.ka. Making in-house Data Science sustainable
  • 4. ● Including, for example: Data Science is a broad discipline one-off scenario- specific modelling exercises on-line predictive modelling of user actions regular analysis of campaigns and customer discovery … and a significant amount of data acquisition, preparation, storage etc
  • 5. ● To be sustainable and minimise risk, we need to combine: ○ great people ○ advanced maths ○ scientific experimentation ○ software engineering ○ high-quality data ○ solid business practices ○ communication The most important thing is communication https://www.quora.com/How-could-the-Data-Science-Venn-Diagram-be-improved
  • 6. 1. Setting up and sizing the team 2. Defining and operating projects 3. Systemising the data pipeline and analyses 4. Ensuring effective communication … to help us make in-house Data Science sustainable Four main areas to cover:
  • 7. ● The practitioner will use a wide variety of tools to: ○ acquire, manipulate, store and access data efficiently ○ design surveys and scientific experiments to test hypotheses ○ undertake statistically valid analyses ○ implement high-quality, optimised predictive models ○ derive and communicate actionable insights … requiring diverse skills covering database management, software engineering, statistical analysis, machine learning, graphic design, ethics, social responsibility, domain knowledge and communication. 1. Setting up and sizing the team Data Scientists need a lot of skills!
  • 8. ● But the days of hiring a single, unicorn-like, 'full-stack' data scientist are pretty much gone, and probably never really existed. 1. Setting up and sizing the team Don’t believe in unicorns
  • 9. The team needs to be small, agile and focused: ● 2-6 data scientists is ample ● they should be proven generalists, team-players and pragmatists ● able to cope with vague requirements, messy data and high failure rates “The first hire(s) should help get three things ready: your data; a clear problem to be solved; and a process to evaluate the business impact of any new solution". - Simon Chan, Forbes, April 2015 http://www.forbes.com/sites/theyec/2015/04/30/how-to-do-your- first-data-science-hire-right/ 1. Setting up and sizing the team Start with a small, focused team
  • 10. Any piece of research or development likely to last more than a few days and/or involve more than one person should have: ● A primary sponsor and a project leader ● A well defined goal (SMART), and a written spec ● Progress meetings to validate and update the plan, with full and frank communication between major stakeholders ● Knowledge sharing upon completion ● Consider maintaining a basic RACI and risks & issues register. 2. Defining and operating projects
  • 11. Automate good workflows and deal with technical debt: ● Understand and map the data 'pipeline' ● Stop when the models are good enough ● Encourage a systematic, shared approach to the creation of all machine learning tools and analyses, with: ○ proper source control and documentation ○ code reviews & 'lunch and learn' seminar sessions ○ regular refactoring of algorithms, applications and data preparation scripts where appropriate. 3. Systemising the data pipeline and analyses
  • 12. Strong communication within & without the team is vital, helping to ensure that projects stay on-track and issues are spotted early: ● Daily stand-up meetings (<10 mins), sharing immediate activities & issues ● An up-to-date communal task schedule - e.g. the Kanban methodology ● Simplified and centralised comms tech; move written discussions away from email and towards wikis, message boards, and group chats Slack ● Try to allow data scientists / software engineers the time & space to get into a productive flow state without meetings and interruptions. 4. Ensuring effective communication
  • 13. ● Start with a small team of capable generalists and work hard to define the business problems and success criteria, set timescales and to understand & access the available data ● Allow for and embrace failure, give data scientists time and space to research and experiment ● Specialise when necessary, automate where possible and embed into an ongoing cycle of development, maintenance and support. ● Require a corporate sponsor with clout and encourage strong communication within the team and the rest of the business http://blog.applied.ai/how-to-build-a-data-science-business-function/ In review
  • 14. Applied AI is a data science consultancy We provide data-driven insights and solutions using applied artificial intelligence www.applied.ai Thank You Any questions?