SlideShare a Scribd company logo
How to Build a Successful
Data Team ?
Hi !
Self-Introduction
x	50
+
+	48
+ x	1
Florian	Douetteau DATAIKU
Dataiku - Data Tuesday
Meet Hal Alowne
3
Big Guys
• 10B$+ Revenue
• 100M+ customers
• 100+ Data Scientist
Hal Alowne
BI Manager
Dim’s Private Showroom
Hey Hal ! We need
a big data platform
like the big guys.
Let’s just do as they do!
‟
”Average E-commerce Web site
• 100M$ Revenue
• 1 Million customer
• 1 Data Analyst (Hal Himself)
Dim Sum
CEO & Founder
Dim’s Private Showroom
Big Data
Copy Cat
Project
TECHNOLOGY DISCONNECT
4
Welcome to Technoslavia !
5
LOL PLATFORM ANTI-PATTERN
6
Test and Invest in Infrastructure == Skilled People
or
Go For Cloud / Packaged Infrastructure
Your	Brand	New	Hadoop	Cluster	
is	perceived	as	slow,	not	so	used		
and	not	reliable
TECHNO MISMATCH ANTI-PATTERN
7
Assume Being Polyglot
or
Be a Dictator
VS
VS
The	Python	
Clan
The	R	
Tribe
The	Old	Elephant	
Fraternity
The	New	Elephant	
Club
PREDICTIVE ANALYTICS DEPLOYMENT STRATEGY
8
Website	2000’	winners	
Companies	that	were	able	to	release	fast	
"Artificial	Intelligence	with		Data	for	
Internet	of	Things"	2010’	winners	
Companies	able	to	put	intelligence	in	production
?
Design a way to put “PREDITICTIVE MODELS”
IN PRODUCTION
PEOPLE DISCONNECT
9
Classic BI Team Org
Business Leader
Data Consumer
Line-of-business
Data Consumer
Business Project
Sponsor BI Solution Architect
Model Designer
ETL Developer
Dashboard / Report Designer
DBA / IT Data Owner
Specs
Data Science Team Org
Business Leader
Data Consumer
Line-of-business
Data Consumer
Business Project
Sponsor Data Team Manager
Data Engineer
Data Analyst
Data System Engineer /
Data Architect
Specs
Data Scientist
Built From Scratch
12
Business Leader
Data Consumer
Line-of-business
Data Consumer
Business Project
Sponsor
DBA / IT Data Owner
Specs
Built From Engineering
13
Business Leader
Data Consumer
Line-of-business
Data Consumer
Business Project
Sponsor
Specs
Built From Analysts
14
Business Leader
Data Consumer
Line-of-business
Data Consumer
Business Project
Sponsor
Specs
Manage Expectations
15
Data
Plumberer
Data
Engineer
Data
Scientist
Data
Waiter
Data
Cleaner
Data
Analyst
REAL
JOB
DREAM
JOB
Perfectly Natural Hidden thoughts
16
Business Project
Sponsor
Data Team Manager
Data Engineer
Data Analyst
Data Scientist
Managing Extreme Personalities
17
Data SCIENTIST
Highly Creative
Passionate
Hard to hire ?
Hard to manage ?
Want to take
your job ?
Ambitious
Paired for Data
18
Data Analyst
Discover Patterns
Data Engineer
Make things work
Fight
data
entropy
Entropy
tech
entropy
When do you prefer ?
19
One Analyst
One Engineer
That work together ?
Two data scientists
Data Disconnect
20
DATA DISCONNECT
What is the main reason for data project to fail ?
21
DATA
NOT 

AVAILABLE
BUT FOR ONLY INCREMENTAL GAIN
Contribu=on	to	the	overall	project	performance
0	% 25	% 50	% 75	% 100	%
20	%30	%50	%
Business	Goal	Definition	and	Data Feature	Engineering Algorithm
How to Get Data if you don’t have it
23
THE	GRASSHOPER THE	SPIDER THE	FOX
The Cicada : Optimistic and Opportunistic Data
24
THE	CICADA
As a startup
As a group inside a company
- Build a new product using open data
- Benefit from the data sharing initiative within your company
- Wait for data to be available in your data lake
The Spider: Power of the Network
25
THE	SPIDER
As a startup
As a group inside a company
- Create a network of (web trackers | sensors)
- Make it available for free
- Build your service on people’s collected data
- Make a web service available to collect data
- Promote it internally so that people use it
The Fox: Hunt for the Big Money first
26
THE	FOX
As a startup
As a group inside a company
- Hunt for a Business Group within a large company with a problem
- Build a SaaS solution using their data
- Replicate to competitors
- Take in a charge a critical problem as per the CEO’s request
- Build your own integrated tech team to solve it
- Use those ressources to reset data services internally
Product Disconnect
27
PRODUCT DISCONNECT
What is Big Data about ?
28
The Age Of Distributed Intelligence
29
Global,	Personalised	
and	Real	Time	Data	
Driven	Services
Data to Visualize or Data to Automate ?
30
2013 2014 2015 2015 2017 2018
Automated	Decision VIsualize	To	Decide
Moving to a world of automated decision making
Where is your added value ?
31
Is	the	problem		at	the	Core	of	
my	Business	Process?	
Is	it	a	common		problem	/	with	
share	data	?	
Go	for		Best	of	
Breed	SAAS	
Solution
Can	I	Solve	it	on	my	own	?	
Really	?	
Build	by	the	
data	team		
Build	by	the	
data	team	?	
Build	by	the	
data	team		
Hire	
Consultants	
and	Learn	
Yes
Yes No
I can’t Ok, I can try
Yes!
No!
No
Be aware of the confort zone
32
Mission
Critical
Small
Structured
Large
Diverse
Sheer
Curiosity
Reporting
for Finance
in Any Industry
Analyze
Each Tweet
Web Navigation

For E-Merchant
Ticket Data
For Discounts
in Retail
Phone Call
Logs for Security
RTB Data
For Advertising
Customer
Consumption
For Anti-Churn
in Utilities
Optimization
Filings
For Fraud
in Insurance
Not Enough
Data To Learn
From ?
Not Enough
“Hard" Examples
So that you can learn
Infuse the Data and Try Mindset
33
Brendan	Stern	is	now		
a	Specialist	for	Data	
Science	in	Healthcare	
at	Dataiku
“ When I was 20, as I was working as a
manager at my Starbucks shop, I realised
that I could probably enhance the amount
of sales for ground coffee. Depending on the
day and time of days, I kept moving around
the ground coffee. I manage to made some
A/B tests that optimised the average sale
amount by 12%”
Involve product team
34
Product	Feature	
Personalised	Item	Ranking
Product	Feature	
Notify	User	Only	when	Needed
Product	Feature:	
Historical	Data	For	Path	Optimisation
Have Product Management Deeply Involved
In the Data Team
Create an "API" Culture
35
Do not share
• Random Piece of Code
• Flat File
Do share
• Reproductible documented workflows
• Clean, documented APIs
36
WAITING FOR QUESTIONS SLIDEWAITING FOR QUESTIONS SLIDE
More	food	for	thoughts

	on	Dataiku’s	blog	
http://www.dataiku.com/blog/	
Find	us	on	Twitter
@fdouetteau  @dataiku 

More Related Content

What's hot

Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Dataiku
 
Online Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for FunOnline Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for Fun
Dataiku
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
Dataiku
 
Back to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchBack to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from Scratch
Klaas Bosteels
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
Cdiscount
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
Dataiku
 
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku
 
Walmart Big Data Expo
Walmart Big Data ExpoWalmart Big Data Expo
Walmart Big Data Expo
BigDataExpo
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECH
Dataiku
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
Zhihao Lin
 
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
VMware Tanzu
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
Dataiku
 
The Big Data Dream Team
The Big Data Dream TeamThe Big Data Dream Team
The Big Data Dream Team
Accenture Analytics
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
PAPIs.io
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
PAPIs.io
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureML
Jen Stirrup
 
Making big data work
Making big data work Making big data work
Making big data work
Ed Thewlis
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
DataKitchen
 

What's hot (20)

Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 
Online Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for FunOnline Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for Fun
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
Back to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchBack to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from Scratch
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
 
Walmart Big Data Expo
Walmart Big Data ExpoWalmart Big Data Expo
Walmart Big Data Expo
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECH
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
 
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 
The Big Data Dream Team
The Big Data Dream TeamThe Big Data Dream Team
The Big Data Dream Team
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureML
 
Making big data work
Making big data work Making big data work
Making big data work
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
 

Viewers also liked

Introduction à la Data Science l data business
Introduction à la Data Science l data businessIntroduction à la Data Science l data business
Introduction à la Data Science l data business
Vincent de Stoecklin
 
P Sizemore Data Team
P Sizemore Data TeamP Sizemore Data Team
P Sizemore Data Team
Paula Sizemore
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
EMC
 
APIEC SCIP Signaux Faibles
APIEC SCIP Signaux FaiblesAPIEC SCIP Signaux Faibles
APIEC SCIP Signaux Faibles
henri Laude
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
Dataconomy Media
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
Dataiku
 
Big Data Easy BI
Big Data Easy BI Big Data Easy BI
Big Data Easy BI
DataWorks Summit
 
Présentation Micropole 3eme Forum MDM - 19 novembre 2014
Présentation Micropole 3eme Forum MDM - 19 novembre 2014Présentation Micropole 3eme Forum MDM - 19 novembre 2014
Présentation Micropole 3eme Forum MDM - 19 novembre 2014
Micropole Group
 
DATA FORUM MICROPOLE - 2015
DATA FORUM MICROPOLE - 2015DATA FORUM MICROPOLE - 2015
DATA FORUM MICROPOLE - 2015
Micropole Group
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Dataiku
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven Organization
IT Weekend
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
Ashish Bansal
 

Viewers also liked (14)

Introduction à la Data Science l data business
Introduction à la Data Science l data businessIntroduction à la Data Science l data business
Introduction à la Data Science l data business
 
P Sizemore Data Team
P Sizemore Data TeamP Sizemore Data Team
P Sizemore Data Team
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
APIEC SCIP Signaux Faibles
APIEC SCIP Signaux FaiblesAPIEC SCIP Signaux Faibles
APIEC SCIP Signaux Faibles
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
 
Big Data Easy BI
Big Data Easy BI Big Data Easy BI
Big Data Easy BI
 
Présentation Micropole 3eme Forum MDM - 19 novembre 2014
Présentation Micropole 3eme Forum MDM - 19 novembre 2014Présentation Micropole 3eme Forum MDM - 19 novembre 2014
Présentation Micropole 3eme Forum MDM - 19 novembre 2014
 
DATA FORUM MICROPOLE - 2015
DATA FORUM MICROPOLE - 2015DATA FORUM MICROPOLE - 2015
DATA FORUM MICROPOLE - 2015
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven Organization
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 

Similar to How to Build Successful Data Team - Dataiku ?

Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data Teams
Outreach Digital
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
Sunil Ranka
 
3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию
antishmanti
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to Foresight
Sunil Ranka
 
Patternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckPatternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase Deck
MaryLudloff
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & Tactics
Matt Turck
 
Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...
DataWorks Summit
 
Just ask Watson Seminar
Just ask Watson SeminarJust ask Watson Seminar
Just ask Watson Seminar
Certus Solutions
 
Big data and the data quality imperative
Big data and the data quality imperativeBig data and the data quality imperative
Big data and the data quality imperative
Trillium Software
 
World’s 10 Best Data Integration Solution Providers 2022.pdf
World’s 10 Best Data Integration Solution Providers 2022.pdfWorld’s 10 Best Data Integration Solution Providers 2022.pdf
World’s 10 Best Data Integration Solution Providers 2022.pdf
InsightsSuccess4
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
Institute of Contemporary Sciences
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation
Sri Ambati
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
IntelAPAC
 
Big Data and Analytics - 2016 CFO
Big Data and Analytics - 2016 CFOBig Data and Analytics - 2016 CFO
Big Data and Analytics - 2016 CFO
John-Paul Della-Putta
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 Conference
David Feinleib
 
Making Sense of your data - eLearning Network April 2014
Making Sense of your data - eLearning Network April 2014Making Sense of your data - eLearning Network April 2014
Making Sense of your data - eLearning Network April 2014
Andy Wooler
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
Annie Flippo
 
Finding the right digital tools for your internal communication strategy
Finding the right digital tools for your internal communication strategyFinding the right digital tools for your internal communication strategy
Finding the right digital tools for your internal communication strategy
Stephan Schillerwein
 
Best Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the OrganizationBest Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the Organization
Chasity Gibson
 
Big Data at a Glance
Big Data at a GlanceBig Data at a Glance
Big Data at a Glance
Softweb Solutions
 

Similar to How to Build Successful Data Team - Dataiku ? (20)

Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data Teams
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to Foresight
 
Patternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckPatternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase Deck
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & Tactics
 
Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...
 
Just ask Watson Seminar
Just ask Watson SeminarJust ask Watson Seminar
Just ask Watson Seminar
 
Big data and the data quality imperative
Big data and the data quality imperativeBig data and the data quality imperative
Big data and the data quality imperative
 
World’s 10 Best Data Integration Solution Providers 2022.pdf
World’s 10 Best Data Integration Solution Providers 2022.pdfWorld’s 10 Best Data Integration Solution Providers 2022.pdf
World’s 10 Best Data Integration Solution Providers 2022.pdf
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
Big Data and Analytics - 2016 CFO
Big Data and Analytics - 2016 CFOBig Data and Analytics - 2016 CFO
Big Data and Analytics - 2016 CFO
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 Conference
 
Making Sense of your data - eLearning Network April 2014
Making Sense of your data - eLearning Network April 2014Making Sense of your data - eLearning Network April 2014
Making Sense of your data - eLearning Network April 2014
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
Finding the right digital tools for your internal communication strategy
Finding the right digital tools for your internal communication strategyFinding the right digital tools for your internal communication strategy
Finding the right digital tools for your internal communication strategy
 
Best Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the OrganizationBest Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the Organization
 
Big Data at a Glance
Big Data at a GlanceBig Data at a Glance
Big Data at a Glance
 

More from Dataiku

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Dataiku
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...
Dataiku
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare Industry
Dataiku
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem
Dataiku
 
04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku 04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku
Dataiku
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Dataiku
 
OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - Dataiku
Dataiku
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from th
Dataiku
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku
 
Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku
 

More from Dataiku (12)

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare Industry
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem
 
04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku 04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
 
OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - Dataiku
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from th
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
 
Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch
 

Recently uploaded

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 

Recently uploaded (20)

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 

How to Build Successful Data Team - Dataiku ?

  • 1. How to Build a Successful Data Team ?
  • 3. Dataiku - Data Tuesday Meet Hal Alowne 3 Big Guys • 10B$+ Revenue • 100M+ customers • 100+ Data Scientist Hal Alowne BI Manager Dim’s Private Showroom Hey Hal ! We need a big data platform like the big guys. Let’s just do as they do! ‟ ”Average E-commerce Web site • 100M$ Revenue • 1 Million customer • 1 Data Analyst (Hal Himself) Dim Sum CEO & Founder Dim’s Private Showroom Big Data Copy Cat Project
  • 6. LOL PLATFORM ANTI-PATTERN 6 Test and Invest in Infrastructure == Skilled People or Go For Cloud / Packaged Infrastructure Your Brand New Hadoop Cluster is perceived as slow, not so used and not reliable
  • 7. TECHNO MISMATCH ANTI-PATTERN 7 Assume Being Polyglot or Be a Dictator VS VS The Python Clan The R Tribe The Old Elephant Fraternity The New Elephant Club
  • 8. PREDICTIVE ANALYTICS DEPLOYMENT STRATEGY 8 Website 2000’ winners Companies that were able to release fast "Artificial Intelligence with Data for Internet of Things" 2010’ winners Companies able to put intelligence in production ? Design a way to put “PREDITICTIVE MODELS” IN PRODUCTION
  • 10. Classic BI Team Org Business Leader Data Consumer Line-of-business Data Consumer Business Project Sponsor BI Solution Architect Model Designer ETL Developer Dashboard / Report Designer DBA / IT Data Owner Specs
  • 11. Data Science Team Org Business Leader Data Consumer Line-of-business Data Consumer Business Project Sponsor Data Team Manager Data Engineer Data Analyst Data System Engineer / Data Architect Specs Data Scientist
  • 12. Built From Scratch 12 Business Leader Data Consumer Line-of-business Data Consumer Business Project Sponsor DBA / IT Data Owner Specs
  • 13. Built From Engineering 13 Business Leader Data Consumer Line-of-business Data Consumer Business Project Sponsor Specs
  • 14. Built From Analysts 14 Business Leader Data Consumer Line-of-business Data Consumer Business Project Sponsor Specs
  • 16. Perfectly Natural Hidden thoughts 16 Business Project Sponsor Data Team Manager Data Engineer Data Analyst Data Scientist
  • 17. Managing Extreme Personalities 17 Data SCIENTIST Highly Creative Passionate Hard to hire ? Hard to manage ? Want to take your job ? Ambitious
  • 18. Paired for Data 18 Data Analyst Discover Patterns Data Engineer Make things work Fight data entropy Entropy tech entropy
  • 19. When do you prefer ? 19 One Analyst One Engineer That work together ? Two data scientists
  • 21. What is the main reason for data project to fail ? 21 DATA NOT 
 AVAILABLE
  • 22. BUT FOR ONLY INCREMENTAL GAIN Contribu=on to the overall project performance 0 % 25 % 50 % 75 % 100 % 20 %30 %50 % Business Goal Definition and Data Feature Engineering Algorithm
  • 23. How to Get Data if you don’t have it 23 THE GRASSHOPER THE SPIDER THE FOX
  • 24. The Cicada : Optimistic and Opportunistic Data 24 THE CICADA As a startup As a group inside a company - Build a new product using open data - Benefit from the data sharing initiative within your company - Wait for data to be available in your data lake
  • 25. The Spider: Power of the Network 25 THE SPIDER As a startup As a group inside a company - Create a network of (web trackers | sensors) - Make it available for free - Build your service on people’s collected data - Make a web service available to collect data - Promote it internally so that people use it
  • 26. The Fox: Hunt for the Big Money first 26 THE FOX As a startup As a group inside a company - Hunt for a Business Group within a large company with a problem - Build a SaaS solution using their data - Replicate to competitors - Take in a charge a critical problem as per the CEO’s request - Build your own integrated tech team to solve it - Use those ressources to reset data services internally
  • 28. What is Big Data about ? 28
  • 29. The Age Of Distributed Intelligence 29 Global, Personalised and Real Time Data Driven Services
  • 30. Data to Visualize or Data to Automate ? 30 2013 2014 2015 2015 2017 2018 Automated Decision VIsualize To Decide Moving to a world of automated decision making
  • 31. Where is your added value ? 31 Is the problem at the Core of my Business Process? Is it a common problem / with share data ? Go for Best of Breed SAAS Solution Can I Solve it on my own ? Really ? Build by the data team Build by the data team ? Build by the data team Hire Consultants and Learn Yes Yes No I can’t Ok, I can try Yes! No! No
  • 32. Be aware of the confort zone 32 Mission Critical Small Structured Large Diverse Sheer Curiosity Reporting for Finance in Any Industry Analyze Each Tweet Web Navigation
 For E-Merchant Ticket Data For Discounts in Retail Phone Call Logs for Security RTB Data For Advertising Customer Consumption For Anti-Churn in Utilities Optimization Filings For Fraud in Insurance Not Enough Data To Learn From ? Not Enough “Hard" Examples So that you can learn
  • 33. Infuse the Data and Try Mindset 33 Brendan Stern is now a Specialist for Data Science in Healthcare at Dataiku “ When I was 20, as I was working as a manager at my Starbucks shop, I realised that I could probably enhance the amount of sales for ground coffee. Depending on the day and time of days, I kept moving around the ground coffee. I manage to made some A/B tests that optimised the average sale amount by 12%”
  • 35. Create an "API" Culture 35 Do not share • Random Piece of Code • Flat File Do share • Reproductible documented workflows • Clean, documented APIs
  • 36. 36 WAITING FOR QUESTIONS SLIDEWAITING FOR QUESTIONS SLIDE More food for thoughts
 on Dataiku’s blog http://www.dataiku.com/blog/ Find us on Twitter @fdouetteau  @dataiku