SlideShare a Scribd company logo
1 of 51
Download to read offline
The Data Errors We Make
Sean J Taylor
Core Data Science Team
Facebook
About Me
• 5 years at Facebook as a
Research Scientist
• PhD in Information Systems
from New York University
• Research Interests:
• Field Experiments
• Forecasting
• Sports and sports fans
https://facebook.github.io/prophet/
Strategic Decisions Micro-decisions at Scale
Data
Algorithm
Human

Choices
Estimate Decision Outcome
Truth
statistical 

error
practical 

error
Optimal
Decision
Optimal

Outcome
Simplest Error Model
H0: You are not pregnant.
H1: You are pregnant.
H0 is True
Product is Bad
H1 is True
Product is Good
Accept Null
Hypothesis
(Don’t ship product)
Right decision
Type II Error
(wrong decision)
Reject Null
Hypothesis
(Ship Product)
Type I Error
(wrong decision)
Right decision
Receiver Operating Characteristic (ROC) Curve
tells us Type I and II error rates
Type I error rate
(1 - Type II error rate)
Outline
1. Refinements to the Type I/II error model
2. A simple causal model of how we make errors
3. What we can effectively do about errors
Refinements
Refinement 1:

Assign Costs to Errors
H0 is True
Product is Bad
H1 is True
Product is Good
Accept Null
Hypothesis
(Don’t ship product)
Right decision
Type II Error
(wrong decision)
Reject Null
Hypothesis
(Ship Product)
Type I Error
(wrong decision)
Right decision
Refinement 1:

Assign Costs to Errors
H0 is True
Product is Bad
H1 is True
Product is Good
Accept Null
Hypothesis
(Don’t ship product)
0 -100
Reject Null
Hypothesis
(Ship Product)
-200 +100
Example: 

Expected value of a product launch
P(Type	I)	is	1%	and	P(Type	II)	is	20%	
		P(good)									*	(100	*	.80	+	-100	*	.2)		
		+	(1	-	P(good))	*	(-200	*	.01	+	0	*	.99)	
=	(.5	*	60)	+	(.5	*	-2)	
=	30	-	1	
=	29
Allowing more Type I errors lowers Type II rate.
Optimal choice depends on payoffs and P(H1).
P(Type	I)	is	5%	and	P(Type	II)	is	7%	
		P(good)									*	(100	*	.93	+	-100	*.07)		
		+	(1	-	P(good))	*	(-200	*	.05	+	0	*	.95)	
=	(.5	*	86)	+	(.5	*	-10)	
=	43	-	5	
=	38	>	29
Example 2: 

Expected value of a product launch
Refinement 2:
Opportunity Cost
Key Idea: If we devote resources to minimizing Type I
and II errors for one problem, we will have fewer
resources for other problems.
• Few organizations makes a single decision, we
usually make many of them.
• Acquiring more data, investing more time into
problems has diminishing marginal returns.
Examples of Constraints
• Sample size for online
experiments
• Gathering more data
• Analyst time
Refinement 3:
Mosteller’s Type III Errors


Type III error: “correctly rejecting the null hypothesis
for the wrong reason” -- Frederick Mosteller
More clearly: The process you used worked this time,
but is unlikely to continue working in the future.
Good Process vs.
Good Outcome
Good Outcome Bad Outcome
Good Process Deserved Success Bad Break
Bad Process Dumb Luck Poetic Justice
Refinement 4:
Kimball’s Type III Errors


Type III error: “the error committed by giving the right
answer to the wrong problem” -- Allyn W. Kimball
Why we make errors
Data
Algorithm
Human

Choices
Estimate
Cause 1: Data
• Inadequate data
• Non-representative data
• Measuring the wrong thing
made data
designed to be adequate
found data
adequate if we are fortunate
Non-representative
data
2014 World Cup
First Facebook Check-ins in Brazil from non-Brazilian users
Bias?
2014 World Cup Check-ins by Country
Measuring the wrong thing
Common Pattern
• High volume of of cheap, easy to measure
“surrogate” 

(e.g. steps, clicks)
• Surrogate is correlated with true measurement of
interest (e.g. overall health, purchase intention)
• key question: sign and magnitude of
“interpretation bias”
Cause 2: Algorithms
• The model/procedure we choose primarily
concerns what side of the bias-variance tradeoff
we'd like to be on.
• Common mistakes are:
• Using a model that’s too complex for the data.
• Focusing too much on algorithms instead of
gathering the right data or correctness.
Optimizing models
Reducing bias
• Choose a more flexible model.
Reducing variance
• Choosing a less flexible
model.
• Get more data.
Tree Induction vs. Logistic
Regression: A Learning-Curve
Analysis

Perlich et al. (2003)
• logistic regression is better for
smaller training sets and tree
induction for larger data sets
• logistic regression is usually
better when the signal-to-
noise ratio is lower
Cause 3: Human choices
Many analysts, one dataset: Making transparent
how variations in analytical choices affect results

(Silberzahn et al. 2017)
• 29 teams involving 61 analysts used the same
dataset to address the same research question
• Are soccer ⚽ referees are more likely to give red
cards to dark skin toned players than light skin
toned players?
• effect sizes ranged from 0.89 to 2.93 in odds ratio units
• 20 teams (69%) found a statistically significant positive effect
• 9 teams (31%) observed a nonsignificant relationship
Overconfidence
Incentives
Ways Forward
• prevent errors
• opinionated analysis development
• test driven data analysis
• be honest about uncertainty
• estimate uncertainty using the bootstrap
Opinionated Analysis Development

(by Hilary Parker)
Test-Driven Data Analysis
Estimating Uncertainty
No algorithm in Scikit Learn 

will estimate uncertainty.
The Bootstrap
R1
All Your
Data
R2
…
R500
Generate random
sub-samples
s1
s2
s500
Compute statistics
or estimate model
parameters
…
} 0.0
2.5
5.0
7.5
-2 -1 0 1 2
Statistic
Count
Get a distribution
over statistic of interest
(usually the prediction)
- take mean
- CIs == 95% quantiles
- SEs == standard deviation
Summary
Think about errors!
• What kind of errors are we making?
• Where did the come from?
Prevent errors!
• Use a reasonable and reproducible
process.
• Test your analysis as you test your code.
Estimate uncertainty!
• Models that estimate uncertainty are more
useful than those that don’t.
• They facilitate better learning and
experimentation.

More Related Content

What's hot

Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?Daniel Tunkelang
 
IRJET- Review Analyser with Bot
IRJET- Review Analyser with BotIRJET- Review Analyser with Bot
IRJET- Review Analyser with BotIRJET Journal
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulSubir Paul
 
User behavior modelling & recommendation system based on social networks
User behavior modelling & recommendation system based on social networksUser behavior modelling & recommendation system based on social networks
User behavior modelling & recommendation system based on social networksShah Alam Sabuj
 
Deep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDeep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
 
Interpretations of data
Interpretations of dataInterpretations of data
Interpretations of dataBusiness901
 
From Research to Production: ML/DL in the Enterprise
From Research to Production: ML/DL in the EnterpriseFrom Research to Production: ML/DL in the Enterprise
From Research to Production: ML/DL in the EnterpriseThomas da Silva Paula
 
Social Science Applications of Agent Based Modelling
Social Science Applications of Agent Based ModellingSocial Science Applications of Agent Based Modelling
Social Science Applications of Agent Based ModellingEdmund Chattoe-Brown
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary SurveyTrieu Nguyen
 
Analysis of the article "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of the article "A Predictive Analytics Primer" by Thomas H. DavenportAnalysis of the article "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of the article "A Predictive Analytics Primer" by Thomas H. DavenportVaibhav Srivastav
 
Welcome to the world of Analytics
Welcome to the world of AnalyticsWelcome to the world of Analytics
Welcome to the world of AnalyticsRupak Roy
 
Data Science at Intersection of Security and Privacy
Data Science at Intersection of Security and PrivacyData Science at Intersection of Security and Privacy
Data Science at Intersection of Security and PrivacyTarun Chopra
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101Setu Chokshi
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primerShesha
 
Advancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsAdvancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsTao Xie
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional ContextDaniel Tunkelang
 
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Machine Learning
Introduction to Machine Learning Rupak Roy
 

What's hot (19)

Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
IRJET- Review Analyser with Bot
IRJET- Review Analyser with BotIRJET- Review Analyser with Bot
IRJET- Review Analyser with Bot
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna Paul
 
User behavior modelling & recommendation system based on social networks
User behavior modelling & recommendation system based on social networksUser behavior modelling & recommendation system based on social networks
User behavior modelling & recommendation system based on social networks
 
Deep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDeep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up Seattle
 
Interpretations of data
Interpretations of dataInterpretations of data
Interpretations of data
 
From Research to Production: ML/DL in the Enterprise
From Research to Production: ML/DL in the EnterpriseFrom Research to Production: ML/DL in the Enterprise
From Research to Production: ML/DL in the Enterprise
 
Math in data
Math in dataMath in data
Math in data
 
Social Science Applications of Agent Based Modelling
Social Science Applications of Agent Based ModellingSocial Science Applications of Agent Based Modelling
Social Science Applications of Agent Based Modelling
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary Survey
 
Analysis of the article "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of the article "A Predictive Analytics Primer" by Thomas H. DavenportAnalysis of the article "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of the article "A Predictive Analytics Primer" by Thomas H. Davenport
 
Welcome to the world of Analytics
Welcome to the world of AnalyticsWelcome to the world of Analytics
Welcome to the world of Analytics
 
Bayesian reasoning
Bayesian reasoningBayesian reasoning
Bayesian reasoning
 
Data Science at Intersection of Security and Privacy
Data Science at Intersection of Security and PrivacyData Science at Intersection of Security and Privacy
Data Science at Intersection of Security and Privacy
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primer
 
Advancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsAdvancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software Analytics
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Machine Learning
Introduction to Machine Learning
 

Similar to The Data Errors we Make by Sean Taylor at Big Data Spain 2017

AAPOR 2012 Langer Probability
AAPOR 2012 Langer ProbabilityAAPOR 2012 Langer Probability
AAPOR 2012 Langer ProbabilityLangerResearch
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)jemille6
 
Answer questions Minimum 100 words each and reference (questions.docx
Answer questions Minimum 100 words each and reference (questions.docxAnswer questions Minimum 100 words each and reference (questions.docx
Answer questions Minimum 100 words each and reference (questions.docxamrit47
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataDataCards
 
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversTurning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversUNCResearchHub
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayHjk6653284
 
Identification1
Identification1Identification1
Identification1veesingh
 
Designing Indicators
Designing IndicatorsDesigning Indicators
Designing Indicatorsclearsateam
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studywolf vanpaemel
 
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017Investigating Performance: Design & Outcomes with xAPI | LSCon 2017
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017HT2 Labs
 
The Art and Science of Survey Research
The Art and Science of Survey ResearchThe Art and Science of Survey Research
The Art and Science of Survey ResearchSiobhan O'Dwyer
 
School customer service presentation
School customer service presentationSchool customer service presentation
School customer service presentationsteve muzzy
 
Big Data and the Social Sciences
Big Data and the Social SciencesBig Data and the Social Sciences
Big Data and the Social SciencesAbe Usher
 
Introduction to Hypothesis Testing
Introduction to Hypothesis TestingIntroduction to Hypothesis Testing
Introduction to Hypothesis Testingjasondroesch
 
A Journalist’s Guide to Survey Research and Election Polls by Cliff Zuskin
A Journalist’s Guide to Survey Research and Election Polls by Cliff ZuskinA Journalist’s Guide to Survey Research and Election Polls by Cliff Zuskin
A Journalist’s Guide to Survey Research and Election Polls by Cliff ZuskinFincher Consulting
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
 

Similar to The Data Errors we Make by Sean Taylor at Big Data Spain 2017 (20)

intro_big_data.pptx
intro_big_data.pptxintro_big_data.pptx
intro_big_data.pptx
 
AAPOR 2012 Langer Probability
AAPOR 2012 Langer ProbabilityAAPOR 2012 Langer Probability
AAPOR 2012 Langer Probability
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
 
Jsm big-data
Jsm big-dataJsm big-data
Jsm big-data
 
Answer questions Minimum 100 words each and reference (questions.docx
Answer questions Minimum 100 words each and reference (questions.docxAnswer questions Minimum 100 words each and reference (questions.docx
Answer questions Minimum 100 words each and reference (questions.docx
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling Data
 
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversTurning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Identification1
Identification1Identification1
Identification1
 
1.1 statistical and critical thinking
1.1 statistical and critical thinking1.1 statistical and critical thinking
1.1 statistical and critical thinking
 
Designing Indicators
Designing IndicatorsDesigning Indicators
Designing Indicators
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
 
CS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptxCS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptx
 
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017Investigating Performance: Design & Outcomes with xAPI | LSCon 2017
Investigating Performance: Design & Outcomes with xAPI | LSCon 2017
 
The Art and Science of Survey Research
The Art and Science of Survey ResearchThe Art and Science of Survey Research
The Art and Science of Survey Research
 
School customer service presentation
School customer service presentationSchool customer service presentation
School customer service presentation
 
Big Data and the Social Sciences
Big Data and the Social SciencesBig Data and the Social Sciences
Big Data and the Social Sciences
 
Introduction to Hypothesis Testing
Introduction to Hypothesis TestingIntroduction to Hypothesis Testing
Introduction to Hypothesis Testing
 
A Journalist’s Guide to Survey Research and Election Polls by Cliff Zuskin
A Journalist’s Guide to Survey Research and Election Polls by Cliff ZuskinA Journalist’s Guide to Survey Research and Election Polls by Cliff Zuskin
A Journalist’s Guide to Survey Research and Election Polls by Cliff Zuskin
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 

More from Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...Big Data Spain
 

More from Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

The Data Errors we Make by Sean Taylor at Big Data Spain 2017