A	Framework	for	Evaluating	Empirical	Studies
InfoQ
SRITNE	KNOWLEDGE	SEMINAR,	ISB,	Feb	2016
Galit Shmueli 徐茉莉
Analytics
Humanity
Responsibility
eBay	dataset	with	millions	of	auctions
• All	camera	auctions	in	2008
• All	camera	auctions	in	2016
• Auctions	for	all	items	in	1/2016
• Sample	from	all	items	in	2016
How	much	will	you	pay?
Or	maybe	Craigslist data?
2
“statisticians	working	in	a	research	environment…	may	
well	have	to	explain	that	the	data	are	inadequate to	
answer	a	particular	question.”
Statistics:	A	Very	Short	Introduction (Hand	2008)
Pre-data,	post-data,	post-analysis
What	is	the	potential	of	a	dataset	to	
generate	knowledge?
3
Analysis	goal
g X
Available	data	
f
Data	analysis
method	
Utility	measure
U
4
Analysis	goal
g
Domain	Goal
What,	why,	when,	where,	how
→	Analysis	Goal
Explain,	predict,	describe
Enumerative,	analytic
Exploratory,	confirmatory
Quality	of	Goal	Specification
• “error	of	the	third	kind”	- giving	the	right	answer	to	the	wrong	
question– Kimball
• “Far	better	an	approximate	answer	to	the	right	question,	which	
is	often	vague,	than	an	exact	answer	to	the	wrong	question,	
which	can	always	be	made	precise”	- Tukey 5
X
Available	data	
Data	Source
• Primary,	secondary
• Observational,	experiment
• Single,	multiple	sources
• Collection	instrument,	protocol
Data	Type
• Continuous,	categorical,	mix
• Structured,	un-,	semi-structured
• Cross-sectional,	time	series,	panel,		
network,	geographical
Data	Quality	U(X|g)
• “Zeroth Problem	- How	do	the	data	relate	to	the	problem,	and	
what	other	data	might	be	relevant?”	- Mallows
• MIS/Database	- usefulness	of	queried	data	to	person	querying	it.	
• Quality	of	Statistical	Data (IMF,	OECD)	- usefulness	of	summary	
statistics	for	a	particular	goal	 (7	dimensions)
Data	Size	and	
Dimension
• #	observations
• #	variables
6
f
Data	analysis
method	
Analysis	Quality
• “poor	models	and	poor	analysis	techniques,	or	even	analyzing	the	
data	in	a	totally	incorrect	way.”		- Godfrey
• Analyst	expertise
• Software	availability
• The	focus	of	statistics/econometrics/DM	education
Statistical	models	and	methods	
• Parametric,	semi-,	non-parametric
• Classic,	Bayesian
Econometric	models
Data	mining	algorithms
Graphical	methods
Operations	research	methods	
7
Utility	measure
U
Quality	of	Utility	Measure
• Adequate	metric	from	analysis	standpoint		(R2,		holdout	data)
• Adequate	metric	from	domain	standpoint
Domain	goal	→	Analysis	goal
• Predictive	accuracy,	lift	
• Goodness-of-fit	
• Statistical	power,	statistical	significance
• Strength-of-fit
• Expected	costs,	gains
• Bias	reduction,	bias-variance	tradeoff
Analysis	utility	→ Domain	utility
8
InfoQ(f,X,g)	=	U(	f(X|g)	)	
Depends	on	quality	of	g,	X,	f,	U	and	relationship	between	them
The	potential	of	a	particular	dataset	
to	achieve	a	particular	goal	using	a	
given	empirical	analysis	method	
9
Statistical	Approaches	for	Increasing	InfoQ
Study	Design	(Pre-Data)
• DOE
• Clinical	trials
• Survey	sampling
• Computer	experiments
Post-Data-Collection
• Data	cleaning	and	
preprocessing
• Re-weighting,	bias	
adjustment
• Meta	analysis
Randomization,	Stratification,	
Blinding,	Placebo,	Blocking,	
Replication,	Sampling	frame,
Link	data	collection	protocol	
with	appropriate	design
Recovering	“real	data”	vs.	
“cleaning	for	the	goal”
Handling	missing	values,	
outlier	detection,	re-
weighting,	combining	results 10
Assessing	InfoQ
“Quality	of	Statistical	Data”	
(Eurostat,	OECD,	NCSES,…)
• Relevance
• Accuracy
• Timeliness	and	punctuality
• Accessibility
• Interpretability
• Coherence
• Credibility
InfoQ	dimensions
1. Data	resolution
2. Data	structure
3. Data	integration
4. Temporal	relevance
5. Chronology	of	data	and	goal
6. Generalizability
7. Construct	operationalization
8. Communication
3	V’s	of	Big	Data	
• Volume
• Variety
• Velocity
Marketing	Research
• Recency
• Accuracy
• Availability
• Relevance	 11
#1	Data	Resolution
Measurement	scale	and	aggregation	level	
12
Dutch	Flower	Auctions
Van	Heck
Ketter
Gupta
Koppius
Lu
Kambil
How	many	flowers?
How	many	auctions?
Bidder	level?
#2	Data	Structure
Data	Types
• Time	series,	cross-sectional,	panel
• Geographic,	spatial,	network
• Text,	audio,	video,	semantic
• Structured,	semi-,	non-structured
• Discrete,	continuous
Data	Characteristics	
Corrupted	and	missing	values	due	to	
study	design	or	data	collection	
mechanism	
13
Social	TV:	Real-Time	Media	
Response	to	TV	Advertising
Hill,	Nalavade &	Benton,	KDD	2012
Prediction	in	Economic	Networks
Dhar,	Geva,	Oestreicher-Singer	&	
Sundararajan,	ISR	2012
Consumer	Surplus	in	Online	Auctions
Bapna,	Jank &	Shmueli,	 ISR	2008
#3	Data	Integration
Utility	of	Linkage	
Dangers:	Privacy
Increase	or	decrease	InfoQ?
14
Music	Blogging,	Online	Sampling,	and	the	
Long	Tail,	Dewan &	Ramprasad,	ISR	2013
Song	radio	play	data	from	Nielsen	SoundScan +	
music	blog	sampling	data	from	The	Hype	Machine
#4	Temporal	Relevance
Analysis	Timeliness
(solving	the	right	
problem	too	late)	
Data	
Collection
Data	
Analysis
Study	
Deployment
t1 t2 t3	 t4	 t5 t6
Collection	Timeliness
(relevance	to	g)
g:	Prospective	vs.	retrospective;	longitudinal	vs.	snapshot
Nature	of	X,	complexity	of	f
forecast
15
#5	Chronology	of	Data	&	Goal
g1:	Explain	price
g2:	Forecast	price
Retrospective/prospective
Ex-post	availability
Endogeneity
16
#6	Generalizability
Statistical	
generalizability	
Scientific	
generalizability	
Definition	of	g
Choice	of	X,	f,	U 17
The	Hidden	Cost	of	Accommodating	Crowdfunder
Privacy	Preferences:	A	Randomized	Field	Experiment
Burtch,	Ghose,	Wattal
“We	found	that	our	
treatment	had	a	large,	highly	
negative	effect	on	hiding	
behavior…
(β	=	-0.279,	p	<	0.001)”
“It	is	possible	that	our	results	would	
not	extend	to	a	[offline]	purchase	
context,	where	issues	of	social	
capital,	reputation,	etc.	might	be	less	
pronounced”
18
19
#7a	Construct	Operationalization
χ construct	
X	=	θ(χ)		operationalization	(measurable)
• Causal	explanation	vs.	
prediction,	description
• Theory	vs.	data
• Data:	Questionnaire,	
physical	measurement
20
Online	
Seller	
Reputation
Total	#ratings
{#pos,	#neg}	ratings
#	recent	negative	ratings
Text	content	of	feedback
The Digitization of Word-of-Mouth:
Promise and Challenges of Online Reputation Systems
Dellarocas, Management Science
#7b	Action	Operationalization
21
Derive	concrete	actions from	the	information	
provided	by	a	study
The	Management	Insights	editor	
will	write	a	Management	Insights	
paragraph	for	every	paper	that	is	
accepted	for	publication.
One	Way	Mirrors	in	Online	Dating:	
A	Randomized	Experiment	
Bapna,	Ramprasad,	Shmueli	&	Umyarov
Online	Reputation	Systems:	How	to	
Design	One	That	Does	What	You	Need
Dellarocas
Social	TV:	Real-Time	Media	
Response	to	TV	Advertising
Hill,	Nalavade &	Benton
#8	Communication
Visual,	written,	verbal	presentations	&	reports	
Knowledge	must	reach	the	right	person	at	the	right	time
• Mentoring
• Manuscript	reviewing
• Data	made	available	to	
others
• EDA	and	shared	
visualization	dashboards
• Seminars	+	conferences!
22
23
“In	the	last	three	years,	there	has	
been	a	concerted	effort	by	those	in	
Washington	to	reduce	government	
spending	and	reign	in	the	national	
debt.
One	reason for	the	budget	cuts?	
Research	by	two	Harvard	
economists,	Ken	Rogoff and	Carmen	
Reinhart.	The	pair	found	that	when	a	
country	owes	more	than	90	percent	
of	their	GDP,it	slides	into	
recession.”
…	Fixing	this	Excel	error	transforms	
high-debt	countries	from	recession	
to	growth
www.marketplace.org/topics/economy/excel-mistake-heard-round-world
a brief conversation about marriage equality with
a canvasser who revealed that he or she was gay
had a big, lasting effect on the voters’ views, as
measured by separate online surveys
administered before and after the conversation.
24
May	2015:	Independent	researchers	fail	to	
replicate;	noted	statistical	irregularities	in	
LaCour’s data (baseline	outcome	data	statistically	
indistinguishable	from	a	national	survey	;	over-time	
changes	indistinguishable	from	perfect	normally	
distributed	noise)
High or
Low InfoQ?
Assessing	InfoQ	in	Practice
Rating-based	assessment
1-5	scale	on	each	dimension:
InfoQ	Score	=	[d1(Y1)	 d2(Y2)	 …	 d8(Y8)]1/8
Experience	from	two	Research	Methods	courses
Shmueli	and	Kenett (2013),	“An	Information	Quality	(InfoQ)	Framework	for	Ex-Ante	and	Ex-Post	
Evaluation	of	Empirical	Studies”,	Proceedings	of	IDAM	2013,	Springer-Verlag. 25
Assessing	Research	Proposals:	
Ex-Ante	(Prospective)	InfoQ
• 2009	Research	methods	workshop	for	PhD	students
• Help	students	develop	and	present	research	proposal
• 50	students	from	OB,	OR,	marketing,	economics
• Each	student	evaluates	his/her	proposal	on	the	8	
InfoQ	dimensions.	Will	proposal	likely	lead	to	the	
intended	goal?
• Students’	grades	derived	from	an	InfoQ	score	of	
their	proposal	submission	(presentation	+	written)
InfoQ	Integration	(Prospective)
Goal:	
Make	students’	research	journey	
more	efficient	and	more	effective
Assessing	Research	Proposals:
Ex-Post	(Retrospective)	InfoQ
Professional	masters	degree	program	that	emphasizes	
statistical	practice,	methods,	data	analysis	and	practical	
workplace	skills.	The	MSP	is	for	students	who	are	
interested	in	professional	careers	in	business,	industry,	
government,	or	scientific	research.
Masters	Program	in	
Statistical	Practice
Assessing	Research	Proposals:
Ex-Post	(Retrospective)	InfoQ
In	2012:	16	students
Students	instructed	to:	(60-90	min)
1. Read	short	description	of	InfoQ	and	its	dimensions
2. Read	reports	on	five	studies	
3. Describe	goal,	data,	analysis,	utility	for	each	study
4. Evaluate	the	8	dimensions	for	each	study
Predicting	days	with	
unhealthy	air	quality	in	
Washington	DC
Predicting	Changes	in	
Quarterly	Corporate	
Earnings	Using	
Economic	Indicators
Quality-of-care	factors	
in	U.S.	nursing	homes
Predicting	
ZILLOW.com’s
Zestimate accuracy
goo.gl/erNPF
Predicting	First	Day	
Returns	for	
Japanese	IPOs
Feedback:	
InfoQ	approach	helped	“sort	out	all	of	the	
information”
Several	reported	that	they	will	adopt	this	
evaluation	approach	for	future	studies
More	with	InfoQ:
Guidelines	for	Journal	Article
Writing	and	Reviewing
32
33
34
InfoQ:	Strengths	and	Challenges
InfoQ	approach	streamlines	questioning	of	data	value
• “Why	should	we	invest	in	data?”	– management
• Compare	value	of	potential	datasets,	analyses
• Prioritize/rank	projects
• Strengthen	functional	– analytical	relationship
• Evaluation	study:	InfoQ	useful	for	developing	an	analysis	plan	
(prospective)	and	evaluating	an	empirical	study	(retrospective)
To	Do:
• Improve	InfoQ	assessment	(heterogeneity	across	different	raters)
• Alternative	InfoQ	assessment	approaches	(pilot	study,	EDA,	other)
• Further	dimensions	(data	privacy,	ethics/moral,	novelty)
• Effect	of	technological	advances	on	InfoQ
35
36
With	
discussion
Forthcoming	(2016)

Information Quality: A Framework for Evaluating Empirical Studies