Presenta(on Outline
•  Introduc)on	
•  Primary	issues/problems	
•  Publica)on	bias,	substandard	science,	p-hacking,	&	scien)fic	misconduct	
•  Replica)on	science	
•  Protocol	registra)on	and	pre-specified	analysis	plans	
•  Transparent	repor)ng	
•  Data	de-iden)fica)on,	anonymiza)on,	&	storage	
•  Related	Tools	&	Tips	
•  Data	work-flow	
•  Data	and	code	organiza)on	and	storage	
•  Cita)on	management	
•  Ethics	and	IRB	process	overview	
2016-10-17		|		UC	Berkeley	
1	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B
2016-10-17		|		UC	Berkeley	 Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	 2	
Introduc(on
Video Clip
(2:24 – 6:36)
•  Scien)fic	Studies:	Last	Week	Tonight	with	John	Oliver	(HBO)		
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
3
Growing Focus in the
Research Community
• “Research	findings	more	likely	to	be	false	if…	
•  Study	has	a	small	sample	size	and	may	be	underpowered	
•  The	effect	size	is	small	
•  A	large	number	of	effects	are	tested		
•  Studies	with	flexible	designs,	defini)ons,	outcomes,	analyses	
•  Studies	for	which	there	is	great	financial	or	other	interest	in	the	outcome”	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
4	
Source:	Jade	Benjamin-Chung,	2016-05-04	presenta)on	
to	icddr.b	Environmental	Interven)ons	Unit
Growing Focus in
Mainstream Media
2016-10-17		|		UC	Berkeley	
5	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B
2016-10-17		|		UC	Berkeley	 Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	 6	
Primary Issues/Problems
Publica(on Bias
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
7
Negligent/sub-
standard Science
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
8
P-hacking & a Brief
History of p-values
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
9
P-hacking:
How to do it?
• “Stop	collec)ng	data	once	p<.05		
• Analyze	many	measures,	but	report	only	those	with	p<.05	
• Collect	and	analyze	many	condi)ons,	but	only	report	those	with	p<.05		
• Use	covariates	to	get	p<.05		
• Exclude	par)cipants	to	get	p<.05	
• Transform	the	data	to	get	p<.05”	
• …other	”methods”?	
• Selec)ve	exclusion	of	“outlier”	observa)ons	
• Try	different	analyses/models	un)l	get	p<0.05	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
10	
Source:	Leif	D.	Nelson,	UCB,	
2016-06	BITSS	presenta)on	
”False-posi)ves,	p-hacking,	
sta)s)cal	power,	and	eviden)al	
value”
P-hacking
• “Why	le>	skew	with	p-hacking?		
• Because	p-hackers	have	limited	ambi)on		
• p=.21		
• ?Drop	if	>2.5	SD		
• p=.13		
• ?Control	for	gender		
• p=.04		
• ?Write	Intro		
• If	we	stop	p-hacking	as	soon	as	p<.05,	won’t	get	to	p=.02	very	ojen.”	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
11	
Source:	Leif	D.	Nelson,	UCB,	
2016-06	BITSS	presenta)on	
”False-posi)ves,	p-hacking,	
sta)s)cal	power,	and	eviden)al	
value”
Scien(fic Misconduct
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
12
Scien(fic Misconduct
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
13
Scien(fic Misconduct
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
14
Scien(fic Misconduct
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
15

Transparency1