3. I am of a social psychological breed, so I feel a bit of a spotlight shining on
me, considering the recent fraud cases…
4.
5. But also:
-I hate the spotlight on experimental lab studies
-I experienced a fair share of “too good to be true” moments
-Sometimes, I am just like a child when it comes to ethical standards and
then …
8. IN MY OPINION…
… it is not a problem of particular researchers. Frauds will always be
there but the major threat = dark-grey research practices
… it is not a problem of one single discipline (social psychology) or one
single research method (experimental research)
10. Evolution of a whole lot of research disciplines depends on how we deal with
the situation right now.
-Singular fraud vs. systemic questionable practices?
And I fear the day this will inspire policy makers to cut down on our resources
(there is a crisis, you know) or business people deciding to design their own
educational system (or even worse, their own flawed research).
11. FRAUD Diederik Stapel (50)
Data fabrication
Paper duplication
Yoshitaka Fuji (183)
Plagiarism
Dirk Smeesters
Lack of IRB approval
P-hacking
File drawer
One-sided lit. review
Biased content analysis
Biased interviews
We?
(communication
sciences;
KULeuven; Other questionable
IMS) research practices
THE POPE ; JESUS
13. -Retractions of flawed or fraudulent research papers; mega-corrections to
published articles
-Research on fraud and questionable research practices (and fierce
discussions among protagonists)
-Calls for replication studies; publication based on reviewed study design
-Open science networks: e.g. mere publication/replication depositories –
open science framework
-Post-publication review: Less intrusive than a letter to the editor; more open
access; closer to true academic discussion
-Judicial sanctions for busted researchers
-…
23. -You ARE a scientist. So trust your feelings when they say “too good to be
true…”
An extreme example: Greg Francis’ research (though criticized by a.o. Uri
Simonsohn)
***For the following slides:
ALL CREDITS to Greg’s presentation on Febr 5th 2013 in Brussels***
24. Experimental methods
• Suppose you hear about two sets of experiments that investigate
phenomena A and B
• Which effect is more believable?
Effect A Effect B
Number of
10 19
experiments
Number of
experiments that 9 10
reject H0
Replication rate 0.9 0.53
25. • Effect A is Bem’s (2011) precognition study that reported evidence of
people’s ability to get information from the future
– I do not know any scientist who believes this effect is real
• Effect B is from a meta-analysis of a version of the bystander effect, where
people tend to not help someone in need if others are around
– I do not know any scientist who does not believe this is a real effect
• So why are we running experiments?
Effect A Effect B
Number of
10 19
experiments
Number of
experiments that 9 10
reject H0
Replication rate 0.9 0.53
26. Hypothesis testing (for means)
• We start with a null hypothesis: no effect, H0
• Identify a sampling distribution that describes variability in a test
statistic
X1 - X 2
t=
sX -X
1 2
27. Hypothesis testing (for two means)
• We can identify rare test statistic values as those in the tail of the sampling distribution
• If we get a test statistic in either tail, we say it is so rare (usually 0.05) that we should
consider the null hypothesis to be unlikely
• We reject the null
H0
X1 - X 2
t=
sX -X
1 2
28. Alternative hypothesis
• If the null hypothesis is not true, then the data came from some other
sampling distribution (Ha)
H0 Ha
29. Power
• If the alternative hypothesis is true
• Power is the probability you will reject H0
• If you repeated the experiment many times, you would expect to reject H0 with a
proportion that reflects the power
H0 Ha
30. Power
• Use the pooled effect size to compute the pooled power of each
experiment (probability this experiment would reject the null
hypothesis)
Sample Effect Power
• Pooled effect size size (g)
size Exp. 1 100 0.249 0.578
– g*=0.1855 Exp. 2 150 0.194 0.731
• The sum of the power Exp. 3 97 0.248 0.567
Exp. 4 99 0.202 0.575
values (E=6.27) is the
Exp. 5 100 0.221 0.578
expected number of times Exp. 6 Negative 150 0.146 0.731
these experiments would Exp. 6 Erotic 150 0.144 0.731
Exp. 7 200 0.092 0.834
reject the null hypothesis
Exp. 8 100 0.191 0.578
(Ioannidis & Trikalinos, 2007)
Exp. 9 50 0.412 0.363
31. Take-home-message of Greg’s studies
-The file drawer phenomenon might be immense. Don’t put your money on
published studies
-Think not only about the p of your failed studies, but also their power.
-For most studies in our discipline, there is about a 50% chance to discover an
true phenomenon (since many studies are underpowered)
-Increase your N per hypothesis! It increases your “power” to discover an
effect (Ha= true) and (a bit) to refute an effect’s existence (H0= true)
Note:
To “detect” that men weigh more than women at an adequate power of .8,
you need to have n=46!!! (Simmons et al., 2013).
Are we studying effects that are stronger than men outweighing women??
32. -You ARE a scientist. So trust your feelings when they say “too good to be
true…”
-Engage in post-publication reviewing: do some active blogging about your
own studies; engage in discussions about others’ research
-Replicate! Or make others replicate. That is, investigate what others have
done already. Use all available data for your insights and do not take any
single study’s results for granted. Go ahead and p-hack your own data, but
replicate your own results
-Document your studies in a good way. Genuinely question yourself: is this
really everything one should need to know in order to replicate my study?
-Openness in reporting and reviewing. Be honest and confront reviewers if
they fetish immaculate papers
-Preferably, collaborate with other researchers and use shared repositories to
store data, analyses, notes, etc.
33. IRONICALLY,
the net result will be that more papers will be published rather than fewer, I
guess
Standards for what is good enough to be published, should go down. As a
result, more will be published, and meta-analysis will become the true
katalyst to scientific progress rather than single studies.