Statistical significance

Statistical significance
In statistics, statistical significance (or a statistically
significant result) is attained when a p-value is less than
the significance level.[1][2][3][4][5][6][7]
The p-value is the
probability of observing an effect given that the null hy-
pothesis is true whereas the significance or alpha (α) level
is the probability of rejecting the null hypothesis given
that it is true.[8]
As a matter of good scientific practice, a
significance level is chosen before data collection and is
usually set to 0.05 (5%).[9]
Other significance levels (e.g.,
0.01) may be used, depending on the field of study.[10]
Statistical significance is fundamental to statistical hy-
pothesis testing.[11][12]
In any experiment or observation
that involves drawing a sample from a population, there is
always the possibility that an observed effect would have
occurred due to sampling error alone.[13][14]
But if the p-
value is less than the significance level (e.g., p < 0.05),
then an investigator may conclude that the observed ef-
fect actually reflects the characteristics of the population
rather than just sampling error.[11]
An investigator may
then report that the result attains statistical significance,
thereby rejecting the null hypothesis.[15]
The present-day concept of statistical significance origi-
nated with Ronald Fisher when he developed statistical
hypothesis testing based on p-values in the early 20th
century.[2][16][17]
It was Jerzy Neyman and Egon Pearson
who later recommended that the significance level be set
ahead of time, prior to any data collection.[18][19]
The term significance does not imply importance and the
term statistical significance is not the same as research,
theoretical, or practical significance.[11][12][20]
For exam-
ple, the term clinical significance refers to the practical
importance of a treatment effect.
1 History
Main article: History of statistics
The concept of statistical significance was originated by
Ronald Fisher when he developed statistical hypothesis
testing, which he described as “tests of significance”,
in his 1925 publication, Statistical Methods for Research
Workers.[2][16][17]
Fisher suggested a probability of one in
twenty (0.05) as a convenient cutoff level to reject the null
hypothesis.[18]
In their 1933 paper, Jerzy Neyman and
Egon Pearson recommended that the significance level
(e.g. 0.05), which they called α, be set ahead of time,
prior to any data collection.[18][19]
Despite his initial suggestion of 0.05 as a significance
level, Fisher did not intend this cutoff value to be fixed,
and in his 1956 publication Statistical methods and scien-
tific inference he recommended that significant levels be
set according to specific circumstances.[18]
2 Role in statistical hypothesis test-
ing
Main articles: Statistical hypothesis testing, Null hypoth-
esis, p-value and Type I and type II errors
Statistical significance plays a pivotal role in statistical
In a two-tailed test, the rejection region for a significance level of
α=0.05 is partitioned to both ends of the sampling distribution
and makes up 5% of the area under the curve (white areas).
hypothesis testing, where it is used to determine whether
a null hypothesis should be rejected or retained. A null
hypothesis is the general or default statement that noth-
ing happened or changed.[21]
For a null hypothesis to be
rejected as false, the result has to be identified as being
statistically significant, i.e. unlikely to have occurred due
to sampling error alone.
To determine whether a result is statistically significant,
a researcher would have to calculate a p-value, which is
the probability of observing an effect given that the null
hypothesis is true.[7]
The null hypothesis is rejected if the
p-value is less than the significance or α level. The α level
is the probability of rejecting the null hypothesis given
that it is true (type I error) and is most often set at 0.05
(5%). If the α level is 0.05, then the conditional probabil-
ity of a type I error, given that the null hypothesis is true,
is 5%.[22]
Then a statistically significant result is one in
which the observed p-value is less than 5%, which is for-
mally written as p < 0.05.[22]
1

2 6 REFERENCES
If an observed p-value is not lower than the significance
level, then rather than simply accepting the null hypoth-
esis, where feasible it would often appear to be appro-
priate to increase the sample size of the study, and see
whether the significance level is then reached.[23]
Never-
theless, the practice of increasing the number of subjects
may result in the smallest effect having statistical signifi-
cance. [24]
In these cases, reporting effect sizes becomes
particularly important.
If the α level is set at 0.05, it means that the rejection re-
gion comprises 5% of the sampling distribution.[25]
These
5% can be allocated to one side of the sampling distribu-
tion, as in a one-tailed test, or partitioned to both sides
of the distribution as in a two-tailed test, with each tail
(or rejection region) containing 2.5% of the distribution.
One-tailed tests are more powerful than two-tailed tests,
as a null hypothesis can be rejected with a less extreme
result.
3 Stringent significance thresholds
in specific fields
Main articles: Standard deviation and Normal distribu-
tion
In specific fields such as particle physics and
manufacturing, statistical significance is often ex-
pressed in multiples of the standard deviation or sigma
(σ) of a normal distribution, with significance thresholds
set at a much stricter level (e.g. 5σ).[26][27]
For instance,
the certainty of the Higgs boson particle’s existence was
based on the 5σ criterion, which corresponds to a p-value
of about 1 in 3.5 million.[27][28]
In other fields of scientific research such as genome-wide
association studies significance levels as low as 5×10−8
are
not uncommon.[29][30]
4 Effect size
Main article: Effect size
Researchers focusing solely on whether their results are
statistically significant might report findings that are not
substantive[31]
and not replicable.[32]
To gauge the re-
search significance of their result, researchers are there-
fore encouraged to always report the effect size along with
p-values (in cases where the effect being tested for is de-
fined in terms of an effect size): the effect size quantifies
the strength of an effect, such as the distance between
two means (cf. Cohen’s d), the correlation between two
variables or its square, and other measures.[33]
5 See also
• A/B testing
• ABX test
• Confidence level, the complement of the signifi-
cance level
• Effect size
• Fisher’s method for combining independent tests of
significance
• Look-elsewhere effect
• Multiple comparisons problem
• Texas sharpshooter fallacy (gives examples of tests
where the significance level was set too high)
• Reasonable doubt
• Statistical hypothesis testing
6 References
[1] Redmond, Carol; Colton, Theodore (2001). “Clinical sig-
nificance versus statistical significance”. Biostatistics in
Clinical Trials. Wiley Reference Series in Biostatistics
(3rd ed.). West Sussex, United Kingdom: John Wiley &
Sons Ltd. pp. 35–36. ISBN 0-471-82211-6.
[2] Cumming, Geoff (2012). Understanding The New Statis-
tics: Effect Sizes, Confidence Intervals, and Meta-Analysis.
New York, USA: Routledge. pp. 27–28.
[3] Krzywinski, Martin; Altman, Naomi (30 October 2013).
“Points of significance: Significance, P values and t-
tests”. Nature Methods (Nature Publishing Group) 10
(11): 1041–1042. doi:10.1038/nmeth.2698. Retrieved
3 July 2014.
[4] Sham, Pak C.; Purcell, Shaun M (17 April 2014).
“Statistical power and significance testing in large-scale
genetic studies”. Nature Reviews Genetics (Nature Pub-
lishing Group) 15 (5): 335–346. doi:10.1038/nrg3706.
Retrieved 3 July 2014.
[5] Johnson, Valen E. (October 9, 2013). “Revised stan-
dards for statistical evidence”. Proceedings of the National
Academy of Sciences (National Academies of Science).
doi:10.1073/pnas.1313476110. Retrieved 3 July 2014.
[6] Altman, Douglas G. (1999). Practical Statistics for Med-
ical Research. New York, USA: Chapman & Hall/CRC.
p. 167. ISBN 978-0412276309.
[7] Devore, Jay L. (2011). Probability and Statistics for Engi-
neering and the Sciences (8th ed.). Boston, MA: Cengage
Learning. pp. 300–344. ISBN 0-538-73352-7.
[8] Schlotzhauer, Sandra (2007). Elementary Statistics Using
JMP (SAS Press) (PAP/CDR ed.). Cary, NC: SAS Insti-
tute. pp. 166–169. ISBN 1-599-94375-1.

3
[9] Craparo, Robert M. (2007). “Significance level”. In
Salkind, Neil J. Encyclopedia of Measurement and Statis-
tics 3. Thousand Oaks, CA: SAGE Publications. pp.
889–891. ISBN 1-412-91611-9.
[10] Sproull, Natalie L. (2002). “Hypothesis testing”. Hand-
book of Research Methods: A Guide for Practitioners and
Students in the Social Science (2nd ed.). Lanham, MD:
Scarecrow Press, Inc. pp. 49–64. ISBN 0-810-84486-9.
[11] Sirkin, R. Mark (2005). “Two-sample t tests”. Statistics
for the Social Sciences (3rd ed.). Thousand Oaks, CA:
SAGE Publications, Inc. pp. 271–316. ISBN 1-412-
90546-X.
[12] Borror, Connie M. (2009). “Statistical decision making”.
The Certified Quality Engineer Handbook (3rd ed.). Mil-
waukee, WI: ASQ Quality Press. pp. 418–472. ISBN
0-873-89745-5.
[13] Babbie, Earl R. (2013). “The logic of sampling”. The
Practice of Social Research (13th ed.). Belmont, CA: Cen-
gage Learning. pp. 185–226. ISBN 1-133-04979-6.
[14] Faherty, Vincent (2008). “Probability and statistical sig-
nificance”. Compassionate Statistics: Applied Quantitative
Analysis for Social Services (With exercises and instruc-
tions in SPSS) (1st ed.). Thousand Oaks, CA: SAGE Pub-
lications, Inc. pp. 127–138. ISBN 1-412-93982-8.
[15] McKillup, Steve (2006). “Probability helps you make a
decision about your results”. Statistics Explained: An In-
troductory Guide for Life Scientists (1st ed.). Cambridge,
United Kingdom: Cambridge University Press. pp. 44–
56. ISBN 0-521-54316-9.
[16] Poletiek, Fenna H. (2001). “Formal theories of testing”.
Hypothesis-testing Behaviour. Essays in Cognitive Psy-
chology (1st ed.). East Sussex, United Kingdom: Psy-
chology Press. pp. 29–48. ISBN 1-841-69159-3.
[17] Fisher, Ronald A. (1925). Statistical Methods for Research
Workers. Edinburgh, UK: Oliver and Boyd. p. 43. ISBN
0-050-02170-2.
[18] Quinn, Geoffrey R.; Keough, Michael J. (2002). Experi-
mental Design and Data Analysis for Biologists (1st ed.).
Cambridge, UK: Cambridge University Press. pp. 46–69.
ISBN 0-521-00976-6.
[19] Neyman, J.; Pearson, E.S. (1933). “The testing of statisti-
cal hypotheses in relation to probabilities a priori”. Math-
ematical Proceedings of the Cambridge Philosophical So-
ciety 29: 492–510. doi:10.1017/S030500410001152X.
[20] Myers, Jerome L.; Well, Arnold D.; Lorch Jr, Robert F.
(2010). “The t distribution and its applications”. Research
Design and Statistical Analysis: Third Edition (3rd ed.).
New York, NY: Routledge. pp. 124–153. ISBN 0-805-
86431-8.
[21] Meier, Kenneth J.; Brudney, Jeffrey L.; Bohte, John
(2011). Applied Statistics for Public and Nonprofit Admin-
istration (3rd ed.). Boston, MA: Cengage Learning. pp.
189–209. ISBN 1-111-34280-6.
[22] Healy, Joseph F. (2009). The Essentials of Statistics: A
Tool for Social Research (2nd ed.). Belmont, CA: Cen-
gage Learning. pp. 177–205. ISBN 0-495-60143-8.
[23] Cohen, Barry H. (2008). Explaining Psychological Statis-
tics (3rd ed.). Hoboken, NJ: John Wiley and Sons. pp.
46–83. ISBN 0-470-00718-4.
[24] Friston, Karl (2012). article “Ten ironic rules for non-
statistical reviewers”. NeuroImage 61 (4): 1300–1310.
[25] Health, David (1995). An Introduction To Experimental
Design And Statistics For Biology (1st ed.). Boston, MA:
CRC press. pp. 123–154. ISBN 1-857-28132-2.
[26] Vaughan, Simon (2013). Scientific Inference: Learning
from Data (1st ed.). Cambridge, UK: Cambridge Uni-
versity Press. pp. 146–152. ISBN 1-107-02482-X.
[27] Bracken, Michael B. (2013). Risk, Chance, and Causa-
tion: Investigating the Origins and Treatment of Disease
(1st ed.). New Haven, CT: Yale University Press. pp.
260–276. ISBN 0-300-18884-6.
[28] Franklin, Allan (2013). “Prologue: The rise of the sig-
mas”. Shifting Standards: Experiments in Particle Physics
in the Twentieth Century (1st ed.). Pittsburgh, PA: Univer-
sity of Pittsburgh Press. pp. Ii–Iii. ISBN 0-822-94430-8.
[29] Clarke, GM; Anderson, CA; Pettersson, FH; Cardon, LR;
Morris, AP; Zondervan, KT (February 6, 2011). “Basic
statistical analysis in genetic case-control studies”. Nature
Protocols 6 (2): 121–33. doi:10.1038/nprot.2010.182.
PMID 21293453.
[30] Barsh, GS; Copenhaver, GP; Gibson, G; Williams, SM
(July 5, 2012). “Guidelines for Genome-Wide As-
sociation Studies”. PLoS Genetics 8 (7): e1002812.
doi:10.1371/journal.pgen.1002812. PMID 22792080.
[31] Carver, Ronald P. (1978). “The Case Against Statistical
Significance Testing”. Harvard Educational Review 48:
378–399.
[32] Ioannidis, John P. A. (2005). “Why most published re-
search findings are false”. PLoS Medicine 2: e124.
[33] Pedhazur, Elazar J.; Schmelkin, Liora P. (1991). Mea-
surement, Design, and Analysis: An Integrated Approach
(Student ed.). New York, NY: Psychology Press. pp.
180–210. ISBN 0-805-81063-3.
7 Further reading
• Ziliak, Stephen and Deirdre McCloskey (2008), The
Cult of Statistical Significance: How the Standard Er-
ror Costs Us Jobs, Justice, and Lives. Ann Arbor,
University of Michigan Press, 2009. ISBN 978-0-
472-07007-7. Reviews and reception: (compiled by
Ziliak)
• Thompson, Bruce (2004). “The “signifi-
cance” crisis in psychology and education”.
Journal of Socio-Economics 33: 607–613.
doi:10.1016/j.socec.2004.09.034.

4 8 EXTERNAL LINKS
• Chow, Siu L., (1996). Statistical Significance: Ra-
tionale, Validity and Utility, Volume 1 of series In-
troducing Statistical Methods, Sage Publications Ltd,
ISBN 978-0-7619-5205-3 – argues that statistical
significance is useful in certain circumstances.
• Kline, Rex, (2004). Beyond Significance Testing:
Reforming Data Analysis Methods in Behavioral Re-
search Washington, DC: American Psychological
Association.
8 External links
• The article "Earliest Known Uses of Some of the
Words of Mathematics (S)" contains an entry on Sig-
nificance that provides some historical information.
• "The Concept of Statistical Significance Testing"
(February 1994): article by Bruce Thompon hosted
by the ERIC Clearinghouse on Assessment and
Evaluation, Washington, D.C.
• "What does it mean for a result to be “statistically
significant"?" (no date): an article from the Statis-
tical Assessment Service at George Mason Univer-
sity, Washington, D.C.

5
9 Text and image sources, contributors, and licenses
9.1 Text
• Statistical signiﬁcance Source: http://en.wikipedia.org/wiki/Statistical_significance?oldid=649544777 Contributors: Bryan Derksen, The
Anome, William Avery, Michael Hardy, Kku, Gabbe, Dcljr, Ellywa, Nichtich~enwiki, Den fjättrade ankan~enwiki, Nerd~enwiki, Cherkash,
Topbanana, Paranoid, Gak, Henrygb, Giftlite, BrendanH, Pgan002, Antandrus, L353a1, DanielCD, Rich Farmbrough, Yknott, Kndiaye,
Slb, Cretog8, Arcadian, Andrewpmk, John Quiggin, Seans Potato Business, Alkarex, Woohookitty, Btyner, Rjwilmsi, Smoe, Thomas Are-
latensis, Thisismikesother, ElKevbo, Cjpuﬃn, EvanSeeds, Lborelli~enwiki, Mathbot, Riki, Preslethe, Vonkje, Chobot, YurikBot, Wave-
length, Gaius Cornelius, ENeville, Nephron, DRosenbach, Jon Olav Vik, Doc pune, Lt-wiki-bot, Davril2020, Badgettrg, Darrel francis,
SmackBot, McGeddon, Jtneill, Robfuller, Ohnoitsjamie, Josefec, Nbarth, Danielkueh, Richard001, G716, Arodb, Euchiasmus, Tim bates,
Nijdam, Tommyzee, Mmiller0712, Mdgross50, Grapplequip, DwightKingsbury, Joseph Solis in Australia, Abeg92, Tawkerbot4, LarryQ,
Thijs!bot, Tallred, Wildthing61476, Tillman, Erxnmedia, Fetchcomms, Magioladitis, Torchiest, Inhumandecency, MartinBot, ChemN-
erd, Lilac Soul, Coppertwig, Yym1997, Kenneth M Burke, Spellcast, Philip Trueman, Don Quixote de la Mancha, MuanN, Seraphim,
Sprasad.ee, SQL, Wangerin, Lavers, Jasondet, Strasburger, The-G-Unit-Boss, Melcombe, Wjmummert, Martarius, ClueBot, Binkster-
net, Srudes2, Winsteps, Pwestfall, Lot49a, Qwfp, Staticshakedown, Dthomsen8, SilvonenBot, Mifter, Aam aadmi, ZooFari, Jmkim dot
com, Tayste, Addbot, Eric Drexler, DOI bot, Fgnievinski, Bulletproofman19, MrOllie, Palmerabollo, Numbo3-bot, Ehrenkater, Zorrobot,
Luckas-bot, AnomieBOT, ChristopheS, Materialscientist, SvartMan, Xqbot, Bbarkley, Sylwia Ufnalska, M12107, Constructive editor,
FrescoBot, Sławomir Biały, Pinethicket, Edderso, Georg Hurtig, RedBot, Gjsis, Cerebis, Animalparty, Indicedigini, Raylyons, Billare, Sir
Arthur Williams, Rgmooney C109, GoingBatty, Schwa dk, HiW-Bot, Kostya 888, Muditjai, Mysticyx, L Kensington, Mikhail Ryazanov,
ClueBot NG, Mathstat, Michael D. Stephens, Helpful Pixie Bot, BG19bot, Wikstar7, Lilingxi, Matthieu Vergne, Manoguru, Minsbot,
MathewTownsend, BattyBot, HankW512, ChrisGualtieri, Eggingerik, BetseyTrotwood, NicenFriendlyPerson, Sa publishers, Soranoch,
Thewikiguru1, Rgiordan, EmilKarlsson, 1980na, Isambard Kingdom, ChrisLloyd58 and Anonymous: 154
9.2 Images
• File:Commons-logo.svg Source: http://upload.wikimedia.org/wikipedia/en/4/4a/Commons-logo.svg License: ? Contributors: ? Original
artist: ?
• File:Fisher_iris_versicolor_sepalwidth.svg Source: http://upload.wikimedia.org/wikipedia/commons/4/40/Fisher_iris_versicolor_
sepalwidth.svg License: CC BY-SA 3.0 Contributors: en:Image:Fisher iris versicolor sepalwidth.png Original artist: en:User:Qwfp (origi-
nal); Pbroks13 (talk) (redraw)
• File:Folder_Hexagonal_Icon.svg Source: http://upload.wikimedia.org/wikipedia/en/4/48/Folder_Hexagonal_Icon.svg License: Cc-by-
sa-3.0 Contributors: ? Original artist: ?
• File:NormalDist1.96.png Source: http://upload.wikimedia.org/wikipedia/en/b/bf/NormalDist1.96.png License: Cc-by-sa-3.0 Contribu-
tors:
self-made
Original artist:
Qwfp (talk)
• File:People_icon.svg Source: http://upload.wikimedia.org/wikipedia/commons/3/37/People_icon.svg License: CC0 Contributors: Open-
Clipart Original artist: OpenClipart
• File:Portal-puzzle.svg Source: http://upload.wikimedia.org/wikipedia/en/f/fd/Portal-puzzle.svg License: Public domain Contributors: ?
Original artist: ?
• File:Wikiversity-logo.svg Source: http://upload.wikimedia.org/wikipedia/commons/9/91/Wikiversity-logo.svg License: CC BY-SA 3.0
Contributors: Snorky (optimized and cleaned up by verdy_p) Original artist: Snorky (optimized and cleaned up by verdy_p)
9.3 Content license
• Creative Commons Attribution-Share Alike 3.0

Statistical significance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Statistical significance

Similar to Statistical significance (20)

More from Mai Ngoc Duc

More from Mai Ngoc Duc (17)

Recently uploaded

Recently uploaded (20)

Statistical significance