How to choose a statistical test?
Dr. S. A. Rizwan, M.D.
Public	Health	Specialist
SBCM,	Joint	Program	– Riyadh
Ministry	of	Health,	Kingdom	of	Saudi	Arabia
Learning	objectives
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Name	the	various	commonly	used	statistical	tests
• Describe	the	preconditions	to	select	a	statistical	test
• Apply	the	correct	test	for	the	problem	at	hand
• Interpret	the	conclusions	of	the	test	appropriately
Introduction
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• We	apply	a	statistical	test	to	confirm	if	
our	study	results	have	arisen	due	to	
chance	or	due	to	a	true	effect
• It	is	important	to	apply	the	correct	test	
for	a	valid	conclusion
• A	number	of	considerations	are	
required	before	selecting	a	test
Introduction
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• There	are	more	than	100s	of	statistical	
tests	available
• This	book	is	a	useful	and	simple	guide
• Apart	from	tests	there	are	many	
statistical	methods	which	are	also	
considered	statistical	tests
Prerequisites	for	deciding	a	test
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• How	many	variables	are	there?
• What	is	the	nature	of	dependent	and	
independent	variable?
• How	many	categories	are	there	in	the	
categorical	variable?
• Does	the	continuous	variable	follow	normal	
distribution?	(Parametric	vs.	Non	parametric)
• Is	there	any	pairing	in	the	data/variables?
*	Recall	types	of	variable	/	data?
What	is	meant	by	data	follows	normal	dist?
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• As	a	general	rule
• Small	samples	(<30)	are	not	normal
• Conversely	large	(>30)	samples	are	considered	to	
be	normal
• Normal	distribution	can	be	checked	
• Informally:
• Visual	methods:	histogram,	P-P,	Q-Q	plots
• Comparing	SD	and	mean
• Formally:
• Statistical	tests	(Kolmogorov–Smirnov,	Shapiro–
Wilk)
Dependent	/	independent	variable?
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Dependent	/	independent	variable?
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Height
• Weight
• Diet
• Socio-economic	status
• Drug	type
• Muscle	mass
• Quality	of	life
• Wealth
• Employment
• Sex
• Treatment	type
• Cure	rate
• Survival	index
• Maternal	mortality	rate
• Teenage	pregnancy
• Immunisation	coverage
• Education
• House	type
• Health	accessibility
• Mortality
• Tobacco	use
• Domestic	violence
*	Can	you	classify	these	variables	
into	dependent	and	independent?
Paired	/	unpaired	data?
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Paired	data/samples	in	which	
natural	or	matched	couplings	
occur
• Pre-test/post-test	samples	
• Cross-over	trials	
• Matched	samples
• Cluster	samples
How	to	choose	a	statistical	test?
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
One	sample	tests
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	dependent	variable	compared	to	a	
hypothesized	population	value
• There	are	no	independent	variables
• Rarely	used	in	community	medicine
Nature	of	variable Test	name
Normal One	sample	t-test
Ordinal/interval One	sample	median
Categorical	(2	levels) Binomial	test
Categorical	(>2	levels) Chi-square	GOF
*	If	the	sample	size	is	large	and	within	known	population	
parameters,	Z	test	is	used.
No	independent	variable
Comparison	of	2	variables	- 1
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Categorical
• Independent
• 2	levels
• One	dependent	variable
• Some	of	the	most	commonly	used	tests
• Aka,	bivariate	analysis
Nature	of	DV Test	name
Normal
Independent	sample	
t-test
Ordinal/interval	
(not	normal)
Wilcoxon-Mann	
Whitney	test
Categorical	
(assumptions	met)
Chi-square	test
Categorical	
(assumptions	NOT	met)
Fisher’s	exact	test
Comparison	of	2	variables	- 2
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Categorical
• Independent
• >2	levels
• One	dependent	variable
• Some	of	the	most	commonly	used	tests
• Aka,	bivariate	analysis
Nature	of	DV Test	name
Normal One	way	ANOVA
Ordinal/interval	
(not	normal)
Kruskal	Wallis
Categorical	
(assumptions	met)
Chi-square	test
Categorical	
(assumptions	NOT	met)
Fisher’s	exact	test
Comparison	of	2	variables	- 3
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Categorical
• Dependent
• 2	levels
• One	dependent	variable
Nature	of	DV Test	name
Normal Paired	t-test
Ordinal/interval	
(not	normal)
Wilcoxon	signed	
ranks	test
Categorical McNemar
Comparison	of	2	variables	- 4
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Categorical
• Dependent
• >2	levels
• One	dependent	variable
Nature	of	DV Test	name
Normal
One-way	repeated	
measures	ANOVA
Ordinal/interval	
(not	normal)
Friedman	test
Categorical
Repeated	measures	
logistic	regression
Comparison	of	2	variables	- 5
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Interval
• One	dependent	variable
Nature	of	DV Test	name
Normal
Correlation	
Regression
Ordinal/interval	
(not	normal)
Non-parametric	
correlation
Categorical
Simple	logistic	
regression
Other	statistical	methods	- 1
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	or	more	independent	variables
• Interval	or	categorical
• One	dependent	variable
Nature	of	DV Test	name
Interval	and	Normal
Multiple	regression
ANCOVA
Categorical
Multiple	logistic	
regression
Discriminant
analysis
Other	statistical	methods	- 2
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• One	independent	variable
• Independent	groups
• 2	or	more	levels
• 2	or	more	dependent	variables
Nature	of	DV Test	name
Interval	and	Normal One	way	MANOVA
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• 2	or	more	independent	variables
• Categorical
• Independent
• One	dependent	variable
• Some	of	the	most	commonly	used	tests
• Aka,	bivariate	analysis
Nature	of	DV Test	name
Normal Factorial	ANOVA
Ordinal/interval	
(not	normal)
Ordered	logistic	
regression
Categorical
Factorial	logistic	
regression
Other	statistical	methods	- 3
Other	statistical	methods	- 4
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• 2	or	more	independent	variables
• 2	or	more	dependent	variables
Nature	of	DV Test	name
Interval	and	Normal
Multivariate	
multiple	linear	
regression
Other	statistical	methods	- 5
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• 2	or	more	dependent	variables
Nature	of	DV Test	name
Interval	and	Normal
Canonical	
correlation	
Factor	analysis
Summary	– Tests	you	must	know
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Type	of	bivariate	
comparison
Number	
of	
groups
Independent	samples Paired	samples
Parametric	
(normal	distribution)
Non-parametric
(non-normal	distribution)
Parametric
(normal	distribution)
Non-parametric
(non-normal	distribution)
Categorical	vs.	categorical	
(eg.	Treatment	 type	vs.	
gender)
2
-
Chi	square	test
(or	Fishers	exact	test)
-
McNemar test
Categorical	vs.	categorical	
(eg.	Disease	severity	vs.	
religion)
>2
-
Chi	square	test
(or	Fishers	exact	test)
- -
Categorical	vs.	quantitative	
(eg.	Two	treatment	types	
vs.	mean	BP)
2
Students	t	test Wilcoxon	rank	sum	test Paired	t	test Wilcoxon	sign	rank	test
Categorical	vs.	quantitative	
(eg.	Three	treatment	 type	
vs.	mean	BP)
>2
One-way	ANOVA	(analysis	
of	variance)
Kruskal	Wallis	test
Repeated	measures	
ANOVA
Freidman’s	test
Quantitative	vs.	
quantitative	
(eg.		Age	vs.	weight)
-
Pearson’s	correlation
Regression	analysis
Spearman’s	rank	
correlation Generalized	linear	models -
Summary
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Take	home	messages
Demystifying statistics! – Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• A	number	of	things	must	be	considered	
before	selecting	a	test
• For	a	given	test	the	assumptions	must	be	
satisfied
• The	results	must	be	interpreted	cautiously	
with	the	context
Thank	you!
Email	your	queries	to	sarizwan1986@outlook.com

Choosing a statistical test

  • 1.
    How to choosea statistical test? Dr. S. A. Rizwan, M.D. Public Health Specialist SBCM, Joint Program – Riyadh Ministry of Health, Kingdom of Saudi Arabia
  • 2.
    Learning objectives Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • Name the various commonly used statistical tests • Describe the preconditions to select a statistical test • Apply the correct test for the problem at hand • Interpret the conclusions of the test appropriately
  • 3.
    Introduction Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • We apply a statistical test to confirm if our study results have arisen due to chance or due to a true effect • It is important to apply the correct test for a valid conclusion • A number of considerations are required before selecting a test
  • 4.
    Introduction Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • There are more than 100s of statistical tests available • This book is a useful and simple guide • Apart from tests there are many statistical methods which are also considered statistical tests
  • 5.
    Prerequisites for deciding a test Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • How many variables are there? • What is the nature of dependent and independent variable? • How many categories are there in the categorical variable? • Does the continuous variable follow normal distribution? (Parametric vs. Non parametric) • Is there any pairing in the data/variables? * Recall types of variable / data?
  • 6.
    What is meant by data follows normal dist? Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • As a general rule • Small samples (<30) are not normal • Conversely large (>30) samples are considered to be normal • Normal distribution can be checked • Informally: • Visual methods: histogram, P-P, Q-Q plots • Comparing SD and mean • Formally: • Statistical tests (Kolmogorov–Smirnov, Shapiro– Wilk)
  • 7.
    Dependent / independent variable? Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
  • 8.
    Dependent / independent variable? Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • Height • Weight • Diet • Socio-economic status • Drug type • Muscle mass • Quality of life • Wealth • Employment • Sex • Treatment type • Cure rate • Survival index • Maternal mortality rate • Teenage pregnancy • Immunisation coverage • Education • House type • Health accessibility • Mortality • Tobacco use • Domestic violence * Can you classify these variables into dependent and independent?
  • 9.
    Paired / unpaired data? Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • Paired data/samples in which natural or matched couplings occur • Pre-test/post-test samples • Cross-over trials • Matched samples • Cluster samples
  • 10.
    How to choose a statistical test? Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
  • 11.
    One sample tests Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One dependent variable compared to a hypothesized population value • There are no independent variables • Rarely used in community medicine Nature of variable Test name Normal One sample t-test Ordinal/interval One sample median Categorical (2 levels) Binomial test Categorical (>2 levels) Chi-square GOF * If the sample size is large and within known population parameters, Z test is used. No independent variable
  • 12.
    Comparison of 2 variables - 1 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Categorical • Independent • 2 levels • One dependent variable • Some of the most commonly used tests • Aka, bivariate analysis Nature of DV Test name Normal Independent sample t-test Ordinal/interval (not normal) Wilcoxon-Mann Whitney test Categorical (assumptions met) Chi-square test Categorical (assumptions NOT met) Fisher’s exact test
  • 13.
    Comparison of 2 variables - 2 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Categorical • Independent • >2 levels • One dependent variable • Some of the most commonly used tests • Aka, bivariate analysis Nature of DV Test name Normal One way ANOVA Ordinal/interval (not normal) Kruskal Wallis Categorical (assumptions met) Chi-square test Categorical (assumptions NOT met) Fisher’s exact test
  • 14.
    Comparison of 2 variables - 3 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Categorical • Dependent • 2 levels • One dependent variable Nature of DV Test name Normal Paired t-test Ordinal/interval (not normal) Wilcoxon signed ranks test Categorical McNemar
  • 15.
    Comparison of 2 variables - 4 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Categorical • Dependent • >2 levels • One dependent variable Nature of DV Test name Normal One-way repeated measures ANOVA Ordinal/interval (not normal) Friedman test Categorical Repeated measures logistic regression
  • 16.
    Comparison of 2 variables - 5 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Interval • One dependent variable Nature of DV Test name Normal Correlation Regression Ordinal/interval (not normal) Non-parametric correlation Categorical Simple logistic regression
  • 17.
    Other statistical methods - 1 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One or more independent variables • Interval or categorical • One dependent variable Nature of DV Test name Interval and Normal Multiple regression ANCOVA Categorical Multiple logistic regression Discriminant analysis
  • 18.
    Other statistical methods - 2 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • One independent variable • Independent groups • 2 or more levels • 2 or more dependent variables Nature of DV Test name Interval and Normal One way MANOVA
  • 19.
    Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • 2 or more independent variables • Categorical • Independent • One dependent variable • Some of the most commonly used tests • Aka, bivariate analysis Nature of DV Test name Normal Factorial ANOVA Ordinal/interval (not normal) Ordered logistic regression Categorical Factorial logistic regression Other statistical methods - 3
  • 20.
    Other statistical methods - 4 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • 2 or more independent variables • 2 or more dependent variables Nature of DV Test name Interval and Normal Multivariate multiple linear regression
  • 21.
    Other statistical methods - 5 Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • 2 or more dependent variables Nature of DV Test name Interval and Normal Canonical correlation Factor analysis
  • 22.
    Summary – Tests you must know Demystifying statistics!– Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh Type of bivariate comparison Number of groups Independent samples Paired samples Parametric (normal distribution) Non-parametric (non-normal distribution) Parametric (normal distribution) Non-parametric (non-normal distribution) Categorical vs. categorical (eg. Treatment type vs. gender) 2 - Chi square test (or Fishers exact test) - McNemar test Categorical vs. categorical (eg. Disease severity vs. religion) >2 - Chi square test (or Fishers exact test) - - Categorical vs. quantitative (eg. Two treatment types vs. mean BP) 2 Students t test Wilcoxon rank sum test Paired t test Wilcoxon sign rank test Categorical vs. quantitative (eg. Three treatment type vs. mean BP) >2 One-way ANOVA (analysis of variance) Kruskal Wallis test Repeated measures ANOVA Freidman’s test Quantitative vs. quantitative (eg. Age vs. weight) - Pearson’s correlation Regression analysis Spearman’s rank correlation Generalized linear models -
  • 23.
    Summary Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
  • 24.
    Take home messages Demystifying statistics! –Lecture 6 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh • A number of things must be considered before selecting a test • For a given test the assumptions must be satisfied • The results must be interpreted cautiously with the context
  • 25.