8. 1 continuous = dot plot, histogram or box and whisker
2 continuous = scatter plot
1 categorical = bar chart or pie chart
2 categorical = cluster bar chart
1 continuous + 1 categorical = 2 box and whiskers
The variables determine the graph(s) you
should use:
15. Central tendency and spread
Depending on the type of data you have, you use different types of
measure for central tendency and spread of the data
Measures of central tendency
Mean = the average number
Median = the middle number
Mode = the most occurring number
Measures of spread of the data
Standard deviation = the mean distance from the mean!
Variance = standard deviation2
Range
IQR (75% value - 25% value)
16. Standard deviation (Ļ) = mean distance from the mean
e.g. 1, 2, 3, 4, 5
Mean = 3
So mean distance from the mean: 6/5= 1.2
e.g. 3, 3, 3, 3, 3
Mean = 3
But mean distance from the mean = 0
Although the mean is the same for both, the 1st set is more
spread out. This is reflected in the standard deviation (Ļ)
being bigger.
21. Sample vs Population
If I take a random sample of 16 of this room.
n = 16.
If I examine each of their height, I
can work out the sample mean (x) height. (e.g. xĢ = 1.75m)
This gives an indication of the population mean (Ī¼), but does NOT exactly
give the mean height of the whole lecture theatre
Iāll only know the population mean if I actually examine everyoneās height:
ļ n would have to match the population
In Medicine (or life) we are hardly ever are able to examine a whole
population! We can only take a sampleā¦ (n) and then extrapolate to the
population to make judgementsā¦
This leads on to what Confidence Intervals are.
22. Summary: Introductory Statistics
1. Central tendency measures:
1. mean,
2. median,
3. mode
2. Measures of spread:
1. standard deviation,
2. variance,
3. range,
4 . I Q R
3. If you have non-parametric data, you can:
i. log it, then do a parametric test
OR
ii. just do a non-parametric test
4. Use mean and s.d for parametric tests (normal distribution)
5. Use median and IQR for non-parametric tests (+ve/-ve skewed distributions)
23. Sample mean ( ) and population mean (Ī¼) are two very different things.
All Normal Distributions can be described by just 2 parameters: 1. Mean
2. Standard deviation
Standard deviation ā A measure of variability in the data.
ā Shows the spread (distance) of data values from itās mean.
Variance measures the average degree to which each point differs from the mean.
Variance ā standard deviation squared
MSMSM from A-Levels!
27. What is a CI?
ļ When taking a sample of n size, you can calculate an xĢ.
BUT we want to calculate the population mean, Ī¼.
ļ As said before, this is often not possible to 100% because we only have a sample
to work with.
ļ So instead, we can calculate a range with a % probability that the Ī¼ lies within.
ļ Therefore, we can estimate itās largest and smallest possible value.
ļ This is a Confidence Interval.
Using the example earlierā¦
I get an xĢ = 1.75m from n = 16 of the lecture theatre.
I then calculate from the heights that Ļ = 0.16m (for example)
By doing a few calculations, I could then calculate a range for which I am
95% sure that the population mean height (Ī¼) would lie in.
28. CI Equation Z = a number that determines
to what % we know that Ī¼ is
within the confidence interval.
For us, Z = 1.96.
This gives us 95% certainty
that Ī¼ is within the interval.
Standard Error =
Our exampleā¦
Z = 1.96
n = 16
xĢ = 1.75m
Ļ = 0.16m
= 1.75 Ā± (1.96)x(0.16/ā16)
= 1.75 Ā± 0.08
Therefore, 95% CI = 1.67m - 1.83m
We are 95% sure that the average
height of the people in this room is
between 1.67m and 1.83m
Sample mean
29. Confidence Intervals in Normal Distribution
Mean height is
somewhere here,
with 95% certainty
How this related to medicine:
ā¢ When new drugs are being tested, their
hazardous effects need to be examined.
ā¢ This is often done as a hazard ratio,
compared to current treatment
Hazard ratio (NEW : old)
likelihood that new drug B
causes an adverse effect
compared to old drug A.
E.g., If old Drug A has a S/E of cardiomyopathy in 1/25 patients
but new Drug B has a S/E of cardiomyopathy in 1/50 patients.
The hazard ratio (HR) of B:A is 0.5 or 1/2 (HR of A:B is the RECIPROCAL= 2)
You can see from BOTH hazard ratios ļ to see improvement when considering a new drug,
we want HR (new:old) to be LESS than 1 or as LOW as possible.
30. 0.4 0.6
HR of Cardiomyopathy in Drug B : Drug A
95% sure that the
mean height is
somewhere here.
So here, we are fairly certain that Drug B generally causes less cardiomyopathy than
Drug A as mean HR the UPPER LIMIT (0.6) < 1. ļ This is statistically significant.
There is only a 2.5% or less possibility that the new drug has a higher chance of causing
more cardiomyopathy than the old drug.
However, it is NOT that simpleā¦
We want a low HR, but we canāt always be 100% sure becauseā¦ we only tend to
get HRs as confidence intervals, e.g., with 95% CI, HR: 0.4 - 0.6 ā¦ WHY?
Confounding factors NOT considered
as only 2 parameters considered in N.D.
31. We can NOT be certain (to 95%) that
Drug B causes less of side effect "X"
than Drug A (HR could be <1).
So, this is statistically insignificant.
Mean height is
somewhere here,
with 95% certainty.
In this scenario, UPPER LIMIT > 1
This principle applies to other
measures as wellā¦
1.If Hazard Ratio or Odds Ratio, check if the upper/lower limit crosses 1.
If it does = statistical insignificance
If it does not = statistical significance
2.If just a number (i.e. NOT ratio), check if the upper/lower limit cross 0.
If it does = statistical insignificance
If it does not = statistical significance
32. Note the difference between statistical significance (if the value crosses 0/1)
and clinical significance
CIs come again later, when talking
about statistical tests and P-values
33. Summary: Confidence Intervals
1. A confidence interval is an interval for which you are x% sure
that the population mean is within.
2. In medicine we use, z = 1.96, which corresponds to 95%
3. For hazard ratios and odds ratios, the threshold for
statistical significance is 1
4. For absolute numbers, the threshold for statistical significance
is 0
5. Statistical significance is not the same as clinical significance
41. Validity
Validity accounts for how representative your measurements are
to the real, objective phenomenon.
e.g., Does [oxytocin]plasma correspond to happiness?
Does a +ve biopsy mean cancer?
Classified 3 categories of validity:
1. Criterion V
alidity
2. Construct V
alidity
3. Content (Face)V
alidity
42. BEST
1. Criterion Validity =
comparing to a gold standard
(done in diagnostic tests)
2. Construct Validity = comparing
to a good indicator.
E.g. Comparing biomarkers
measuring QoL to a questionnaire
measuring QoL
3. Content (Face) Validity =
a panel of experts deciding if
it seems reasonable
WORST
43. Reliability
e.g., Endoscopes are more reliable in determining GI status
than questionnaires enquiring QoL.
All tests that measure something, have a standard error.
Smaller standard error = more reliable test results
Standard error = a measure of how likely a test is to be incorrect
This is because endoscopes have a smaller standard error.
No matter how reliable a test is, mistakes will always occur, so, every test has an
associated standard error
and from this standard error, we give each test a:
Reliability Coefficient: how reliable the test is.
44. Reliability Coefficient
A measure of how reliable something is!
Reliability Coefficient = Varsubject / (Varsubject+Varrepeat)
Varsubject = How variable the test is at measuring
(This is effectively the Standard Error)
Varrepeat = How many times the test has to be repeated to
idecrease the Varsubject
45. Summary: Accuracy, Validity &Reliability
1. Accuracy is the lack of random and systematic error
(to have precision with lack of bias)
2. Validity can be classified into (best) criterion,
(middle) construct,
(worst) content validity.
3. Reliability is associated with standard error,
for which each test has a reliability coefficient.
4. Reliability Coefficient =
Varsubject / ( Varsubject + Varrepeat)
51. P-value
When experiments / studies are done, there are 2 hypothesis made:
Statistical
Test
P value
1. Null hypothesis, H0 = there is no link
2. Alternative hypothesis, H1 = there is a link
(but no magnitude stated - how big or small, positive or negative)
The P-value is the probability, that given the results, the H0 is correct.
If P > 0.05, accept H0. The default position for any claim should be skepticism.
If P < 0.05, the probability is too low for H0 to be accepted ļ accept H1
52. Errors
Type I (Ī±) error = Too gullible.
Accepting H1 when H0.
False +ve.
There are 2 errors that can occur when accepting/
rejecting hypotheses.
Type II (Ī²) error = Too skeptical.
Accepting H0 when H1.
False -ve.
53. Decreasing Errors
Decreasing T
ype I errors:
1.Decrease the number of tests youāve done
The more tests you have the more likely you are to find
a pattern that does not exist
2.Decrease the number of secondary outcomes
you have
Decreasing T
ype II errors:
Increase the power of your study
Power = 1 - Type II error rate
Accepted Ī± error rate = 10%
Accepted Ī² error rate = 20%
What statistical test you use depends on a few factors:
1. Variable (continuous/classification/regression)
2. Number of variables
3. Parametric vs Non-parametric
54.
55. Continuous Variables
1 Variable:
Parametric = 1 sample T test
Non-parametric = Sign Test (doesnāt give magnitude)
Wilcoxon Signed Rank Test (gives magnitude)
e.g. IQ tests
2 Variables (paired): e.g. [glucose]plasma before & after meal
Parametric = 2 sample T test
Non-parametric = Wilcoxon Signed Rank test (WSRT)
3 Variables (independant): most common
Parametric = 2 sample T test
Non-parametric = Mann-Whitney U test
56. 2< V
ariables (paired):
Area under the curve
2< V
ariables:
Parametric = ANOVA (One-way analysis of variance) or
Bonferoni test or
Dunnettās test.
Non-parametric = Kruskal-Wallis test
Main one
57. ā¦ from before
Normal Distribution
āBell shapeā This is known as parametric data
Display:
1. Mean for central tendency
2. Standard Deviation for spread
Display:
1. Median for central tendency
2. IQR for spread
This is known as non- parametric data
Skewed Distribution
58. 1. If all of the numbers are 5< = Chi-squared test
2. If any of the numbers are <5 = Fisherās exact test
3. If paired data = McNemarās test
Categorical variables in classification tables
These are mutually
exclusive variables
59. Regression Testing
This tests a correlation between 2 variables, not causation.
(Although, correlations do have causes)
G C S E maths
y=mx+c
m=gradient
c = y intercept
Statistics
y=Ī±+Ī²x
Ī² = gradient
Ī± = y intercept
y=2.5x+5
60. In regression testing, you input loads of data points (like a scatter)
and the test puts a line through itā¦
You get a correlation coefficient (r) that corresponds
to the correlation between the variables
r = 1. Perfect positive correlation.
r = 0. No correlation.
r = -1. Perfect negative correlation.
There are parametric and non-parametric regression tests:
Parametric = Pearsonās Correlation Coefficient
Non-parametric = Spearmanās Rank Correlation
61. Multiple Linear Regression
However, there are often multiple regression tests going on at once.
This is called multiple linear regression.
This is better at determining confounding factors.
Simple linear regression: (what we just did)
y = Ī± + Ī²x
Multiple linear regression:
y = Ī± + Ī²1x1 + Ī²2x2 + Ī²3x3+ā¦+Ī²nxn
The key rules are:
1. If the lines cross, the variables must interact with each other
2. If lines do not cross (have the same Ī²), they can be ignored
62. Interactions of confounding factors can be positive or negative.
B
A
I.E
āMore than the
sum of its partsā
63. Model Selection
For MLR, the best covariates need to be selected to increase the
likelihood of statistical significance.
Therefore, model selection musttake place.
For model selection, there are 4 options:
1. Forward: Speed dating
2. Backward: Big brother
3. Stepwise: Clothes shopping
4. Collet: Speed dating + Big brother + Speed dating
64. Types of Regression
1. Linear regression = uses absolute numbers
2. Logistic = uses odds ratios. For binary categorical variables.
3. Cox = uses hazard ratios. Used in survival analysis.
Remember!
For ratios = if it does not cross 1 it is statistically significant
For numbers = if it does not cross 0 it is statistically significant
65. Survival analysis These are ātime to eventā analysis.
Just measure the time to an event!
In survival analysis, the event is
death of the patient.
66. Censoring is when a patientās time to event can not be fully documented:
ā¢ Right censoring (more common) = when the patient is lost to follow-up
ā¢ Left censoring = when the event has happened, but start time was unknown
67. Survival analysis graphically
compared between populations
When comparing the survival
analysis of one pop. to another,
you must use a Kaplan Meier
graph.
This is non-parametric,
and p-value is determined
by a log-rank test.
Assumption of proportional
hazards stands if the 2 curves
DO NOT cross.
68. Summary: Statistical Tests
CONTINEOUS V
ARIABLES
1 variable:
Parametric = 1 sample t test
Non-parametric = Sign or WSRT
2 paired variables:
Parametric = Paired sample t test
Non-parametric = WSRT
2 independant variables:
Parametric = 2 sample t test
Non-parametric = Mann Whitney U test
2< paired variables:
AUC
2< independant variables:
Parametric = One-way ANOVA
Non-parametric = Krusskal Wallis test
CLASSIFICATION TABLES
5< = Chi-squared test
<5 = Fisherās exact test
Paired = McNemarās test
REGRESSION TESTING
y = Ī± + Ī²x, correlation coefficients
indicate proportionality
Pearsonās Correlation Coefficient
Spearmanās Rank Correlation
When cofounders cross they interact.
Forward, Backward, Side-step and Collet
methods of model selection for MLR
1. Linear
2. Logistic = ORs (binary variables)
3.Cox = HRs (survival analysis)
Survival analysis = time to events.
Kaplan Meier graphs are non-parametric.
P-value is used to determine if H0 is accepted or not. Type I and Type II errors.
Power = 1 - II error rate.
75. Types of Risk
1. Absolute Risk
2. Relative Risk
3. Odds Ratio
Absoulte Riska = a / a+b
=16/100
=16%
Therefore, 16% chance of getting lung cancer if you
are a smoker (made up numbers)
1. Absolute Risk
Make a quadrant with a, b, c, d in order
and then filling the 2-way table is easy.
Now, remember what each letter would
stand for and apply to formulas.
76. 2. Relative Risk
RR a:b = Absolute Risk a / Absolute Risk c
3. Odds Ratio
OR Lung Cancer = Odds A / Odds B
= [A/C] / [B/D]
= AD/BC
Smokers have 4.57x greater
odds of developing lung cancer.
= 16 x 96 / 84 x 4
= 1536 / 336
= 4.57
Relative risk of 4x to get lung
cancer as a smoker relative to
non-smokers.
= 0.16/0.04
= 4
87. Diagnostic Tests
ļ¼ Sensitivity = a/a+c
ļ¼ Specificity = d/b+d
ļ¼ PPV = a/a+c
ļ¼ NPV = d/c+d
Prevalence =
No. of diseased people / total population
( a + c / a + b + c + d )
88. Diagnostic Tests (continued)
+ve Likelihood Ratio = Sensitivity/1-Specificity
The probability of a person who has the disease testing positive DIVIDED BY
The probability of a person who does not have the disease testing positive
-ve Likelihood Ratio = 1-Sensitivity/Specificity
The probability of a person who has the disease testing negative DIVIDED BY
The probability of a person who does not have the disease testing negative
e.g., for +ve Likelihood Ratio = Sensitivity/1-Specificity
If on a graphā¦
Y = Sensitivity
X = 1 - Specificity
M (dy/dx) = +LR
89. Receiver Operated Characteristic Curves
+ve Likelihood Ratio = Sensitivity/1-Specificity (m=dy/dx)
Therefore, the gradient is +ve LR
The āfatterā the better.
The closer the curve is
to 1, the better. Why?
92. Methods of agreement
2 continuous variables
e.g., Are nurses or doctors better at measuring BP?
Bland-Altman method
2 categorical/binary variables
e.g., Are oncologists or pathologists better at identifying the presence of symptom X?
Kappa Coefficients
For when you have 2 ways to measure tests but donāt know which is better!
Its like diagnostic tests, but there isnāt a gold standard.
93. Summary: Risk and Diagnostic Tests
1. Risks:
ā¢ Absolute risk
ā¢ Relative Risk (down left column)
ā¢ Odds Ratio (across top row)
2. Diagnostic Tests:
ā¢ Sensitivity = a/a+c,
ā¢ Specificity = d/b+d,
ā¢ PPV = a/a+c,
ā¢ NPV = d/c+d
ā¢ Prevalence = (a+c/a+b+c+d)
3. +ve Likelihood Ratio = Sensitivity/1-Specificity,
-ve Likelihood Ratio = 1-Sensitivity/Specificity
94. 4. Receiver operative characteristic curves are used to help set markers for a threshold
for a test to turn +ve for a disease.
ļ You alter the threshold depending on the ideal balance of
sensitivity to specificity.
5. Lower threshold if you want: higher sensitivity but lower specificity.
6. Increase threshold if you want: higher specificity but lower sensitivity.
7. Methods of agreement include the:
ā¢ Bland-Altman method (continuous)
ā¢ Kappa Coefficients (categorical/binary)
99. Some main biases/fallacies
Regarding diagnostic tests
ā¢ Spectrum Bias: select a group patients who have higher prevalence of disease,
e.g. the elderly (not random)
ā¢ Verification bias: only some patients get the gold standard standard, in
whom you suspect the disease (not random).
ā¢ Incorporation bias: gold standard includes part of index test
ā¢ Differential bias: only some patients with the index test get the reference test
ā¢ Observer Bias: not blinded and knows something about the diagnosis and test
Some more generally ones (theres are lots!)
ā¢ Selection Bias: Incomplete randomisation, at any stage
ā¢ Appeal to nature
ā¢ Regression to the mean: Isnāt really a bias, but without it can lead to falsehoods
ā¢ Ecological fallacy
ā¢ Attrition Bias
100. Thinking that compounds/molecules in nature are more than identical to
compounds/molecules artificially synthesised.
As something is more natural it is more morally acceptable or desirable
(often propagated by social darwinists).
Would you prefer Vitamin C extracted from an orange or Vitamin C
synthesised in a lab?
āIām a meat eater because it is natural to do soā
Appeal to nature:
101. Regression to the mean:
If you did really bad at something once, you are more likely to do better the second
time (and regress to the mean) or vice versa.
e.g., A footballer that does really well one season (Vardy?) is more likely to do worse the
next season (because the previous seasons success had a large component of luck).
HOWEVER, we would tend to attempt to give a causal reason for Vardy having a
worse next season e.g. His rise in fame has gone to his head leading to a drop in form
the following season.
It is instead more likely to be just a regression to the mean (humans love to give
causal reasons for phenomenon - even if it is just down to luck)
102. Inferences about the nature of individuals are deduced from the group to
which those individuals belong.
āI donāt like The Moose. You
are part of The Moose.
I donāt like youā
Exhibits an ecological fallacy.
āI like individuals who are part of The Moose.
I just donāt like them when they're all togetherā
Does not exhibit an ecological fallacy.
Ecological fallacy:
103. Premise 1:All boys eat parsnips
Premise 2: Phil is a boy
Conclusion: Phil eats parsnips
(Valid but not sound)
Deduction vs Induction
The difference between deduction and induction came up last year
Deduction (top-down logic)
Made of premises. If the premises hold true, the conclusion is 100% correct logically.
Induction (bottom-up logic)
You observe a phenomenon (conclusion) and then try to explain it.
This is probabilistic: scientifically, you canāt disprove anything.
e.g. the cookie monster or unicorns.
The burden of proof is on the claim being made.
You can only provide a probability for which something/a phenomenon exists.
Premise 1: Bachelors are single males
Premise 2: Max is a single male
Conclusion: Max is a Bachelor
(Valid and sound)
104. Types of studies
1. Interventional Studies
2. Observational Studies
3. Quantitative Studies
+ some extra few types
105. Types of studies
Interventional studies
1.Randomised Control Trial,
RCT for individuals /
Cluster RCTs for groups, e.g. general practices or schools
2. Open label/single-blind/double-blind trials
3.Controlled study. There is a control group who have no
intervention, for comparison
4.Cross-over study: paired data. The patient is their own control.
(this is relevant for tests e.g. paired sample t test, WSRT, AUC)
e.g. BP before and after exercise.
106. Types of studies
Observational studies
1.Case-control study. Retrospective and therefore utilises odds ratios
(OR). Group haveY.What are the odds that X caused it?
2. Cohort study. Prospective and therefore utilises relative risk (RR).
Group have been exposed to X. What is their risk of gettingY?
3. Cross sectional study. Find population of interest, sample them
(cross sectional) and calculate point prevalence
e.g. STIs in students
4.Ecological study. Examine the population. Not as good as Case- control
or cohort.
5.Longitudinal study. Repeated observations over a long period of time.
107. Types of studies
Quantitative
1. Service evaluation: evaluate a service/initiative
2. Audit: evaluate a product, person, system, organisation etc
3.Economic Analysis: cost-benefit analysis.
This is done as part of health economics.
Other terms
1.Transitional Research Study: for the transition from bench tobedside
2.Action Research: Research that is fast tracked to solve an
immediate problem e.g. swine flu, ebola etc
108. Health Economics
Some terms that will be useful to learnā¦ these came up last year
Opportunity Cost = spending resources on one activity means sacrificing in terms of
resources spent elsewhere
Ā£/QALY = Price per quality adjusted life year.This quantifies a number where 0 =
death and 1 = disease free life year.
Tests:
1. Cost-minimisation analysis: only considers cost and cheapest option is taken
2. Cost-benefit analysis: considers cost and all health/non-health effects
3. Cost-consequence analysis: shown only when best treatment decided
4. Cost effectiveness analysis: Ā£ to life year (not as good as QALY)
5. Cost utility analysis: Ā£ to quality adjusted life year (this is the best form)
109. Some institutions/terms to learn:
NICE = Set guidelines. They set 1 year (cost-effectiveness) =
Ā£20,000. No more than Ā£30,000 as it is better spend elsewhere
PPI = Patient and public involvement. Participation vs Involvement
Participation = they are involved in the study, as part of the sample
Involvement = they are involved in the study, giving advice on how the study is done
NIHR = National Institute for Health Research.
Bridges gap between Universities and industry.
Look these up, and more!
110.
111.
112.
113. Meta-analysis
A meta-analysis is a systematic review of all of the literature on a particular topic.
They are shown as forest plots Each box size
represents the
power of that
study.The lines
represent upper
and lower limits of
a 95% CI
The diamond sums
it all up.Y
ou can
note the P value.
114. Meta-analyses are the gold standard for reviewing the literature.
HOWEVER, they are affected by the publication bias of the papers.
Funnel plots are used to determine if publication bias has likely occurred.
e.g. Pharmaceutical companies!
115. Summary: Bias, Types of Study and Other
1. There are loads of biases, but try to learn a few so you can recognise them in
the exam.
e.g. Selection, Publication, Appeal to Nature, Ecological fallacy etc
2. Interventional studies include: RCT (individuals), Cluster RCT (groups),
open label/single blind/double blind, Cross-over (paired data), Controlled
3. Observational studies include: Case-control (look to past for odds of
exposure), Cohort (calculate risk for potential future disease), Cross-
sectional (point prevalence), Ecological and Longitudinal
116. 4. Quantitative studies include: Service evaluation, Audits and Health
Economics studies (such as Cost-Utility Analysis)
5. Meta-analyses are the gold standard to determine the literature, but are
prone to publication bias. Papers tend to display positive results
6. Publication bias is shown by a funnel plot.
117.
118.
119.
120.
121. 30 QUESTION QUIZ
Check original PowerPoints in 2-slide view to carefully see
any added notes and see why the other options are NOT the
answer.
Retry the Questions again via āEmpty RMH Quizā doc.
122.
123. Which of the following is an example of a nominal variable?
Question1
124. For height and gender, what graphical display should be used?
Question2
125. For IQ and weight, what graphical display should be used?
Question3
126. Data that is normally distributed is analysed using a parametric
test. What values should be stated at the end of the test?
Question4
127. A confidence interval is best described as what?
a) A range for which you are x% confident that the sample
median lies within
b) A range for which you are x% confident that the
population standard deviation lies within
c) A range for which you are x% confident that the sample
mean lies within
d) A range for which you are x% confident that the
population mean lies within
e) An interval of time for which someone is x% confident
Question5
128. What is the general rule to determine statistical significance
of all confidence intervals?
a) If the upper or lower limits crosses 1, it is statistically insignificant.
If it does not cross 1, it is statistically significant.
b) If the upper or lower limit crosses 1, it is statistically significant. If
it crosses 1, it is statistically insignificant
c) If the upper limit is twice as large as the lower limit, it is statistically
significant. If it is less than twice as large, it is statistically insignificant
d) If a test is not clinically significant, it is therefore not statistically significant
either.
e) None of the above
Question6
130. To have a lack of precision coupled with
bias leads to what errors?
a) Lack of precision leads to systematic error.
Bias leads to publication error
b)Lack of precision leads to random error. Bias
leads to systematic error
c) Lack of precision leads to systematic error.
Bias leads to ecological error
d)Lack of precision leads to random error. Bias
leads to publication error
e) Lack of precision leads to ecological error.
Bias leads to inductive error
Question8
131. The best type of validity is undertaken when comparing a test to a
gold standard.What type of validity is this?
a) Content validity
b) Criterion validity
c) Face validity
d) Construct validity
e) Closed validity
Question9
132. The reliability coefficient is dependant on what 2 variables?
a) Variance of the test and cost of the test
b) Repeatability and clinically significance
c) Variance of the test and variance of repeat
d) Variance of the test and clinical significance
e) Repeatability and variance of repeat
Question10
133. How is the P-value best described?
a) The probability that, given the results, the H1 is correct
b) The probability that there may be a link if further tests are taken
c) The probability that, given the results, the H0 is correct
d) The probability that there is sometimes a link and sometimes no link
e) None of the above
Question11
134. Type I (Ī±) errors are false positives. However,Type II (Ī²) errors are
false negatives.
a) True
b) False
Question12
135. What is the most suitable statistical test for the following data?
a) McNemarās test
b) Fisherās exact test
c) Chi-squared test
d) 2 sample T test
e) Paired sample T test
Question13
136. What is the most suitable statistical test for the following non-parametric
data? Analysing [glucose]plasma of type 2 diabetics, before and after a meal
a)1 sample t test
b)Sign test
c)Area under the curve
d)Wilcoxon Signed Rank test
e)Krusskal Wallis test
Question 14
137. What is the most suitable statistical test for the following parametric data?
Testing the [oxytocin]plasma between pregnant women and non-pregnant
women
a) One way ANOVA
b) Mann-Whitney U
c) Paired sample t test
d) 2 sample t test
e) Dunnettās test
Question 15
138. Estimate the correlation coefficient of the following scatter
a) r = 0.00
b) r = 0.30
c) r = 0.60
d) r = 0.90
e) r = 1.00
Question 16
139. For multiple linear regression, what occurs if lines do not have the
same slope (Ī²)?
a) Nothing.The slope has no relevance
b) They cross, leading to no net effect
c) They cross, cancelling each other out
d) They cross, leading to an addition effect
e) They cross, leading to an interaction effect
Question 17
140. Which model selection process, for MLR, consists of putting all
the functions into the model.Taking out the worst.Taking out the
second worst.Then putting the first worst back in again?
a) Stepwise
b) Backward
c) Forward
d) Collet
e) All of the above
Question 18
141. There are 3 types of regression testing.Which type is used for
binary variables? Also, what output does it analyse?
a) Linear regression, absolute numbers
b) Logistic regression, hazard ratio
c) Logistic regression, odds ratio
d) Cox regression, hazard ratio
e) Cox regression, odds ratio
Question 19
142. Cox regression uses hazard ratios for survival analysis (time to
event analysis). How is this data graphically displayed? What type
of data is it? What is the statistical test?
a) Kaplan-Meier graphs, parametric, log-rank test
b) Kaplan-Meier graphs, non-parametric, Krusskal Wallis test
c) Scatter graph, non-parametric, log-rank test
d) Kaplan-Meier graph, non-parametric, log-rank test
e) Scatter graph, parametric, Krusskal Wallis test
Question 20
143. What is the relative risk of a smoker developing heart disease
compared to a non-smoker?
a)5
b) 1/9
c)1/2
d)5/2
e)10
Question21
What are the odds of a smoker
developing heart disease compared
to non-smokers?
a) 1/8
b) 2/5
c) 1/2
d) 1
e) 8/3
Questions22
144. Positive predictive value may be defined asā¦
a) The probability, given a positive test, the person has the disease
b) The probability of a diagnostic test giving a person without the disease
a positive result
c) The probability, given a positive test, the person does not have the
disease
d) The probability of a diagnostic test giving a person without the disease
a negative result
e) None of the above
Question 23
145. The gradient of a ROC that has 1-sensitivity as the dependant
variable (y) and specificity as the independant (x) is what?
a) +Likelihood Ratio
b) -ve Likelihood Ratio
c) Positive Predictive Value
d) Negative Predictive Value
e) None of the above
Question24
146. How does lowering the diagnostic threshold for a disease affect
sensitivity, specificity and +LR?
a) Decreases sensitivity, decreases specificity and decreases +LR
b) Increases sensitivity, increases specificity and decreases +LR
c) Increases sensitivity, increases specificity and increases +LR
d) Increases sensitivity, decreases specificity and decreases +LR
e) Decreases sensitivity, increases specificity and decreases +LR
Question25
147. Which of the following is true for
inductive arguments?
a) They are the only types of arguments
used in scientific enquiry
b) If their premises are true, the
conclusion follows logically
c) They can be especially prone to
ecological fallacy when concluding
d) They follow from an observation and
are probabilistic
e) They can be valid but unsound if their
premises are incorrect
Question26
150. Questions27,28and29
a) Randomised Control Trial
b) Ecological study
c) Cross sectional study
d) Service Evaluation
e) Cost-effectiveness analysis
f) Cost-utility analysis
g) Questionnaire
h) Cluster Randomized Control Trial
i) Longitudinal Study
j) Audit
k) Cost-minimization analysis
l) Case control study
151. Meta-analyses are vulnerable to publication bias.
What type of graph exposes publication bias?
a) Bamboo plots
b) Dot plots
c) Forest plots
d) Kappa coefficients
e) None of the above
Question30
152.
153. Cross sectional study = takes a cross section and determines point prevalence.
ExtraNotes