CLINICAL AUDIT AND STATISTICS
Dr. Nadir Mehmood
At the end of this discussion, we all will be able to;
1. Define audit and its clinical importance
2. Enumerate the types of audit and use of statistics in
3. Enumerate the components of clinical audit
4. Define statistics and distinguish between population
5. Use real life examples to illustrate the goal of
6. Define and calculate different descriptive statistics
7. Calculate AUC of a normal distribution
“I shut my
eyes in order
Paul Gaughuin (1848-
1. What is audit?
2. What is medical audit?
3. Why audit?
4. Audit versus research
5. The quality cycle
6. Use of statistics in medical audit
What is audit?
Evaluation of data, documents and
resources to check performance of systems
meets specified standards.
Audit in the wider sense is simply a tool to find
out what you do now; this often to be compared
with what you have done in the past, or what you
think you may wish to do in the future.
What is medical audit
“A quality improvement process that seeks to improve
patient care and outcomes through systematic review of
care against explicit criteria and the implementation of
change.” (NICE guides)
• An audit is a cyclical process
- collecting data,
- identifying areas for improvement,
- making necessary changes
- back round to defining new standards.
• Maintain participant and staff safety.
• Maintain data quality .
• Protect reputation of staff, host and sponsorer
• Protect current and future funding
• Improve quality.
• It does not involve experiments
• It uses data that already exists
Audit:- are we doing the best thing in the
• Measures current practice against specific standards
• Never experimental
• Uses data in existence by virtue of practice
• May require ethical approval
• Aims to improve delivery of patient care
Research:- What is the best thing to do/the
best way to do it
• Provides sound basis for medical audit
• Involves experimental trials
• Uses detailed data collection
• Needs ethical approval and registration
• Aims to add to body of scientific knowledge
• LETS DISCUSS THE FOLLOWING STATEMENTS
• Audit usually consumes an extensive amount of
resources (of time, money etc.).
• Rare conditions should be audited.
• The higher the amount of data the practitioner
collects, the easier is the decision making process in
• The most challenging stage in Audit is implementing
• The agreed standards can be reset at realistic
percentages after the first round of data collection.
When to Use What
Method When to use it Why
Research Good practice is not
To define good
Data Collection or
Audit Good practice is
defined but we want
to know how much
we are sticking to it
To improve current
Does Audit Lead to Change
• Hearnshaw et al, BJGP 1998
• Of 1257 audits
• 80% on clinical care
• 65% led to change
WHAT CLINICAL AUDIT
• Big brother
• A cost cutting exercise
• Worthless and a waste of time
Nor is it…
• Buying computers
• Buying software
• Having others put in all the
data and you pushing the
• Extra work
• Optional work We don’t need to
I’m too busy
If only we had
Can’t you do it
• We use different languages
• We have different purposes for doing an audit
• We report differently
• We need to recognise diversity
• We have different expected outcomes
Don’t change for change’s sake
DON’T RE-INVENT THE WHEEL
Clinical Audit Cycle
5. Analysis and
Do We need to do a complex statistical
Do I need to do a complex statistical
• Generally no, unlike research, most
audits will not require heavy number
• A simple graphical display is often the
most effective method of sharing your
• Don’t be tempted to overcomplicate
things just because your computer will
•Topics of audit need to be chosen with care
•Refined to make them suitable
•ƒStandard setting requires clarity of thought
•ƒData collection to observe practice can
consume endless time and money
•ƒLasting change is notoriously difficult to
• Statistics is the art/science of summarizing data
• Better yet…summarizing data so that non-statisticians
can understand it
• Clinical investigations usually involve collecting a lot of
• But, at the end of your trial, what you really want is a
– Did the new treatment work?
– Are the two groups being compared the same or different?
– Is the new method more precise than the old method?
• Statistical inference is the answer!
Do you need a statistician as part of your
clinical research team?
• Simplest reasons: s/he will help to optimize
– Interpretation of results
THERE ARE LIES, DAMN LIES AND
British Prime Minister Benjamin Disraeli
Popularized by Mark Twain
have been done
on misuse of
statistics in medicine
“ It is easy to lie with
statistics, but it is easier to lie
without them” Frederick
ANTECEDENT(A) BEHAVIOUR(B) CONSEQUENCE(C) POSSIBLE
on, access to
Student (PG) in
texting on cell Teacher says, “mr.x
. Stop that”
ABC OF Behaviour
Basic Components of Research
Starts with a hypothesis or “educated
–Not all hypotheses are testable.
–Hypotheses in science are formulated so
that they are testable.
Statistical versus Clinical Significance
• Statistical methods – branch of mathematics
– Helps to protect against biases in evaluating data
• Statistical vs. clinical significance
– Statistical significance – are results due to chance?
– Clinical significance – are results clinically
– Statistical significance does not imply clinical
Statistical versus Clinical Significance
• Balancing statistical versus clinical
–Evaluate effect size
–Evaluate social validity
• Generalizability and the patient uniformity
• The “average” client
Consider flipping a coin and recording the
relative frequency of heads.
When the number of
coin flips is small, there
is a lot of variability in
the relative frequency of
“heads” (as shown in
What do you notice in
the graph at the right?
Consider flipping a coin and recording the
relative frequency of heads.
The graph at the right
shows the relative
frequency when the
coin is flipped a large
number of times.
What do you notice
in this graph at the
Law of Large Numbers
Notice how the relative
frequency of heads approaches ½
the larger the number of trials!
Types of statistics / analyses
DESCRIPTIVE STATISTICS Describing a phenomena
Frequencies How many…
Basic measurements Meters, seconds, cm3, IQ
INFERENTIAL STATISTICS Inferences about phenomena
Hypothesis Testing Proving or disproving theories
Confidence Intervals If sample relates to the larger population
Correlation Associations between phenomena
Significance testing e.g diet and health
Statisticians Require Precise Statement
of the Hypothesis
• H0: There is no association between the
exposure of interest and the outcome
• H1: There is an association between the
exposure and the outcome.
– This association is not due to chance.
– The direction of this association is not typically
• Directional (H1)
– Physical activity program will affect body
composition such that physical activity
individuals will lose more fat than sedentary
• Null (HO)
– Physical activity will not affect body
– Physical activity will affect body composition.
…permit the researcher to describe
many pieces of data with a few
Types of descriptive statistics…
2. measures of central tendency
3. measures of variability
…representations of data enabling the
researcher to see what the
distribution of scores look like
measures of central tendency...
…indices enabling the researcher to
determine the typical or average
score of a group of scores
2. Measures of central tendency…
…the score attained by more
participants than any other score
…the point in a distribution above and
below which are 50% of the scores
…the arithmetic average of the scores
Mean versus Median
• Large sample values tend to inflate the mean.
This will happen if the histogram of the data
• The median is not influenced by large sample
values and is a better measure of centrality if
the distribution is skewed.
• Note if mean=median=mode then the data
are said to be symmetrical
• Normal (symmetrical) Distribution (bell shaped)
• The larger the sample size the greater the
• The larger the effect size the greater the
• The larger the significance level the greater
What to do when you need more
• Increase sample size
• Reduce number of variables
• Show your data graphically
P Values and Statistical Significance
• Based on notion that we can disprove, but not prove, things.
• Therefore, we need something to disprove.
• Let's assume the true effect is zero: the null hypothesis.
• If the value of the observed effect is unlikely under this
assumption, we reject (disprove) the null hypothesis.
• "Unlikely" is related to (but not equal to) a probability or P
• P < 0.05 is regarded as unlikely enough to reject the null
hypothesis (i.e., to conclude the effect is not zero).
– We say the effect is statistically significant at the 0.05 or 5% level.
– Some folks also say "there is a real effect".
• P > 0.05 means not enough evidence to reject the null.
– We say the effect is statistically non-significant.
– Some folks accept the null and say "there is no effect".
• Problems with this philosophy
– We can disprove things only in pure mathematics, not in real
– Failure to reject the null doesn't mean we have to accept the
– In any case, true effects in real life are never zero. Never.
– So, THE NULL HYPOTHESIS IS ALWAYS FALSE!
– Therefore, to assume that effects are zero until disproved is
illogical, and sometimes impractical or even unethical.
– 0.05 is arbitrary.
• The answer? We need better ways to represent the
uncertainties of real life:
– Better interpretation of the classical P value
– More emphasis on (im)precision of estimation, through use of
confidence limits for the true value
– Better types of P value, representing probabilities of clinical or
practical benefit and harm
…a computer should not be used to
perform an analysis that a
researcher has never completed by
hand or, at least, studied
General QUESTION: ANSWERS
• HOW DO PEOPLE LET YOU
KNOW THEY ARE AT YOUR
DOOR AND WANT TO
• They ring the doorbell.
• They knock.
• They stand outside,
studying kinetics, until you
open the door for your own
A Possible Investigation:
Possible research questions Data Sources?
– How do people knock
on someone’s door?
– How many times do
– Do people speak when
• Search literature and
results of previous
studies on this subject
• Survey people and ask
them how they knock
• Observe people as they
knock and record data
Study 1: American Knocking Practices
– People generally approach a residence and knock when
they wish to enter.
– Describe how people knock when at someone’s door.
– Review available data
– Design survey, experiment, interviews or some
– Descriptive Statistics
• Number of events observed (also known as “n” or sample size) was 35.
• Sheldon knocked between 0 and 30,000 (self-reported) times when approaching
• He used 1, 2, 6 and 30,000 knocks each one time. (The “1” was the robot)
• He knocked for Leonard, then Penny, 5 times, with one instance where he
knocked for Penny first.
• Penny knocked one time on Sheldon’s door, in this case she knocked three times.
• In one instance, he knocked, then approached an interior door where he knocked
a second time.
– Parametric Statistics
• The average number of knocks was 860.06 (mean)
• The most common number of knocks was 3 (mode)
• The median number of knocks was 3 (1, 2, 3, 6, 30000)
• The standard deviation of the mean number of knocks was 4997.46
– Without any other information, which of the following can we infer:
• In this sample, three knocks were used to alert the resident that someone was at
• People in general knock three times.
• Knocking three times is always effective in getting someone to answer the door.
• Tony Orlando and Dawn ( http://www.youtube.com/watch?v=k7Jvsbcxunc
) were wrong in the 70’s when they concluded that:
– You should knock three times on the ceiling…
– You should knock twice on the pipe if the answer is no…
• In our data, knocks were always associated with the calling out of a name and this
process was repeated.
• If someone is at your door and they knock three times, followed by your name
three times, and this is repeated three times, it is likely to be Sheldon.
• Sheldon has issues.
Let’s take one of these conclusions and explore
it more thoroughly from a statistical
• People in general knock 3 times.
– How would our results have changed if we had seen only
a subset of the data? (Smaller sample size…) For example
what if we missed the “flash” – how would the results
– The average number of knocks was 3 (mean)
– The most common number of knocks was 3 (mode)
– The median number of knocks was 3 (1, 2, 3, 6, 30000)
– The standard deviation of the mean number of knocks was
Direction for future research:
• Good research always poses new questions.
• Additional research questions for this
– Is there a time when two knocks are sufficient?
– Are mechanical/technological means of knocking
just as effective as in person knocking?
– How hard would it be to find a new apartment?