Ch 2 basic concepts in pharmacoepidemiology (10 hrs)

2. Basic Concepts in
Pharmacoepidemiology (10 hrs)
Outline
2.1. Overview of the Scientific Method
2.2. Study Designs in Pharmacoepidemiology
2.3. Qualitative Research in Pharmacoepidemiology
2.4. Measures of Association or Risk
2.5. Evaluation of evidence and errors in
pharmacoepidemiology studies
2.5.1. Overview
2.5.2. Types of Association Between Factors Under Study
2.5.3. Types of Errors That One Can Make in Performing a Study
2.5.4. Criteria for the Causal Nature of an Association
2.6. Sampling Considerations in pharmacoepidemiology
2.6.1. Sample Size Determination
2.6.2. Sampling Techniques
1

2.1. Overview of the Scientific Method
 The whole point of science is to UNCOVER
THE TRUTH
 Two tools to peruse SCIENTIFIC INQUIRY are
1. OUR SENSES – through which we experience
the world and make observations
2. OUR ABILITY TO REASON – which enables us
to make logical inferences
2

Scientific Method…
 In science we impose logic on observations
 There are two kinds of logic we impose:
 Deductive Inferences:
 From the GENERAL (theory) to the SPECIFIC, of the
observations
 Involving inferences from general principles
 Inductive Inferences:
 Also called logic of reasoning
 From SPECIFIC to the GENERAL.
 Proceeding from particular facts to a general conclusion
 We make many observations
 Discern a pattern,
 Make generalizations, and
 Infer an explanation
3

Scientific Method…
 The scientific methods from the perspective of
inductive approach is a three-stage process
1st stage: One studies a sample of study participants
2nd stage: One generalizes the information obtained
from this sample of study participants
 Drawing a conclusion about a population in
general
→ An association
3rd stage: One establishes causation
→ Making inferences
4

Example:
One might perform a RCT of the efficacy of methyldopa in
lowering BP, randomly allocating a total of 40 middle aged
hypertensive men to receive either methyldopa or placebo
and observing their BP six weeks later. One might expect
that the BP of the 20 men treated with the active drug
decrease more than the BP of 20 men treated with placebo.
In this example
 The 40 study participants would represent the study sample
 One would make generalization that methyldopa lowers BP in
middle aged hypertensive men (establishing association).
 One must further explore whether this observation is by
chance or due to other factors.
 And further decide whether the association is casual in
nature or not.
5
RCT- Randomized Controlled Trial

6
Inductive Research Framework
Ask Questions
Do Background Research
Construct
Hypothesis
Think! Try again
Test with an Experiment
Analyze Results, Draw Conclusion
Hypothesis is TRUE Hypothesis is FALSE or
Partially True
Report Results

2.2. Study Designs in Pharmacoepidemiology
7
 Pharmacoepidemiology applies the methods of
epidemiology to the content area of clinical
pharmacology
 Different study designs can be applied within
pharmacoepidemiology, all with their own
 Specific indications
 Advantages and
 Disadvantages

8
Epidemiologic Study Designs
Descriptive
Case Series
Case Report
Ecologic
Cross
Sectional
Analytic
Non-
Experimental
Cohort
Case-Control
Experimental
RCT
Fig. Types of Epidemiologic Study Designs

Hierarchy of Designs and Strength of Evidence
1. Randomized Controlled Trial (RCT)
2. Cohort: Prospective or Retrospective
3. Case Control- Prospective or
Retrospective
4. Cross Sectional
5. Case Reports/Case Series
6. Systematic Reviews, Meta-Analysis:
Secondary data analysis
Strongest
Evidence
Weakest
Evidence
10

Study Design
Past Present Future
Retrospective Cohort Prospective Cohort
Case-Control (Retrospective)
Cross-Sectional
11

Retrospective and Prospective Designs
 Retrospective Design
 The research question or hypothesis is conceived and
studied using data that were collected and recorded
previously (before the design of the current study).
 Prospective, or Longitudinal, Design
 The collection of data is planned in advance and
actually occurs after the study has begun.
12

Comparison of Retrospective and Prospective Approaches
Retrospective Prospective
Inexpensive to conduct Expensive to conduct
Completed in a shorter time period Completed over a longer time period
Easier to access a larger number of subjects More difficult to access subjects and usually
requires a larger number of subjects
Allows results to be obtained more quickly Exposure status and diagnostic methods for
disease may change
Useful for studying exposures that no longer
occur
Loss of subjects from the study over time
may be substantial
Information and data may be less complete
and inaccurate
Information and data may be more complete
and accurate
Subjects may not remember past
information
Direct access to study subjects enhances
reliability of data
13

Descriptive Studies
 A study in which explain the frequency and relative
distributions of health and disease in populations
 Attempt to uncover and portray the occurrence of the
condition or problem
 Useful in
 Signal detection
 Hypothesis generating
 Identifying previously unrecognised safety issues
 Explain the 5WH Questions (What, Who, Where,
When, How much)
 Person Place Time model
14

 The aspect of epidemiology concerned with the search
for health problem-related causes and effects
 Determine causes or health problems
 Based on Natural exposure
 Use controls or comparison groups
 Focus on the determinants (causes) of diseases
 Ultimate goal is judging whether a particular exposure
causes or prevents disease
 Test cause–effect relationships
15
Analytical Studies

Case Reports
 Case reports are simply reports of events observed in
single patients
 In pharmacoepidemiology,
 A case report describes a single patient who was exposed
to a drug and experiences a particular, usually adverse,
outcome.
(Individual Assignment 10%)
 Case reports are useful
 For raising hypotheses about drug effects, to be tested
with more rigorous study designs.
16

Case Series
 Case series are collections of patients, all of whom
have a single exposure, whose clinical outcomes
are then evaluated and described.
 Examples
 Serious liver damage following use of XTC
 Birth defects after use of Thalidomide (Softenon)
17
XTC- Methylene dioxymethamphetamine (MDMA or 'Ecstasy')

18
LETTER TO THE EDITOR
McBride WG. The Lancet, December 16, 1961: page 1358

Ecological Studies (Correlational)
 The study of association between two factors
 Macro level study (can not imply with each individual)
 Studying the trend of health problems
 Examine trends in an exposure that is a presumed cause
and trends in a disease that is a presumed effect and
test whether the trends coincide.
 These trends can be examined over time or across
geographic boundaries.
 Example: the association between amount of anti-
asthma drug and the increase of asthma mortality
among American people
19

Ecologic Studies: Breast Cancer Incidence
by National Fat Intake (hypothetical example)
20
0
50
100
150
200
250
500 700 900 1100 1300 1500 1700
Fat intake (kcal/d)
Incidence
per
100,000
p-y
Romania
Yugoslavia
Hong Kong
Israel
Italy
Hungary
Poland
Spain
Sweden
UK
N Zealand
France
Switzerland
USA

Cross-Sectional Studies
 Prevalence study
 A study design that shows concurrently existing
characteristics and health outcomes
 Information about the status of an individual with
respect to the presence or absence of exposure and
disease is assessed at a point in time.
 Cross sectional studies also show the picture of social,
environmental, or other problems or events in a
population.
 The point in time may be as short as few minutes or as
long as two or three months.
 The time frame of "point in time" is based on the speed of
data collection.
21

Case-Control Studies
 Comparing cases with a disease to controls
without the disease, looking for differences in
antecedent exposures.
 Comparing the diseased and non-diseased groups
 Look back in time to measure exposures of the study
subjects
 Retrospective study
 Rare diseases
 Ex. Limb defect and Thalidomide in German
babies
22

Case-control studies
Cases
Controls
Population
at Risk
Exposed - b
Unexposed - d
Exposed - a
Unexposed - c
Cause Effect
23

Steps in conducting case control study
Step 1: Define cases
Step 2: Select cases
Step 3: Select controls
• The controls should be similar with the cases except
that the cases have the disease or other outcome of
interest.
Step 4: Check the exposure status of individuals both in the
cases and controls
Step 5: Analysis
• Prepare 2X2 table
• Calculate Odds Ratio (OR)
• Perform statistical tests to check whether there is
significant association
24

Example case-control study
 What is the risk on breast cancer with the use of
SSRI antidepressants?
 Cases:
 women with breast cancer
 Controls:
 women with no breast cancer
 Exposure:
 SSRIs
25

Coogan et al. Am J Epidemiol 2005
26

Case Control
Exposure + a b
- c d
a + c b + d
Odds of Exposed = a/b
Odds of Non Exposed = c/d
Odds Ratio (OR) = (a/b) / (c/d)
= (a*d) / (b*c)  RR
Calculation OR
27

Rofecoxib and risk of MI
Nested Case Control design:
• 9218 MI cases of whom 93 used of rofecoxib < 3 months ago
• 86349 controls, of whom 634 used of rofecoxib < 3 months ago
MI control
Rofecoxib + 93 634
- 9125 85715
OR (MI)= 93x85715 / 634x9125 = 1.38
BMJ 2005;330:1366
28

Selection of cases
 Establish strict diagnostic criteria for the outcome:
Examples:
 Type 1 diabetes in children: severe symptoms, very
high BG, marked glycosuria, and ketonuria.
 Type 2 diabetes: few if any symptoms, Slightly
elevated BG, diagnosis “complicated”.
29

Selection of cases
 Population-based cases: include all subjects or a
random sample of all subjects with the disease at a
single point or during a given period of time in the
defined population:
 Disease registers
 Hospital-based cases:
 All patients in a hospital department at a given time
30

Selection of Controls
Principles of Control Selection:
 Study base:
 Controls can be used to characterise the distribution of exposure
 Comparable-accuracy
 Equal reliability in the information obtained from cases and controls 
no systematic misclassification
 Overcome confounding
 Elimination of confounding through control selection matching or
stratified sampling
31

Selection of Controls
 General population controls:
 registries, households, telephone sampling
 costly and time consuming
 recall bias
 eventually high non-response rate
 Hospitalised controls:
 Patients at the same hospital as the cases
 Easy to identify
 Less recall bias
 Higher response rate
32

Ascertainment of outcome and exposure
status
 External sources:
 Death certificates, disease registries, Hospital and
physicians records etc.
 Internal sources:
 Questionnaires and interviews, information from a
surrogate (spouses or mother of children), biological
sampling( e.g. antibody)
33

Strengths
 Quick, inexpensive
 Well-suited to the
evaluation of diseases with
long latency period
 Rare diseases
 Examine multiple etiologic
factors for a single disease
Limitations
 Not rare exposure
 Incidence rates cannot be
estimated unless the study
is population based
 Selection Bias and recall bias
34
Strengths and Limitations of Case-Control

Cohort Studies
 Are studies that identify subsets of a defined population
and follow them over time, looking for differences in their
outcome.
 Cohort studies generally are used to compare exposed
patients to unexposed patients
 Although they can also be used to compare one exposure to
another.
 The design best allows for estimates of the probability or
risk of developing the outcome
 The exposed or not exposed to a particular risk factors
 Measuring incidence
 The best design of observational studies that can explain
causation of health problems
35

Cohort study / Follow-up study
Study population
Exposed
Non-exposed
Disease +
Disease +
Disease -
Disease -
Cause Effect
36

Strengths and Limitations of Cohort Study
Strength
 Rare exposure
 Examine multiple effects
of a single exposure
 Minimizes bias in the
exposure determination
 Direct measurements of
incidence of the disease
Limitation
 Not rare diseases
 Prospective: Expensive
and time consuming
 Retrospective: in adequate
records
 Validity can be affected by
losses to follow-up
37

Steps in conducting cohort study
Step 1: Define exposure
Step 2: Select exposed group
Step 3: Select non-exposed group
Step 4: Identify sources of data for exposure and
outcome
Step 5: Collect data
Step 6: Analyze data
• Prepare 2X2 table
• Calculate Relative Risk (RR)
• Perform statistical tests to check whether there
is statistical significant association
38

Concurrent VS Retrospective Cohort
Prospective or
39

Frequency measures cohort study
(P1 personyears)
(P0 personyears)
A1
A0
Exposure
No Exposure
Disease
Disease
No disease
Time
Time
No disease (B1)
(B0)
40

Frequency measures
 Incidence
 Cumulative incidence (CI)
 Incidence rate (IR)
41

Risk disease + | exposure + = A1 / N1 = CI1
Risk disease + | exposure - = A0 / N0 = CI0
Disease No disease Total
Exposure + A1 B1 N1
- A0 B0 N0

Force of morbidity | exposure + = A1 / P1 = IC1
Force of morbidity | exposure - = A0 / P0 = IC0
Disease Person-years
Exposure + A1 P1
- A0 P0

Effect measures cohort study
Risk difference RD CI1 – CI0 IC1 - IC0
Relative Risk RR CI1 / CI0 IC1 / IC0
Attributable Risk AR (CI1 - CI0) / CI1 (IC1-IC0) / IC1
Relative Risk Reduction RRR 1 – RR
Number needed to treat NNT 1 / RD
44

Pill and Deep Venous Thrombosis
Risk no pill = 3.9 per 100 000 py
Risk pill gen. 2 = 10.3 per 100 000 py
Risk pill gen. 3 = 21.3 per 100 000 py
RR2/3 = 2.07
RV2/3 = 11.0 per 100 000 py
AR2/3 = 52%
‘NNH’2/3 = 9091 per year
Lancet 1995; 346: 1582 - 1588
45

Prospective vs. Retrospective Cohort Studies
 Prospective Cohort Studies
 Time consuming, expensive
 More valid information on exposure
 Measurements on potential confounders
 Retrospective Cohort Studies
 Quick, cheap
 Appropriate to examine outcome with long latency periods
 Admission to exposure data
 Difficult to obtain information of exposure
 Risk of confounding
46

Selection of the Exposed Population
 Sample of the general population:
 Geographically area, special age groups, birth cohorts (Framingham
Study)
 A group that is easy to identify:
 Nurses health study
 Special population (often occupational epidemiology):
 Rare and special exposure
 Permits the evaluation of rare outcomes
47

Selection of the Comparison Population
 Internal Control Group
 Exposed and non-exposed in the same study population
(Framingham study, Nurses health study)
 Minimise the differences between exposed and non-exposed
 External Control Group
 Chosen in another group, another cohort (Occupational
epidemiology: Asbestosis vs. cotton workers)
 The General Population
48

Applications of different observational designs
Application Ecological X-sectional Cohort Case-control
Rare disease ++++ - - +++++
Rare exposure ++ - +++++ -
Test multiple + ++ +++++ -
effects
Study multiple ++ ++ +++ ++++
exposures
Time relationship ++ - +++++ +
Direct measure - - +++++ +
of incidence
49

Bias, Cost, and Time
Ecological X-sectional Cohort Case-control
Probability of;
Selection bias - medium low high
Recall bias - high low high
Loss of follow up - - high low
Confounding high medium low medium
Time required low medium high medium
Cost low medium high medium
50

Experimental/Intervention studies
 Longitudinal in design
 There is random allocation of subjects to either
group
 Treatment & Control group
 Individuals are allocated by the investigator
 Artificial manipulation of study factors
 Can produce high quality data
51

 Classification based on the population studied
1. Clinical trial
• Usually performed in clinical settings and the
subjects are patients
2. Field trial
• Used in testing medicine for preventive purpose
• Subjects are healthy people e.g. vaccine trial
3. Community trial
• Unit of the study is group of people/community
e.g. Fluoridation of water to prevent dental caries
52
Experimental/ Intervention studies

 Classification based on objective
1. Phase I
 Trial on small subjects to test a new drug with small
dosage to determine the toxic effect
2. Phase II
 Trial on small group to determine the therapeutic effect
3. Phase III
 Study on large population
 Usually randomized controlled trial
53

Source population
Randomisation
Index group Control group
Follow-up Follow-up
Outcome Outcome
In- and exclusion criteria
Method, blinding
Prognostically comparable
Treatment
Double blind
Loss-to-follow-up
Blinded measurement of
outcome
54

 In a clinical trial,
 the experimental group receives the drug or treatment
to be evaluated,
 while the control group receives
 A placebo, no treatment, or
 The standard of care.
 Both groups are followed for the outcome(s) of interest.
 Hawthorne effect: Even “inert” treatments might
result in significant improvements in the patient’s
condition.
55

Crossover Design
 A crossover study is a special design of controlled
intervention study that is sometimes used in drug
trials.
 In this design,
 half of the participants are randomly assigned to start
with the placebo and then switch to active treatment,
while the other half does the opposite.
56

Advantages and Disadvantages of Crossover
Study design (RCT)
Advantages:
 It reduces the number of subjects required,
 since each subject serves as both an experimental subject
and a control
 It decreases the biological variability inherent in
comparing different subjects by comparing each
subject with himself or herself
Disadvantages:
 Increases the duration of the study
 Carry-over effect
 Fatigue
57

Quasi-Experimental Designs
 Artificial manipulation of the study factor without
randomization (e.g. program evaluation)
 One group (Internal) comparison
 Each experimental unit serves as its own control
 The control will be past experience historical (before
and after study)
 Multiple group (External) comparison
 Treatment or intervention group compared to control
or comparison groups.
58

Quasi-Experimental Designs
Designs with Historic Control
 A before and after study is a method of control in
which results from experimental subjects are
compared with outcomes from patients treated
before the new intervention was available.
 These are called historic controls.
59

Advantages and limitation of RCT
(Experimental studies)
 Advantages
 Gold standard for therapeutic evaluation
 The Gold standard is achieved through
 Randomization
 Blindness
 Use of placebo
 Limitations
 Ethical considerations (e.g. smoking, birth defects)
 Feasibility/Practical issues (e.g. rare adverse effects)
 Cost
60

Table: Advantages and disadvantages of epidemiologic study
designs
61

2.3. Qualitative Research in
Pharmacoepidemiology
Presentation Outline
2.3.1. Introduction
2.3.2. Qualitative Research Designs and Methods
2.3.3. Methods of data collection
2.3.4. Sampling strategies in qualitative research
2.3.5. Qualitative Methods of Analysis
2.3.6. Trustworthiness
62

Learning Objectives
 Define qualitative research
 List key features of qualitative research
 Describe basic design questions in qualitative
research methods,
 Identify different methods used for addressing
different research questions
 Describe sampling strategies used in qualitative
approaches
 Discuss on thrust worthiness of qualitative methods
63

2.3.1. Introduction
Definition
 Qualitative research is a type of formative research
that offers specialized techniques for obtaining in-
depth responses about what people think and
how they feel.
 It enables programme management to gain insight
into attitudes, beliefs, motives and behaviors of
the target population.
64
Formative research is the process by which researchers or public
health practitioners define a community of interest, determine how to
access that community, and describe the attributes of the community
that are relevant to a specific public health issue.

Why qualitative research?
Qualitative research:
 Provides greater depth of response and, therefore, greater
consequent understanding than can be acquired through
quantitative techniques.
 Good source of descriptions and explanations of processes in
identifiable local contexts.
 Describe chronological flow, which events led to which
consequences and derive fruitful explanations.
 Could help researchers to get beyond initial conceptions and to
generate or revise conceptual frameworks.
 Well suited for locating meanings people place on events,
processes, and structures of their lives and for connecting
these meanings to the social world.
65

 Qualitative research (cont..)
 Deals with emotional and contextual aspects of human
response rather than with objective measurable
behaviors and attitudes.
 Can be designed to explore concepts, develop
hypotheses or theories, develop research tools, and
clarify the findings of a quantitative study.
66

 There are three domains in which qualitative
research tends to be used in public health:
1. First domain includes economic, political, social and
cultural, environmental and organizational factors
which influence health.
2. Second domain focuses on gaining understanding of
how people make sense of their experiences of
health and disease.
3. Third domain includes interaction of actors involved
in different public health activities.
67

How is Qualitative research used?
 Qualitative research is used largely in four general
ways as:
1. Tool to generate ideas
2. Step in developing a quantitative study
3. Aid in evaluating a quantitative study
4. Primary data collection method for a research topic
68

Difference between qualitative and quantitative
researches
Qualitative Quantitative
 Provides depth of understanding  Measures level of occurrence
 Asks why?  Asks how many? How often?
 Studies motivations  Studies action
 Is subjective  Is objective
 Enables discovery  Provides proof
 Is exploratory  Is definitive
 Allows insights into behavior,
trends and so on
 Measures level of actions, trends,
and so on
 Interprets  Describes
 Inductive  Deductive
69

Characteristics of qualitative research
 Qualitative research methods have many
distinguishing characteristics.
1. Qualitative methods take the views of informants,
whereas quantitative research takes the ideas of the
researcher as points of departure.
2. Lines of reasoning in both methods differ.
 In quantitative research is deductive
 Typically starting with the generation of a hypothesis based on
existing theory, then testing of hypothesis against existing reality
 In qualitative method is inductive
 Qualitative researchers may also test emerging hypotheses or
theories against data, and thus oscillate between data and theory.
70

Characteristics of qualitative research
3.Concerned with reliability and validity.
 Strength of the quantitative approach lies in its reliability
(repeatability)
 that is the same measurements should yield the same results
time after time,
 Strength of qualitative research lies in validity (closeness
to the truth)
 that is good qualitative research should touch the core of
what is going on rather than just skimming the surface.
 Validity of qualitative methods is greatly improved
 by a process known as triangulation and
 by independent analysis of the data by two or more
researchers.
71

2.3.2. Qualitative Research Designs and Methods
 Qualitative research has its own designs and
methods.
 There are four major types of qualitative research
designs
1. Phenomenology
2. Ethnography
3. Grounded theory
4. Case study
72

Phenomenology
 Phenomenology literally means the study of
phenomena.
 It is a way of describing something that exists as part
of the world in which we live.
 Phenomena may be
 Events,
 Situations,
 Experiences or concepts.
73

Phenomenology
 Phenomenological research begins with the
acknowledgement that there is a gap in our
understanding and that clarification or illumination
will be of benefit.
 Phenomenological research will not necessarily
provide definitive explanations but it does raise
awareness and increases insight.
74

Ethnography
 Ethnography is “[an] analytical description of social
scenes and groups that recreate for the reader the
shared beliefs, practices, artifacts, folk knowledge,
and behaviors of those people.” Goetz and LeCompte
(1984, pp. 2-3)
 Ethnography has a background in anthropology.
 The term means “portrait of a people” and it is a
methodology for descriptive studies of cultures and
peoples.
 The cultural parameter is that the people under
investigation have something in common.
75

Ethnography
 Rooted in anthropology, ethnography involves the
study of an intact group, logically defined, in its
natural context for a sustained time interval.
 The researcher is typically an observer or a
participant observer (Creswell, 1994, p. 11).
 Examples of parameters include:
 geographical - a particular region or country
 Religious
 Tribal
 shared experience
76

Ethnography
 In health care settings, researchers may choose an
ethnographic approach because the cultural
parameter is suspected of affecting the
population’s response to care or treatment.
 For example, cultural rules about contact between
males and females may contribute to reluctance of
women from an Asian subgroup to take up cervical
screening.
 Ethnography helps health care professionals to
develop cultural awareness and sensitivity and
enhances the provision and quality of care for
people from all cultures.
77

Ethnography
 Ethnographic studies entail extensive fieldwork by
the researcher.
 Data collection techniques include both formal
and informal interviewing, often interviewing
individuals on several occasions, and participant
observation.
 Because of this, ethnography is extremely time
consuming as it involves the researcher spending long
periods of time in the field.
78

Ethnography
 Ethnographic research is very labor and time
intensive, involving extensive fieldwork in a
natural setting.
 Usually a general research question(s) is (are)
identified.
 Once entry is gained and rapport (or trust) is
established, the research questions are continually
refined becoming more focused.
 It is not uncommon for the larger research
question(s) to be segmented into more numerous,
focused ones.
79

Grounded theory
 This methodology originated with Glaser and Strauss
and their work on the interactions between health
care professionals and dying patients.
 The main feature is the development of new theory
through the collection and analysis of data about a
phenomenon.
 It goes beyond phenomenology because the
explanations that emerge are genuinely new
knowledge and are used to develop new theories
about a phenomenon.
 In health care settings, the new theories can be
applied enabling us to approach existing problems in a
new way.
80

Case study
 Case study research is used to describe an entity
that forms a single unit such as a person, an
organization or an institution.
 Some research studies describe a series of cases.
81

82
Dimension Narrative Phenomenology Grounded
Theory
Ethnography Case Study
Focus •Exploring the
life of an
individual
• Understanding
the essence of
experiences
about a
phenomenon
• Developing a
theory grounded
from data in the
field
• Describing and
interpreting a
cultural or
social group
• Developing an
in-depth
analysis of a
single case or
multiple cases
Data
Collection
• Primary
interviews and
documents
• Long interviews
with up to 10
people
• Interviews with
20-30
individuals to
“saturate”
categories and
detail a theory
• Primarily
observations
and interviews
with additional
artifacts during
extended time in
the field (e.g.
6 months to a
year)
• Multiple
sources
including
documents,
archival records,
interviews,
observations,
• Physical
artifacts
Data
Analysis
• Stories
• Epiphanies
• Historical
content
• Statements
• Meanings
• Meaning themes
• General
description of the
experience
• Open coding
• Axial Coding
• Selective
Coding
• Conditional
Matrix
• Description
• Analysis
• Interpretation
• Description
• Themes
• Assertions
Narrative
Form
• Detailed picture
of an individual’s
life
• Description of
the
“essence” of the
experience
• Theory or
theoretical
model
• Description of
the cultural
behavior of a
group or an
individual
• In-depth study
of a “case” or
“cases”

2.3.3. Methods of data collection
 The most common methods of data collection in
qualitative research are:
1. Participant observation
 Overt observation
 Covert observation
2. Interviews (Unstructured, Semi-structured, Structured)
 Face to face interviews
 Telephone interviews
3. Focus groups
 researcher(s) plus 2-10 participants - guided group discussion
on topic(s)
4. Historical methods
83

Design questions in qualitative research
 Qualitative research can address questions
 What is the society like?
 Why do certain behaviors occur?
 What is this experience like?
84

 The following points discuss some of the key design
questions necessary in designing qualitative
research
1. Defining an area of inquiry
 Drawn from personal experience, reviewing literature and
auditing earlier studies.
2. Stating the research problem
 Entails gap in scientific knowledge
 Helps to describe what has been done so far and identify
questions that have been unanswered.
 Forwards the ways in which the findings of the present
study might be utilized
85

3.Developing a conceptual framework
 It is an alternative way of depicting a set of related
variables and outcomes in the study in an elaborative
schematic diagram.
 It shows the key factors, presumed relationships and
possible outcomes of the research problem.
 It helps to outline the research questions and core
variables included in the data collection instrument.
86

4. Formulating qualitative research questions
 A thoroughly defined research problem helps to
examine the issue with more specific and relevant
questions.
 Research questions serve to narrow the purpose. There
are two types:
1. Central
 The most general questions you could ask
2. Sub-questions
 Subdivides central question into more specific topical
questions
 Limited number
87

 Use good qualitative wording for questions.
 Begin with words such as “how” or “what”
 Tell the reader what you are attempting to “discover,”
“generate,” “explore,” “identify,” or “describe”
 Ask “what happened?” to help craft your description
 Ask “what was the meaning to people of what
happened?” to understand your results
 Ask “what happened over time?” to explore the process
88

2.3.4. Sampling strategies in qualitative research
 Does not need to be representative of population - not
statistical
 Saturation – recruitment of additional cases no longer
provides additional information or insights
 The sampling techniques used in qualitative research are:
1. Purposeful sampling
2. Homogeneous sampling
3. Theoretical sampling
4. Extreme or deviant
5. Maximum variation
6. Convenience sampling
7. Snowball or chain sampling
8. Opportunistic
89

Sampling strategies in qualitative research
 Purposive sampling
 Also known as judgemental sampling, purposive sampling is
a non-probability technique that involves the conscious
selection by the researcher of certain people to include in a
study.
 Participants are selected because they have particular
characteristics that are of interest to the researcher.
 For example, they have had the experience in which the
researchers are interested, or there are certain aspects of
their lives in which the researchers are interested.
90

Homogeneous sampling
 includes people with basically similar characteristics
to study the group in depth.
 The selection of participants is usually done within
certain strata; participants with similar demographic
or social characteristics being included in the same
strata.
 Focus groups usually use this type of sampling.
 The group interaction stimulates people within the
group to discuss their experiences.
 The main advantage of homogeneous sampling is
that it focuses on a similar type of respondents
thereby simplifying analysis and group interviewing
91

Theoretical sampling
 It is the process of selecting "incidents, slices of life, time periods, or people
on the basis of their potential manifestation or representation of important
theoretical constructs" (Patton, 2001, p. 238).
 Theoretical sampling is an important component in the development of
grounded theories.
 Glaser and Strauss (1967) describe an iterative sampling process that is
based on emerging theoretical concepts.
 This sampling approach has the goal of developing a rich understanding of
the dimensions of a concept across a range of settings and conditions.
Extreme or deviant
 chooses extreme cases of outstanding successes or crisis events after knowing the typical case in
order to highlight and understand the situation.
 For example, a researcher may be interested in studying two health facilities, one whose family
planning clients are highly satisfied and another whose clients are not satisfied, in order to identify
factors that favor or discourage the utilization of services.
 This type of sampling is valuable to test emerging theories by learning from highly unusual
manifestations.
92

Maximum variation
 Also known as heterogeneous sampling.
 Useful for obtaining maximum differences among information-rich
informants or group.
 Subjects included in the study are different from each other based
on predetermined criteria.
 E.g. A study of rural, urban and suburban or merchants and
academicians or high activity/low activity college students, etc.
Convenience sampling
 Study participants are selected based on their ease, accessibility
and availability.
 Researcher selects those individuals who are most readily
available.
 Helps to save time, money and effort.
 However, it may be the weakest sampling scheme due to its low
credibility.
93

Snowball or chain sampling
 depends on locating participants by asking others to identify
individuals or groups with rich information on the
phenomenon under study.
 implies that the first subject is used to identify the next person or
group to facilitate the identification of cases of interest.
 the sample gradually increases in size, like a snowball being rolled
down a hill.
 Valuable when the researcher is new to the study site, and
also important for identifying individuals who have rich
information but are difficult to reach.
Opportunistic
 Additional study subjects may be selected to take advantage
of unexpected opportunities at the field level.
94

2.3.5. Qualitative Methods of Analysis
1. Thematic analysis
 Focuses on identifiable themes and patterns of living and/or
behaviour. From the conversations that take place in a
therapy session or those that are encouraged for the sake of
researching a process, ideas emerge that can be better
understood under the control of a thematic analysis.
2. Content Analysis
 Is doing the word-frequency count.
 assumption made is that the words that are mentioned most
often are the words that reflect the greatest concerns.
3. Discourse Analysis
 Discourse analysis focuses on talk and texts as social
practices, and on the resources that are drawn on to enable
those practices.
95

2.3.6. Trustworthiness
 Ensuring the quality of data based on certain
established criteria is the main activity of the
researcher both in qualitative and quantitative
research traditions.
 Particularly for qualitative research, where the challenge
of understanding and making meaning is put upon the
researcher.
96

Trustworthiness
 There are four common criteria for assessing
trustworthiness of qualitative research findings are:
1. Truth value/Credibility
2. Applicability/Transferability
3. Consistency/Dependability
4. Neutrality/Confirmability
97

Trustworthiness
Truth value/Crediblity
• It refers to the ability of the study to detect what the
research really aimed at studying.
• It asks whether the researcher has established confidence
in the truth of the findings for the subjects or informants
and the context in which the study was undertaken (Lincoln
& Guba, 1985).
98

Trustworthiness
Applicability/Transferability
• Applicability refers to the ability to determine the extent
to which the findings are applicable in other settings,
situations, populations or circumstances.
99

Trustworthiness
Consistency
• The basic question asked by researchers while dealing with
consistency is “can the findings be repeated with the same
(or similar) respondents in the same context?”
• Consistency of findings in both quantitative and qualitative
research designs can be explained by reliability and
dependability, respectively.
100

Trustworthiness
Neutrality
• It refers to the role of the researchers mainly during data
collection.
• It is assessed by objectivity in quantitative research and
conformability in qualitative approach.
101

Trustworthiness
102
Conventional
inquiry
Naturalistic
inquiry
Methods to ensure quality
Internal validity Credibility Member checks; prolonged
engagement in the field; data
triangulation
External validity Transferability Thick description of setting and/or
participants
Reliability Dependability Audit– researcher’s documentation
of data, methods and decisions;
researcher triangulation
Objectivity Confirmability Audit and reflexivity
Table 1 – Lincoln and Guba’s translation of terms

2.4. Measures of Association or Risk
2.4.1. Introduction to risk and harm
2.4.2. Methods of Risk Measures
105

Learning Objectives
 Define risk, harm, hazard and risk assessment
 List risk measure methods
 Differentiate types of risk measures
 Interpret and apply risk estimates
106

2.4.1. Introduction to risk and harm
Definitions
Risk
1. It is the probability that an event will occur, e.g.,
that an individual will become ill or die within a stated
period of time or by a certain age.
Lasts Dictionary of Epidemiology
2. It is hazard probability that an event will occur at
a time t when it has not occurred at time t-1.
 In pharmacoepidemiology, this term designates the probability
that a subject (whether exposed to a drug or not) will present
an event at any given time, knowing that the subject had not
presented it in the preceding time interval.
Dictionary of Pharmacoepidemiology
107

2.4.1. Introduction …(continued)
Harm
 Harm is the nature and extent of actual damage
that could be caused by a drug.
Hazard
1. It is the potential to cause harm.
2. It refers to a property or situation that in particular
circumstances could lead to harm (Royal Society,
1992).
Royal Society (1992). Risk Analysis, Perception and Management. The Royal Society, London
108

2.4.1. Introduction …(continued)
 Risk assessment
 It is the procedure in which the risks posed by inherent
hazards involved in processes or situations are
estimated either quantitatively or qualitatively.
109

2.4.2. Methods of Risk Measures
 Risk measures are estimates that describe amount
of risk associated with particular exposure in sample
population
 Risk estimates can
 Describe quantitatively risk associated with particular
exposure and development or prevention of disease
 Quantify association between exposure to particular
drug and adverse drug reaction
110

2.4.2. Methods of Risk …
 Risk estimates are part of our daily lives
 Measures of risk are communicated to
 Patients via
 Newspapers
 Television
 Internet
 Practitioners via
 Studies published in medical journals
 Thus, risk measures are important in clinical
decision-making process for both patients and
practitioners.
111

2.4.2. Methods of risk …
 Understanding of risk measures is important to
 Interpret appropriately
 Apply the estimates
 Risk measures are difficult to use for many reasons:
1. When conflicting results reported from different studies
 When conflicting information pertaining to risk is published
 It becomes difficult for both practitioners and patients to use risk
estimates for clinical decision-making
2. When confusion in interpretation of results of study exists
on risk estimates
 Two readers may interpret, communicate, and use the results of a
study very differently
112

Risk Measure Types
1. Prevalence
2. Incidence
3. Relative Risk
4. Odds Ratio
5. Attributable Risk (Risk difference)
6. Attributable Risk Percent
7. Number Needed to Treat
113

Prevalence
 Prevalence (P) is defined as the number of existing
cases of disease (or any outcome, e.g., adverse
drug reaction, drug use) in a population at a
particular point in time.
114
P =
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒆𝒙𝒊𝒔𝒕𝒊𝒏𝒈 𝒄𝒂𝒔𝒆𝒔 𝒊𝒏 𝒂 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏
𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒆𝒐𝒑𝒍𝒆 𝒊𝒏 𝒕𝒉𝒂𝒕 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏

Prevalence
Types of prevalence
1. Point prevalence rate
 comprises all the cases of a disease that exist at a point in
time.
2. Period prevalence
 Numerator is all cases whether old, new or recurrent, arising
over a defined period, say a year or two.
 Denominator is the average population over the period (or
mid-point estimate)
3. Lifetime prevalence
 Proportion of the population who have ever had the disease
115

Incidence
 Incidence refers the number of new cases of disease
that develop in a population at risk over a specified
time period.
 Incidence is used to determine how often the
disease is occurring.
 Incidence is typically described as either
 Cumulative incidence (CI) or incidence rate
 Person-time rate
116

Incidence
Cumulative incidence (CI)
 CI assumes that all of the subjects were followed
for the entire study period.
 CI does not reflect study dropouts or losses to
follow-up.
117
CI =
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒏𝒆𝒘 𝒅𝒊𝒔𝒆𝒂𝒔𝒆 𝒄𝒂𝒔𝒆𝒔 𝒅𝒖𝒓𝒊𝒏𝒈 𝒈𝒊𝒗𝒆𝒏 𝒕𝒊𝒎𝒆
𝑻𝒐𝒕𝒂𝒍 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒂𝒕 𝒓𝒊𝒔𝒌

Incidence
Incidence Rate
 IR is also referred as incidence density
 It is more accurate means of measuring disease
occurrence.
 IR takes into account actual observation time of each
subject during study period
 It does not assume all subjects were followed for the entire
study period.
118
IR =
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒏𝒆𝒘 𝒄𝒂𝒔𝒆𝒔 𝒅𝒖𝒓𝒊𝒏𝒈 𝒈𝒊𝒗𝒆𝒏 𝒕𝒊𝒎𝒆
𝑻𝒐𝒕𝒂𝒍 𝒑𝒆𝒓𝒔𝒐𝒏 𝒕𝒊𝒎𝒆 𝒐𝒇 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏 𝒊𝒏 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒂𝒕 𝒓𝒊𝒔𝒌
× 𝟏𝟎𝒏
n- 1, 2, 3

Incidence
Example:
Sample of population recruited= 100 persons
Incidence of side effects
Study period- 3years
After one year- 5 persons- 100 person year (py)
After two years-10 persons- 95 py
After three years-15 persons- 85 py
119
IR =
𝟓+𝟏𝟎+𝟏𝟓
𝟏𝟎𝟎+𝟗𝟓+𝟖𝟓
× 𝟏𝟎𝟑
=
𝟑𝟎
𝟐𝟖𝟎 𝒑𝒆𝒓𝒔𝒐𝒏 𝒚𝒆𝒂𝒓
× 𝟏𝟎𝟑
= 𝟏𝟎𝟕 per 1000 person-year
107 people on side effects per 1000 person-year of observation

Incidence
2 by 2 Contingency Table
120
Disease/ Outcome
Yes No Total
Exposure
Yes A B A + B
No C D C + D
Total A + C B + D A+ B + C +D

Relative Risk
 Relative risk (RR) is the likelihood of developing the
disease in the exposed group relative to the
unexposed group;
 It measure association between exposure and disease.
 Once incidence of an outcome or disease has been
measured in both exposed and unexposed groups,
 useful to know relationship between exposure and
development of disease
121

Relative Risk
 RR is simply a ratio of CI in exposed group over
unexposed group
 It can be calculated as follows:
 Ho in a comparison of two groups states that
proportion of subjects with outcome of interest is
equal in exposed and unexposed groups.
 In other words, RR equals 1.
122
𝑹𝑹 =
𝑪𝑰 𝒆𝒙𝒑𝒐𝒔𝒆𝒅
𝑪𝑰𝒖𝒏𝒆𝒙𝒑𝒐𝒔𝒆
=
𝑨/(𝑨+𝑩)
𝑪/(𝑪+𝑫)

Relative risk
 RR can be used to measure association between
exposure and outcome in
 Cohort studies
 Clinical trials
 RR is not used in context of a case-control study.
123

Relative risk
Table. Interpretation of Relative Risk
Relative Risk Risk Association between Exposure and
Outcome
1 No association
< 1 Negative association/Decreased risk
> 1 Positive association/Increased risk
124

Odds Ratio
 Odds is the ratio of the probability of occurrence of an
event to that of nonoccurrence.
 Odds ratio (OR) is a means of estimating the relative
risk in case-control studies.
 In case-control studies, subjects are chosen based on
disease status and then compared for rates of
exposure.
125

Odds ratio
 OR is used to estimate RR in a case-control study.
 The odds ratio is calculated as follows:
 Null hypothesis when using the OR states, H0: OR =1.
126
𝑶𝑹 =
odds of exposure among cases
odds of exposure among controls
=
𝑨/𝑩
𝑪/𝑫
=
𝑨𝑫
𝑩𝑪

Odds ratio
Table. Interpretation of Odds Ratio
Odds Ratio Association between Exposure and
Outcome
1 No association between exposure and
outcome
< 1 Negative association/Decreased risk/
Protective effect
> 1 Positive association/Increased risk
-an increased risk of outcome associated
with the exposure.
127

Confidence Intervals (CI)
 Confidence intervals are defined as the range within
which the true effect lies with a certain degree of
assurance.
 Confidence intervals determine the reliability of the
risk estimate obtained in the sample.
128

Confidence intervals
 The 95% confidence intervals of the relative risk and
odds ratio can be calculated as follows:
129
𝑪𝑰 𝒇𝒐𝒓 𝑶𝑹 = 𝒆𝒙𝒑 𝒍𝒏(𝑶𝑹) ± 𝒁𝜶/2
𝟏
𝒂
+
𝟏
𝒃
+
𝟏
𝒄
+
𝟏
𝒅
𝑪𝑰 𝒇𝒐𝒓 𝑹𝑹 = 𝒆𝒙𝒑 𝒍𝒏(𝑹𝑹) ± 𝒁𝜶/2
𝟏 − 𝒂
𝒂 + 𝒃
𝒂
+
𝟏 − 𝒄
𝒄 + 𝒅
𝒄

Confidence intervals (CI)
 The confidence interval can be calculated at various
degrees of confidence (e.g. 90%, 95%,99%).
 Confidence intervals are most frequently reported at
the 95% level, corresponding with a z-value of 1.96.
130
Confidence interval Z-value (Z/2)
90% ( = 0.1) 1.64
95% ( = 0.05) 1.96
99% ( = 0.01) 2.58

Confidence intervals (CI)
 Consider the results of three different studies. The
relative risk and the corresponding 95% confidence
interval for each study were as follows:
 Studies 1 and 2 shows statistically significant result
 because the number 1 is not included within the bounds of
the 95% confidence interval
131
Study 1 RR = 2.2; 95% CI (1.8, 2.6)
Study 2 RR = 2.2; 95% CI (1.1, 3.3)
Study 3 RR = 2.2; 95% CI (0.9, 3.0)

Attributable Risk or Risk difference
 Attributable risk (AR) or risk difference (RD) is another
measure of risk used in studies.
 AR provides information on absolute effect of the
exposure.
 AR describes excess risk of disease in those exposed
compared with those who were unexposed.
132

Attributable risk or risk difference
 AR is calculated as follows:
 AR allows to determine how morbidity and mortality are
affected by removing the exposure.
 AR of 0 is equal to the null hypothesis and
 means there is no association between exposure and outcome.
 AR provides information on the type of effect that can be
achieved by decreasing or eliminating the exposure.
133
𝑨𝑹 = 𝑪𝑰 𝒆𝒙𝒑𝒐𝒔𝒆𝒅 − 𝑪𝑰 𝒖𝒏𝒆𝒙𝒑𝒐𝒔𝒆𝒅 =
𝐴
𝐴+𝐵
-
𝐶
𝐶+𝐷

Attributable Risk Percent
 AR can be converted to the attributable risk percent (AR%),
which may be easier to interpret.
 AR% provides an estimate of the proportion of the disease
among the exposed that is attributable to the exposure.
 Like the AR, it provides information pertaining to the
proportion of the disease in the exposed group that could be
prevented by eliminating the exposure.
134

Attributable Risk Percent
 The attributable risk percent is calculated as follows:
135
𝐀𝐑% =
CIexposed − CI unexposed
CIexposed
× 100
AR% =
A
A+B
− C
C+D
A
A+B
× 100

Number Needed to Treat (NNT)
 Refers number of patients who would need to be
treated (NNT), to prevent one clinical event or
adverse outcome such as one death.
 Like the attributable risk, the number needed to
treat is used by administrators to allocate health
resources.
136

Number needed to treat
 It can be calculated as follows:
 Example
 If the mortality rate because of disease A in the untreated
group is 17% and mortality in the treated group by using
Drug X is 12%. Calculate NNT to prevent one death.
 Solution: NNT = 1/ (17%- 12%) = 1/0.05 = 20 people
137
𝐍𝐍𝐓 =
𝟏
𝐀𝐑
=
𝟏
𝐂𝐈 𝐞𝐱𝐩𝐨𝐬𝐞𝐝 − 𝐂𝐈 𝐮𝐧𝐞𝐱𝐩𝐨𝐬𝐞𝐝
=
𝟏
𝐀
𝐀 + 𝐁
−
𝐂
𝐂 + 𝐃

Summary
 Measures of risk allow the quantification of degree of risk
associated with any number of exposures.
 Risk estimates are point estimates, the estimates obtained in
the particular study population.
 Risk estimates may or may not represent the true, or actual, risk
that exists in the general population.
 When evaluating risk estimates, it is important to
 consider baseline risk of developing the disease.
 evaluate confidence intervals around the risk estimate to
determine the stability of the risk estimate.
 consider other factors that may be responsible for the disease,
including confounding variables.
138

2.5. Evaluation of Evidence and Errors in
Pharmacoepidemiology Studies
2.5.1. Types of Association Between Factors Under
Study
2.5.2. Types of Errors That One Can Make in
Performing a Study
2.5.3. Criteria for the Causal Nature of an Association
139

2.5.1. Types of Association Between Factors
Under Study
 There are four basic types of associations that can be
observed in a study:
1. No Association (independent)
2. Artefactual Association (Spurious or False)
a. Chance (Unsystematic Variation)
b. Bias (Systematic Variation)
3. Indirect Association(Confounded Association)
4. Causal Association (Direct or True)
 The basic purpose of research is to differentiate
among them.
140

Hypothesis Testing
 Establishing associations requires hypothesis testing
 There are Two Types of Hypothesis in the scientific
investigation:
 Null Hypothesis (Ho) and
 Alternative Hypothesis (HA)
 The hypothesis we test statistically is called the null
Hypothesis
141

Hypothesis Testing
 Example: Suppose we are testing the efficacy of a new
drug on patients with myocardial infraction
1. We divide the patients in to two groups: drug
and no drug
2. Measure Mortality in the two groups
3. We say our hypothesis
 That the drug makes no difference and
 What we hope to do is to reject the ‘no difference’
hypothesis, based on evidence from our sample
patients
142

4. We specify our test hypothesis as follows:
 Ho (hypothesis): death rate in group treated with
Drug A = death rate in group treated with Drug B
 That is equivalent to say:
Ho: death rate in group A – death rate in group B = 0
 We test this against an alternate hypothesis known
as HA,
 The difference in death rate between the two
groups does not equal to 0
 That is equivalent to say:
HA: death rate in group A – death rate in group B  0
143

 If the observed difference is sufficiently greater than
zero difference,
 We reject the null hypothesis.
 If we reject the null hypothesis of no difference,
 We accept the alternate hypothesis
 We can never be certain that we are right in either
accepting or rejecting a hypothesis
 because of errors that can be produced in a study
144

2.5.2. Types Of Errors That One Can Make In
Performing A Study
 There are three possible types of errors that
can be produced in a study:
1. Chance or Random Error
• Type I (alpha) error
• Type II (beta) error
2. Bias
3. Confounding
145

Chance or Random Error
 The purpose of statistical testing in science is to
 Evaluate the role of chance and
 Estimate the probability that the result observed in a
study could have happened purely by chance
 The two kinds of errors:
 Rejecting the null or test hypothesis incorrectly
 Type I error
 Fail to reject the null hypothesis incorrectly
 Type II error
146

Consider the hypothesis below:
 Null Hypothesis:
 Drug has no effect ↔ no difference in mortality
between patients using drug and not using drug
 Alternate Hypothesis:
 Drug has effect ↔ reduces mortality
147

148
Decision on the
basis of the
sample
True state of nature
Ho True HA True
Reject Ho Type I Error No Error
Do not reject Ho No Error Type II Error
 If we reject Ho and accept HA,
 we conclude that there is relationship between drug and
mortality
 If we don’t reject Ho, and reject HA,
 we conclude that there is no relationship between drug and
mortality.

Actions to be taken based on decisions:
1. If we reject null hypothesis in favor of the alternative hypothesis
we will use the drug  Type I error ()
→ Consequence of wrong decision: Type I Error. We will use the drug
but the patient don’t benefit.
Presuming the drug is not harmful in itself,
 we do not directly hurt the patients
 but since we think that we have found the cure, we may
no longer test other drugs
2. If we believe the null hypothesis (i.e., fail to reject null
hypothesis) we will not use the drug  Type II error ()
→Consequence of wrong decision: Type II Error.
 Since in reality the drug is beneficial, by withholding it,
 We will allow patients to die who might otherwise have
survived.
149

 We cannot eliminate the risk of making one of these
types of errors but
 can lower the probabilities that we will make these errors
 The probability of making a Type I Error is known as
the significance level of a statistical test
 To lower the probabilities of both the Type I and II
Errors in a study
 It is necessary to increase the number of observations
150

Confidence Interval and P – value
 Two options to answer whether an observed association in
a sample is large enough to be evidence of a true
association in the population from which the sample was
drawn.
 The 95% confidence interval
 P - value
 A 95% confidence interval
 It gives a plausible RANGE OF VALUES that should contain the
true association in the population.
 A P-value
 It is the probability of getting the observed association, or more
extreme, in the sample purely by chance from a population
where the true association is one.
151

Bias (Systematic Variation)
 Bias refers to any systematic error in the design, conduct
or analysis of a study that results in a mistaken estimate
of an exposures effect on the risk of disease.
 Systematic variation in a consistent manner in which two
study groups are treated or evaluated differently
 This consistent difference can create an apparent
association where one actually does not exist
 It also masks true association
 Biases once present cannot be corrected
 They represent errors in the study design
 Proper study design is the only protection against biases
152

 Two main types of bias
1. Selection bias
 Errors in the process of selecting the study population
 Factors that influence study participation
2. Information bias
 Occurs during data collection.
 Errors in the way the information is collected
 Some Types and Sources of Information Bias
 Bias in abstracting records
 Bias in interviewing
 Bias from surrogate (substitute or proxy) interviews
 Surveillance bias
 Recall bias
 Reporting bias
153

Selection Bias
 Occurs when selection of cases or control is related
to exposure
 Selection of patients from hospitals, specialised
centres
 Selection of “healthy” controls from hospitals
 Response rate bias
 Self selection bias
 Survival bias
154

Information Bias
 It is a systematic distortion or error that arises from
the procedures used for classification or measurement
of the disease, the exposure, or other relevant
variables.
 Types of Information Bias
1. Misclassification
2. Observer Bias
3. Recall Bias
155

Information Bias – Misclassification bias
 Misclassification bias is a systematic error that can
occur at any stage in the research process.
 It occurs when an individual is assigned to a
different category than the one to which they
should be assigned.
 For example, if a patient appears to be non-
hypertensive because of medication-controlled
blood pressure, resulting in systolic and diastolic
measures that are within the ‘normal range’, this
may constitute an incorrect classification.
156

Information Bias – Observer Bias
 Observer know the underlying hypothesis and ask
more probing question to those exposed than
controls
 Remedies for observer bias
 Blind the observer
 Use highly structured interview
157

Information Bias - Recall Bias
 Disease status affect patients’ response
 Patient with musculoskeletal diseases are more likely to
remember minor trauma
 Particular problem with case control studies
 Remedies for recall bias
 Find reliable records
 Use control with other illnesses
158

Confounding
 Confounding refers to the mixing of the effect of an extraneous
variable with the effects of the exposure and disease of
interest.
 It arises when some causes other than the exposure under
study is more, or less, prevalent in the exposed group than in
the unexposed.
 Such variable is defined as an extraneous (third) variable which
is associated with the exposure and, independent of that
exposure, be a risk factor for the disease.
159
A
C
B
Note:
A - Exposure
B - Outcome variable
C - Confounder variable

160
Whereas
• A mediator is a factor in the causal chain (1),
• A confounder is a spurious factor incorrectly implying
causation (2)

 Example
 The RR is determined from an observational study
 This raises the question
 whether the exposed and the reference groups are
similar in all respects except the exposure under study
RR- Relative Risk
161
Confounding

 In addition to the two principal factors, which
determine the RR, namely the outcome of the
adverse event and the drug exposure status,
 We must also consider tertiary factors of potentially
large importance
 These factors are related to both the outcome and
exposure status – confounding factors
 Example: See table I and II next slide
162
Confounding

163
Table I. Data from a cohort study by Careson et al. (1987) on the
effects of NSAIDs on the risk of UGIB
Exposed to NSAIDs
No. of cases of UGIB
Person-months
Rate of UGIB/10000 person - months
155
1220000
1.27
Not exposed to NSAIDs
Person-months
Rate of UGIB/10000 persons-months
Rate ratio
96
1157000
0.83
1.5
UGIB- Upper Gastrointestinal Bleeding
Confounding

164
Exposed to NSAIDs Alcohol consumption
Heavy Light
Exposed to NSAIDs
Person-months
Rate of UGIB/10000 person-
month
143 12
492000 728000
2.91 0.16
Not exposed to NSAIDs
Person-months
Rate of UGIB/10000 person-
months
Rate ratio
95 1
982,000 175000
0.97 0.06
3.0 2.9
Table 2. Hypothesis stratification to illustrate confounding by alcohol
consumption of data by Careson et al. (1987) on the effects of NSAIDs
on the risk of UGIB
Confounding

Characteristics of a Confounding Variable
1. Associated with the disease of interest in the
absence of exposure.
 Risk factor for the study outcome among exposed
group
 Risk factor for the study outcome among non exposed
2. Associated with the study exposure but not as a
consequence of the exposure.
165
Confounding

Effect of Confounding
1. Totally or partially accounts for the apparent
effect
2. Mask an underlying true association
3. Reverse the actual direction of the association
166
Confounding

Remedies to Confounding
1. In the DESIGN confounding could minimize by:
 Match (match case and control for gender and age)
 Restriction (limit study to certain groups)
 Randomisation (limit to treatment)
2. In the ANALYSIS
 Stratification
 Standardisation
 Statistical modelling (multivariate analysis)
167
Confounding

 Probability of random error can be quantified
using statistics.
 Bias needs to be prevented by designing the study
properly and
 Confounding can be controlled in either the design
of the study or in its analysis.
 If the three types of errors excluded,
 then one is left with a True Causal Association
168
Summary

2.5.3. Criteria for the Causal Nature of an
Association
 The “Criteria for the causal nature
of an association” were first put
forth by Sir Austin Bradford Hill
 But have been described in
various forms since,
 Each with some modification.
 Probably the best known
description of them was in the
first Surgeon General’s Report on
 Smoking and Health, published in
1964.
169
https://profiles.nlm.nih.gov/NN/B/B/M/Q/_/nnb
bmq.pdf

Criteria for the Causal Nature of an Association
1. Coherence with existing information (also called
biological plausibility)
2. Consistency of the association
3. Time sequence
4. Specificity of the association
5. Strength of the association
a. Quantitative strength
b. Dose–response relationship
c. Study design
170

1. Coherence with existing information or biological
plausibility
 Refers to whether the association makes sense,
 in light of other types of information available in the literature.
 These other types of information could include:
 Data from other human studies,
 Data from studies of other related questions,
 Data from animal studies, or
 Data from in vitro studies, as well as
 Scientific or pathophysiologic theory.
 Example: The association between cigarettes and lung
cancer
 cigarette smoke is a known carcinogen
 based on animal data
 Cause cancers of the head and neck, the pancreas, and the bladder
in human
171

2. The Consistency of the Association
 Reproducibility of the association in different
settings which includes
 Different geographic settings,
 Different study designs,
 Different populations
 For example, in the case of cigarettes and lung
cancer, the association has now been reproduced
 In many different studies,
 In different geographic locations,
 Using different study designs.
172

3. Time Sequence
 A cause must precede an effect.
 Although this may seem obvious,
 there are study designs from which this cannot be
determined.
 Example: Cigarette smoking usually precedes the
lung cancer
173

4. Specificity
 Refers effect and whether the effect ever occurs without
the presumed cause.
 This criterion is almost never met in biology, with the
occasional exception of infectious diseases.
 It provides extremely strong support for a conclusion
that an association is causal if the criteria met
174

5. Strength of the Association
5a. Quantitative Strength of an Association
 Refers to the effect size. To evaluate this,
 One asks whether the magnitude of the observed difference
between the two study groups is large.
 A quantitatively large association can only be created by a
causal association or a large error,
 which should be apparent in evaluating the methodology of a
study.
 A quantitatively small association may still be causal,
 but it could be created by a subtle error, which would not be
apparent in evaluating the study.
175

5b. Dose – Response Relationship
 Exists when an increase in the intensity of an exposure
results in an increased risk of the disease under study.
 Equivalent to this is a duration–response relationship
 which exists when a longer exposure causes an increased
risk of the disease.
 The presence of either a dose–response relationship or a
duration–response relationship strongly implies that an
association is, in fact, a causal association.
176

5c. Study Design Used
 It refers whether the study was well designed,
 whether the study was subject to one of the three errors
namely
 Random error,
 Bias, and
 Confounding.
 And which study design was used in the studies in
question.
177

2.6. Sample Size Consideration for
Pharmacoepidemiology Studies
2.6.1. Background
2.6.2.1. Factors That Affect Sample Size Calculations
2.6.2.2. Sample Size Calculations For Cohort Studies
2.6.2.3. Sample Size Calculations For Case-Control
Studies
2.6.2.4. Sample Size Calculations For Case Series
178

2.6.1. Background
 In premarketing study
 Uses sample size between 500 and 3000
 To be 95% certain to detect ADR between 1 and 6 in 1000
exposed respectively
 In post marketing studies
 Needs large sample size than premarketing
 To increase significance
 Requirement for large sample sizes raises
 Logistical obstacles to cost-effective studies
 Thus,
 Needs to know how to calculates the minimum sample size
necessary for a pharmacoepidemiology study
 To avoid the problem of a study with a sample size that is too small
179

2.6.1. Background
 Research studies are conducted with many different aims in mind.
 A study may be conducted to establish
 The difference, or conversely the similarity
 Between two groups defined in terms of a particular
 Risk factor or
 Treatment regimen.
 Alternatively, it may be conducted to estimate some quantity
 For example, the prevalence of disease
 In a specified population
 With a given degree of precision.
 Regardless of the motivation for the study
 It is essential that to have an appropriate size to achieve its aims.
 The most common aim is probably
 That of determining some difference between two groups
180

2.6.1. Background
The difference between two groups in a study will usually
be explored in terms of
 An estimate of effect
 It is the size of the effect to be detected
 Appropriate P value or confidence interval
 The confidence interval indicates
 The likely range of values for the true effect in the population
 The P value determines
 How likely it is that the observed effect in the sample is due to chance.
 The statistical power of the study
 It is the probability of correctly identifying a difference between the
two groups in the study sample
 When one genuinely exists in the populations from which the
samples were drawn.
181

2.6.2.1. Factors That Affect Sample Size Calculations
2.6.2.2. Sample Size Calculations For Cohort Studies
2.6.2.3. Sample Size Calculations For Case-Control
Studies
2.6.2.4. Sample Size Calculations For Case Series
182

2.6.2.1. Factors that Affect Sample Size
Calculations
 There are three main factors that must be considered in the appropriate
sample size calculation.
183
 Once these three factors have been established,
 there are tabulated values and formulae available for calculating the
required sample size for different study designs.

2.6.2.2. Sample Size Calculations for
Cohort Studies
 The sample size required for a cohort study depends
on what you are expecting from the study.
 To calculate sample sizes for a cohort study, one
needs to specify five variables
1. Type I error ()
 considered tolerable, and
 Whether it is one-tailed or two-tailed
2. Type II error () considered tolerable
3. Minimum relative risk (RR) to be detected
4. Incidence of the disease in the unexposed control group
5. Ratio of unexposed controls to exposed study subjects
184

Type I error
 That one is willing to tolerate in the study
 Probability of concluding
 there is a difference between the groups being compared
 when in fact a difference does not exist.
 Using diagnostic tests as an analogy
 A type I error is a false positive study finding
185

Type II error
 That one is willing to tolerate in the study.
 A type II error is the probability of concluding
 There is no difference between the groups being compared
 When in fact a difference does exist.
 A type II error is the probability of missing a real difference
 Using diagnostic tests as an analogy,
 A type II error is a false negative study finding.
 The complement of  is the power of a study
 The probability of detecting a difference if a difference really exists.
 Power is calculated as (1− ).
186

 The minimum effect size one wants to be able to detect.
 For a cohort study, this is expressed as a relative risk.
 The expected incidence of the outcome of interest in the unexposed control
group.
 The number of unexposed control subjects to be included in the study for
each exposed study subject.
 A study has the most statistical power for a given number of study
subjects
 If it has the same number of controls as exposed subjects.
 However, sometimes the number of exposed subjects is limited and,
 therefore, inadequate to provide sufficient power to detect a relative risk of
interest.
 In that case, additional power can be gained by increasing the number of
controls alone.
 Doubling the number of controls,
 that is including two controls for each exposed subject, results in a modest increase in the
statistical power, but it does not double it.
 Including three controls for each exposed subject increases the power further.
187

Example: Sample Size Formula For Cohort
Study
where
 p is the incidence of the disease in the unexposed,
 R is the minimum relative risk to be detected,
  is the type I error rate which is acceptable,
  is the type II error rate which is acceptable,
 Z 1 −  and Z 1−  refer to the unit normal deviates corresponding to  and  ,
 K is the ratio of number of control subjects to the number of exposed subjects, and
 Z 1 −  is replaced by Z 1 −  /2 if one is planning to analyze the study using a two-tailed
188

2.6.2.3. Sample Size Calculations for
Case-Control Studies
 The approach to calculating sample sizes for case–
control studies is similar to the approach for cohort
studies.
 There are five variables that need to be specified
1. Type I error () considered tolerable, and whether it is
one-tailed or two-tailed
2. Type II error () considered tolerable
3. Minimum odds ratio to be detected
4. Prevalence of the exposure in the un-diseased control
group
5. Ratio of un-diseased controls to diseased study subjects
189

 In a case–control study
 one selects subjects based on the presence or absence of the disease of
interest,
 And then investigates the prevalence of the exposure of interest in each
study group.
 This is in contrast to a cohort study,
 In which one selects subjects based on the presence or absence of an
exposure,
 And then studies whether or not the disease of interest develops in each
group.
 Therefore, the fourth variable to be specified for a case–control
study is
 The expected prevalence of the exposure in the un-diseased control
group,
 Rather than the incidence of the disease of interest in the unexposed
control group of a cohort study.
190

Example: Sample Size Formula For Case–
control Study
191

NB. Sample size determinations in cohort and
Case–control studies
 Assume one is able to obtain information
 on each of the five variables that factor into these sample size
calculations.
 But it is unrealistic
 Four of the variables are totally in the control of the investigator,
subject to his or her specification:
 α, β, the ratio of control subjects to study subjects, and the
minimum relative risk to be detected.
 Only one of the variables requires data derived from other sources.
 For Cohort Studies,
 This is the expected incidence of the disease in the unexposed control
group.
 For Case–Control Studies,
 This is the expected prevalence of the exposure in the un-diseased
control group.
192

2.6.2.4. Sample size calculations for
Case series
 Case series are usually used in pharmacoepidemiology
 to quantitate better incidence of a particular disease in
patients exposed to a newly marketed drug.
Example 1:
 In the “Phase IV” post marketing drug surveillance study
conducted for prazosin, the investigators collected a case series
of 10 000 newly exposed subjects recruited through the
manufacturer’s sales force, to quantitate better the incidence of
first-dose syncope, which was a well-recognized adverse effect of
this drug.
193

 Case series are usually used to determine
 Whether a disease occurs more frequently than some
predetermined incidence in exposed patients.
 Most often, the predetermined incidence of interest is
zero.
 And one is looking for any occurrences of an extremely
rare illness.
194

Example 2:
 When CIMETIDINE was first marketed,
 there was a concern over whether it could cause agranulocytosis,
 since it was closely related chemically to METIAMIDE, another H-2
blocker, which had been removed from the market in Europe
because it caused agranulocytosis.
 This study collected 10 000 subjects.
 It found only two cases of neutropenia, one in a patient also
receiving chemotherapy.
 There were no cases of agranulocytosis.
195

Poisson Distribution Method
 To establish drug safety, a study must include a sufficient
number of subjects to detect an elevated incidence of a
disease, if it exists.
 Generally, this is calculated by assuming the frequency of the
event in question is vanishingly small, so that the occurrence of
the event follows a Poisson distribution, and then one generally
calculates 95% confidence intervals around the observed
results.
196
μ - The mean number of successes (occurrence of an event) that occur in a
specified period of time, μ = λ x t
X = The actual number of successes that occur in a specified period of time.

 In order to apply Poisson Distribution, one first calculates
the incidence rate observed from the study’s results
𝑰𝑹 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 =
𝒏𝒐. 𝒐𝒇 𝒔𝒖𝒃𝒆𝒋𝒄𝒕𝒔 𝒅𝒆𝒗𝒆𝒍𝒐𝒑 𝒅𝒊𝒔𝒆𝒂𝒔𝒆 𝒅𝒖𝒓𝒊𝒏𝒈 𝒔𝒑𝒆𝒄𝒊𝒇𝒊𝒆𝒅 𝒕𝒊𝒎𝒆 𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍
𝒕𝒉𝒆 𝒕𝒐𝒕𝒂𝒍 𝒏𝒐. 𝒐𝒇 𝒊𝒏𝒅𝒊𝒗𝒊𝒅𝒖𝒂𝒍𝒔 𝒊𝒏 𝒕𝒉𝒆 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒂𝒕 𝒓𝒊𝒔𝒌
 For example,
 If three cases of liver disease were observed in a population of
1000 patients exposed to a new nonsteroidal anti-inflammatory
drug during a specified period of time, the incidence would be
0.003.
 The number of subjects who develop the disease is the “Observed
number on which estimate is based (n)” in Table A17. In this
example, it is 3.
197

 The lower boundary of the 95% confidence interval for the
incidence rate is then the corresponding “Lower limit
factor (L)” multiplied by the observed incidence rate.
 In the example above, it would be 0.206 ×0.003 =0.000 618.
 Analogously, the upper boundary would be the product of the
corresponding “Upper limit factor (U)” multiplied by the
observed incidence rate.
 In the above example, this would be 2.92 × 0.003 =0.00876.
 In other words, the incidence rate (95% confidence interval)
would be 0.003 (0.000618 − 0.00876).
 Thus, the best estimate of the incidence rate would be 30
per 10 000, but there is a 95% chance that it lies between
6.18 per 10 000 and 87.6 per 10 000.
198

“Rule of threes” Method
 Simple guide which is useful in the common
situation
 where no events of a particular kind are observed.
 Specifically, if no events of a particular type are
observed in a study of X individuals,
 Then one can be 95% certain that the event occurs no
more often than 3/X.
N = 3/x
200

“Rule of threes” Method
 Example,
 If 500 patients are studied prior to marketing a drug, then one can
be 95% certain that any event which does not occur in any of those
patients may occur with a frequency of 3 or less in 500 exposed
subjects, or that it has an incidence rate of less than 0.006.
 If 3000 subjects are exposed prior to drug marketing, then one can
be 95% certain that any event which does not occur in this
population may occur no more than 3 in 3000 subjects, or the
events have an incidence rate of less than 0.001.
 If 10 000 subjects are studied in a post-marketing drug surveillance
study, then one can be 95% certain that any events which are not
observed may occur no more than 3 in 10 000 exposed individuals,
or that they have an incidence rate of less than 0.0003.
 In other words, events not detected in the study may occur less
often than 1 in 3333 subjects.
201

202

Ch 2 basic concepts in pharmacoepidemiology (10 hrs)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Ch 2 basic concepts in pharmacoepidemiology (10 hrs)

Similar to Ch 2 basic concepts in pharmacoepidemiology (10 hrs) (20)

More from University of Gondar

More from University of Gondar (17)

Recently uploaded

Recently uploaded (20)

Ch 2 basic concepts in pharmacoepidemiology (10 hrs)