Case Control Study
Dr. A.P. Kulkarni
BSc; MD; DPH; PhD; FIAPSM
drapkulkarni@gmail.com
Indications
• RQ is NOT descriptive type, but is of
association type
• Hypothesis to be tested exists and is
based on previous studies
• Disease is rare
• Exposure-disease interval is long **
• Association of disease with more than one
exposures / causes is to be tested **
** Not mandatary
RQ in Case control study
• Is there an association between smoking
and lung cancer?
• Are the obese persons at more risk of
hypertension than non-obese persons?
• Is the risk of diabetes more in the women
with abnormal GTT during pregnancy than
the women with normal GTT during
pregnancy?
Some important discoveries made in
case control studies
  1950's
• Cigarette smoking and lung cancer
1970's
• Diethyl stilbesterol and vaginal adeno-
carcinoma
• Post-menopausal estrogens and
endometrial cancer
  1980's
• Aspirin and Reyes syndrome
• Tampon use and toxic shock syndrome
• L-tryptophan and eosinophilia- myalgia
syndrome
• AIDS and sexual practices
1990's
• Diet and breast cancer
Study Design
• Cases : Individuals with disease in
question
• Controls: Individuals not with the disease
Time
Exposure Disease Study starts
Case control studies are therefore: Backward looking
studies
Scheme
CASES
CONTROLS
SAMPLE
SAMPLE
Exposed
Un
exposed
Exposed
Un
exposed
S
T
U
D
Y
B
A
S
E
Steps
• Selection of cases
• Selection of controls
• Matching
• Measurement of exposure in cases &
controls
• Analysis
• Reporting
Step-1: Selection of Cases
• Criteria (Diagnostic)
clearly defined
• Inclusion / exclusion
criteria of cases
• Based on history ,
examination,
investigations
• New / old cases?:
Usually new
Sources of Cases
• Hospital
• Workplace
• Population
Step-2: Selection of Controls
• Do not have disease / problem under study
• May have other disease / problem
Sources of Controls
• Hospital
• Relatives
• Neighbors
• Occupational associates
• General population
• Inclusion exclusion
criteria for controls
• Exclude cases with
known association with
exposure factor under
study
Step-3: Matching
Care taken to avoid unequal distribution of
key variables like age, gender
• Selection from same source
• Selection from same groups
• Selection after cases
Variables under study are never matched
Individual matching:
 If there is a male case age 50-54; there will
be a male control aged 50-54 & if there is a
female case aged 30-34; there will be a
female control aged 30-34
 Requires use of matched control design in
analysis
 One has to wait until required controls are
obtained for ALL cases .
Disadvantages of Matching
• Time consuming
• Costly
• Exact match may not be found
• Once a factor is matched, it cannot be analysed
• If you match for neighborhood, you may ALSO
be matching for SE Status
Matching: Precautions
• If matching factor is associated with disease but
not exposure, matched analysis will be
statistically less efficient
• If matching factor is associated with exposure
but not disease, we must use matched analysis,
otherwise odds ratio will be biased towards null
in unmatched analysis.
• Do not match unless matching variable is known
associated both with disease and exposure.
Step-4: Measurement of
Exposure
• Yes / No type (Smoking)
• Frequency (Blood donations received, X-
rays done)
• Magnitude of exposure (Eg. Distance from
factory)
• Duration (Smoking, service duration in an
occupation)
Sources of Data on Exposure
• Questionnaire
• Physical examination
• Lab investigations
• Records
Step-5: Analysis: Unmatched &
Frequency matched
• Comparability of cases and controls for
key variables (other than the ones being
studied) If yes ->
• Odds Ratio
Exposure Cases Controls
Yes a b
No c d
--------------------------------- O.R. = a*d /
b*c
Calculation of OR: Unmatched
Exposure Cases Controls Total
Yes 42 24 66
No 28 46 74
Total 70 70 140
OR = (42 x 46) / (28 x 24) = 2.875
Odds that case will be exposed is 2.87 times
Higher than controls.
Calculation of OR: Unmatched
Exposure Cases Controls Total
Yes 42 24 66
No 28 46 74
Total 70 70 140
OR in Matched Case Control Study
Controls
Exposed
Cases Exposure Total
Yes No
Yes a b a +b
No c d c + d
Total a +c b + d n
a & d are concordant pairs
b & c are discordant pairs
OR = c / b: Also “Discordant Ratio”
OR in Matched Case Control Study
Exposure : Oral Use of Conjugated
Estrogen
Case : Cervical Cancer
Controls
Exposed
Cases Exposure Total
Yes No
Yes 12 7 19
No 43 121 164
Total 55 128 183
OR = 43 / 7 = 6.14
FOR THIS EXAMPLE
FOR THIS EXAMPLE
Chi SQ (McNM) =
(43-7-1) x (43-7-1)÷ (43+7)
= 24.1
OR in Matched Case Control Study
Exposure : Oral Use of Conjugated
Estrogen
Case : Cervical Cancer
Controls
Exposed
Cases Exposure Total
Yes No
Yes 12 7 19
No 43 121 164
Total 55 128 183
Interpretation of Odds Ratio
Theoretical Range: 0 to ∞
Point
Estimate
Lower
value
Higher
value
Interpretation
0.4 0.21 0.89 Negative
association,
significant
0.8 0.6 3.8 Negative, Not
significant
2.1 0.8 6.1 Positive, Not
significant
4.3 2.7 11.2 Positive, significant
Reporting of Case-control study
• STROBE guidelines
• Check-list, Flow charts
Cohort Study
Dr. A.P. Kulkarni
BSc; MD; DPH; PhD; FIAPSM
drapkulkarni@gmail.com
Cohort Study Design
• Exposed: : Individuals with exposure to
risk factor under study
• Controls: Individuals not Exposed
Time
Exposure Study starts Disease
Cohort study is therefore: Forward looking studies
Cohort: Design
P
O
P
U
L
A
T
I
O
N
EXPOSED
Un
EXPOSED
Follow -up Develop disease
Indications
• RQ is NOT descriptive type, but is of
association type
• Hypothesis to be tested exists and is
based on previous studies
• Disease is fairly common
• Exposure-disease interval is short
• Association of Exposure with more than
one outcomes is to be tested **
** Not mandatory
Steps in a typical cohort study
1. Selection of source population: Co-operation
2. Measurement of exposure: Valid method
Nominal variable: Yes/ No, Categories
Ratio variable
4. Follow-up for outcome: Diagnosis of out-
come. Standard test
5. Analysis
6. Reporting: STROBE guidelines
Calculations
Group Disease No
Disease
Total
Exposed 12 488 500
Un-exposed 04 496 500
Total 016 984 1000
Incidence in exposed = IE= 12 X 1000 / 500 = 24 per 1000
Incidence Unexposed = IU = 4 X 1000 / 500 = 08 per 1000
Can be calculated for Years of exposure
Incidence in 1st
year of exposure, 2nd
year etc
Relative Risk = RR = IE/ IU = 24/ 8 = 3.0
95% CI of RR (Soft ware)
Comparison of CC & Cohort Study
Particulars Case control Cohort
Study starts After disease &
exposure
Before disease
Useful in Rare disease Rare exposure
Useful if Exposure – effect
duration long
Exposure- effect
duration short
Can test Association of
disease & many
exposure factors
Association of
exposure and
many diseases
Cost Cheaper Costly
Bias most likely Recall Follow up
Incidence
calculation
Not possible Possible
Special issues in cohort studies
Recruitment
•All recruited simultaneously or in very short
duration. This is very rare. Here “Entry time”
of each participant is almost same.
•Recruitment over a period of time, as and
when participants come. Here “Entry time”
of individuals differ
Terminologies
Follow-up & Exit
A.Censored: Here Participants are Followed-
up for fixed duration only. All subjects do not
experience endpoint of interest. Such
observations are called “Censored”
B.Un-censored: Here, Follow-up is till
development of disease under question or
death. A rare method. These observations are
called “Un-censored”
Outcomes possible
A. Alive at end
Developed the disease under study
Did not develop disease under study
B. Dead during study period
Cause: Disease under study
Cause: Not the disease under study
C. Lost to follow-up, outcome un-known
Special issues in cohort studies
Follow-up
•Quality of follow-up
•Uniformity of follow-up
•Loss to follow-up
Reversal/ Contamination
•Exposed turning non-exposed: Leave smoking
•Non-exposed turning exposed: Start smoking
Diagnostic improvements: During study
Concept of Person-time
• Variable period of follow-up
• Concept of person-time (Person-years)
3 person years
5 person years
4 person years
2 person years
1 case in 14 person years
Nested Case Control Study
Dr. A.P. Kulkarni
BSc; MD; DPH; PhD; FIAPSM
drapkulkarni@gmail.com
What is Nested Case Control
Study?
• By product of cohort study
• A nested case-control study is conducted
within a defined cohort in which exposure
data and population characteristics are
available to some extent, often from the
time of enrollment into the cohort
• It draws its cases and controls from a
cohort population that has been followed
for some period of time.
Concept of Nested
Case-Control
Example
• Cohort Study for risk of breast cancer &
Coronary Heart Disease
• Cohort of 90,000 women
• The women provide baseline information
on a host of exposures, and they also
provide baseline blood and urine samples
that are frozen for possible future use
• The women are then followed, for about
eight years 1439 develop breast cancer
Example…. continued
• Investigators now want to test the
hypothesis that past exposure to
pesticides such as DDT is a risk factor for
breast cancer.
• Exposure to DDT can be detected by
examining frozen blood samples
Example…..Continued
• Since they froze blood samples at
baseline, they have the option of analyzing
all of the blood samples in order to
ascertain exposure to DDT at the
beginning of the study before any cancers
occurred.
• The problem is that there are almost
90,000 women and it would cost $20 to
analyze each of the blood samples.
Example…. Continued
Economical option:
 Examine stored blood sample of all
those developed breast cancer (Cases)
 Examine stored blood of selected
participants only from those who did not
develop breast cancer (Controls)
Advantages
• Utilize the exposure and confounder data
originally collected before the onset of the
disease, thus reducing potential recall bias
and temporal ambiguity
• Include cases and controls drawn from the
same cohort, decreasing the likelihood of
selection bias.
Concerns
• A concern, usually minor, is that the
remaining non-diseased persons from
whom the controls are selected when it is
decided to do the nested study, may not
be fully representative of the original
cohort due to death or losses to follow-up.
Sampling issues
Sampling in nested study
Control can be selected from a pool of:
•All those who did not develop disease at
the time when nested study started
•Persons who did not suffer from breast
cancer at the time when case was
diagnosed as cancer irrespective of final
outcome (Period matching)
•Controls selected based on duration of
preservation of blood (Batch-matching)
Matching in Nested Study
Matching can be done for
•Age and gender
•Selected confounders like Ethnicity

4. case control study

  • 1.
    Case Control Study Dr.A.P. Kulkarni BSc; MD; DPH; PhD; FIAPSM drapkulkarni@gmail.com
  • 2.
    Indications • RQ isNOT descriptive type, but is of association type • Hypothesis to be tested exists and is based on previous studies • Disease is rare • Exposure-disease interval is long ** • Association of disease with more than one exposures / causes is to be tested ** ** Not mandatary
  • 3.
    RQ in Casecontrol study • Is there an association between smoking and lung cancer? • Are the obese persons at more risk of hypertension than non-obese persons? • Is the risk of diabetes more in the women with abnormal GTT during pregnancy than the women with normal GTT during pregnancy?
  • 4.
    Some important discoveriesmade in case control studies   1950's • Cigarette smoking and lung cancer 1970's • Diethyl stilbesterol and vaginal adeno- carcinoma • Post-menopausal estrogens and endometrial cancer
  • 5.
      1980's • Aspirinand Reyes syndrome • Tampon use and toxic shock syndrome • L-tryptophan and eosinophilia- myalgia syndrome • AIDS and sexual practices 1990's • Diet and breast cancer
  • 6.
    Study Design • Cases: Individuals with disease in question • Controls: Individuals not with the disease Time Exposure Disease Study starts Case control studies are therefore: Backward looking studies
  • 7.
  • 8.
    Steps • Selection ofcases • Selection of controls • Matching • Measurement of exposure in cases & controls • Analysis • Reporting
  • 9.
    Step-1: Selection ofCases • Criteria (Diagnostic) clearly defined • Inclusion / exclusion criteria of cases • Based on history , examination, investigations • New / old cases?: Usually new Sources of Cases • Hospital • Workplace • Population
  • 10.
    Step-2: Selection ofControls • Do not have disease / problem under study • May have other disease / problem Sources of Controls • Hospital • Relatives • Neighbors • Occupational associates • General population • Inclusion exclusion criteria for controls • Exclude cases with known association with exposure factor under study
  • 11.
    Step-3: Matching Care takento avoid unequal distribution of key variables like age, gender • Selection from same source • Selection from same groups • Selection after cases Variables under study are never matched
  • 12.
    Individual matching:  Ifthere is a male case age 50-54; there will be a male control aged 50-54 & if there is a female case aged 30-34; there will be a female control aged 30-34  Requires use of matched control design in analysis  One has to wait until required controls are obtained for ALL cases .
  • 13.
    Disadvantages of Matching •Time consuming • Costly • Exact match may not be found • Once a factor is matched, it cannot be analysed • If you match for neighborhood, you may ALSO be matching for SE Status
  • 14.
    Matching: Precautions • Ifmatching factor is associated with disease but not exposure, matched analysis will be statistically less efficient • If matching factor is associated with exposure but not disease, we must use matched analysis, otherwise odds ratio will be biased towards null in unmatched analysis. • Do not match unless matching variable is known associated both with disease and exposure.
  • 15.
    Step-4: Measurement of Exposure •Yes / No type (Smoking) • Frequency (Blood donations received, X- rays done) • Magnitude of exposure (Eg. Distance from factory) • Duration (Smoking, service duration in an occupation)
  • 16.
    Sources of Dataon Exposure • Questionnaire • Physical examination • Lab investigations • Records
  • 17.
    Step-5: Analysis: Unmatched& Frequency matched • Comparability of cases and controls for key variables (other than the ones being studied) If yes -> • Odds Ratio Exposure Cases Controls Yes a b No c d --------------------------------- O.R. = a*d / b*c
  • 18.
    Calculation of OR:Unmatched Exposure Cases Controls Total Yes 42 24 66 No 28 46 74 Total 70 70 140 OR = (42 x 46) / (28 x 24) = 2.875 Odds that case will be exposed is 2.87 times Higher than controls.
  • 19.
    Calculation of OR:Unmatched Exposure Cases Controls Total Yes 42 24 66 No 28 46 74 Total 70 70 140
  • 20.
    OR in MatchedCase Control Study Controls Exposed Cases Exposure Total Yes No Yes a b a +b No c d c + d Total a +c b + d n a & d are concordant pairs b & c are discordant pairs OR = c / b: Also “Discordant Ratio”
  • 21.
    OR in MatchedCase Control Study Exposure : Oral Use of Conjugated Estrogen Case : Cervical Cancer Controls Exposed Cases Exposure Total Yes No Yes 12 7 19 No 43 121 164 Total 55 128 183 OR = 43 / 7 = 6.14 FOR THIS EXAMPLE FOR THIS EXAMPLE Chi SQ (McNM) = (43-7-1) x (43-7-1)÷ (43+7) = 24.1
  • 22.
    OR in MatchedCase Control Study Exposure : Oral Use of Conjugated Estrogen Case : Cervical Cancer Controls Exposed Cases Exposure Total Yes No Yes 12 7 19 No 43 121 164 Total 55 128 183
  • 23.
    Interpretation of OddsRatio Theoretical Range: 0 to ∞ Point Estimate Lower value Higher value Interpretation 0.4 0.21 0.89 Negative association, significant 0.8 0.6 3.8 Negative, Not significant 2.1 0.8 6.1 Positive, Not significant 4.3 2.7 11.2 Positive, significant
  • 24.
    Reporting of Case-controlstudy • STROBE guidelines • Check-list, Flow charts
  • 25.
    Cohort Study Dr. A.P.Kulkarni BSc; MD; DPH; PhD; FIAPSM drapkulkarni@gmail.com
  • 26.
    Cohort Study Design •Exposed: : Individuals with exposure to risk factor under study • Controls: Individuals not Exposed Time Exposure Study starts Disease Cohort study is therefore: Forward looking studies
  • 27.
  • 28.
    Indications • RQ isNOT descriptive type, but is of association type • Hypothesis to be tested exists and is based on previous studies • Disease is fairly common • Exposure-disease interval is short • Association of Exposure with more than one outcomes is to be tested ** ** Not mandatory
  • 29.
    Steps in atypical cohort study 1. Selection of source population: Co-operation 2. Measurement of exposure: Valid method Nominal variable: Yes/ No, Categories Ratio variable 4. Follow-up for outcome: Diagnosis of out- come. Standard test 5. Analysis 6. Reporting: STROBE guidelines
  • 30.
    Calculations Group Disease No Disease Total Exposed12 488 500 Un-exposed 04 496 500 Total 016 984 1000 Incidence in exposed = IE= 12 X 1000 / 500 = 24 per 1000 Incidence Unexposed = IU = 4 X 1000 / 500 = 08 per 1000 Can be calculated for Years of exposure Incidence in 1st year of exposure, 2nd year etc Relative Risk = RR = IE/ IU = 24/ 8 = 3.0 95% CI of RR (Soft ware)
  • 31.
    Comparison of CC& Cohort Study Particulars Case control Cohort Study starts After disease & exposure Before disease Useful in Rare disease Rare exposure Useful if Exposure – effect duration long Exposure- effect duration short Can test Association of disease & many exposure factors Association of exposure and many diseases Cost Cheaper Costly Bias most likely Recall Follow up Incidence calculation Not possible Possible
  • 32.
    Special issues incohort studies Recruitment •All recruited simultaneously or in very short duration. This is very rare. Here “Entry time” of each participant is almost same. •Recruitment over a period of time, as and when participants come. Here “Entry time” of individuals differ
  • 33.
    Terminologies Follow-up & Exit A.Censored:Here Participants are Followed- up for fixed duration only. All subjects do not experience endpoint of interest. Such observations are called “Censored” B.Un-censored: Here, Follow-up is till development of disease under question or death. A rare method. These observations are called “Un-censored”
  • 34.
    Outcomes possible A. Aliveat end Developed the disease under study Did not develop disease under study B. Dead during study period Cause: Disease under study Cause: Not the disease under study C. Lost to follow-up, outcome un-known
  • 35.
    Special issues incohort studies Follow-up •Quality of follow-up •Uniformity of follow-up •Loss to follow-up Reversal/ Contamination •Exposed turning non-exposed: Leave smoking •Non-exposed turning exposed: Start smoking Diagnostic improvements: During study
  • 36.
    Concept of Person-time •Variable period of follow-up • Concept of person-time (Person-years) 3 person years 5 person years 4 person years 2 person years 1 case in 14 person years
  • 38.
    Nested Case ControlStudy Dr. A.P. Kulkarni BSc; MD; DPH; PhD; FIAPSM drapkulkarni@gmail.com
  • 39.
    What is NestedCase Control Study? • By product of cohort study • A nested case-control study is conducted within a defined cohort in which exposure data and population characteristics are available to some extent, often from the time of enrollment into the cohort • It draws its cases and controls from a cohort population that has been followed for some period of time.
  • 40.
  • 41.
    Example • Cohort Studyfor risk of breast cancer & Coronary Heart Disease • Cohort of 90,000 women • The women provide baseline information on a host of exposures, and they also provide baseline blood and urine samples that are frozen for possible future use • The women are then followed, for about eight years 1439 develop breast cancer
  • 42.
    Example…. continued • Investigatorsnow want to test the hypothesis that past exposure to pesticides such as DDT is a risk factor for breast cancer. • Exposure to DDT can be detected by examining frozen blood samples
  • 43.
    Example…..Continued • Since theyfroze blood samples at baseline, they have the option of analyzing all of the blood samples in order to ascertain exposure to DDT at the beginning of the study before any cancers occurred. • The problem is that there are almost 90,000 women and it would cost $20 to analyze each of the blood samples.
  • 44.
    Example…. Continued Economical option: Examine stored blood sample of all those developed breast cancer (Cases)  Examine stored blood of selected participants only from those who did not develop breast cancer (Controls)
  • 45.
    Advantages • Utilize theexposure and confounder data originally collected before the onset of the disease, thus reducing potential recall bias and temporal ambiguity • Include cases and controls drawn from the same cohort, decreasing the likelihood of selection bias.
  • 46.
    Concerns • A concern,usually minor, is that the remaining non-diseased persons from whom the controls are selected when it is decided to do the nested study, may not be fully representative of the original cohort due to death or losses to follow-up.
  • 47.
  • 48.
    Sampling in nestedstudy Control can be selected from a pool of: •All those who did not develop disease at the time when nested study started •Persons who did not suffer from breast cancer at the time when case was diagnosed as cancer irrespective of final outcome (Period matching) •Controls selected based on duration of preservation of blood (Batch-matching)
  • 49.
    Matching in NestedStudy Matching can be done for •Age and gender •Selected confounders like Ethnicity