Case control study


Published on

Epidemiological studies and details of case control study

Published in: Health & Medicine, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Case control study

  1. 1. Seminar Presentation by: Dr. Timiresh Kumar Das Moderator: Dr. D. K. Raut, Director Professor, Dept. of Community Medicine, VMMC & Safdarjung Hospital
  2. 2.  Epidemiological study cycle  Analytical studies: Types  Case control vs Cohort  Case control study  Definitions  History  Design  Outcomes  Limitations  Advantages and Applications  Nested case control studies  Selected examples of case control studies
  3. 3.  The sequence of events starting with description of disease or health related event in relation to time, place, person searching for and finding differences in occurrence in different populations formulating hypotheses regarding possible causative factors and testing them analysing the results results may lead to further descriptive studies or new hypotheses.
  4. 4. DESCRIPTIVE STUDY Hypothesis: Smoking causes Ca Lung CASE CONTROL STUDY • Ca Lung increasing  mostly smokers • Death rates higher in populations with higher per capita cigarette consumption • Ca Lung patients and non patients Clarifies if it was smokers who contributed to high Ca Lung COHORT STUDY Ochsner, 1939 • Follows a cohort of smokers and non smokers without Ca Lung Doll, 1947-52 Hill, 1951 -61 •Smokers develop Ca Lung more frequently INTERVENTIONAL TRIAL •Proves hypothesis conclusively (RCT) •Gives inputs regarding other factors, control measures.
  5. 5.  Observational Case control (Retrospective) studies  Cohort (Prospective) studies  Difference in study groups is ONLY observed & analysed, NOT created experimentally  Experimental (Interventional):  Animal experiments  Human studies • Therapeutic trials • Preventive trials Difference in study groups is CREATED EXPERIMENTALLY and outcomes observed
  6. 6.  Purpose: To produce a valid estimate of a hypothesised cause-effect relationship between suspected risk factor and disease. Case Control Study Cohort Study Starts with diseased (cases) & not diseased (controls) Starts with not diseased but exposed & not exposed Determine if 2 groups differ in exposure to specific factor or factors Followed up to determine difference in rates at which disease develops in relation to exposure Called as case control study due to the way Called so because of the use of a “cohort” in which study group is assembled (a group of people who share a common characteristic or experience)
  7. 7. A fourfold table Retrospective (Cohort) Prospective (Case-Control) E X P O S U R E cases present present exposed absent Not exposed controls DISEASE absent a b c d Total Total Mausner, 1985
  8. 8. Case Control Studies Cohort Studies Proceeds from effect to cause Proceeds from cause to effect Starts with the disease Starts with people exposed to the risk factor or suspected cause Tests whether the suspected cause occurs more frequently in those with disease than those without disease Tests whether disease occurs more frequently in those exposed than in those not exposed Usually the 1st approach to the testing of hypothesis, but also useful for exploratory studies Reserved for the testing of precisely formulated hypothesis Involves fewer study subjects Involves larger number of subjects Yields results relatively quickly Long follow-up, delayed results Suitable for study of rare diseases Inappropriate when disease or exposure under investigation is rare Generally, yields only estimate of relative risk (Odds ratio) Yields incidence rates, relative risk, attributable risk Cannot yield information about disease other than that under study Can give information about more than one disease outcome Relatively inexpensive Expensive
  9. 9.  Case control study synonyms:  Case comparison study  Case compeer study  Case history study  Case referent study  Retrospective study  Case control study definitions:  The observational epidemiologic study of persons with the disease (or other outcome variable) of interest and a suitable control (comparison/ reference) group of persons without the disease. (Dictionary of Epidemiology: 3rd ed; John M Last. 2000)
  10. 10.  Case control study definitions: A study that compares two groups of people: those with the disease or condition under study (cases) and a very similar group of people who do not have the disease or condition (controls). (National Institute of Health, USA)  A case control study involves two populations – cases and controls and has three distinct features :   Both exposure and outcome have occurred before the start of the study.  The study proceeds backwards from effect to cause.  It uses a control or comparison group to support or refute an inference. (Park’s Textbook of Preventive and Social Medicine – 20th ed; K. Park. 2009)
  11. 11.  Case : A person in the population or study group identified as having the particular disease, health disorder or condition under investigation. (Dictionary of Epidemiology: 3rd ed; John M Last. 2000)  Control: Person or persons in a comparison group that differs, in disease experience (or other health related outcome) in not having the outcome being studied. (Dictionary of Epidemiology: 3 ed; John M Last. 2000) rd
  12. 12.  Bias: Any systematic error in the design, conduct, or analysis of a study that results in mistaken estimates of the effect of the exposure on disease.  Confounding: When a measure of the effect of an exposure on risk is distorted because of the association of exposure with other factors that influence the outcome. It creates data where it is not possible to separate the contribution that any single causal factor has made an effect.
  13. 13. PCA Louis (17881875) - numerical method; William Augustus Guy (1843); Baker (1862) – case control comparisons of marriage and fertility in breast cancer LANE CLAYPON’S BREAST CANCER STUDY 1926 Early beginnings LUNG CANCER AND SMOKING 1950 Establishment and acceptance
  14. 14.  Six essential elements which developed separately over time in medical hiatory Idea of the case  Interest in disease etiology and prevention  Focus on individual, as opposed to group etiologies  Anamnesis or history taking from patients  Grouping individual cases together into series  Making comparisons of the differences between groups, in order to elicit average risk at the level of individual 
  15. 15.  Concept found in works of Parisian physician PCA Louis (1788-1875) - “numerical method”, a technique whose principal tool was the tabulation of aggregated data about patients with similar pathologic and clinical findings.  First explicit description by William Augustus Guy (1843) – analysis of relationship of prior occupational exposure and occurrence of pulmonary consumption.  Baker (1862) – case control comparisons of marriage and fertility in breast cancer patients.
  16. 16.  Lane Claypon’s Breast cancer study 1926 -‘‘A further report on cancer of the breast: reports on public health and medical subjects.’’ (Lane-Claypon 1926a).  500 hospitalised cases and 500 controls with noncancerous illnesses  22% lower fertility in the case group.  1950 - Four studies that implicated cigarette smoking in cancer of the lung published in 1950 in the United States (Levin et al 1950; Wynder & Graham 1950; Schrek et al. 1950) and in Britain (Doll & Hill 1950), have established several features of the modern form of the case-control study.  Doll & Hill’s study is perhaps the most well known in history.
  17. 17. The investigator selects cases with the disease and appropriate controls without the disease and obtains data regarding past exposure to possible etiologic factors in both groups. The investigator then compares the frequency of exposure of the two groups.
  18. 18. Exposed Not Exposed Disease “CASES” Exposed Not Exposed No Disease “CONTROLS”  Hallmark of Case Control Study: Starts from cases and controls and searches for exposure.
  19. 19. FIRST: Select CASES (With Disease) CONTROLS (Without Disease) THEN: Were exposed a b Measure Exposure Were not exposed c d a+c b+d TOTALS Proportions Exposed a a+c b b+d
  20. 20.  Selection of CASES: 1. Representativeness: Ideally, cases are a random sample of all cases of interest in the source population (e.g. from vital data, registry data). More commonly they are a selection of available cases from a medical care facility. (e.g. from hospitals, clinics)  Information: can be collected from cases themselves, or from a respondent by proxy (relative/ friend), from records or a combination of the above.
  21. 21.  Selection of CASES: 2. Method of Selection Selection may be from incidence or prevalence case: • Incident cases are those derived from ongoing- ascertainment of cases over time. • Prevalent cases are derived from a cross- sectional survey.
  22. 22.  Selection of CASES: 2. Method of Selection  Selection of INCIDENT CASES is OPTIMAL.  These should be all newly diagnosed cases over a given period of time in a defined population.  However we are excluding patients who died before diagnosis. A difficult problem ???  Prevalent cases do NOT include patients with a short course of disease.  So patients who recovered early and those who died will not be included.  Additional protection against bias by including deceased cases as well as those alive
  23. 23.  Selection of CASES: 3. Diagnostic criteria for case studies a) Specificity b) Diagnostic bias c) Validation  Diagnostic criteria regarding diagnosis of cases, types of cases and stage of disease to be included should be predefined.  Validity is more important than generalizability i.e. the need to establish an etiologic relationship is more important than to generalise results to the population.
  24. 24.  Selection of CASES: 3. Diagnostic criteria for case studies  Example: In a study on breast cancer – we can include all cases OR we can include only premenopausal women with lobular cancer.  If we take the later group as cases; we can elicit the etiology better. 
  25. 25.  Selection of CONTROLS: (i) Should the controls be similar to the cases in all respects other than having the disease? i.e. COMPARABLE (ii) (ii) Should the controls be representative of all non-diseased people in the population from which the cases are selected? i.e. REPRESENTATIVE
  26. 26.  Selection of CONTROLS:  Comparability vs Representativeness  The control group should be representative of the general population in terms of probability of exposure to the risk factor  AND they should also have had the same opportunity to be exposed as the cases have.  Not that both cases and controls are equally exposed; but only that they have had the same opportunity for exposure.
  27. 27.  Selection of CONTROLS:  Comparability vs Representativeness  Usually, cases in a case-control study are not a random sample of all cases in the population. And if so, the controls must be selected in the same way (and with the same biases) as the cases.  If follows from the above, that a pool of potential controls must be defined. This is a universe of people from whom controls may be selected (study base).
  28. 28.  Selection of CONTROLS:  The study base is composed of a population at risk of exposure over a period of risk of exposure.  Cases emerge within a study base. Controls should emerge from the same study base, except that they are not cases.  For example, if cases are selected exclusively from hospitalized patients, controls must also be selected from hospitalized patients.
  29. 29. “Total” Population Reference Population Cases Controls
  30. 30.  Selection of CONTROLS: Criteria  Comparability is more important than representativeness in the selection of controls  The control should be at risk of the disease  The control should resemble the case in all respects except for the presence of disease (and any as yet undiscovered risk factors for disease)
  31. 31.  Selection of CONTROLS: Sources Source Advantage Disadvantage Hospital based Easily identified. Available for interview. More willing to cooperate. Tend to give complete and accurate information ( recall bias). Not typical of general population. Possess more risk factors for disease. Some diseases may share risk factors with disease under study. (whom to exclude???) Berkesonian bias Population based (registry cases) Most representative of the general population. Generally healthy. Time, money, energy. Opportunity of exposure may not be same as that of cases. (locn, occu,) Neighbourhood controls/ Telephone exchange random dialing Controls and cases similar in residence. Easier than sampling the population. Non cooperation. Security issues. Not representative of general population. Best friend control/ Sibling control Accessible, Cooperative. Similar to cases in most aspects. Overmatching.
  32. 32.  Selection of Controls : Number o Large study: Cases: Control :: 1:1 o Small study: Cases: Control :: 1:2, 1:3, 1:4. o Use of multiple controls 1. Controls of same type: Cases: Control :: 1:1 ( for rare diseases, cases cannot be increased in that time), ( increases power of the study). 2. Multiple controls of different types: controls- 1 hospital, 1 neighborhood e.g. case- Children with brain tumor, control- children with other cancer, normal children, risk factor- h/o radiation exposure.
  33. 33. Children with brain tumours Children with other cancers Children without cancer Radiation causes cancers Radiation causes brain cancers only  Multiple controls of different types are valuable for exploring alternate hypothesis & for taking into account possible potential recall bias.  (From Gold EB, Gordis L, Tonascia J, Szklo M; Risk factors for brain tumors in children. Am J Epidemiol 1979)
  34. 34. Selection of Controls: Objectives  Elimination of selection bias - Selection  Minimization of information bias - Blinding  Minimization of confounding - Matching
  35. 35.  Problems in control selection – Confounding variables. Confounding variables are factors associated with the exposure of interest and causally with the disease of interest.  May lead to a spurious/ biased relationship between risk factor and disease.  Common confounding variables are : age, sex, educational status, socioeconomic level, etc.  These can be adjusted by :   Designing the study through Matching  Statistical techniques like Stratification and Regression
  36. 36.  Matching:  Definition: It is the selection of controls so that they are similar to the cases in specified characteristics. (Epidemiology: An Introductory Text; Mausner & Bahn, 1985)  Matching is defined as the process of selecting controls so that they are similar to cases in certain characteristics such as age, sex, race, socioeconomic status and occupation. (Epidemiology; Leon Gordis, 2004)
  37. 37.  Matching:  Matching variables (e.g. age), and matching criteria (e.g. within the same 5 year age group) must be set up in advance.  Controls can be individually matched (most common) or Frequency matched.  Individual matching (Matched pairs): search for one (or more) controls who have the required matching criteria, paired (triplet) matching is when there is one (two) control (s) individually matched to each cases.  Group matching (Frequency matching): select a population of controls such that the overall characteristics of the case, e.g. if 15% cases are under age 20, 15% of the controls are also.
  38. 38.  Matching:  Avoid over-matching, match only on factors KNOWN to be cause of the disease.  Obtain POWER by matching MORE THAN ONE CONTROL per case. In general, N of controls should be ≤ 4, because there is no further gain of power above that.  Obtain Generalizability by matching by matching more than one type of control.
  39. 39.  Matching: Problems –  Individual matching on too many variables – is time consuming, costly, cumbersome and may lead to too less controls.  Cannot explore possible association of disease with any variable on which cases and controls have been matched. Therefore only factors which are known to be associated with the disease are studied.  Suppose we know that breast cancer rates are higher among single women than in married women; then matching cases for marital status would spuriously NOT detect any relation regarding this factor.
  40. 40.  Matching: Problems –  Overmatching: Matching on variables other than those that are risk factors for the disease under study, either in a planned manner or inadvertently.  Example: In a study on OCP use as a risk factor for cancer, if we use “best friend controls”, it is most likely that the controls would also be OCP users. In effect we would have matched for the very factor we want to study.  Example: If we use neighbourhood controls in a study on nutrition and tuberculosis, we would be inadvertently matching for socioeconomic status and thus nutrition.
  41. 41.  Definition: Any systematic error in the design, conduct, or analysis of a study that results in mistaken estimates of the effect of the exposure on disease.  Types of bias in case control studies: Selection bias Information bias Confounding bias
  42. 42.  Selection Bias:  Sources – Selective loss to follow-up 2. Incomplete ascertainment of cases (Detection or Diagnostic bias) 3. Inappropriate control group 4. Differential motivation to participate 1.
  43. 43.  Selection Bias: Selective survival - only surviving subject available to be studied; those surviving differ from those dying in potentially important ways. Solution: interview :Rapid case ascertainment and
  44. 44.  Information Bias:  Occurs due to 1. 2. Imperfect definitions of study variables OR Flawed data collection procedures.  Leads to – Misclassification of disease and exposure.  Types of Information bias –  Recall bias  Interviewer bias
  45. 45.  Some of the cases or controls who were actually exposed will be erroneously classified as unexposed, and some who were actually not exposed will be erroneously classified as exposed.—this generally results in an underestimate of the true risk of the disease associated with the exposure. e.g. cervical cancer with sexual intercourse with uncircumcised men Comparison of patients’ statements with examination findings concerning circumcision status, Roswell Park Memorial Istitute, New York Patients statement regarding circumcision Examination finding Yes (no.) Yes(%) No (no.) No(%) circumcised 37 66.1 47 34.6 notcircumcised 19 33.9 89 65.4 Total 56 100.0 136 100.0
  46. 46. Recall bias (usually in case-control studies): Cases who are aware of their disease status may be more likely to recall exposures than controls e.g. congenital malformation with prenatal infections Results in misclassification Solution • Achieving similarity in the procedures used to obtain information from cases and controls • Verify exposure with existing records • Objective measure of exposure • Use of information recorded prior to the time of diagnosis.
  47. 47.  Interviewer bias: When interviewer is not blinded (knows) case status of subjects there is potential for interviewer bias.  Leads to –  If interviewer knows case status – differential misclassification likely.  If interviewer does not know case status – non differential misclassification is still possible.  Solution –  Blinding of interviewer as to case status  Equal interview time for all participants
  48. 48.  Confounding: When a measure of the effect of an exposure on risk is distorted because of the association of exposure with other factors that influence the outcome. Not possible to separate the contribution that any single causal factor has made Confounding Factor: is one which is associated with both exposure & disease , and is distributed unequally in study & control groups.  E.g.: Alcohol & Esophageal Ca ; confounding factorsmoking  Solution: Study design : Matching Analysis: Stratification & Regression 
  49. 49.  On analysis of case control study we find out  Exposure rates: the frequency of exposure to suspected risk factor in cases and in controls  Estimation of disease risk associated with exposure: (Odds ratio)
  50. 50.  Exposure rates:  A case control study provides a direct estimation of the exposure rates (frequency of exposure) to the suspected factor in disease and non-disease groups. Cases (lung cancer) Smokers Non Smokers TOTAL Controls (without lung cancer) 33 (a) 55 (b) 2 (c) 27 (d) 35 (a + c) 82 (b+d) Doll R. and Hill AB. (1950) Brit. Med. J.  Exposure rates  Cases = a/ (a + c) = 33/ 35 = 94.2%  Controls = b/ (b + d) = 55/82 = 67.0%
  51. 51.  Odds Ratio / Relative odds (estimate of relative risk).  Odds: Odds of an event is defined as the ratio of the number of ways an event can occur to the number of ways an event cannot occur. (Epidemiology; Leon Gordis. 2004)  If the probability of event X occurring is P, then odds of it occurring is = P/ 1-P.  Odds ratio: Ratio of the odds that the cases were exposed to the odds that the controls were exposed.
  52. 52.  Odds ratio:  Using the four-fold table – Diseased/ Cases Exposed a Not diseased/ Controls b Not exposed c d Odds that case was exposed  Odds ratio = Odds that control was exposed = (a/c)/ (b/d) = ad / bc
  53. 53.  Odds ratio ( = cross products ratio) can also be viewed as the ratio of the product of the two cells that support the hypothesis of an association (cells a & d – diseased people who were exposed and non diseased people who were not exposed), to the product of the two cells which negate the hypothesis of an association (cells b & c – non diseased people who were exposed and diseased people who were not exposed).
  54. 54.  When is Odds ratio a good estimate of the relative risk in the population?  Cases studied are representative  Regarding history of exposure of all people with the disease in the population from which cases are drawn.  Controls studied are representative  Regarding history of exposure of all people without the disease in the population from which cases are drawn  When the disease being studied does NOT occur frequently
  55. 55. 1. Susceptible to bias if not carefully designed 2. Especially susceptible to exposure misclassification 3. Especially susceptible to recall bias 4. Restricted to single outcome 5. Incidence rates not usually calculate 6. Cannot assess effects of matching variables
  56. 56. 1. Only realistic study design for uncovering etiology in rare diseases 2. Important in understanding new diseases 3. Commonly used in outbreaks investigation 4. Useful if inducing period is long 5. Relatively inexpensive
  57. 57. Rare disease: Case-control approaches are the most efficient for rare diseases, e.g idiopathic pulmonary fibrosis, most cancers. Cohort approaches would require large populations and prohibitive expense and followup time.
  58. 58. Case ascertainment system in place: The conduct of a case-control study may be facilitated by the availability of a caseascertainment system. a) Population-based cancer registry b) Hospital-based surveillance systems c) Mandated disease reporting systems When funding and time constraints are not compatible with a cohort study.
  59. 59. Obtain interviews, blood, urines, etc. Study Population TIME 1 YEARS TIME 2 Develop Disease CASES Do Not Develop Disease CONTROLS (Subgroup) CASE-CONTROL STUDY
  60. 60. Consider the following hypothetical cohort: X = lung cancer case O = loss to follow-up X X O O X t1 t2 Time t3
  61. 61.  Advantages: 1. Possibility of recall bias is eliminated, since data on exposure are obtained before disease develops. 2. Exposure data are more likely to represent the preillness state since they are obtained years before clinical illness is diagnosed. 3. Costs are reduced compared to those of a prospective study, since laboratory tests need to be done only on specimens from subjects who are later chosen as cases or as controls.
  62. 62. 1950’s Cigarette smoking and lung cancer 1970’s Diethyl stilbestrol and vaginal adenocarcinoma Post-menopausal estrogens and endometrial cancer 1980 ’s Aspirin and Reyes sydrome Tampon use and toxic shocks syndrome L-tryptopham and eosinophilia-myalgia syndrome AIDS and sexual practices 1990’s Vaccine effectiveness Diet and cancer
  63. 63.  Park’s Textbook of Preventive and Social Medicine – 21st ed; Park JE. 2010.  Mausner & Bahn Epidemiology: An Introductory Text – 2nd ed; Mausner JS, Kramer S. 1985.  A Dictionary of Epidemiology – 3rd ed; Last JM. 2000.  Epidemiology – 3rd ed; Gordis L. 2004.  Origins and early development of the case-control study by Nigel Paneth, Ezra Susser, Mervyn Susser. Available from