The prevalence of the EXPOSURE (Frequent vs. Rare)
The prevalence of the OUTCOME (Frequent vs. Rare)
How much is known about the relationship?
How strong you want the evidence to be?
How are subjects sampled (on disease status or exposure status) Data collection methods (survey, observations, experimental) Timing of data collection (when) Unit of observation (Individuals or groups) Number of observations made (data collection points) Availability of subjects $$$$$
Choosing a good study design is essentialin epidemiology The general common choices are: Case reports and Case series Ecologic Cross-sectional Case-control Cohort Randomized-controlled Trial (experimental design) Observational vs. Experimental Approaches
Manipulation of study factor
Was exposure of interest controlled by the investigator? (No=observational; Yes=Experimental)
Randomization of study subjects
Was there use of a random process to determine exposure of study subjects?
(Observational/ descriptive Studies = the investigator observes the natural occurrence of the events) These studies help us generate hypotheses
I have a headache and fever, and purple polka dots on my feet. Case Report Physician or group of physicians seeing a handful of cases of a very rare disease, or an unusual presentation of a disease They are useful as a starting point for thinking about etiology (cause). Has no comparison group Without any comparison group, we CANNOT assume any associations. Case Series
A case series is an extension of a case report. You now have several cases of the disease
You can start looking for commonalitiesWhich may help you develop a hypothesis.
You cannot make ANY argument about an exposure being associated with disease.
For that, we need a comparison group.
Selection bias (major disadvantage)
Error due to systematic differences in characteristics between those who take part in a study and those who do not (nonresponse problems).
Ecologic Studies (group level analysis) Aggregates of individuals defined by units: geographic region, school, health care facility. Does the overall occurrence of disease in a population correlate with occurrence of the exposure. No individual level data People argue that a high-fat diet is associated with breast cancer since countries with high-fat diets have higher breast cancer incidence.
Ecologic Study Drawbacks Can we be sure that the pattern we are seeing is reflecting the real cause? No, not with this design. We are looking at data on groups, not individuals. Tendency to apply the relationship seen at the group level to an individual == ecologic fallacy Tendency to think what is meaningful for the group is meaningful to the individual Likely to be many confounding factors we haven’t considered So, to look at causal relationships, I should have a comparison group… …and I should have data on individuals, not groups
Cross-Sectional Studies (Surveys) Collect data (single point in TIME) on individuals and allows for comparison between groups. Sampling is done on a single group of people of interest (e.g. college students and drinking) The main tool is a survey. After collecting data,yougroup them by exposure and an outcome status to everyone (drinkers vs. nondrinkers) Compute a measure of association to compare by either exposure or disease status
Considers prevalenceof disease or condition
Cross-Sectional 2 x 2 Analysis starts here, since you sampled one whole group. Now, you distribute your total N based on E and D status e.g., obesity and diabetes
Example: Cross-Sectional Study Total subjects surveyed Prevalence of exposurecomparison in diseased and nondiseased: OR = a/c = ad b/d bc = (150 x 125) / (50 x 75) = 18750 / 3750 = 5.0
Cross-Sectional Studies Pros Quick &Inexpensive Usually easy to conduct Representative sample Describe patterns of disease occurrence Can study several diseases or exposures Cons
We have prevalence data only, no incidence data.
We collect exposure and outcome information at the same time, so we cannot argue any temporal relationship between the two (exposure must precede disease in order to be causal)
Selection bias (non-response)
So what is our next step? So far we discussed at, descriptive designs that allow us to look at correlations and associations, and provide knowledge for hypothesis development But, we need to consider incident cases and time sequence if we want to look at cause-effect relationships (temporality)
For that, we must turn to the analytic study designs.
case-control, and cohort (hypothesis testing)
Analytic study designs
Design of a Case-Control Study e.g., lung cancer patients and controls are asked how much they smoked in the past Source:www.socialresearchmethods.net/.../cohort.html
Case-control Study Analytic study The study subjects are selected on the basis of whether they have the disease (cases) or not (controls) and then We determine how many in each group had the risk factor (exposed). A case definition is a statement of the characteristics a subject must have to be considered a “case” of the disease to be studied. Usually confirmed by clinical signs or laboratory test Signs must distinguish the specific disease from similar conditions that might be caused by different exposures, genetic traits, interactions, etc. May consist of combinations of symptoms. (e.g., flu vs. cold)
Case-control Study Exposure collection is retrospective You will ask cases and controls about their exposures before they became a case (or before the date they were enrolled as a control) Need to identify a comparison group - controls These could be people in the hospital with other conditions, people from the neighborhood where the case lives, people identified from voter registration The goal is to identify a control group that represents the source population of the cases Population-based controls Patients from the same hospital as the cases Relatives of cases Friends of cases--SES control
Any exclusion or inclusion criteria applied to the selection of cases must also be applied in the selection of controls.
Case-control 2 x 2 Odds Ratio Definition of odds: the ratio of the probability of an event occurring (disease) to that of it not occurring = P /1-P Your total group was sampled here. You assign exposure status for the case and control groups, and this completes the table
Calculating Odds Ratio CasesControls Exposed a b Not Exposedc d a+cb+d Odds of exposure among cases: Proportion of exposed among cases Proportion of non-exposed among cases = (a/a+c) / (c/a+c) = a/c OR = a/c = ad b/d bc Odds of exposure among controls: Proportion of exposed among controls Proportion of non-exposed among controls = (b/b+d)/(d/b+d) = b/d
Myocardial infarction Total No Yes 327 2949 3276 304 2816 3120 23 133 156 Current OC use Yes No Total a = 23 b = 304 c = 133 d = 2816 Data from L. Rosenberg et al., Oral contraceptive use in relation to non-fatal myocardial infarction. Am J. Epidemiol. 111:59, 1980. a/d 23 x 2816 OR = = = OR = 1.6 b/c 304 x 133 Data from a case-control study of current oral contraceptive (OC) use and myocardial infarction in premenopausal female nurses Those who used OC had a 60% increased risk of having MI compared to women who did not use. Or the relative odds are nearly 2 times higher among those who used OC
Case-control studies Pros
Relatively inexpensive, small sample sizes, fast study technique
Great for rare diseases (sampling on disease status lets you guarantee enough cases to make comparisons)
Can study multiple exposures
Hard to find an appropriate control group (should match the cases in other characteristics)
Potential for recall bias – biggest problem (limited recall or selective memory)
Can assume, but cannot confirm the temporal sequence of exposure and disease
Cohort Design Disease Exposed No disease People without disease A cohort is defined as a population group, or subset thereof, that is followed over a period of time. Population Disease Not exposed No disease Instead of sampling on disease status, we sample on exposure(Subjects are defined on the basis of presence or absence of exposure to a risk factor; but you are not assigning an exposure to occur) Follow people prospectivelyuntil disease occurs or the study ends (thus, you can demonstrate that exposure precedes disease)
At the time exposure status is defined subjects are outcome negative.
Cohort Design The purpose of following a cohort is to measure the occurrence of one or more specific diseases/outcomes during the period of follow-up, usually with the aim of comparing the disease rates for two or more cohorts. What are the requirements for the Cohort Population All members of the cohort population must: • Be free of the disease at the start of the study period. • Be at risk of developing the disease – Must be alive (exclude dead people). – Must not be immune (exclude people who have been immunized or have had the disease before, if the first episode confers immunity). Example: measles – Must not be in a non-susceptible group (men don’t get cervical cancer)
TYPES OF COHORT STUDIES A. TIMING
PROSPECTIVE (OR CONCURRENT)
RETROSPECTIVE (OR NON-CONCURRENT)
cohort studies with sampling unrelated to exposure (common)
cohort studies with exposure-based sampling (rare exposure)
C. POPULATION BASE
D. TYPE OF COHORTS
• OPEN - people moving in and out
• CLOSED - fixed population
Types of Cohorts Closed Cohort:
cohort is defined at the start of the study.
No subjects are added after the start,
some cohort members may drop out or die before the end of the study.
Investigators try to follow all cohort members to the study end.
Examples: Clinical trials, Framingham study, Nurses, Health Study Open Cohort = The cohort takes on new members and may lose members as the study progresses. Also called a Dynamic Cohort or Dynamic Population. Examples: Cancer registries, school studies, hospital infection surveillance studies
Study starts Study starts Study starts Disease occurrence Exposure time Prospective cohort study Retrospective cohort study Disease occurrence Exposure Disease occurrence Exposure time time
Prospective vs. Retrospective studies Prospective Study: Observing a cohort of subjects and over a long periodfor an event occur (e.g., disease or death, or cure) during the study period and take a note of suspected risk or protective factor(s). The outcome of interest should be clear and defined. Use incidenceof an outcome or the relative risk of an outcome based on exposure. Retrospective Study: Looks backwards and examines exposures to suspected risk or protection factors in relation to an outcome. E.g., cancer Many valuable case-control studies, are retrospective and ask for patient histories. the odds ratio provides an estimate of relative risk (in Rare disease).
Study began in 1948 by recruiting an Original Cohort of 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts, who had not yet developed overt symptoms of cardiovascular disease or suffered a heart attack or stroke
. Since that time the Study has added an Offspring Cohort in 1971, the Omni Cohort in 1994, a Third Generation Cohort in 2002, a New Offspring Spouse Cohort in 2003, and a Second Generation Omni Cohort in 2003.
identification of major CVD risk factors
CARDIOVASCULAR DISEASE Framingham, MA Tecumseh, MI Evans county, GA (biracial) Muscatine, IA Bogalusa, LA (children) OCCUPATION BASED TO STUDY EXPOSURES Benzene-workers (leukemia) Coke-oven workers (lung cancer) Asbestos workers (lung cancer) Radium dial painters (oral cancer) VETERANS
http://www.nationalchildrensstudy.gov/Pages/default.aspx The National Children’s Studywill examine the effects of environmental influences on the health and development of 100,000 children across the United States, following them from before birth until age 21. 105 Study locations (counties or groups of counties) across the United States. 79 metropolitan areas (urban, suburban, and small cities), as well as 26 rural communities. All locations were selected using a probability-based sampling method to ensure that children and families across the nation—from diverse ethnic, racial, economic, religious, geographic, and social groups—are represented in the Study. Enroll women who are either pregnant or likely to have a child during the recruitment period. Each Study location will recruit enough women for 250 infant births per year during the four-year enrollment period.
Research efforts geared toward studying children’s health and development and will form the basis of child health guidance, interventions, and policy for generations to come.
The NCS will examine important health issues to establish links between children’s environments and their health, including:
birth defects and pregnancy-related problems
behavior, learning, and mental health disorders
Study Locations State Map * To become active 2009 - 2010Families will join the Study, or enroll, beginning in some communities in the winter of 2009. Other communities will begin enrolling families over the next couple of years.
Cohort Data Analysis The tabulation and analysis of morbidity or mortality rates
The study base is the person-time experience of the individuals in whom the outcome is ascertained.
Calculation of person-years at risk is the means of achieving equivalence of study base in cohort studies.
The denominator depends on whether all subjects were followed till the end of the study or not.
Measuring Exposure: The Scale • There are different ways to measure exposure, and their association with outcomes will vary. • For example, measuring smoking exposure: – Smoker • Yes • No • Never • Past but quit (pack-years, type of smoking) • Current (pack-years, type of smoking) – Pack-years of smoking How many cigarettes do you currently smoke per day? Then we can categorize accordingly.
Cohort Study 2 x 2 We can calculate either the Relative Risk as a valid measure of association, You selected study subjects by exposure status, so you know these row totals. (a+b) and (c+d) Now you separate your two exposure groups by disease status
Risk ratios or Relative Risk What if we started with exposure instead of disease status? We could compute risk directly, so the measure of association is the risk ratio. Essentially, you compute the risk in the exposed group and divide by the risk in the unexposed group Risk in exposed=a/(a+b) Risk in unexposed=c/(c+d)
Example Imaging that the data from Example 1 were actually from a cohort (meaning we followed OJ drinkers and non-drinkers over time) The relative risk would be 2.5, meaning orange juice drinkers had a 150% increase (2.5-1) in the risk of stomach ulcer as compared to non-drinkers.
Life Table Methods Give estimates for survival during time intervals and present the cumulative survival probability at the end of the interval. Example: Life tables can be constructed to portray the survival times of patients in clinical trials. There are two life table methods: Cohort life table: Shows the mortality experience of all persons born during a particular year. Period life table: Enables us to project the future life expectancy of persons born during the year as well as the remaining life expectancy of persons who have attained a certain age.
Survival Curves A method for portraying survival times In order to construct a survival curve, the following information is required: Time of entry into the study Time of death or other outcome Status of patient at time of outcome, e.g., dead or censored (patient is lost to follow-up) Example 15 subjects followed over 36 months; all entered the study at the same time. Nine died at different points of the study. Deaths of two patients caused a steep drop at 19 months. Each step indicates the death(s) of one or more patients.
Advantages of Cohort Studies - Can examine rare exposures (asbestos = lung cancer)
Temporal relationship can be inferred (prospective design); temporal order between exposure and outcome (Temporality)
Time-to-event analysis is possible - Multiple outcomes of a single exposure can be studied
(smoking = lung cancer, COPD, larynx cancer); may uncover unanticipated associations with outcome
Cohort studies are usually but not exclusively prospective,
- outcome is measured after exposure and can determine multiple effects of a single exposure.
Lengthy/ Time consuming and expensive (if prospective study)
If retrospective, requires availability of adequate records
- May require very large samples - Not suitable for rare diseases: When outcomes are rare, large populations need to be followed for outcomes. - Not suitable for diseases with long-latency - Unexpected environmental changes may influence the association
Nonresponse, migration and loss-to-follow-up biases (Tracking for a long time)
Potential for bias from loss to follow up affecting the Validity
Sampling, ascertainment and observer biases are still possible
Inefficient for study of rare disease
Disadvantages of Cohort Studies
Bias or issues in Interpretation of the results 1. Loss to Follow Up
Large loss to follow up (more than 30%) will raise issues about validity.
Obtain as much data as possible.
Loss to follow up analysis using baseline data to compare those interviewed and those lost to follow up.
Loss to follow up is the major source of bias in cohort studies.
Retention rate related to length of follow up.
Those who agree to participate may differ from non-participants.
Non-response affects generalizability and MAY affect validity.
Non-response affects validity if it is associated to both exposure and other risk factors.