This document discusses sources of error and bias in epidemiological studies. It describes how selection bias can occur when the study population is not representative of the target population, due to factors like differential participation rates or loss to follow up. Selection bias can lead the study to produce either overestimates or underestimates of exposure-disease relationships. The document provides examples to illustrate how selection bias may influence both cohort and case-control study designs.
This document discusses selection bias, which occurs when there are systematic differences in the characteristics of those selected for a study compared to the overall population from which they were selected. It provides examples of selection bias in case-control and cohort studies. Selection bias can be introduced when the probability of being included in the study depends on both the exposure and outcome of interest. This distorts the association observed in the study sample compared to the true association in the population. Differential participation or loss to follow up based on exposure and outcome status are common causes of selection bias.
Bias, confounding and fallacies in epidemiologyTauseef Jawaid
This document discusses three major threats to internal validity in epidemiology: bias, confounding, and fallacies. It focuses on defining and providing examples of bias, specifically selection bias and information bias. Selection bias can occur when comparison groups are not representative of the target populations due to factors like non-random selection or differential loss to follow up. Information bias, also called misclassification bias, results from errors in measuring exposures or outcomes, which can be differential or non-differential. Methods to control for biases like blinding subjects and using multiple questions are also outlined.
This document discusses various types of biases that can occur in epidemiological studies. It defines bias as systematic error that results in an incorrect estimate of the association between exposure and disease. Several types of biases are described, including selection bias, information bias, and confounding. Selection bias occurs when the study population is not representative of the target population and can arise from inappropriate definitions of eligibility, sampling frames, or follow-up. Information bias results from inaccurate or missing data.
This document defines and describes different types of bias that can affect studies, including confounding, selection bias, and information bias. Bias can result in incorrect estimates of associations between exposures and outcomes. Selection bias occurs when subjects have different probabilities of being included based on exposure or outcome. Information bias results from poor measurement of variables. Confounding bias is due to the influence of a third variable, called a confounder, that is associated with both the exposure and outcome. The document provides examples and discusses approaches to control for different types of bias, such as matching, stratification, and adjustment in data analysis.
A retrospective cohort study examines existing data to investigate associations between exposures and outcomes without prospective follow-up. The document discusses:
1) Retrospective cohort studies identify exposed and unexposed groups from past data and determine current disease status, requiring less time than prospective studies.
2) Limitations include potential for poor quality or incomplete past data, and lack of information on confounding factors.
3) As an example, a study used employee health records to retrospectively examine the association between chemical exposure in tire manufacturing and mortality. However, past data may not fully account for smoking, diet, or other risk factors.
The document discusses bias in epidemiology. It defines bias as systematic error that results in a mistaken estimate of an exposure's effect. It describes several types of bias including selection bias, information bias, and confounding. Selection bias can occur if cases and controls are selected in a way that distorts the exposure-disease association. Information bias arises from inadequate means of obtaining information. Confounding occurs when a third factor is associated with both exposure and outcome. The document outlines various specific biases like self-selection, recall bias, and healthy worker effect. It emphasizes the importance of minimizing bias through proper study design, conduct, and analysis to obtain valid results.
This document discusses measures of association used to quantify the relationship between an exposure and outcome in epidemiological studies. There are two types of measures: absolute measures, which are based on differences in disease frequency between exposed and unexposed groups, and relative measures, which are ratios of disease frequency. Common measures include risk difference, risk ratio, and odds ratio. Measures are calculated using data organized in 2x2 tables and can be interpreted as showing the strength and direction of association.
Cross-sectional studies examine the relationship between a disease and exposure in a population at a single point in time. They provide a snapshot of disease prevalence and exposure prevalence simultaneously. While they can describe disease burden and identify potential risk factors, the temporal relationship between exposure and disease is unclear since they involve simultaneous rather than longitudinal measurement.
This document discusses selection bias, which occurs when there are systematic differences in the characteristics of those selected for a study compared to the overall population from which they were selected. It provides examples of selection bias in case-control and cohort studies. Selection bias can be introduced when the probability of being included in the study depends on both the exposure and outcome of interest. This distorts the association observed in the study sample compared to the true association in the population. Differential participation or loss to follow up based on exposure and outcome status are common causes of selection bias.
Bias, confounding and fallacies in epidemiologyTauseef Jawaid
This document discusses three major threats to internal validity in epidemiology: bias, confounding, and fallacies. It focuses on defining and providing examples of bias, specifically selection bias and information bias. Selection bias can occur when comparison groups are not representative of the target populations due to factors like non-random selection or differential loss to follow up. Information bias, also called misclassification bias, results from errors in measuring exposures or outcomes, which can be differential or non-differential. Methods to control for biases like blinding subjects and using multiple questions are also outlined.
This document discusses various types of biases that can occur in epidemiological studies. It defines bias as systematic error that results in an incorrect estimate of the association between exposure and disease. Several types of biases are described, including selection bias, information bias, and confounding. Selection bias occurs when the study population is not representative of the target population and can arise from inappropriate definitions of eligibility, sampling frames, or follow-up. Information bias results from inaccurate or missing data.
This document defines and describes different types of bias that can affect studies, including confounding, selection bias, and information bias. Bias can result in incorrect estimates of associations between exposures and outcomes. Selection bias occurs when subjects have different probabilities of being included based on exposure or outcome. Information bias results from poor measurement of variables. Confounding bias is due to the influence of a third variable, called a confounder, that is associated with both the exposure and outcome. The document provides examples and discusses approaches to control for different types of bias, such as matching, stratification, and adjustment in data analysis.
A retrospective cohort study examines existing data to investigate associations between exposures and outcomes without prospective follow-up. The document discusses:
1) Retrospective cohort studies identify exposed and unexposed groups from past data and determine current disease status, requiring less time than prospective studies.
2) Limitations include potential for poor quality or incomplete past data, and lack of information on confounding factors.
3) As an example, a study used employee health records to retrospectively examine the association between chemical exposure in tire manufacturing and mortality. However, past data may not fully account for smoking, diet, or other risk factors.
The document discusses bias in epidemiology. It defines bias as systematic error that results in a mistaken estimate of an exposure's effect. It describes several types of bias including selection bias, information bias, and confounding. Selection bias can occur if cases and controls are selected in a way that distorts the exposure-disease association. Information bias arises from inadequate means of obtaining information. Confounding occurs when a third factor is associated with both exposure and outcome. The document outlines various specific biases like self-selection, recall bias, and healthy worker effect. It emphasizes the importance of minimizing bias through proper study design, conduct, and analysis to obtain valid results.
This document discusses measures of association used to quantify the relationship between an exposure and outcome in epidemiological studies. There are two types of measures: absolute measures, which are based on differences in disease frequency between exposed and unexposed groups, and relative measures, which are ratios of disease frequency. Common measures include risk difference, risk ratio, and odds ratio. Measures are calculated using data organized in 2x2 tables and can be interpreted as showing the strength and direction of association.
Cross-sectional studies examine the relationship between a disease and exposure in a population at a single point in time. They provide a snapshot of disease prevalence and exposure prevalence simultaneously. While they can describe disease burden and identify potential risk factors, the temporal relationship between exposure and disease is unclear since they involve simultaneous rather than longitudinal measurement.
This document discusses nested case-control studies, case-cohort studies, and case-crossover studies. It provides examples and discusses the advantages and disadvantages of each study design. Nested case-control studies select controls from within a prospective cohort study. Case-cohort studies select a random subcohort of controls from the entire cohort. Case-crossover studies use individuals as their own controls by comparing exposure during case periods to control periods.
This document discusses various types of bias, confounding, and causation that can occur in epidemiological studies. It defines a confounder as a variable that is associated with the exposure and affects the outcome but is not in the causal pathway. Three main types of bias are described: selection bias, information bias, and confounding. Specific biases like recall bias, observer bias, and non-respondent bias are explained. Methods for controlling confounding like matching, stratification, and multivariate analysis are also outlined. The document concludes by discussing Hill's criteria for determining a causal association and threats to the internal and external validity of experimental studies.
This document discusses different types of error and bias that can occur in epidemiological studies. It defines random error as occurring due to chance and resulting in imprecise measures, while systematic error or bias results in invalid measures that are not true. Types of bias discussed include selection bias, information bias, and confounding. Selection bias can arise from how cases and controls are selected, while information bias occurs when exposure or disease status is incorrectly classified. The document emphasizes the importance of reducing both random and systematic errors to obtain valid study results.
This document discusses bias and confounding in epidemiological studies. It defines bias as systematic error that results in incorrect estimation of exposure-outcome associations. Selection bias and information bias are two common types of bias. Confounding occurs when another exposure is associated with both the disease and exposure being studied, independently of the exposure-disease relationship. Methods to control for confounding include restriction, matching, randomization, stratification, and multivariate analysis at the design and analysis stages of a study.
The document discusses odds ratios, which are used to measure the association between an exposure and an outcome. An odds ratio is calculated by dividing the odds of an event in one group (e.g. exposed to a drug) by the odds of the event in another unexposed group. Odds ratios can be calculated in both cohort and case-control studies. While relative risk can only be calculated in cohort studies, odds ratios are commonly used to approximate relative risk in case-control studies when the outcome is rare. The document provides examples of how to calculate odds ratios from 2x2 contingency tables and interprets what different values mean.
This document discusses different risk measures used in epidemiology, including relative risk, odds ratio, and attributable risk. Relative risk measures the strength of association between an exposure and disease based on prospective studies. Odds ratio is used similarly in case-control studies when relative risk cannot be directly calculated. Attributable risk determines how much disease can be attributed to a specific exposure by comparing disease rates in exposed and unexposed groups. These measures provide important information for evaluating disease causation and determining potential disease prevention through reducing exposures.
This document summarizes a seminar presentation on case control studies. It begins by defining epidemiological study cycles and analytical study types such as case control and cohort studies. It then focuses on case control studies, defining them, discussing their history, design, outcomes, limitations, advantages and applications. Examples of notable case control studies are provided, such as Lane Claypon's 1926 breast cancer study and studies from the 1950s linking smoking to lung cancer. Key aspects of case control study methodology like selection of cases and controls, and matching to control for confounding variables are explained.
This document discusses bias and validity in clinical research. It defines clinical epidemiology as the study of health-related states and events in populations to control health problems. It describes how epidemiologic studies compare outcomes like disease rates between exposed and unexposed groups. Validity is important, with internal validity indicating good construct free from bias/errors, and external validity showing generalizability. Bias and confounding can threaten validity and lead to erroneous associations if not avoided or controlled for.
This document provides information on epidemiological study designs, including analytical studies, case-control studies, and cohort studies. It defines epidemiology as the study of health-related states in populations. Case-control studies look backward from the outcome to exposures, comparing cases to controls. Cohort studies follow groups over time to examine exposure-outcome relationships. The key difference is that cohort studies measure incidence while case-control studies measure odds ratios. Selection of appropriate study populations and controls is important to minimize biases.
1. Effect modification and confounding are two aspects to consider when examining the relationship between an exposure and outcome. Effect modification provides useful information about how a third variable impacts the relationship, while confounding can distort the observed effect.
2. Stratifying the data and calculating stratum-specific odds ratios is a way to check for effect modification and confounding. Effect modification is present if the odds ratios differ substantially between strata. Confounding is present if the crude odds ratio differs from the stratified odds ratios.
3. It is important to measure potential confounding factors so they can be controlled for through stratification or multivariate analysis. This prevents confounding from distorting the observed effect between the exposure and outcome.
This document describes different types of epidemiological study designs, including observational studies like cross-sectional, case-control, cohort, and experimental studies like randomized controlled trials. It provides details on descriptive versus analytical epidemiology and cross-sectional studies specifically. Cross-sectional studies measure prevalence at a single point in time by surveying exposures and disease status simultaneously in a population cross-section. They are useful for assessing disease burden, comparing prevalence between populations, and examining trends over time.
This document discusses intervention studies and randomized controlled trials. It begins by defining causality using the counterfactual model and comparing exposed and unexposed groups. It then describes the importance of randomization in intervention studies, noting that randomization helps ensure the unexposed group is a valid control, controls for unknown confounders, facilitates blinding, and provides a foundation for statistical tests. The document discusses types of intervention studies, issues like compliance, analysis approaches like intention-to-treat, and ethical considerations.
Cross sectional study by Dr Abhishek Kumarak07mail
This document discusses cross-sectional studies. It defines a cross-sectional study as an observational analytical study that determines exposure and disease simultaneously. Both chronic and acute diseases can be studied, and it provides a snapshot of disease distribution in a population. Cross-sectional studies can be descriptive, providing information on single or multiple variables, or analytical, assessing associations between variables. Key steps include identifying a reference population, determining sample size, sampling, data collection, and analysis of prevalence and prevalence ratios. Advantages are quick results and cost-effectiveness while disadvantages are inability to determine disease incidence or prove causality.
1) Field trials evaluate prevention strategies by testing interventions on healthy individuals at risk of disease. They aim to determine if an agent or procedure reduces disease risk.
2) Field trials are generally conducted in real-world settings rather than clinical settings and can involve individuals, groups, or entire communities.
3) Random allocation of participants to intervention and control groups is important to reduce bias and ensure groups are comparable outside of the intervention being tested.
1. Bias and confounding can systematically skew epidemiological study results if not properly addressed. Bias results from errors in study design, data collection, analysis, or reporting. Confounding occurs when another variable is associated with both the exposure and outcome under study.
2. Common types of bias include selection bias, information bias, and surveillance bias. Selection bias results when study groups systematically differ. Information bias stems from misclassification of exposures or outcomes.
3. Confounding can be controlled for through restriction, matching, randomization, and statistical techniques like stratification and multivariate analysis. These methods help equalize the effects of extraneous variables to isolate the exposure-outcome association being studied.
This document describes different types of epidemiological study designs, including observational studies like cross-sectional, case-control, and cohort studies as well as experimental studies like randomized controlled trials. It provides details on descriptive versus analytical epidemiology and discusses cross-sectional, case-control, and cohort study designs, including their characteristics, uses, strengths, limitations, and how to conduct them. Key points covered include how observational studies observe exposures and outcomes without intervention, while experimental studies test interventions, and how case-control and cohort studies are used to test hypotheses about disease risk factors.
This document describes a cross-sectional study and its methodology. A cross-sectional study involves collecting data on exposure and outcome variables from a population at a single point in time. The document discusses the differences between descriptive and analytical cross-sectional studies. Descriptive studies measure prevalence, while analytical studies test associations between exposures and outcomes. The document provides examples of cross-sectional study design, biases, advantages, and guidelines for evaluating validity.
Descriptive and Analytical Epidemiology coolboy101pk
This document provides an overview of a training session on descriptive and analytic epidemiology. Descriptive epidemiology involves describing disease frequency, distribution, and determinants in populations using measures like prevalence and incidence. Analytic epidemiology aims to understand why diseases occur using study designs like cohort studies and case-control studies to test hypotheses. Key terms discussed include measures of association like relative risk and odds ratio, and statistical tests like confidence intervals and p-values.
This document contains 26 slides presented by Dr. Rizwan S A on cohort studies. It defines cohort studies as prospective longitudinal studies that follow healthy populations over time to determine the causes of diseases. Key aspects covered include classifying cohort studies as prospective, retrospective or combined; describing the elements of cohort studies such as selecting and following subjects, measuring exposure and outcomes, and analyzing results using measures like relative risk, risk difference and attributable risk. Examples of famous cohort studies on smoking, heart disease and oral contraceptives are also provided.
This document provides an overview of non-randomized control trials. It discusses reasons why non-randomized studies are sometimes necessary, including ethical or feasibility concerns. It describes different types of non-randomized study designs like uncontrolled trials, natural experiments, before-after studies with and without controls, and quasi-experimental designs. It also discusses threats to internal validity in these designs like selection bias, and methods to adjust for these biases like regression and propensity score matching. The document emphasizes that while non-randomized studies can provide useful evidence, randomization is preferable when possible to minimize biases.
This document discusses types of observational research studies and sources of error. It describes case-control and cohort studies, as well as cross-sectional and prospective cohort studies. Electronic health records are increasingly common but often lack discrete data elements, making it difficult to extract useful outcomes data. While observational studies are faster and less expensive than randomized trials, they are subject to biases like selection, measurement, and confounding. Careful design can limit biases, but multiple studies are typically needed to prove causation. Observational research generates hypotheses that require further prospective research to determine causality.
Etiologic research aims to establish causal relationships between determinants and disease outcomes. There are two main observational study designs for etiologic research: cohort studies and case-control studies. Cohort studies follow groups of individuals over time based on exposure to a determinant and compare disease outcome rates. Case-control studies identify cases of a disease and controls without the disease and compare past exposure to determinants. Both study designs are prone to biases like selection bias, information bias, and confounding, which can distort the true determinant-outcome relationship if not adequately addressed.
This document discusses nested case-control studies, case-cohort studies, and case-crossover studies. It provides examples and discusses the advantages and disadvantages of each study design. Nested case-control studies select controls from within a prospective cohort study. Case-cohort studies select a random subcohort of controls from the entire cohort. Case-crossover studies use individuals as their own controls by comparing exposure during case periods to control periods.
This document discusses various types of bias, confounding, and causation that can occur in epidemiological studies. It defines a confounder as a variable that is associated with the exposure and affects the outcome but is not in the causal pathway. Three main types of bias are described: selection bias, information bias, and confounding. Specific biases like recall bias, observer bias, and non-respondent bias are explained. Methods for controlling confounding like matching, stratification, and multivariate analysis are also outlined. The document concludes by discussing Hill's criteria for determining a causal association and threats to the internal and external validity of experimental studies.
This document discusses different types of error and bias that can occur in epidemiological studies. It defines random error as occurring due to chance and resulting in imprecise measures, while systematic error or bias results in invalid measures that are not true. Types of bias discussed include selection bias, information bias, and confounding. Selection bias can arise from how cases and controls are selected, while information bias occurs when exposure or disease status is incorrectly classified. The document emphasizes the importance of reducing both random and systematic errors to obtain valid study results.
This document discusses bias and confounding in epidemiological studies. It defines bias as systematic error that results in incorrect estimation of exposure-outcome associations. Selection bias and information bias are two common types of bias. Confounding occurs when another exposure is associated with both the disease and exposure being studied, independently of the exposure-disease relationship. Methods to control for confounding include restriction, matching, randomization, stratification, and multivariate analysis at the design and analysis stages of a study.
The document discusses odds ratios, which are used to measure the association between an exposure and an outcome. An odds ratio is calculated by dividing the odds of an event in one group (e.g. exposed to a drug) by the odds of the event in another unexposed group. Odds ratios can be calculated in both cohort and case-control studies. While relative risk can only be calculated in cohort studies, odds ratios are commonly used to approximate relative risk in case-control studies when the outcome is rare. The document provides examples of how to calculate odds ratios from 2x2 contingency tables and interprets what different values mean.
This document discusses different risk measures used in epidemiology, including relative risk, odds ratio, and attributable risk. Relative risk measures the strength of association between an exposure and disease based on prospective studies. Odds ratio is used similarly in case-control studies when relative risk cannot be directly calculated. Attributable risk determines how much disease can be attributed to a specific exposure by comparing disease rates in exposed and unexposed groups. These measures provide important information for evaluating disease causation and determining potential disease prevention through reducing exposures.
This document summarizes a seminar presentation on case control studies. It begins by defining epidemiological study cycles and analytical study types such as case control and cohort studies. It then focuses on case control studies, defining them, discussing their history, design, outcomes, limitations, advantages and applications. Examples of notable case control studies are provided, such as Lane Claypon's 1926 breast cancer study and studies from the 1950s linking smoking to lung cancer. Key aspects of case control study methodology like selection of cases and controls, and matching to control for confounding variables are explained.
This document discusses bias and validity in clinical research. It defines clinical epidemiology as the study of health-related states and events in populations to control health problems. It describes how epidemiologic studies compare outcomes like disease rates between exposed and unexposed groups. Validity is important, with internal validity indicating good construct free from bias/errors, and external validity showing generalizability. Bias and confounding can threaten validity and lead to erroneous associations if not avoided or controlled for.
This document provides information on epidemiological study designs, including analytical studies, case-control studies, and cohort studies. It defines epidemiology as the study of health-related states in populations. Case-control studies look backward from the outcome to exposures, comparing cases to controls. Cohort studies follow groups over time to examine exposure-outcome relationships. The key difference is that cohort studies measure incidence while case-control studies measure odds ratios. Selection of appropriate study populations and controls is important to minimize biases.
1. Effect modification and confounding are two aspects to consider when examining the relationship between an exposure and outcome. Effect modification provides useful information about how a third variable impacts the relationship, while confounding can distort the observed effect.
2. Stratifying the data and calculating stratum-specific odds ratios is a way to check for effect modification and confounding. Effect modification is present if the odds ratios differ substantially between strata. Confounding is present if the crude odds ratio differs from the stratified odds ratios.
3. It is important to measure potential confounding factors so they can be controlled for through stratification or multivariate analysis. This prevents confounding from distorting the observed effect between the exposure and outcome.
This document describes different types of epidemiological study designs, including observational studies like cross-sectional, case-control, cohort, and experimental studies like randomized controlled trials. It provides details on descriptive versus analytical epidemiology and cross-sectional studies specifically. Cross-sectional studies measure prevalence at a single point in time by surveying exposures and disease status simultaneously in a population cross-section. They are useful for assessing disease burden, comparing prevalence between populations, and examining trends over time.
This document discusses intervention studies and randomized controlled trials. It begins by defining causality using the counterfactual model and comparing exposed and unexposed groups. It then describes the importance of randomization in intervention studies, noting that randomization helps ensure the unexposed group is a valid control, controls for unknown confounders, facilitates blinding, and provides a foundation for statistical tests. The document discusses types of intervention studies, issues like compliance, analysis approaches like intention-to-treat, and ethical considerations.
Cross sectional study by Dr Abhishek Kumarak07mail
This document discusses cross-sectional studies. It defines a cross-sectional study as an observational analytical study that determines exposure and disease simultaneously. Both chronic and acute diseases can be studied, and it provides a snapshot of disease distribution in a population. Cross-sectional studies can be descriptive, providing information on single or multiple variables, or analytical, assessing associations between variables. Key steps include identifying a reference population, determining sample size, sampling, data collection, and analysis of prevalence and prevalence ratios. Advantages are quick results and cost-effectiveness while disadvantages are inability to determine disease incidence or prove causality.
1) Field trials evaluate prevention strategies by testing interventions on healthy individuals at risk of disease. They aim to determine if an agent or procedure reduces disease risk.
2) Field trials are generally conducted in real-world settings rather than clinical settings and can involve individuals, groups, or entire communities.
3) Random allocation of participants to intervention and control groups is important to reduce bias and ensure groups are comparable outside of the intervention being tested.
1. Bias and confounding can systematically skew epidemiological study results if not properly addressed. Bias results from errors in study design, data collection, analysis, or reporting. Confounding occurs when another variable is associated with both the exposure and outcome under study.
2. Common types of bias include selection bias, information bias, and surveillance bias. Selection bias results when study groups systematically differ. Information bias stems from misclassification of exposures or outcomes.
3. Confounding can be controlled for through restriction, matching, randomization, and statistical techniques like stratification and multivariate analysis. These methods help equalize the effects of extraneous variables to isolate the exposure-outcome association being studied.
This document describes different types of epidemiological study designs, including observational studies like cross-sectional, case-control, and cohort studies as well as experimental studies like randomized controlled trials. It provides details on descriptive versus analytical epidemiology and discusses cross-sectional, case-control, and cohort study designs, including their characteristics, uses, strengths, limitations, and how to conduct them. Key points covered include how observational studies observe exposures and outcomes without intervention, while experimental studies test interventions, and how case-control and cohort studies are used to test hypotheses about disease risk factors.
This document describes a cross-sectional study and its methodology. A cross-sectional study involves collecting data on exposure and outcome variables from a population at a single point in time. The document discusses the differences between descriptive and analytical cross-sectional studies. Descriptive studies measure prevalence, while analytical studies test associations between exposures and outcomes. The document provides examples of cross-sectional study design, biases, advantages, and guidelines for evaluating validity.
Descriptive and Analytical Epidemiology coolboy101pk
This document provides an overview of a training session on descriptive and analytic epidemiology. Descriptive epidemiology involves describing disease frequency, distribution, and determinants in populations using measures like prevalence and incidence. Analytic epidemiology aims to understand why diseases occur using study designs like cohort studies and case-control studies to test hypotheses. Key terms discussed include measures of association like relative risk and odds ratio, and statistical tests like confidence intervals and p-values.
This document contains 26 slides presented by Dr. Rizwan S A on cohort studies. It defines cohort studies as prospective longitudinal studies that follow healthy populations over time to determine the causes of diseases. Key aspects covered include classifying cohort studies as prospective, retrospective or combined; describing the elements of cohort studies such as selecting and following subjects, measuring exposure and outcomes, and analyzing results using measures like relative risk, risk difference and attributable risk. Examples of famous cohort studies on smoking, heart disease and oral contraceptives are also provided.
This document provides an overview of non-randomized control trials. It discusses reasons why non-randomized studies are sometimes necessary, including ethical or feasibility concerns. It describes different types of non-randomized study designs like uncontrolled trials, natural experiments, before-after studies with and without controls, and quasi-experimental designs. It also discusses threats to internal validity in these designs like selection bias, and methods to adjust for these biases like regression and propensity score matching. The document emphasizes that while non-randomized studies can provide useful evidence, randomization is preferable when possible to minimize biases.
This document discusses types of observational research studies and sources of error. It describes case-control and cohort studies, as well as cross-sectional and prospective cohort studies. Electronic health records are increasingly common but often lack discrete data elements, making it difficult to extract useful outcomes data. While observational studies are faster and less expensive than randomized trials, they are subject to biases like selection, measurement, and confounding. Careful design can limit biases, but multiple studies are typically needed to prove causation. Observational research generates hypotheses that require further prospective research to determine causality.
Etiologic research aims to establish causal relationships between determinants and disease outcomes. There are two main observational study designs for etiologic research: cohort studies and case-control studies. Cohort studies follow groups of individuals over time based on exposure to a determinant and compare disease outcome rates. Case-control studies identify cases of a disease and controls without the disease and compare past exposure to determinants. Both study designs are prone to biases like selection bias, information bias, and confounding, which can distort the true determinant-outcome relationship if not adequately addressed.
This document discusses case control studies and provides examples to illustrate their use. It defines a case control study as an epidemiological approach that starts with identified "cases" who have a disease and compares them to "controls" who do not have the disease. The study then examines past exposure history to identify potential risk factors.
Key aspects of case control studies covered include selecting appropriate cases and controls, matching on important variables, measuring past exposure, calculating odds ratios to estimate disease risk associated with exposures, and potential biases like selection bias, recall bias, and survivorship bias. Examples are provided of early case control studies that helped identify links between smoking and lung cancer, and between rubella infection and cataracts.
This document discusses various types of bias that can occur in epidemiological studies, including selection bias and information bias. Selection bias occurs when the study population is not representative of the target population, such as when there are differential participation rates related to exposure or disease status. Information bias, also called misclassification bias, arises from errors in measuring or recording exposure, disease status, or other variables. There is often no perfect way to measure exposures, and tools to assess past exposures like diet are especially imprecise. The direction and magnitude of bias introduced can impact study results and conclusions.
This document outlines different types of epidemiological study designs including observational studies like descriptive studies, analytical studies, ecological studies, cross-sectional studies and case-control studies. It also discusses experimental study designs like randomized controlled trials, field trials and community trials. Key features and steps are provided for case-control studies and cohort studies. Sources of bias and errors in epidemiological studies are also summarized.
This document discusses sources of information bias in epidemiological studies. It describes different types of information bias including classification or measurement bias, differential versus non-differential bias, and direction of bias. Examples are provided of sources of error such as respondent bias from inability to recall or disclose accurately, data collector bias from unclear questions, and selection bias from only a portion of letters arriving in their destination in a classic study. Validation techniques and avoiding data collection errors are also discussed.
Case-control study is a variety of analytical studies. This is a brief presentation regarding history, design, issues, advantages - disadvantages and examples of Case-control study.
Excelsior College PBH 321 Page 1 BIAS IN EPIDE.docxgitagrimston
Excelsior College PBH 321
Page 1
BIAS IN EPIDE MIOLO GY
THE MEANING AND CONTEXT OF BIAS
We briefly touched upon bias earlier in the course. The word bias probably has an intuitive meaning to you,
and recall the definition from Module 3:
“Deviation of results or inferences from the truth, or processes leading to such deviation. Any trend in the
collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are
systematically different from the truth.” (J.M. Last, A Dictionary of Epidemiology, 4th ed.)
Bias is a systematic error that results in an incorrect (invalid) estimate of the measure of association. Bias can
therefore create the appearance of an association when there really is none (bias away from the null), or mask
an association when there really is one (bias towards the null). Bias can arise in all study types: experimental,
cohort, and case-control designs. It is primarily introduced by the investigator or study participants in the
design and conduction of a study, and once it occurs, cannot be removed – but it can be evaluated during the
analysis phase of a study. Bias is a direct threat to the validity of any epidemiologic study, and influences
whether we can believe the observed association is a true association.
DIRECTION OF BIAS
We can describe the impact of bias on the measure of association in a few different ways. Remember, the null
value is 1.0; the value that means there is no observed association between exposure and disease. If the true
association is protective (less than 1.0), bias towards the null will make it seem less protective as the measure
of association gets closer to the null value.
• Positive bias: the observed value is higher than the true value
• Negative bias: the observed value is lower than the true value
• Bias towards the null: the observed value is closer to 1.0 than the true value.
• Bias away from the null: the observed value is farther away from 1.0 than the true value.
Bias towards null
Bias away from null
Bias away from null
Bias towards null
Protective effects Harmful effects
1.0
(no association)
Bias towards null
Bias away from null
Bias away from null
Bias towards null
Protective effects Harmful effects
1.0
(no association)
Excelsior College PBH 321
Page 2
TYPES OF BIAS
Two main types of bias are selection and information (observation) bias. You had an introduction to these
biases in Module 4.
1. Selection Bias
Selection bias is error that occurs if selection of subjects into a study is related to both exposure and disease.
As a result, the measure of association differs from what would have been obtained if we could examine the
entire population targeted for study. Selection bias is most likely to occur in case-control or retrospective
cohort studies because exposure and outcome have occurred at time of study selection, and may influence
the willing ...
A cohort study involves observing a group of individuals over time to examine exposure-outcome relationships. Key characteristics include prospectively following exposed and unexposed groups to compare disease outcomes. Major biases include loss to follow up and misclassification of exposure status. Cohort studies are well-suited for rare exposures and allow examination of multiple outcomes, but require large sample sizes and are time-consuming.
Observational study is divided into descriptive and analytical studies.
Non-experimental
Observational because there is no individual intervention
Treatment and exposures occur in a “non-controlled” environment
Individuals can be observed prospectively or retrospectively
COHORT STUDY- an “observational” design comparing individuals with a known risk factor or exposure with others without the risk factor or exposure.
looking for a difference in the risk (incidence) of a disease over time.
best observational design
data usually collected prospectively (some retrospective)
CASE CONTROL - EFFECT TO CAUSE
Retrospective
When disease is rare
.
The document discusses key concepts regarding natural history of disease and population screening:
- Understanding natural history is fundamental for effective disease prevention at primary, secondary, and tertiary levels.
- Requirements for screening programs include the disease being suitable for screening, having a suitable screening test, implementing a suitable screening program, and making good use of resources.
- Validity of screening tests depends on their sensitivity to correctly classify cases and specificity to correctly classify non-cases. Interpreting screening results requires considering predictive values.
This document provides an overview of observational (non-experimental) study designs used in epidemiology. It describes descriptive studies such as case reports, case series, cross-sectional studies, and correlational studies. It also describes analytical studies including case-control and cohort studies. Case-control studies compare exposure history in cases versus controls, while cohort studies follow groups exposed versus not exposed over time to compare incidence rates. Both are used to test hypotheses about risk factors but cohort studies provide a direct measure of risk through relative risk. Bias is a potential limitation of all observational studies.
This document discusses cross-sectional studies and ecologic studies. It provides examples of national health surveys that use complex sampling designs to study populations. Cross-sectional studies measure prevalence at a single point in time and can examine multiple exposures and outcomes simultaneously. However, they cannot determine whether an exposure preceded an outcome. Ecologic studies examine relationships at a group level but inferences from group to individual level may not be valid.
2010-Epidemiology (Dr. Sameem) basics and priciples.pptAmirRaziq1
Epidemiology is the study of the distribution and determinants of health-related states in populations. There are three main types of epidemiological studies: observational studies which examine risk factors without interfering; experimental (interventional) studies which manipulate factors; and descriptive studies which show disease patterns and frequencies. Case-control studies are retrospective and compare exposures in cases (diseased) and controls (non-diseased) to identify risk factors. Cohort studies are prospective and follow exposure groups over time to calculate disease incidence and identify risk factors. Cross-sectional studies provide a snapshot of disease prevalence and help generate hypotheses for further research.
The document discusses different types of epidemiological studies, including descriptive studies like case reports and case series that focus on person, place and time to create hypotheses. Analytical studies like case-control and cohort studies are used to test hypotheses by being either observational or interventional. Randomized controlled trials are the gold standard for comparing new interventions. Observational analytical studies include cross-sectional, cohort and case-control designs, while interventional analytical studies are clinical trials. The appropriate study design depends on the research goals and objectives.
A cohort study is a longitudinal study that follows groups of individuals who are alike in many ways but differ by a certain characteristic, such as exposure to a risk factor. The study monitors the groups for a specified period of time to determine if those exposed to the risk factor experience more of the outcome than the unexposed group. Key features include identifying cohorts prior to the outcome occurring and observing them over time from cause to effect. Cohort studies can be prospective, following groups forward in time, or retrospective, looking back at records to identify exposed and unexposed groups. They are useful for rare exposures and provide direct estimates of risk but require large sample sizes and long follow-up periods.
This document discusses various types of errors and biases that can occur in epidemiological studies. It defines error as a phenomenon where a study's results do not reflect the true facts. There are two basic types of error: random error, which occurs by chance and makes observed values differ from true values; and systematic error or bias, which is due to factors in the study design that cause results to depart from the truth. Types of bias discussed include selection bias, information bias, and confounding. Strategies for controlling biases such as randomization, restriction, matching, and statistical modeling are also outlined.
This chapter introduces communicable diseases and their epidemiology in Ethiopia. It defines key epidemiological terms used to describe diseases. Communicable diseases pose a major health burden in Ethiopia. Many factors contribute to their transmission, including poverty, poor sanitation and lack of access to health care. The major communicable diseases affecting Ethiopia are described.
This document is a manual published by the World Health Organization in 1997 on vector control methods for use by individuals and communities. It contains 10 chapters that describe the biology, public health importance, and control measures for various disease vectors, including mosquitoes, tsetse flies, triatomine bugs, fleas, lice, ticks, mites, cockroaches, houseflies, freshwater snails, and cyclops. For each vector, the manual provides details on its life cycle, disease transmission, and recommends methods for personal protection as well as community-based control strategies.
The 10-step approach to outbreak investigations involves:
1) Identifying an investigation team and resources.
2) Establishing the existence of an outbreak.
3) Verifying the diagnosis, constructing a case definition, and finding cases systematically.
Descriptive epidemiology is then used to develop hypotheses, which are evaluated through additional studies if needed, before implementing control measures, communicating findings, and maintaining surveillance to confirm the outbreak has ended. Being systematic and following these steps is key to determining the source and controlling outbreaks.
According to a new assessment by the UN Food and Agriculture Organization and Famine Early Warning Systems Network, around 731,000 Somalis face acute food insecurity and 2.3 million more are at risk. This brings the total number of people in need of humanitarian assistance to 3 million. Malnutrition rates remain high, with nearly 203,000 children acutely malnourished. The humanitarian situation has improved in some areas due to above average rainfall and increased aid, but concerns remain for 2015. The humanitarian response plan requests $863 million to address ongoing needs and prevent a major crisis from undoing recent peace and state building progress in Somalia.
This document discusses communicable diseases. It defines communicable diseases as diseases that can spread from one person to another through various modes of transmission like air, water, food, or contact. Some common communicable diseases mentioned include influenza, polio, typhoid, measles, mumps, chickenpox, tuberculosis, and AIDS. It also discusses immunity and how the body develops immunity to diseases either naturally after suffering from an illness or artificially through vaccination. Preventing the spread of communicable diseases requires measures like maintaining hygiene, immunization, and promptly treating illnesses.
This document outlines the Canadian Nurses Association's position on primary health care. It believes primary health care is integral to improving health outcomes for Canadians and that its principles, such as accessibility, health promotion, and intersectoral collaboration, are the most effective way to provide equitable healthcare. The CNA also believes primary health care and nursing are closely connected, and nursing standards and education should be grounded in primary health care principles. Adopting a primary health care approach could help address rising healthcare costs and improve Canada's performance on health indicators relative to other countries.
This document provides an overview of general nutrition concepts. It defines key terms like food, nutrition, diet, and malnutrition. It outlines the six major nutrients - carbohydrates, proteins, fats, vitamins, minerals, and water. The document discusses dietary guidelines and food groups. It explains that human beings need food to provide energy for essential physiological functions like respiration, circulation, digestion, metabolism, maintaining body temperature, growth, and repair of tissues. The most vulnerable groups who require adequate nutrition are infants, young children, pregnant women, and lactating mothers.
The document outlines a road map to accelerate HIV prevention efforts to meet global targets of reducing new HIV infections by 75% by 2020. It finds that while progress has been made, declines in new infections have been too slow, with only 1.7 million new infections in 2016, an 11% decline since 2010. Of 25 focus countries, only 3 saw over 30% declines, while 8 had no decline or increases. No country met the 2015 target of 50% reduction. Faster progress is needed to avoid increased treatment costs and continued mother-to-child transmission programs. The road map proposes intensified prevention programs, especially for adolescent girls, young women and key populations.
This document discusses the key ethical issues that arise in public health surveillance programs. It begins with a brief history of public health surveillance and definitions of key terms. The main ethical problem discussed is the potential conflict between individual interests/rights and collective interests. While clinical ethics focuses on individual physician-patient relationships, public health ethics must consider the broader community. Some argue the ethics of public health and clinical practice are distinctly different given this shift from individual to collective interests. The document examines how tools and checklists can help evaluate the ethical acceptability of surveillance programs.
This document provides an overview of planning and management for health extension workers. It defines management as a process of reaching organizational goals through people and resources. The key functions of management are planning, organizing, staffing, directing, and controlling. Planning involves setting objectives and strategies, while evaluation assesses progress towards objectives. Communication and decision-making are also integral to the management process. Effective management applies principles like management by objectives and learning from experience. The roles of administration and management are also distinguished, with administration focusing more on policy and management on execution.
The document provides recommendations for surveillance of acute viral hepatitis. It defines clinical and laboratory criteria for diagnosing hepatitis A, B, and non-A/non-B. Surveillance is recommended to guide control measures like ensuring blood and injection safety and immunization programs. Countries should monitor cases of acute jaundice and increase in liver enzymes to detect hepatitis outbreaks and evaluate prevention programs. Standardized case definitions and laboratory tests are important for comparable surveillance data.
This document provides an introduction to a module on the Expanded Program on Immunization (EPI) in Ethiopia. The module aims to train health center teams and other health professionals to increase immunization coverage and reduce morbidity and mortality from six childhood diseases. Despite initiatives over the years, immunization coverage remains low in Ethiopia due to factors like lack of transportation, ineffective cold chains, shortage of trained staff, poor collaboration, and inadequate community involvement. The module seeks to address this through training and bringing about significant changes in EPI coverage.
This document provides a handbook on water programming published by UNICEF in 1999. It aims to guide field professionals in implementing UNICEF's water, environment and sanitation strategies. The handbook covers topics such as water and sustainable development, community participation and management, cost effectiveness, appropriate water technologies, and maintenance of water supply systems. It emphasizes the importance of community-based management of water resources, cost-effective solutions, and involvement from all levels of government and communities in water sector issues.
This document provides an introduction to the Somali PHAST Step-by-Step Guide, which uses participatory methods to help communities improve hygiene behaviors, prevent diarrheal diseases, and encourage community management of water and sanitation facilities. The guide contains 7 steps to take communities through developing a plan for preventing diarrheal diseases. Section 2 provides background concepts, defining hygiene, sanitation, the link between the two, and that hygiene and sanitation promotion requires more than just asking people to change - it requires understanding disease transmission and being motivated to promote positive behaviors.
The development of this lecture note for training Health Extension workers is an arduous assignment for Dr. Meseret Yazachew and Dr. Yihenew Alem at Jimma University.
This document was developed with inputs from many institutions and experts. Several individuals deserve special mention. Mary Arimond, Kathryn Dewey and Marie Ruel developed the analytical framework and provided technical oversight throughout the project. Eunyong Chung and Anne Swindale provided technical support. Nita Bhandari, Roberta Cohen, Hilary Creed de Kanashiro, Christine Hotz, Mourad Moursi, Helena Pachon and Cecilia C. Santos-Acuin conducted analysis of data sets. Chessa Lutter coordinated a working group to update the breastfeeding indicators. Mary Arimond and Megan Deitchler coordinated the working group that developed the Operational Guide on measurement issues which is a companion to this document. Bernadette Daelmans and José Martines coordinated the project throughout its phases. Participants in the consensus meetings held in Geneva 3–4 October 2006 and in Washington, DC 6–8 November 2007 provided invaluable inputs to formulate the recommendations put forward in this document.
POLICY MAKING PROCESS
Policy
• a statement of intent for achieving an objective.
• Deliberate statement aimed at achieving specific objective
• policies are formulated by the Government in order to provide
a guideline in attaining certain objectives for the benefit of the
people.
• Importance and objective of any policy
• to solve existing challenges/problems in any society
• used as a tool to safeguard and ensure better services to
members of the society.
• Reasons for formulating a Policy
• Reforms (socio-economic, technological advancements, etc)
within and outside the country.
This document describes a case-control study conducted to determine the reason for many students failing an exam. The study found that students who did not attend lectures had an 80 times higher chance of failing compared to students who did attend, and that this result was statistically significant with a p-value less than 0.05, suggesting not attending lectures was the likely cause of failure.
Aim of nutritional assessment
To identify nutritional problems of the community
To find the underlying cause for malnutrition
To plan and implement control of malnutrition
Maintain good nutrition of community
Ancylostomiasis, or hookworm infection, is an important global public health problem caused by parasitic hookworms that infect humans. It is transmitted when larvae penetrate the skin and enter the body, usually through walking barefoot on contaminated soil. In Libya, hookworm infection is very rare, with most cases found in farmers who come into contact with infected feces in soil. The hookworms live in the intestine and feed on blood, potentially causing iron deficiency anemia and related health issues if left untreated. Prevention relies on sanitary disposal of human waste and health education to avoid transmission.
Integrating Ayurveda into Parkinson’s Management: A Holistic ApproachAyurveda ForAll
Explore the benefits of combining Ayurveda with conventional Parkinson's treatments. Learn how a holistic approach can manage symptoms, enhance well-being, and balance body energies. Discover the steps to safely integrate Ayurvedic practices into your Parkinson’s care plan, including expert guidance on diet, herbal remedies, and lifestyle modifications.
NVBDCP.pptx Nation vector borne disease control programSapna Thakur
NVBDCP was launched in 2003-2004 . Vector-Borne Disease: Disease that results from an infection transmitted to humans and other animals by blood-feeding arthropods, such as mosquitoes, ticks, and fleas. Examples of vector-borne diseases include Dengue fever, West Nile Virus, Lyme disease, and malaria.
These lecture slides, by Dr Sidra Arshad, offer a quick overview of the physiological basis of a normal electrocardiogram.
Learning objectives:
1. Define an electrocardiogram (ECG) and electrocardiography
2. Describe how dipoles generated by the heart produce the waveforms of the ECG
3. Describe the components of a normal electrocardiogram of a typical bipolar lead (limb II)
4. Differentiate between intervals and segments
5. Enlist some common indications for obtaining an ECG
6. Describe the flow of current around the heart during the cardiac cycle
7. Discuss the placement and polarity of the leads of electrocardiograph
8. Describe the normal electrocardiograms recorded from the limb leads and explain the physiological basis of the different records that are obtained
9. Define mean electrical vector (axis) of the heart and give the normal range
10. Define the mean QRS vector
11. Describe the axes of leads (hexagonal reference system)
12. Comprehend the vectorial analysis of the normal ECG
13. Determine the mean electrical axis of the ventricular QRS and appreciate the mean axis deviation
14. Explain the concepts of current of injury, J point, and their significance
Study Resources:
1. Chapter 11, Guyton and Hall Textbook of Medical Physiology, 14th edition
2. Chapter 9, Human Physiology - From Cells to Systems, Lauralee Sherwood, 9th edition
3. Chapter 29, Ganong’s Review of Medical Physiology, 26th edition
4. Electrocardiogram, StatPearls - https://www.ncbi.nlm.nih.gov/books/NBK549803/
5. ECG in Medical Practice by ABM Abdullah, 4th edition
6. Chapter 3, Cardiology Explained, https://www.ncbi.nlm.nih.gov/books/NBK2214/
7. ECG Basics, http://www.nataliescasebook.com/tag/e-c-g-basics
Recomendações da OMS sobre cuidados maternos e neonatais para uma experiência pós-natal positiva.
Em consonância com os ODS – Objetivos do Desenvolvimento Sustentável e a Estratégia Global para a Saúde das Mulheres, Crianças e Adolescentes, e aplicando uma abordagem baseada nos direitos humanos, os esforços de cuidados pós-natais devem expandir-se para além da cobertura e da simples sobrevivência, de modo a incluir cuidados de qualidade.
Estas diretrizes visam melhorar a qualidade dos cuidados pós-natais essenciais e de rotina prestados às mulheres e aos recém-nascidos, com o objetivo final de melhorar a saúde e o bem-estar materno e neonatal.
Uma “experiência pós-natal positiva” é um resultado importante para todas as mulheres que dão à luz e para os seus recém-nascidos, estabelecendo as bases para a melhoria da saúde e do bem-estar a curto e longo prazo. Uma experiência pós-natal positiva é definida como aquela em que as mulheres, pessoas que gestam, os recém-nascidos, os casais, os pais, os cuidadores e as famílias recebem informação consistente, garantia e apoio de profissionais de saúde motivados; e onde um sistema de saúde flexível e com recursos reconheça as necessidades das mulheres e dos bebês e respeite o seu contexto cultural.
Estas diretrizes consolidadas apresentam algumas recomendações novas e já bem fundamentadas sobre cuidados pós-natais de rotina para mulheres e neonatos que recebem cuidados no pós-parto em unidades de saúde ou na comunidade, independentemente dos recursos disponíveis.
É fornecido um conjunto abrangente de recomendações para cuidados durante o período puerperal, com ênfase nos cuidados essenciais que todas as mulheres e recém-nascidos devem receber, e com a devida atenção à qualidade dos cuidados; isto é, a entrega e a experiência do cuidado recebido. Estas diretrizes atualizam e ampliam as recomendações da OMS de 2014 sobre cuidados pós-natais da mãe e do recém-nascido e complementam as atuais diretrizes da OMS sobre a gestão de complicações pós-natais.
O estabelecimento da amamentação e o manejo das principais intercorrências é contemplada.
Recomendamos muito.
Vamos discutir essas recomendações no nosso curso de pós-graduação em Aleitamento no Instituto Ciclos.
Esta publicação só está disponível em inglês até o momento.
Prof. Marcus Renato de Carvalho
www.agostodourado.com
Osteoporosis - Definition , Evaluation and Management .pdfJim Jacob Roy
Osteoporosis is an increasing cause of morbidity among the elderly.
In this document , a brief outline of osteoporosis is given , including the risk factors of osteoporosis fractures , the indications for testing bone mineral density and the management of osteoporosis
Basavarajeeyam is a Sreshta Sangraha grantha (Compiled book ), written by Neelkanta kotturu Basavaraja Virachita. It contains 25 Prakaranas, First 24 Chapters related to Rogas& 25th to Rasadravyas.
1. 3/15/2011 Sources of error: Selection bias 1
Sources of error: Selection bias
Victor J. Schoenbach, PhD
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/
Principles of Epidemiology for Public Health (EPID600)
2. 2
Excerpts from (allegedly) actual student
history essays collected by teachers from
8th grade through college
Richard Lederer, St. Paul's School
http://www.edfac.usyd.edu.au/staff/souters/Humour/Student.Blooper.World.html
3. 3
Excerpts from (allegedly) actual student history essays collected by teachers from
8th grade through college
Richard Lederer, St. Paul's School
“Finally Magna Carta provided that no man should
be hanged twice for the same offense.”
“Another story was William Tell, who shot an
arrow through an apple while standing on his
son's head.”
4. 4
Excerpts from (allegedly) actual student history essays collected by teachers from
8th grade through college
Richard Lederer, St. Paul's School
“It was an age of great inventions and discoveries.
Gutenberg invented removable type and the
Bible. Another important invention was the
circulation of blood. Sir Walter Raleigh is a
historical figure because he invented cigarettes
and started smoking. And Sir Francis Drake
circumcised the world with a 100 foot clipper.”
5. 5
Winning tongue in cheek headlines selected for
humorous magazine cover contest (Epimonitor.net)
•“Don’t Ask, Don’t Tell---Double Blinding In Military
Studies” (Melissa Adams)
•Sensitivity Analysis---Are You Really A New Man?
(Mark Colvin)
•Absent Sex Life? Lucky You! 38 Sexually Transmitted
Diseases You Won’t Get (Mark Colvin)
•Do You Have Survey Phobia? Take Our Quiz and Find
Out!! (Mary Anne Pietrusiak)
6. 3/15/2011 Sources of error: Selection bias 6
“To error is human”
• Science emphasizes systematic,
repeatable, carefully-conducted
observation
• Laboratory investigations are
highly controlled, to minimize
unwanted influences
• Human sciences must contend
with many threats to validity www.cfa.harvard.edu/poem/
7. 10/29/2001 Sources of error: Selection bias 7
“To error is human”
Any epidemiologic study presents many, many
opportunities for error in relation to:
• Selection of study participants
• Classification and measurement
• Comparison and interpretation
8. 3/15/2011 Sources of error: Selection bias 8
Accuracy, Bias, Error, Precision, Reliability, Validity
• Accuracy: lack of error – getting the correct result
• Bias: systematic error – independent of study size
• Error: discrepancy between the observed result and the true
value
• Precision: absence of random error
• Reliability: repeatability of a measure
• Validity: absence of bias (or absence of all error)
9. 10/29/2001 Sources of error: Selection bias 9
Internal validity versus external validity
• Internal validity: whether the study provides an
unbiased estimate of what it claims to estimate
• External validity: whether the results from the study
can be generalized to some other population
10. 10/18/2004 Sources of error: Selection bias 10
“Random error”
• “that part of our experience that we cannot
predict” (Rothman and Greenland, p. 78)
• Usually most easily conceptualized as sampling
variability
11. 10/29/2001 Sources of error: Selection bias 11
Draw 1,000 random samples from population with 50% women
Estimate of % Women
Distribution N = 100 N = 1,000
Highest 68 54.9
10% above 56 above 51.9
25% above 53 above 51.0
25% below 47 below 48.9
10% below 44 below 48.0
Lowest 33 45.0
12. 7/30/2010 Sources of error: Selection bias 12
Random error can be problematic, but . . .
• Influence can be reduced
– increase sample size
– change design
– improve instrumentation
• Probability of some types of influence can be
quantified (e.g., confidence interval width)
13. 10/29/2001 Sources of error: Selection bias 13
Systematic error (“bias”) is more problematic
• May be present without investigator being
aware
• Sources may be difficult to identify
• Influence may be difficult to assess
14. 10/19/2010 Sources of error: Selection bias 14
Direction of bias - “which way is up?”
• Positive bias – observed value is higher than the true
value
• Negative bias – observed value is lower than the true
value
• Bias towards the null – observed value is closer to 1.0
than is the true value
• Bias away from the null – observed value is farther
from 1.0 than is the true value
True Observed
TrueObserved
Observed TrueNull
True Observed1
NullObserved2
15. 10/29/2001 Sources of error: Selection bias 15
Classifying types of bias
• Selection bias – differential access to the study
population
• Information bias – inaccuracy in measurement or
classification
• Confounding bias – unfair comparison
16. 10/18/2004 Sources of error: Selection bias 16
Possible sources of selection bias
• Participant selection procedure, e.g., exposure
affects case ascertainment (“detection bias”) or
control selection
• Differential unavailability due to death (“selective
survival”), illness, migration, or refusal
(nonresponse bias)
• Loss to follow-up / attrition / missing data
17. 10/16/2007 Sources of error: Selection bias 17
Case-control studies
• Case-control studies are highly vulnerable to
selection bias, particularly in the control group.
• The purpose of the control group is to estimate
exposure in the base population.
• Selection bias results if control selection is not
neutral with respect to exposure.
18. 10/29/2001 Sources of error: Selection bias 18
Example: differential loss to follow-up (cohort
study)
Complete cohort Observed cohort
Cases Noncases Cases Noncases
Exposed 40 160 32 144
Unexposed 20 180 18 162
Total 60 340 50 306
Risk ratio 2.0 1.8
19. 7/7/2002 Sources of error: Selection bias 19
Example: non-response bias in case-control study
Target population Study population
Cases Pop-time Cases Noncases
Exposed 200 20,000 180 200
Unexposed 400 40,000 300 400
Total 600 60,000 480 600
Odds ratio 1.0 1.2
20. 7/1/2009 Sources of error: Selection bias 20
Examples
• Case-control study of socioeconomic factors and
risk of anencephaly in Mexico
• Case-control study of physical fitness and heart
attacks
• Cohort study of smoking cessation methods (Free
& Clear)
21. 3/15/2011 Sources of error: Selection bias 21
Selective survival can affect a cohort study
Natural history of HIV infection
1 Recruit participants at the time they become HIV
infected
2 Recruit participants by means of a serologic survey to
detect prevalent infections.
People who progress to AIDS and die relatively soon after
infection will be underrepresented in study 2.
22. 10/25/2005 Sources of error: Selection bias 22
Prevalent cohort – cohort of survivors
Effect of hypertension in an elderly cohort
If people who die before becoming elderly were more
vulnerable to end-organ damage from hypertension, then
the cohort study may observe less morbidity associated
with hypertension than would be observed if the study
had enrolled younger participants.
23. 3/15/2011 Sources of error: Selection bias 23
Example – cohort of survivors
Effect of environmental tobacco smoke (ETS) in newborns
If ETS increases early fetal losses, possibly undetected,
then the most susceptible fetuses may be unavailable to
be part of the cohort study of effects of ETS on infants.
In this example, the early fetal losses are “informative”.
24. 10/29/2001 Sources of error: Selection bias 24
Conceptual framework
(Kleinbaum, Kupper, Morgenstern)
• Target population
• Actual population
• Study population
• “Selection probabilities”
25. 10/29/2001 Sources of error: Selection bias 25
Target population vs. actual population
• Target population – the population from which we
think we are studying a sample
• Actual population – the population from which we
are actually studying a sample
• Study population – a random sample from the
actual population
26. 10/18/2004 Sources of error: Selection bias 26
Actual population vs. study population
• “Selection probabilities” – the probability that
someone in the target population will be in the
actual population
Selection probability (for each exposure-disease
subgroup) = probability (target actual)
27. 10/29/2001 Sources of error: Selection bias 27
Selection probabilities from the target
population to the actual population
Target population Actual population
Exposed Unexposed Exposed Unexposed
Cases A B A
o
B
o
Noncases C D C
o
D
o
alpha = A
o
/A, beta = B
o
/B, gamma = C
o
/C, delta = D
o
/D
28. 10/29/2001 Sources of error: Selection bias 28
Example of selection probabilities
Endometrial cancer and estrogen use
• Case-control studies found high OR’s
• Horwitz and Feinstein claimed “detection bias”:
women may have asymptomatic tumors; if they have
vaginal bleeding from estrogens their cancer will be
discovered
29. 6/27/2002 Sources of error: Selection bias 29
Horwitz and Feinstein’s argument
Target population Actual population
Estrogen Unexposed Estrogen Unexposed
Endometrial
cancer 2,000 8,000 1,800 900
Noncases 2,000,000 8,000,000 1,600,000 6,400,000
OR 1.0 8.0
30. 6/23/2002 Sources of error: Selection bias 30
Horwitz and Feinstein’s argument
Actual population / Target population
Estrogen Unexposed
Cases alpha = 1,800 / 2,000 beta = 900 / 8,000
Noncases gamma = 1,600,000 /
2,000,000
delta = 6,400,000 /
8,000,000
31. 10/18/2004 Sources of error: Selection bias 31
Horwitz and Feinstein’s argument
Actual population / Target population
Estrogen Unexposed
Cases alpha = 0.9 beta = 0.11
Noncases gamma = 0.8 delta = 0.8
alpha/beta = 8 > gamma/delta = 1
32. 6/27/2002 Sources of error: Selection bias 32
Horwitz and Feinstein’s solution
Actual population / Target population
Estrogen Unexposed
Cases alpha = 0.9 beta = 0.11
Noncases gamma = 0.36 delta = 0.06
alpha/beta = 8, gamma/delta = 6
33. 10/24/2006 Sources of error: Selection bias 33
Summary
• Epidemiologists differentiate between random
error and systematic error (“bias”).
• Bias: selection, information, and confounding.
• Bias can arise in any type of epidemiologic
study.
• Ask: bias from what, how much, what effect?
34. 34
Excerpts from (allegedly) actual student history essays collected by teachers from 8th grade
through college
Richard Lederer, St. Paul's School
“The greatest writer of the Renaissance was
William Shakespeare. He was born in the year
1564, on his birthday. He never made much
money and is famous only because of his plays.
He wrote tragedies, comedies, and hysterectomies,
all in Islamic pentameter. Romeo and Juliet are an
example of a heroic couplet. Romeo's last wish
was to be laid by Juliet.”
35. 35
Excerpts from (allegedly) actual student history essays collected by teachers from 8th grade
through college
Richard Lederer, St. Paul's School
“Writing at the same time as Shakespeare was
Miguel Cervantes. He rote Donkey Hote. The next
great author was John Milton. Milton wrote Paradise
Lost. Then his wife died and he wrote Paradise
Regained.”
Editor's Notes
Greetings. Namaste, konnichiwa, drasvuitsia, Ni-hau, merhaba, oseo
This lecture is on Sources of Error, particularly Selection Bias.
To put us in the right frame of mind, here are some excerpts from (allegedly) actual student history essays collected by teachers from 8th grade through college. The come from a web page by Richard Lederer, St. Paul’s School, www.edfac.usyd.edu.au/staff/souters/Humour/Student.Blooper.World.html
“Finally Magna Carta provided that no man should be hanged twice for the same offense.”
“Another story was William Tell, who shot an arrow through an apple while standing on his son's head.”
“It was an age of great inventions and discoveries. Gutenberg invented removable type and the Bible. Another important invention was the circulation of blood. Sir Walter Raleigh is a historical figure because he invented cigarettes and started smoking. And Sir Francis Drake circumcised the world with a 100 foot clipper.”
Here are some winning tongue in cheek headlines selected for humorous magazine cover contest:
“Don’t Ask, Don’t Tell---Double Blinding In Military Studies” (Melissa Adams)
Sensitivity Analysis---Are You Really A New Man? (Mark Colvin)
Absent Sex Life? Lucky You! 38 Sexually Transmitted Diseases You Won’t Get (Mark Colvin)
Do You Have Survey Phobia? Take Our Quiz and Find Out!! (Mary Anne Pietrusiak)
cover contest (Epimonitor.net)
Science emphasizes systematic, repeatable, carefully-conducted observation. By designing and performing experiments, a scientist attempts to control all aspects of a situation in order to observe the effect of varying only those elements s/he is trying to investigate.
My brother-in-law is an astrophysicist at the Harvard-Smithsonian Center for Astrophysics in Cambridge, Massachusetts. He has been conducting an experiment to use laser interferometry to verify the law of falling objects to a higher degree of precision than has heretofore has been obtainable. A great deal of the experimental set-up consisted in insulating highly-sensitive measurement apparatus from outside perturbations such as vibration of the building. Measurements had to be made at night, since the apparatus would pick up even the gravitational force from a car parking outside the building.
Epidemiology and other human sciences have nowhere near the precision of measurement that is available in the physical sciences and nowhere near the ability to control threats to validity. So recognizing and minimizing the effect of myriad sources of error is a constant theme in epidemiologic research.
Sources of systematic error in epidemiologic studies are generally considered under three headings: errors that may arise from the way that study participants are selected into the study, errors from classification and measurement procedures, and errors from comparisons and their interpretation. These sources of systematic error are in addition to sampling variability, which is often termed “random error”.
The more important the concept, the richer the vocabulary that evolves to express its various aspects. Given the importance of error, we would expect a wealth of terms to characterize it, and we will not be disappointed in that expectation. Here are some of the major terms, including several that we have already encountered.
Accuracy is a general term for getting the right result. Accuracy is the opposite of error, which is used to refer to a discrepancy between an observed result and the true value. Of course, the true value is rarely known.
Error is typically regarded as arising from two types of processes – random processes, such as simple random sampling or flipping a fair coin, and systematic processes which move an estimate in one direction. Epidemiologists use the term bias to refer to systematic error.
Random error can be reduced by increasing sample size. As the sample size increases, observed results tend to converge toward their underlying values. Systematic error, on the other hand, is unaffected by sample size.
Precision refers to the absence of random error. Similarly, reliability refers to the extent to which a measurement or finding will be observed consistently by different observers, using different instruments, at different times, or across other circumstances where the underlying construct is believed not to have changed.
Validity refers to the absence of systematic error (“bias”) or, sometimes, the absence of all error.
Epidemiologists sometimes use the terms “internal validity” and “external validity”. Internal validity concerns whether the study provides an unbiased (valid) estimate of what it claims to estimate. For example, suppose investigators say that they are studying Alzheimer’s disease in North Carolina elderly. If the study findings do in fact accurately depict the situation among NC elderly, then the study is (internally) valid.
External validity concerns whether the results from a study apply to some wider population that the study may have relevance for but does not claim to estimate. For example, do the results from a study carried out in North Carolina apply to people in the southeast U.S. as a whole, the entire U.S., all of North America, all industrialized countries, etc.? The term generalizability is often used interchangeably with “external validity”.
The term “random error” is widely employed yet less widely understood. Rothman and Greenland (Modern epidemiology, p78) define it as “that part of our experience that we cannot predict” (some might say the part that we do not attempt to predict).
The concept can easily be demonstrated in the form of sampling variability, where we can readily show that a series of samples from a single population will in general differ from one another and from the population, at least slightly.
Sampling variability can be reduced by increasing sample size. For example, suppose that we have a large population for which we want to estimate the percentage of females, which happens to be 50% in this population. If we draw 1,000 samples of 1,000 people each – that’s the right-hand column in the table – and compute the percentage of women in each sample, we might observe the percentage to range across the samples from as low as 45% to as high as nearly 55%, with about half of the samples having a percentage between about 49% and 51%.
On the other hand, if the samples had only 100 people each, the percentages of women in the samples would vary more widely around the population value of 50%. As shown in the left-hand column, about one-fourth of the estimated percentages from the smaller samples would be below 47% and one-fourth above 53%.
So the larger sample size has reduced the amount of random error in our estimates of the population sex ratio. In general, if the magnitude of the error can be reduced by increasing the sample size, then we classify the error as random.
Random error can certainly be problematic, but it is generally not as problematic as systematic error. For one, we know a great deal about random error and how to reduce it. We can, for one, increase sample size. It may also be possible to reduce random error by changing the design of our sampling plan or using more precise measurement instruments. Moreover, the probability of various degrees of influence from random error can be quantified, so that we can use confidence intervals to express the uncertainty inherent in our estimates due to sampling variability.
The precision of an estimate is defined as the reciprocal of the standard error of the estimate, but the width of the confidence interval is often used in epidemiology. The difference between the upper and lower 95% confidence limits for a rate or proportion is approximately four (2 x 1.96) times the standard error of the point estimate. The standard errors for many ratio measures (e.g., , risk ratio, rate ratio, odds ratio) are obtained with a log transformation, so the width of their confidence intervals is quantified as the ratio of the upper to the lower confidence limit (called the confidence interval ratio, or CLR). The natural log of the CLR is 2 x 1.96 times the log of the standard error.
Whatever the problems with people’s understanding of random error and statistics, systematic error, also called “bias”, is more problematic for epidemiologists. Systematic error may be present without the investigator’s being aware of it, and sources of systematic error may be difficult to identify. Even when we can identify sources of error, their influence may be difficult to assess.
When we talk about systematic error, we are often interested in its direction. Random error, being random, goes in both directions. A specific source of systematic error, however, usually shifts the observed value in a particular direction. So we need some terminology for giving directions.
Most often we are referring to bias in an estimated relative risk (incidence density ratio, cumulative incidence ratio, or odds ratio). For a relative risk, the null value (no association) is 1.0. A positive bias means that whatever is the true relative risk (e.g., 2.3, 1.0, 0.5), the observed relative risk is higher, i.e., in a positive direction (e.g., 3.3 observed vs. 2.3 true; 1.3 vs. 1.0, 0.8 vs 0.5).
Negative bias is the inverse of positive bias. For the three example true relative risks just quoted, the following observed relative risks would indicate negative bias: 1.9 observed vs. 2.3 true, 0.8 vs. 1.0, 0.3 vs. 0.5).
Some error processes tend to bias the measure of relative risk in a positive or negative direction. In other cases, however, the error process may tend to shift the observed relative risk closer to 1.0 (the null value) or farther from 1.0, regardless of whether the direction is positive or negative. For these situations we have the terms “towards the null” and “away from the null”. If the true relative risks were 2.3, 1.0, and 0.5, then bias towards the null would be exemplified by 1.8 observed vs. 2.3 true and by 0.8 vs. 0.5. A true relative risk of 1.0 is not susceptible to bias towards the null since it is already at the null value. That’s handy to know, since it means that a bias towards the null can never create the completely false impression of an association.
As noted in an earlier slide, epidemiologists usually conceptualize sources of systematic error as falling into one of three categories, and here are the terms used to refer to the types of error that result.
Selection bias refers to systematic error resulting from differential access to the study population for different exposure-disease subgroups.
Information bias refers to systematic error resulting from inaccuracy in measurement or classification of study variables, including disease, exposure, and other risk factors.
Confounding bias differs somewhat from the other two forms, in that with confounding bias the data themselves may be fully accurate. Confounding bias is a systematic error of interpretation that results from making an unfair comparison, i.e., from a failure to take into account important differences between exposed and unexposed groups.
We will take up each of these types of bias. The rest of this lecture will examine selection bias. The following lectures take up information bias and confounding.
Selection bias can arise from any aspect of the procedures for selecting participants for a study. Several common examples are situations where the exposure affects the ascertainment of cases for a case-control study (a situation sometimes referred to as “detection bias”) or affects the selection of controls. For example, doctors are more likely to suspect the presence of heart disease in a man who is overweight, hypertensive, and/or smokes cigarettes, so a middle-aged person with these characteristics who comes to a doctor complaining of chest pain may be more likely to have a thorough workup for angina pectoris than someone without these heart disease risk factors.
Another example is what we might call differential unavailability due to death, illness, migration, or refusal. “Selective survival” refers to the fact that people with different characteristics are more or less likely to survive after contracting a disease. Similarly, characteristics of people who are more likely to refuse to participate in a study are therefore likely to be underrepresented.
Attrition of participants from a study can produce bias if the incidence rates in people who drop out differ from those in people who complete the study. Loss to follow-up in a cohort study and participants not included in the data analysis due to missing data items are also potential sources of selection bias.
Case-control studies are highly vulnerable to selection bias, particularly from the identification and recruitment of the control group. Clear thinking and considerable effort are required to avoid introducing bias.
The essential purpose of the control group is to provide an estimate of exposure in the base population, the population from which the cases arise. This understanding represents a shift in emphasis from the traditional view of case-control studies, which emphasized the comparability of controls to cases.
If the study includes all cases from a defined geographical area, then a representative sample of residents from that area should provide a valid estimate of exposure. However, such a sample will be difficult and expensive to recruit. Also, unless the response rate is high, the sample may yield a biased estimate due to differential nonresponse. Furthermore, a population-based sample may not provide accurate information on certain exposures, such as medical history.
A control group drawn from among hospitalized patients will be easier and less expensive to recruit, have a higher response rate, and provide better information about medical exposures. But if the conditions that bring people to the hospital are related to the exposure under study, then the control group and the overall study will be biased. For example, some early case-control studies of lung cancer and smoking underestimated the association because they drew their controls from hospitalized patients.
Let’s look at selection bias with two numerical examples. Suppose that we are studying a cohort of 400 people, 200 of whom are exposed and 200 of whom are unexposed. The left-hand table shows the complete cohort. 40 of the 200 exposed people develop the disease, compared to 20 of the unexposed people, so the relative risk is 40/200, which is the cumulative incidence of the disease in the exposed, divided by 20/200, which is the cumulative incidence of the disease in the unexposed, so the relative risk is 2.0.
Now suppose that there is loss to follow-up among both exposed and unexposed, including both people who will develop the disease and people who will not. Suppose that the proportion who drop out is 10% in all exposure-disease groups except for some reason 20% drop out among people who are exposed and are going to develop the disease (or who were exposed and have developed the disease but not yet been counted as cases). Under this scenario, the right-hand table shows the cohort that we will observe. The relative risk in the observed cohort is 32/176 divided by 18/180, or 1.8, so that due to selection bias the observed relative risk is 10% less than the true relative risk.
Here is a numerical example of how selection bias due to non-response might distort the observed odds ratio in a case-control study. Again, the left-hand table represents the true situation in the population, while the right-hand table represents what we find in the case-control study.
In this case there are 600 cases of the disease – that’s on the left – but only 90% of exposed cases agree to be in the study, so that in the upper left hand corner, of the 200 exposed cases, only 90%, or 180, agree to be in the study and therefore appear in the right-hand table, the study population, and let’s assume that only 75% of unexposed cases participate, so of the 400 in the lower left of the table only 300, or 75%, agree to come into the study population. Then the odds ratio in the case-control study will be 1.2, suggesting an association where in fact none exists.
Let’s consider some examples of studies where selection bias would occur.
Muñoz et al. conducted a case-control study of the risk of anencephaly (a congenital neural tube defect) and socioeconomic status (SES) in Mexico. Cases were newborns diagnosed with anencephaly (or for stillbirths, with anencephaly listed as the cause of death) and reported to a national neural tube defects registry. Controls were obtained from live newborns in the same hospital systems. Mothers were contacted and interviewed regarding SES. However, low SES mothers were less likely to be interviewed due to less-than-reliable reporting to the national registry in low SES areas. If low SES increased the risk of anencephaly, the cases included in the study would underrepresent cases from low SES families, biasing the estimated association (Muñoz JB, et al. Socioeconomic factors and the risk of anencephaly in a Mexican population: A case-control study. Public Health Reports 2005; 120:39-45.)
Or, suppose we are conducting a case-control study of the protection from heart attack among people who are physically fit. Cases are persons newly-admitted to the coronary care unit at area hospitals. Controls are obtained through random digit dialing in that same geographical area. Since physical fitness is protective, we should find the prevalence of being physically fit to be lower among the heart attack victims than in the population control group.
However, many people who suffer a heart attack die before they are admitted to the hospital. What if being physically fit improves the chances of surviving a heart attack? Then hospitalized heart attack victims will tend to be more fit than heart attack victims in general, thereby biasing our odds ratio toward finding less of a benefit (and possibly even a risk) from physical fitness.
Here is a third example. A number of years ago some colleagues and I conducted a randomized trial of quit smoking interventions. The usual practice in such studies has been to count dropouts as continuing smokers, in order to have a more conservative estimate of the quit rate. However, what if quitters in the control group felt that they did not receive anything from the study so that they were more likely to drop out than quitters in the group that received the quitting intervention. Then counting all dropouts as smokers would exaggerate the apparent effect of the intervention.
So selection bias can affect cohort and case-control studies, as well as cross-sectional studies.
Ordinarily we think of loss to follow-up as being the primary source of selection bias in cohort studies, since the study base is the cohort of people being followed. However, there are situations in which something like selection bias can occur from other causes.
For example, here are two study designs for examining prognosis after HIV infection. In the first, participants are recruited when they are newly-infected with HIV. In the second, participants are recruited through a serologic survey that identifies people with prevalent infections. The second study, which is sometimes called a “prevalent cohort”, will underrepresent people who have an especially rapid course to death, since they will have died before being enrolled. Such a process has been suggested as a reason that the Women’s Health Initiative randomized trial of hormone replacement therapy found higher CVD rates in women taking HRT, whereas observational studies (or prevalent cohorts) had found lower CVD death rates in women taking HRT.
Another hypothetical example of how selective survival can affect a cohort study is a prospective investigation of end-organ damage (e.g., blindness, kidney disorder) among elderly persons with hypertension. If people who die before becoming elderly were more vulnerable to suffering end-organ damage from hypertension, then a cohort study that enrolls people at age 65 years may observe less morbidity associated with hypertension than would be observed among younger participants. Although the different observations reflect a true difference in what is seen with younger and older hypertensives, the results could easily be misinterpreted.
Yet another hypothetical example of how selection bias might occur in a cohort of survivors is a situation in which environmental tobacco smoke (ETS) causes early fetal losses, possibly undetected. If the most susceptible fetuses did not survive to birth, a study of the effects of ETS among newborns will see less of an effect that has occurred.
These examples fall into the category of what is called “competing risks”. Loss of participants from competing risks, dropout, nonresponse, missing data, and other causes is said to be “noninformative” when losses do not affect estimates (e.g., proportional losses in all groups being compared) or “informative” (meaning that the losses cause bias in the association of interest).
A helpful conceptual framework for thinking about situations where selection bias may operate is the one elaborated by David Kleinbaum, Larry Kupper, and Hal Morgenstern, in their book Epidemiologic methods: theory and quantitative methods. The KKM framework envisions a “target” population, an “actual” population, a “study” population, and “selection probabilities”.
The target population consists of the people whom we believe we are studying. If, for example, we are trying to study depression among all teenagers in North Carolina, then the target population consists of all North Carolina teenagers.
The actual population consists of the people in the target population who might actually contribute data to our study. For example, if we were collecting our data by telephone interviews with a random sample of households listed in the telephone directory, then the actual population would consist of teenagers who live in households with a listed telephone number, whose parents would give permission for the interview, who would participate if invited, and who would be interviewed (e.g., an interviewer and the participant speak the same language).
The study population consists of the people whose data we have collected and analyzed. In the KKM framework, the study population is regarded as an unbiased random sample from the actual population. If the sample size were increased to its maximum, then the study population would be the same as the actual population. This framework divides up sources of error into random error (divergence between the actual population and a particular study population) and systematic error (divergence between the actual population and the target population).
The final component of the KKM conceptual framework is the selection probabilities that describe the relation between the target population and the actual population. For each exposure-disease subgroup, there is a selection probability that describes the proportion of the target population for each subgroup that is available in the actual population.
In this slide, the left-hand table represents the target population – the population whose disease-exposure relations we seek to measure. The right-hand table represents the actual population – the proportions of the target population that are actually available to our study.
The selection probabilities are the proportions of each subgroup in the target population that make it into the actual population. There is a selection probability for each exposure-disease subgroup.
So, for example, if we call the selection probability for exposed-cases “alpha”, as KKM do, then:
alpha = Ao / A
Similarly, the remaining selection probabilities are beta = Bo / B, gamma = Co / C, and delta = Do / D.
To see how these selection probabilities can be used, consider the following example based on studies of endometrial (uterine) cancer and postmenopausal estrogens and the argument by Ralph Horwitz and Alvan Feinstein that detection bias was responsible for much or all of the observed increase in risk among users of exogenous estrogens.
During the 1970’s, several case-control studies found a strong association between risk of endometrial cancer and prior use of estrogen replacement medications, with odds ratios on the order of 6 - 10. Horwitz and Feinstein proposed that these associations were the result of a “detection bias” in which endometrial cancers were more likely to come to medical attention in women who took estrogen than in women who did not.
Exogenous estrogens often cause vaginal bleeding, as a side effect. When a postmenopausal woman experiences bleeding, she is likely to see her doctor and undergo a dilation and curretage (D&C) procedure in which the lining of her uterus is removed and examined. Endometrial cancers are found in this way. Thus, among women with asymptomatic endometrial cancer, if women who take estrogens are more likely to bleed and to have a D&C, then their cancers are more likely to be detected. In this way, detection bias could result in a higher prevalence of estrogen use among detected cases of endometrial cancer than would exist among all cases. For the Horwitz and Feinstein argument to explain the observed endometrial cancer-estrogen association, there would need to be a sizable reservoir of undiagnosed endometrial cancer waiting to be detected.
Here are some hypothetical data that illustrate this argument as it would be presented in the KKM framework. Again, the left-hand table shows a target (true) population that investigators believe they are studying. In this particular target population of just over 10,000,000 postmenopausal women, there are 10,000 postmenopausal women with endometrial cancer. We assume that 20% of the women – overall and those who develop endometrial cancer – have used postmenopausal estrogens.
Case-control studies, however, typically rely on the health-care system to identify cases. If women who take estrogens are more likely to experience bleeding and a D&C, then the proportion of endometrial cancers detected among women who take estrogen (1,800 / 2,000 = 90%) would be higher than the proportion among women with endometrial cancer who do not take estrogen (900 / 8,000 = 11.25%). Thus, a case-control study with no other biases would be expected to observe an OR of 8.0, whereas the OR for the hypothetical true data is assumed to be 1.0.
If we examine the selection probabilities, we see that alpha = 1,800 / 2,000 = 0.9, beta = 900 / 8,000 = 0.1125, gamma = 1,600,000 / 2,000,000 = 0.8, and delta = 6,400,000 / 8,000,000 = 0.8. These are shown on the next slide.
The selection probabilities can be used to assess the presence, direction, and extent of selection bias in the measure of association. For example, in a case-control study, the odds ratio is computed as the odds of exposure among cases divided by the odds of exposure among controls.
The odds ratio is computed from the study population, which is assumed to be an unbiased sample of the actual population. So for the odds ratio to be unbiased, the exposure odds in the actual population should be the same as the exposure odds in the target population. (The OR can also be unbiased if both exposure odds are biased in the same direction and by the same [proportionate] amount, in which case the biases offset one another.)
The bias in the exposure odds in cases is equal to the ratio of the selection probabilities in cases, alpha/beta. The bias in the exposure odds in controls is equal to the ratio of the selection probabilities in controls, gamma/delta. So in the detection bias scenario we have just presented, alpha/beta = 0.9 / 0.11 = 8, indicating a substantial bias in the observed exposure odds in cases. Since the exposure odds in controls is unbiased (gamma / delta = 0.8/0.8 = 1), the bias in the odds ratio will be the same as the bias in the exposure odds in cases. Since the true OR of 1.0, the observed OR is 8.0
Interestingly, however, the remedy proposed by Horwitz and Feinstein involved the controls rather than the cases. Arguing that there were undetected cases of endometrial cancer in the population, Horwitz and Feinstein recommended that the control group be recruited from among women who had undergone the same diagnostic procedure for endometrial cancer (i.e., D&C) as had the cases, so that any endometrial cancers among the controls would be detected and could be added to the cases.
This proposal has a superficial appeal, since it makes it appear that the cases and controls are being treated the same (recall that the conventional understanding of the case-control study emphasized making a “fair” comparison of cases and controls). However, only a very small proportion of controls would have undetected endometrial cancer, and unless the D&C were being carried out specifically for the study, these women would already have been counted as cases. So the impact on the case group was negligible. In contrast, there was a substantial effect on the control group, since instead of being selected from the general population, to provide an estimate of estrogen use in the population from which cases arose, the control group provided an estimate of estrogen use in a population of women who attended gynecology clinics for evaluation of symptoms. These women were more likely to be receiving postmenopausal estrogens, of course, since evaluation of bleeding and other side effects would be a reason for getting a D&C. In terms of selection probabilities, we might guess that if 6% of postmenopausal women in the general population attend a gynecology clinic, then 36% of postmenopausal women using estrogens attend a gynecology clinic. As a result, gamma/delta now equals 6, and the OR is 1.3.