SlideShare a Scribd company logo
1 of 79
2/22/2011 Cross-sectional studies 1
Study designs: Cross-sectional studies,
ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/
Principles of Epidemiology for Public Health (EPID600)
2
Signs from around the world
In a Copenhagen airline ticket office:
β€œWe take your bags and send them in all
directions.”
3
Signs from around the world
In a Norwegian cocktail lounge:
β€œLadies are requested not to have
children in the bar.”
4
Signs from around the world
Rome laundry:
β€œLadies, leave your clothes here and
spend the afternoon having a good time.”
5
Faster keyboarding - 1
I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I
was rdanieg. The phaonmneal pweor of the hmuan mnid,
aoccdrnig to a rscheearch at Cmabrigde Uinervtisy. It
dn'seot mttaer in waht oredr the ltteers in a wrod are, the
olny iprmoatnt tihng is taht the frist and lsat ltteer be in
the rghit pclae. The rset can be a taotl mses and you can
sitll raed it wouthit a porbelm.
β€’Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)
6
Faster keyboarding - 2
Most of my friends could read this with understanding
and rather quickly I might add. Then I had them read a
statistical bit of literature:
β€’Miittluvraae asilyans sattes an idtenossiy ctuoonr epilsle
is the itternoiecsno of a panle pleralal to the xl-yapne and
the sruacfe of a btiiarave nmarol dbttiisruein.
Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)
2/22/2011 Cross-sectional studies 7
Study designs: Cross-sectional studies,
ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/
Principles of Epidemiology for Public Health (EPID600)
10/15/2001 Cross-sectional studies 8
Today – outline
β€’ Cross-sectional studies (and sampling)
β€’ Ecologic studies
β€’ Confidence intervals
2/10/2009 Cross-sectional studies 9
Cross-sectional studies
β€’ Cross-sectional studies include surveys
β€’ People are studied at a β€œpoint” in time, without
follow-up.
β€’ Can combine a cross-sectional study with follow-up
to create a cohort study.
β€’ Can conduct repeated cross-sectional studies to
measure change in a population.
2/22/2011 Cross-sectional studies 10
Cross-sectional studies
β€’ Number of uninsured Americans rises to 50.7
million. (USA Today, 9/17/2010; data from Census Bureau)
β€’ In 2007-2008, almost one in five children older than
5 years was obese. (Health, United States, 2010; data from
the National Health and Nutrition Examination Survey)
β€’ 35% (~7.4 million) of births to U.S. women during
the preceding 5 years were mistimed or unwanted
(2002 National Survey of Family Growth, Series 23, No. 25, Table 21)
[Source: www.cdc.gov/nchs/]
2/10/2009 Cross-sectional studies 11
Cross-sectional studies
β€’ Incidence information is not available from a typical
cross-sectional study
β€’ Sometimes can reconstruct incidence from historical
information
β€’ Example: the incidence proportion of quitting
smoking, called the β€œquit ratio”:
ex-smokers / ever-smokers
is calculated from survey data.
10/15/2001 Cross-sectional studies 12
Measure prevalence at β€œpoint” in time
β€’ β€œSnapshot” of a population, a β€œstill life”
β€’ Can measure attitudes, beliefs, behaviors, personal or
family history, genetic factors, existing or past health
conditions, or anything else that does not require follow-
up to assess.
β€’ The source of most of what we know about the
population
2/22/2011 Cross-sectional studies 13
Population census
β€’ A cross-sectional study of an entire
population
β€’ Provides the denominator data for
many purposes (e.g., estimation of
rates, assessing generalizability,
projecting from smaller studies)
β€’ A huge effort – people can be difficult to
find and to count; may not want to
provide data
β€’ Some countries maintain accurate and
current registries of the entire country
2/22/2011 Cross-sectional studies 14
National surveys conducted by NCHS
National Health Interview Survey (NHIS) –
household interviews
National Health and Nutrition Examination
Survey (NHANES) – interviews and physical
examinations
National Survey of Family Growth (NSFG) –
household interviews
National Health Care Survey (NHCS) –
medical records
2/22/2011 Cross-sectional studies 15
National surveys
β€’ Designed to be representative of the entire country
β€’ Modes: household interview, telephone, mail
β€’ Employ complex sampling designs to optimize
efficiency (tradeoff between information and cost)
β€’ Logistically challenging (answering machines, cellphones, . . .)
See presentation by Dr. Anjani Chandra at
www.minority.unc.edu/institute/2003/materials/slides/Chandra-20030522.ppt
10/15/2001 Cross-sectional studies 16
Example: National Health Interview Survey
β€’ Conducted every year in U.S. by National
Center for Health Statistics (CDC)
β€’ β€œStratified, multistaged, household survey
that covers the civilian noninstitutionalized
population of the United States”
β€’ Redesigned every decade to use new
census
2/10/2009 Cross-sectional studies 17
β€œmultistaged”
β€’ Improves logistical feasibility and reduces costs
(though reduces precision)
1. Divide population into primary sampling units
(PSU’s)
PSU = primary sampling unit: metropolitan statistical
area, county, group of adjacent counties
2/10/2009 Cross-sectional studies 18
β€œmultistaged”
2. Select sample of census block groups (SSU’s)
within each selected PSU
3. Map each selected census block group or
examine building permits
4. Select one cluster of 4-8 housing units
dispersed evenly throughout the block
NCHS draws a new representative sample for
each week’s interviews
10/15/2001 Cross-sectional studies 19
β€œstratified”
β€’ US divided into 1,900 PSU’s
β€’ Largest 52 PSU’s are β€œself-representing”
β€’ Rest of PSU’s divided into 73 categories (β€œstrata”),
based on socioeconomic and demographic variables
β€’ Sampling takes place separately within each category
(β€œstratum”)
7/30/2010 Cross-sectional studies 20
Sample size and Precision
Sample
size
Lower
95%
Point
estimate
Upper
95% Width
100 0.17 0.25 0.33 0.16
400 0.21 0.25 0.29 0.08
900 0.22 0.25 0.28 0.06
1600 0.23 0.25 0.27 0.04
0.25 0.188 0.43301
3/6/2006 Cross-sectional studies 21
Weighted sampling
Hypothetical Unweighted Weighted
Age group Pop (1,000's) Sample Sample
20-39 yrs 18,000 900 400
40-59 yrs 18,000 900 400
60-69 yrs 8,000 400 400
Total 44,000 2,200 1,200
10/15/2001 Cross-sectional studies 22
β€œstratified”
β€’ Also place census blocks into categories and
sample within each
β€’ Oversample some strata
2/10/2009 Cross-sectional studies 23
β€œDefined population”
β€’ Studies, especially cross-sectional studies, are easiest to
interpret when they are based in a population that has some
existence apart from the study itself (β€œdefined population”)
1. Political subdivision (city, county, state)
2. Institutional (HMO, employer, profession)
β€’ Probability sampling enables statistical generalizability to
the defined population
2/22/2011 Cross-sectional studies 24
Surveys of sentinel populations
β€’ HIV seroprevalence survey in three county STD
clinics in central NC in 1988
β€’ 3,000 anonymous, unlinked, leftover sera
β€’ Anonymous questionnaire for demographics
and risk factors
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern
state. Ann Epidemiol 1993;3:281-288]
10/15/2001 Cross-sectional studies 25
HIV seroprevalence
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern
state. Ann Epidemiol 1993;3:281-288]
Group % HIV+
Homosexual men 46
Bisexual men 25
Heterosexual men 1.6
Women 0.6
Total 2.5
10/14/2003 Cross-sectional studies 26
Characteristic Gay Hetero Women
Syphilis
(history/current)
53 9.0 3
Gonorrhea (history) 37 2.6 1
Anal intercourse 41 1.7 2
Paid for sex 5.2
Seroprevalence (% HIV+) by risk factors
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern state.
Ann Epidemiol 1993;3:281-288]
10/15/2001 Cross-sectional studies 27
Interpretation
β€’ Measures prevalence – if incidence is our
real interest, prevalence is often not a good
surrogate measure
β€’ Studies only β€œsurvivors” and β€œstayers”
β€’ May be difficult to determine whether a
β€œcause” came before an β€œeffect” (exception:
genetic factors)
10/15/2001 Cross-sectional studies 28
Other points
β€’ Can choose by exposure or overall
β€’ Can choose by disease – may not be
distinguishable from a case-control study with
prevalent cases
10/15/2001 Cross-sectional studies 29
Outline
β€’ Cross-sectional studies (and sampling)
β€’ Ecologic studies
β€’ Confidence intervals
10/15/2001 Cross-sectional studies 30
β€œEcologic” studies
β€’ Most study designs – cross-sectional, case-
control, cohort, intervention trials – can be carried
out with individuals or with groups
β€’ Group-level studies which use routinely collected
data are easier and less costly
β€’ Group-level studies that involve interventions
may not be easier or less costly
3/6/2006 Cross-sectional studies 31
Types of group-level variables
β€’ Summary of individual-level variable (e.g.,
median household income, % with high
school diploma)
β€’ Property of the aggregate (e.g.,
neighborhood grocery stores, seat belt
legislation, β€œcommunity competence”)
2/22/2011 Cross-sectional studies 32
Interpretation
β€’ Link between summary exposure variable and
individual-level outcome must be inferred
β€’ Inference from group to individual is not
always sound
Example: Male Circumcision and HIV
Source: Bongaarts J, et al. The relationship between male circumcision and HIV infection in African populations. AIDS 1989; 3(6): 373-7.
2/22/2011 Cross-sectional studies 33
(Slope indicates strength of relationship;
r indicates linearity)
10/15/2001 Cross-sectional studies 34
Outline
β€’ Cross-sectional studies (and sampling)
β€’ Ecologic studies
β€’ Confidence intervals
3/8/2006 Cross-sectional studies 35
Confidence intervals
β€’ Provide a plausible range for the quantity
being estimated
β€’ Width indicates the precision of an estimate
for a given level of β€œconfidence”
β€’ Confidence intervals quantify only random
error from sampling variation, not systematic
error from nonresponse, study design, etc.
10/15/2001 Cross-sectional studies 36
Confidence level vs. precision
β€’ The more vague my estimate, the more
confident I can be that it includes the
population parameter: β€œI am 100%
confident that the prevalence of HIV is
between 0 and 100%”.
β€’ The more specific my estimate, the lower
my confidence: β€œI am 0% confident that
the prevalence of HIV is 5.23%”
10/12/2004 Cross-sectional studies 37
Confidence intervals – interpretation
β€’ Simple interpretations are typically not
precise
β€’ Precise interpretations are typically not
simple
10/15/2001 Cross-sectional studies 38
Simple but imprecise
β€’ β€œThere is 95% confidence that the interval
contains the true value”
– True, but begs the question – how to
define β€œconfidence”
10/15/2001 Cross-sectional studies 39
Simple but imprecise
β€’ β€œThere is a 95% probability that the interval
contains the true value”
– Not quite correct: probability (as
conventionally defined) applies to a process,
not to a single instance
3/7/2006 Cross-sectional studies 40
Probability applies to a process: example
A 95% confidence interval can be viewed as a
measurement or estimation process that will
be correct (the interval includes the true
value of the parameter) 95% of the time and
incorrect 5% of the time.
Let us make up another estimation process
that will be correct (about) 95% of the time.
6/29/2002 Cross-sectional studies 41
Why probability applies to a process
β€’ Estimate your gender by flipping a coin 5 times -
if the result is 5 heads estimate your
gender to be its opposite; otherwise estimate
your gender to be what you think it is now.
β€’ Probability that estimate will be correct is
(1 – Probability of 5 heads) = 0.97 = 97%
β€’ Probability that estimate will be incorrect is 3%
6/29/2002 Cross-sectional studies 42
Why probability applies to a process
So we now have a measurement process that
will be correct 97% of the time. We will use it
to measure your gender.
Flip the coin 5 times, and suppose you get 5
heads
– Is there a 97% probability that you are of the
opposite sex?
2/22/2011 Cross-sectional studies 43
Precise but not simple
A 95% confidence interval is:
1. obtained by using a procedure that will include
the population parameter being estimated 95%
of the time
2. the set of all population values which are β€œlikely”
to yield a sample like the one we obtained
10/15/2001 Cross-sectional studies 44
Suppose that this line represents the value
of the parameter we are trying to estimate
True value
10/15/2001 Cross-sectional studies 45
Possible estimates of that parameter in N
identical studies (shows sampling variation)
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooooo o o
Study estimates
True value
10/15/2001 Cross-sectional studies 46
One possible β€œtrue” value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooooo o o
95% of the distribution
True value
10/15/2001 Cross-sectional studies 47
Estimate from one study of a given size
Estimate
?
10/14/2003 Cross-sectional studies 48
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
ooooooooooooooo o o
A possible β€œtrue” value with < 2.5% chance of
being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 49
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oooooooooooooooo o o
A possible true value with > 2.5% probability
of being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 50
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooo
A possible true value with > 2.5% probability
of being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 51
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo
o o o oooooooooooooo
A possible true value with < 2.5% probability of
being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/14/2003 Cross-sectional studies 52
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo
oo o oooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
ooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
oooooooooooooooo o o
What the confidence interval represents
95% confidence interval
?
10/15/2001 Cross-sectional studies 53
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo
oo o oooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
ooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
oooooooooooooooo o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
ooooooooooooooooo o o
What the confidence interval represents
95% confidence interval
3/8/2006 Cross-sectional studies 54
One possible β€œtrue” value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooooo o o
1.96 x s.e. | 1.96 x s.e.
True value
10/15/2001 Cross-sectional studies 55








Confidence intervals – another take
10/15/2001 Cross-sectional studies 56








O
One possible population
10/15/2001 Cross-sectional studies 57








O
Another possible population
10/15/2001 Cross-sectional studies 58








O
A 3rd possible population
10/15/2001 Cross-sectional studies 59








O
A 4th possible population
10/15/2001 Cross-sectional studies 60








O
A 5th possible population
10/15/2001 Cross-sectional studies 61








O
A 6th possible population
O
O
O
10/15/2001 Cross-sectional studies 62








O
etc.
O
O
O
10/15/2001 Cross-sectional studies 63








O
There are 1.6 x 1060
possible populations
(no cases all cases)
O
O
O
10/15/2001 Cross-sectional studies 64








Suppose this is the population
(prevalence = 15%)
O
O
O
O
O
O O
O
O
O
O
O
O
O
O
O
OO O
O
O
O
O
O
O
O
O
O
O
O
10/15/2001 Cross-sectional studies 65








Take a sample (n=10)
O
O
O
O
O
O O
O
O
O
O
O
O
O
O
O
OO O
O
O
O
O
O
O
O
O
O
O
O
10/15/2001 Cross-sectional studies 66
ο‚‚ 
ο‚‚ ο‚€
ο‚€ 
ο‚€
 ο‚‚

The sample
O
O
10/15/2001 Cross-sectional studies 67
ο‚‚ 
ο‚‚ ο‚€
ο‚€ 
ο‚€
 ο‚‚

Make point estimate of prevalence
O
O
6/29/2005 Cross-sectional studies 68
Interval estimate
β€’ What are all the possible populations that
would be expected to yield this prevalence in
a sample of size 10?
10/15/2001 Cross-sectional studies 69








O
This one is not possible
3/8/2006 Cross-sectional studies 70








O
Possible, but VERY UNLIKELY
O
3/8/2006 Cross-sectional studies 71








O
Not quite 2.5% probability(2.1%, in fact)
O
O
O
O
3/8/2006 Cross-sectional studies 72








O
Yields just about 2.5% (3%, actually) probability of
selecting 2 (or more) cases in 10
O
O
O
O O
3/8/2006 Cross-sectional studies 73
One possible β€œtrue” value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
o o o ooooooooooooooooo o o
95% of the distribution
True value
3/8/2006 Cross-sectional studies 74








O
Just above 2.5% (actually 2.6%) probability of
selecting 2 (or fewer) cases in 10
O
O
O
O
O OOO
O
O O
O
O
O
O
O
O
O O OO
O O O
O O
O O
OO OO O O O O O O
O O O OO OOO O
O
OO
O
O
O
O O
O O
O
O O
O O
O
O
OO OO
OOOOOOO
O
OO
O O
O
O
O O OO
O
O
O
O
OO
O
O
O
O
O
O
O
O
O
O
O
O O
3/8/2006 Cross-sectional studies 75








O
Just below 2.5% (actually 2.4%) probability of
selecting 2 (or fewer) cases in 10
O
O
O
O
O OOO
O
OO O
O
O
O
O
O
O O OO
O O O
O O
O O
OO OO O O O O O O
O O O OO O O O
O
OO
O
O
O
O O
O O
O
O O
O O
O
O
OO OO
OOOOOOO
O
OO
O O
O
O
OO O O
O
O
O
O
OO
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O O
O
3/8/2006 Cross-sectional studies 76
Interval estimate for 2/10
β€’ Lower bound: 2.5% (5 cases)
β€’ Upper bound: 55% (110 cases)
Meaning: Our sample of 10 with 2 cases provides
evidence to exclude, at conventional error
tolerance, populations with fewer than 5 cases or
more than 110 cases. Populations with 5-110
cannot be excluded as likely sources for this
sample.
3/8/2006 Cross-sectional studies 77
Interval estimate for 2/10
β€’ Actual population prevalence was 15%,
which in fact is between 2.5% and 55%.
β€’ 2.5% to 55% is a very wide interval, i.e.,
a very imprecise estimate
β€’ To make it more precise, we need a
larger sample
78
Signs from around the world – Germany
β€œA sign posted in Germany's Black Forest:
It is strictly forbidden on our black forest
camping site that people of different sex, for
instance, men and women, live together in
one tent unless they are married with each
other for that purpose.”
79
Signs from around the world – Finland
On the faucet in a Finnish washroom:
β€œTo stop the drip, turn cock to right.”

More Related Content

What's hot

Epidemiology of periodontal disease
Epidemiology of periodontal diseaseEpidemiology of periodontal disease
Epidemiology of periodontal diseaseDR. OINAM MONICA DEVI
Β 
Epidemiological study designs
Epidemiological study designsEpidemiological study designs
Epidemiological study designsKayode Afolabi
Β 
CFAR Scholar Awardees 2013
CFAR Scholar Awardees 2013CFAR Scholar Awardees 2013
CFAR Scholar Awardees 2013HopkinsCFAR
Β 
Overview of Epidemiological Study
Overview of Epidemiological StudyOverview of Epidemiological Study
Overview of Epidemiological StudyUltraman Taro
Β 
Epi chapter 4
Epi chapter 4Epi chapter 4
Epi chapter 4emmoss21
Β 
Epidemiology of periodontal disease
Epidemiology of periodontal diseaseEpidemiology of periodontal disease
Epidemiology of periodontal diseaseZainab Al Fatlawi
Β 
2-Epidemiological studies
2-Epidemiological studies2-Epidemiological studies
2-Epidemiological studiesResearchGuru
Β 
Epidemiology study design
Epidemiology study designEpidemiology study design
Epidemiology study designrobayade
Β 
Chapter 2.2 screening test
Chapter 2.2 screening testChapter 2.2 screening test
Chapter 2.2 screening testNilesh Kucha
Β 
Levels of analysis and levels of inference in ecological study
Levels of analysis and levels of inference in ecological studyLevels of analysis and levels of inference in ecological study
Levels of analysis and levels of inference in ecological studyKamal Budha
Β 
Ecological study
Ecological studyEcological study
Ecological studyNik Ronaidi
Β 
Epidemiological studies
Epidemiological studiesEpidemiological studies
Epidemiological studiesDalia El-Shafei
Β 
Crossectional studies 2
Crossectional studies 2Crossectional studies 2
Crossectional studies 2Bruno Mmassy
Β 
Communication issues regarding control of cervical cancer among rural women ...
Communication issues regarding  control of cervical cancer among rural women ...Communication issues regarding  control of cervical cancer among rural women ...
Communication issues regarding control of cervical cancer among rural women ...Alexander Decker
Β 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiologyDalia El-Shafei
Β 

What's hot (20)

General Epidemiology
General EpidemiologyGeneral Epidemiology
General Epidemiology
Β 
Epidemiology of periodontal disease
Epidemiology of periodontal diseaseEpidemiology of periodontal disease
Epidemiology of periodontal disease
Β 
09 selection bias
09 selection bias09 selection bias
09 selection bias
Β 
Epidemiological study designs
Epidemiological study designsEpidemiological study designs
Epidemiological study designs
Β 
CFAR Scholar Awardees 2013
CFAR Scholar Awardees 2013CFAR Scholar Awardees 2013
CFAR Scholar Awardees 2013
Β 
Overview of Epidemiological Study
Overview of Epidemiological StudyOverview of Epidemiological Study
Overview of Epidemiological Study
Β 
Epi chapter 4
Epi chapter 4Epi chapter 4
Epi chapter 4
Β 
Epidemiology of periodontal disease
Epidemiology of periodontal diseaseEpidemiology of periodontal disease
Epidemiology of periodontal disease
Β 
Types of bias
Types of biasTypes of bias
Types of bias
Β 
2-Epidemiological studies
2-Epidemiological studies2-Epidemiological studies
2-Epidemiological studies
Β 
Epidemiological statistics I
Epidemiological statistics IEpidemiological statistics I
Epidemiological statistics I
Β 
Epidemiology study design
Epidemiology study designEpidemiology study design
Epidemiology study design
Β 
Chapter 2.2 screening test
Chapter 2.2 screening testChapter 2.2 screening test
Chapter 2.2 screening test
Β 
Levels of analysis and levels of inference in ecological study
Levels of analysis and levels of inference in ecological studyLevels of analysis and levels of inference in ecological study
Levels of analysis and levels of inference in ecological study
Β 
Ecological study
Ecological studyEcological study
Ecological study
Β 
Epidemiological studies
Epidemiological studiesEpidemiological studies
Epidemiological studies
Β 
Crossectional studies 2
Crossectional studies 2Crossectional studies 2
Crossectional studies 2
Β 
D.ompad
D.ompadD.ompad
D.ompad
Β 
Communication issues regarding control of cervical cancer among rural women ...
Communication issues regarding  control of cervical cancer among rural women ...Communication issues regarding  control of cervical cancer among rural women ...
Communication issues regarding control of cervical cancer among rural women ...
Β 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
Β 

Similar to 07 crosssectional studies

Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiologymigom doley
Β 
STUDY DESIGN in health and medical research .pptx
STUDY DESIGN in health and medical research .pptxSTUDY DESIGN in health and medical research .pptx
STUDY DESIGN in health and medical research .pptxAbubakar Hammadama
Β 
Bi ostat for pharmacy.ppt2
Bi ostat for pharmacy.ppt2Bi ostat for pharmacy.ppt2
Bi ostat for pharmacy.ppt2yonas kebede
Β 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiologyVineetha K
Β 
Research Designs
Research DesignsResearch Designs
Research DesignsAravind L R
Β 
04 Natural history of disease / population screening
04 Natural history of disease / population screening04 Natural history of disease / population screening
04 Natural history of disease / population screeningAbdiwali Abdullahi Abdiwali
Β 
Epidemiological Studies
Epidemiological StudiesEpidemiological Studies
Epidemiological StudiesINAAMUL HAQ
Β 
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
2010-Epidemiology (Dr. Sameem) basics and priciples.pptAmirRaziq1
Β 
CHAPTER 4.pptx
CHAPTER 4.pptxCHAPTER 4.pptx
CHAPTER 4.pptxssuser31c469
Β 
Epidemiological study designs
Epidemiological study designsEpidemiological study designs
Epidemiological study designsjarati
Β 
Global prevalence of autism and other pervasive developmental disorders
Global prevalence of autism and other pervasive developmental disordersGlobal prevalence of autism and other pervasive developmental disorders
Global prevalence of autism and other pervasive developmental disordersCristian Sepulveda ZuΓ±iga
Β 
4 Epidemiological Study Designs 1.pdf
4 Epidemiological Study Designs 1.pdf4 Epidemiological Study Designs 1.pdf
4 Epidemiological Study Designs 1.pdfmergawekwaya
Β 
05-InterventionStudies.ppt
05-InterventionStudies.ppt05-InterventionStudies.ppt
05-InterventionStudies.pptyonataneshete
Β 
Epidemiological study designs Part - I.pptx
Epidemiological study designs Part - I.pptxEpidemiological study designs Part - I.pptx
Epidemiological study designs Part - I.pptxIsaacLalrawngbawla1
Β 
Randomized Controlled Trials
Randomized Controlled TrialsRandomized Controlled Trials
Randomized Controlled TrialsNabeela Basha
Β 
Study-Designs-in-Epidemiology-2.pdf
Study-Designs-in-Epidemiology-2.pdfStudy-Designs-in-Epidemiology-2.pdf
Study-Designs-in-Epidemiology-2.pdfKelvinSoko
Β 

Similar to 07 crosssectional studies (20)

10 information bias
10 information bias10 information bias
10 information bias
Β 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
Β 
STUDY DESIGN.pptx
STUDY   DESIGN.pptxSTUDY   DESIGN.pptx
STUDY DESIGN.pptx
Β 
STUDY DESIGN in health and medical research .pptx
STUDY DESIGN in health and medical research .pptxSTUDY DESIGN in health and medical research .pptx
STUDY DESIGN in health and medical research .pptx
Β 
Bi ostat for pharmacy.ppt2
Bi ostat for pharmacy.ppt2Bi ostat for pharmacy.ppt2
Bi ostat for pharmacy.ppt2
Β 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
Β 
02 population perspective
02 population perspective02 population perspective
02 population perspective
Β 
Research Designs
Research DesignsResearch Designs
Research Designs
Β 
04 Natural history of disease / population screening
04 Natural history of disease / population screening04 Natural history of disease / population screening
04 Natural history of disease / population screening
Β 
Epidemiological Studies
Epidemiological StudiesEpidemiological Studies
Epidemiological Studies
Β 
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
Β 
CHAPTER 4.pptx
CHAPTER 4.pptxCHAPTER 4.pptx
CHAPTER 4.pptx
Β 
Epidemiological study designs
Epidemiological study designsEpidemiological study designs
Epidemiological study designs
Β 
Global prevalence of autism and other pervasive developmental disorders
Global prevalence of autism and other pervasive developmental disordersGlobal prevalence of autism and other pervasive developmental disorders
Global prevalence of autism and other pervasive developmental disorders
Β 
4 Epidemiological Study Designs 1.pdf
4 Epidemiological Study Designs 1.pdf4 Epidemiological Study Designs 1.pdf
4 Epidemiological Study Designs 1.pdf
Β 
05-InterventionStudies.ppt
05-InterventionStudies.ppt05-InterventionStudies.ppt
05-InterventionStudies.ppt
Β 
Epidemiological study designs Part - I.pptx
Epidemiological study designs Part - I.pptxEpidemiological study designs Part - I.pptx
Epidemiological study designs Part - I.pptx
Β 
Randomized Controlled Trials
Randomized Controlled TrialsRandomized Controlled Trials
Randomized Controlled Trials
Β 
05 intervention studies
05 intervention studies05 intervention studies
05 intervention studies
Β 
Study-Designs-in-Epidemiology-2.pdf
Study-Designs-in-Epidemiology-2.pdfStudy-Designs-in-Epidemiology-2.pdf
Study-Designs-in-Epidemiology-2.pdf
Β 

More from Abdiwali Abdullahi Abdiwali

STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINI
STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINISTATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINI
STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINIAbdiwali Abdullahi Abdiwali
Β 
Lecture Notes of Nutrition For Health Extension Workers
Lecture Notes of Nutrition For Health Extension WorkersLecture Notes of Nutrition For Health Extension Workers
Lecture Notes of Nutrition For Health Extension WorkersAbdiwali Abdullahi Abdiwali
Β 
WHO–recommended standards for surveillance of selected vaccine-preventable di...
WHO–recommended standards for surveillance of selected vaccine-preventable di...WHO–recommended standards for surveillance of selected vaccine-preventable di...
WHO–recommended standards for surveillance of selected vaccine-preventable di...Abdiwali Abdullahi Abdiwali
Β 
Indicators for assessing infant and young child feeding practices Part 1 Defi...
Indicators for assessing infant and young child feeding practices Part 1 Defi...Indicators for assessing infant and young child feeding practices Part 1 Defi...
Indicators for assessing infant and young child feeding practices Part 1 Defi...Abdiwali Abdullahi Abdiwali
Β 
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015Abdiwali Abdullahi Abdiwali
Β 

More from Abdiwali Abdullahi Abdiwali (20)

Communicable disease control
Communicable disease controlCommunicable disease control
Communicable disease control
Β 
Vector control
Vector controlVector control
Vector control
Β 
Outbreak investigation steps
Outbreak investigation stepsOutbreak investigation steps
Outbreak investigation steps
Β 
STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINI
STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINISTATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINI
STATEMENT BY THE HUMANITARIAN COORDINATOR FOR SOMALIA, PHILIPPE LAZZARINI
Β 
Basic introduction communicable
Basic introduction communicableBasic introduction communicable
Basic introduction communicable
Β 
PRIMARY HEALTH CARE
PRIMARY HEALTH CAREPRIMARY HEALTH CARE
PRIMARY HEALTH CARE
Β 
Lecture Notes of Nutrition For Health Extension Workers
Lecture Notes of Nutrition For Health Extension WorkersLecture Notes of Nutrition For Health Extension Workers
Lecture Notes of Nutrition For Health Extension Workers
Β 
HIV prevention-2020-road-map
HIV prevention-2020-road-mapHIV prevention-2020-road-map
HIV prevention-2020-road-map
Β 
Ethics in public health surveillance
Ethics in public health surveillanceEthics in public health surveillance
Ethics in public health surveillance
Β 
Health planing and management
Health planing and managementHealth planing and management
Health planing and management
Β 
WHO–recommended standards for surveillance of selected vaccine-preventable di...
WHO–recommended standards for surveillance of selected vaccine-preventable di...WHO–recommended standards for surveillance of selected vaccine-preventable di...
WHO–recommended standards for surveillance of selected vaccine-preventable di...
Β 
Expended program in Immunization
Expended program in ImmunizationExpended program in Immunization
Expended program in Immunization
Β 
Water &amp; sanitation handbook
Water &amp; sanitation handbookWater &amp; sanitation handbook
Water &amp; sanitation handbook
Β 
Somali phast step guide.
Somali phast step guide.Somali phast step guide.
Somali phast step guide.
Β 
Introduction to Health education
Introduction to Health educationIntroduction to Health education
Introduction to Health education
Β 
Indicators for assessing infant and young child feeding practices Part 1 Defi...
Indicators for assessing infant and young child feeding practices Part 1 Defi...Indicators for assessing infant and young child feeding practices Part 1 Defi...
Indicators for assessing infant and young child feeding practices Part 1 Defi...
Β 
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015
POLICY MAKING PROCESS training policy influence appiah-kubi, 16 dec 2015
Β 
Case controlstudyi
Case controlstudyi  Case controlstudyi
Case controlstudyi
Β 
Assessmentof nutritional status
Assessmentof nutritional statusAssessmentof nutritional status
Assessmentof nutritional status
Β 
Ancylostomiasis
AncylostomiasisAncylostomiasis
Ancylostomiasis
Β 

Recently uploaded

Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls Available
Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls AvailableVip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls Available
Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls AvailableNehru place Escorts
Β 
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safenarwatsonia7
Β 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Miss joya
Β 
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoy
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night EnjoyCall Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoy
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoybabeytanya
Β 
Aspirin presentation slides by Dr. Rewas Ali
Aspirin presentation slides by Dr. Rewas AliAspirin presentation slides by Dr. Rewas Ali
Aspirin presentation slides by Dr. Rewas AliRewAs ALI
Β 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escortsvidya singh
Β 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...astropune
Β 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
Β 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...narwatsonia7
Β 
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...narwatsonia7
Β 
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...astropune
Β 
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554
Β 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Miss joya
Β 
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune) Girls Service
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune)  Girls ServiceCALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune)  Girls Service
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune) Girls ServiceMiss joya
Β 
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomLucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomdiscovermytutordmt
Β 
Russian Escorts Girls Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls DelhiAlinaDevecerski
Β 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Miss joya
Β 
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Delivery
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on DeliveryCall Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Delivery
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Deliverynehamumbai
Β 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...indiancallgirl4rent
Β 

Recently uploaded (20)

Russian Call Girls in Delhi Tanvi ➑️ 9711199012 πŸ’‹πŸ“ž Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➑️ 9711199012 πŸ’‹πŸ“ž Independent Escort Service...Russian Call Girls in Delhi Tanvi ➑️ 9711199012 πŸ’‹πŸ“ž Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➑️ 9711199012 πŸ’‹πŸ“ž Independent Escort Service...
Β 
Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls Available
Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls AvailableVip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls Available
Vip Call Girls Anna Salai Chennai πŸ‘‰ 8250192130 β£οΈπŸ’― Top Class Girls Available
Β 
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic πŸ“ž 9907093804 High Profile Service 100% Safe
Β 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Β 
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoy
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night EnjoyCall Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoy
Call Girl Number in Vashi MumbaiπŸ“² 9833363713 πŸ’ž Full Night Enjoy
Β 
Aspirin presentation slides by Dr. Rewas Ali
Aspirin presentation slides by Dr. Rewas AliAspirin presentation slides by Dr. Rewas Ali
Aspirin presentation slides by Dr. Rewas Ali
Β 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Β 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Β 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
Β 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Β 
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
Β 
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...
β™›VVIP Hyderabad Call Girls ChintalkuntaπŸ–•7001035870πŸ–•Riya Kappor Top Call Girl ...
Β 
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony πŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
Β 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Β 
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune) Girls Service
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune)  Girls ServiceCALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune)  Girls Service
CALL ON βž₯9907093804 πŸ” Call Girls Hadapsar ( Pune) Girls Service
Β 
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomLucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Β 
Russian Escorts Girls Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI πŸ”9711199012 β˜ͺ 24/7 Call Girls Delhi
Β 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Β 
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Delivery
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on DeliveryCall Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Delivery
Call Girls Colaba Mumbai ❀️ 9920874524 πŸ‘ˆ Cash on Delivery
Β 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
Β 

07 crosssectional studies

  • 1. 2/22/2011 Cross-sectional studies 1 Study designs: Cross-sectional studies, ecologic studies (and confidence intervals) Victor J. Schoenbach, PhD home page Department of Epidemiology Gillings School of Global Public Health University of North Carolina at Chapel Hill www.unc.edu/epid600/ Principles of Epidemiology for Public Health (EPID600)
  • 2. 2 Signs from around the world In a Copenhagen airline ticket office: β€œWe take your bags and send them in all directions.”
  • 3. 3 Signs from around the world In a Norwegian cocktail lounge: β€œLadies are requested not to have children in the bar.”
  • 4. 4 Signs from around the world Rome laundry: β€œLadies, leave your clothes here and spend the afternoon having a good time.”
  • 5. 5 Faster keyboarding - 1 I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy. It dn'seot mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. β€’Gary C. Ramseyer's First Internet Gallery of Statistics Jokes http://davidmlane.com/hyperstat/humorf.html (#162)
  • 6. 6 Faster keyboarding - 2 Most of my friends could read this with understanding and rather quickly I might add. Then I had them read a statistical bit of literature: β€’Miittluvraae asilyans sattes an idtenossiy ctuoonr epilsle is the itternoiecsno of a panle pleralal to the xl-yapne and the sruacfe of a btiiarave nmarol dbttiisruein. Gary C. Ramseyer's First Internet Gallery of Statistics Jokes http://davidmlane.com/hyperstat/humorf.html (#162)
  • 7. 2/22/2011 Cross-sectional studies 7 Study designs: Cross-sectional studies, ecologic studies (and confidence intervals) Victor J. Schoenbach, PhD home page Department of Epidemiology Gillings School of Global Public Health University of North Carolina at Chapel Hill www.unc.edu/epid600/ Principles of Epidemiology for Public Health (EPID600)
  • 8. 10/15/2001 Cross-sectional studies 8 Today – outline β€’ Cross-sectional studies (and sampling) β€’ Ecologic studies β€’ Confidence intervals
  • 9. 2/10/2009 Cross-sectional studies 9 Cross-sectional studies β€’ Cross-sectional studies include surveys β€’ People are studied at a β€œpoint” in time, without follow-up. β€’ Can combine a cross-sectional study with follow-up to create a cohort study. β€’ Can conduct repeated cross-sectional studies to measure change in a population.
  • 10. 2/22/2011 Cross-sectional studies 10 Cross-sectional studies β€’ Number of uninsured Americans rises to 50.7 million. (USA Today, 9/17/2010; data from Census Bureau) β€’ In 2007-2008, almost one in five children older than 5 years was obese. (Health, United States, 2010; data from the National Health and Nutrition Examination Survey) β€’ 35% (~7.4 million) of births to U.S. women during the preceding 5 years were mistimed or unwanted (2002 National Survey of Family Growth, Series 23, No. 25, Table 21) [Source: www.cdc.gov/nchs/]
  • 11. 2/10/2009 Cross-sectional studies 11 Cross-sectional studies β€’ Incidence information is not available from a typical cross-sectional study β€’ Sometimes can reconstruct incidence from historical information β€’ Example: the incidence proportion of quitting smoking, called the β€œquit ratio”: ex-smokers / ever-smokers is calculated from survey data.
  • 12. 10/15/2001 Cross-sectional studies 12 Measure prevalence at β€œpoint” in time β€’ β€œSnapshot” of a population, a β€œstill life” β€’ Can measure attitudes, beliefs, behaviors, personal or family history, genetic factors, existing or past health conditions, or anything else that does not require follow- up to assess. β€’ The source of most of what we know about the population
  • 13. 2/22/2011 Cross-sectional studies 13 Population census β€’ A cross-sectional study of an entire population β€’ Provides the denominator data for many purposes (e.g., estimation of rates, assessing generalizability, projecting from smaller studies) β€’ A huge effort – people can be difficult to find and to count; may not want to provide data β€’ Some countries maintain accurate and current registries of the entire country
  • 14. 2/22/2011 Cross-sectional studies 14 National surveys conducted by NCHS National Health Interview Survey (NHIS) – household interviews National Health and Nutrition Examination Survey (NHANES) – interviews and physical examinations National Survey of Family Growth (NSFG) – household interviews National Health Care Survey (NHCS) – medical records
  • 15. 2/22/2011 Cross-sectional studies 15 National surveys β€’ Designed to be representative of the entire country β€’ Modes: household interview, telephone, mail β€’ Employ complex sampling designs to optimize efficiency (tradeoff between information and cost) β€’ Logistically challenging (answering machines, cellphones, . . .) See presentation by Dr. Anjani Chandra at www.minority.unc.edu/institute/2003/materials/slides/Chandra-20030522.ppt
  • 16. 10/15/2001 Cross-sectional studies 16 Example: National Health Interview Survey β€’ Conducted every year in U.S. by National Center for Health Statistics (CDC) β€’ β€œStratified, multistaged, household survey that covers the civilian noninstitutionalized population of the United States” β€’ Redesigned every decade to use new census
  • 17. 2/10/2009 Cross-sectional studies 17 β€œmultistaged” β€’ Improves logistical feasibility and reduces costs (though reduces precision) 1. Divide population into primary sampling units (PSU’s) PSU = primary sampling unit: metropolitan statistical area, county, group of adjacent counties
  • 18. 2/10/2009 Cross-sectional studies 18 β€œmultistaged” 2. Select sample of census block groups (SSU’s) within each selected PSU 3. Map each selected census block group or examine building permits 4. Select one cluster of 4-8 housing units dispersed evenly throughout the block NCHS draws a new representative sample for each week’s interviews
  • 19. 10/15/2001 Cross-sectional studies 19 β€œstratified” β€’ US divided into 1,900 PSU’s β€’ Largest 52 PSU’s are β€œself-representing” β€’ Rest of PSU’s divided into 73 categories (β€œstrata”), based on socioeconomic and demographic variables β€’ Sampling takes place separately within each category (β€œstratum”)
  • 20. 7/30/2010 Cross-sectional studies 20 Sample size and Precision Sample size Lower 95% Point estimate Upper 95% Width 100 0.17 0.25 0.33 0.16 400 0.21 0.25 0.29 0.08 900 0.22 0.25 0.28 0.06 1600 0.23 0.25 0.27 0.04 0.25 0.188 0.43301
  • 21. 3/6/2006 Cross-sectional studies 21 Weighted sampling Hypothetical Unweighted Weighted Age group Pop (1,000's) Sample Sample 20-39 yrs 18,000 900 400 40-59 yrs 18,000 900 400 60-69 yrs 8,000 400 400 Total 44,000 2,200 1,200
  • 22. 10/15/2001 Cross-sectional studies 22 β€œstratified” β€’ Also place census blocks into categories and sample within each β€’ Oversample some strata
  • 23. 2/10/2009 Cross-sectional studies 23 β€œDefined population” β€’ Studies, especially cross-sectional studies, are easiest to interpret when they are based in a population that has some existence apart from the study itself (β€œdefined population”) 1. Political subdivision (city, county, state) 2. Institutional (HMO, employer, profession) β€’ Probability sampling enables statistical generalizability to the defined population
  • 24. 2/22/2011 Cross-sectional studies 24 Surveys of sentinel populations β€’ HIV seroprevalence survey in three county STD clinics in central NC in 1988 β€’ 3,000 anonymous, unlinked, leftover sera β€’ Anonymous questionnaire for demographics and risk factors [Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV seroprevalence in sexually transmitted disease clients in a low-prevalence southern state. Ann Epidemiol 1993;3:281-288]
  • 25. 10/15/2001 Cross-sectional studies 25 HIV seroprevalence [Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV seroprevalence in sexually transmitted disease clients in a low-prevalence southern state. Ann Epidemiol 1993;3:281-288] Group % HIV+ Homosexual men 46 Bisexual men 25 Heterosexual men 1.6 Women 0.6 Total 2.5
  • 26. 10/14/2003 Cross-sectional studies 26 Characteristic Gay Hetero Women Syphilis (history/current) 53 9.0 3 Gonorrhea (history) 37 2.6 1 Anal intercourse 41 1.7 2 Paid for sex 5.2 Seroprevalence (% HIV+) by risk factors [Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV seroprevalence in sexually transmitted disease clients in a low-prevalence southern state. Ann Epidemiol 1993;3:281-288]
  • 27. 10/15/2001 Cross-sectional studies 27 Interpretation β€’ Measures prevalence – if incidence is our real interest, prevalence is often not a good surrogate measure β€’ Studies only β€œsurvivors” and β€œstayers” β€’ May be difficult to determine whether a β€œcause” came before an β€œeffect” (exception: genetic factors)
  • 28. 10/15/2001 Cross-sectional studies 28 Other points β€’ Can choose by exposure or overall β€’ Can choose by disease – may not be distinguishable from a case-control study with prevalent cases
  • 29. 10/15/2001 Cross-sectional studies 29 Outline β€’ Cross-sectional studies (and sampling) β€’ Ecologic studies β€’ Confidence intervals
  • 30. 10/15/2001 Cross-sectional studies 30 β€œEcologic” studies β€’ Most study designs – cross-sectional, case- control, cohort, intervention trials – can be carried out with individuals or with groups β€’ Group-level studies which use routinely collected data are easier and less costly β€’ Group-level studies that involve interventions may not be easier or less costly
  • 31. 3/6/2006 Cross-sectional studies 31 Types of group-level variables β€’ Summary of individual-level variable (e.g., median household income, % with high school diploma) β€’ Property of the aggregate (e.g., neighborhood grocery stores, seat belt legislation, β€œcommunity competence”)
  • 32. 2/22/2011 Cross-sectional studies 32 Interpretation β€’ Link between summary exposure variable and individual-level outcome must be inferred β€’ Inference from group to individual is not always sound
  • 33. Example: Male Circumcision and HIV Source: Bongaarts J, et al. The relationship between male circumcision and HIV infection in African populations. AIDS 1989; 3(6): 373-7. 2/22/2011 Cross-sectional studies 33 (Slope indicates strength of relationship; r indicates linearity)
  • 34. 10/15/2001 Cross-sectional studies 34 Outline β€’ Cross-sectional studies (and sampling) β€’ Ecologic studies β€’ Confidence intervals
  • 35. 3/8/2006 Cross-sectional studies 35 Confidence intervals β€’ Provide a plausible range for the quantity being estimated β€’ Width indicates the precision of an estimate for a given level of β€œconfidence” β€’ Confidence intervals quantify only random error from sampling variation, not systematic error from nonresponse, study design, etc.
  • 36. 10/15/2001 Cross-sectional studies 36 Confidence level vs. precision β€’ The more vague my estimate, the more confident I can be that it includes the population parameter: β€œI am 100% confident that the prevalence of HIV is between 0 and 100%”. β€’ The more specific my estimate, the lower my confidence: β€œI am 0% confident that the prevalence of HIV is 5.23%”
  • 37. 10/12/2004 Cross-sectional studies 37 Confidence intervals – interpretation β€’ Simple interpretations are typically not precise β€’ Precise interpretations are typically not simple
  • 38. 10/15/2001 Cross-sectional studies 38 Simple but imprecise β€’ β€œThere is 95% confidence that the interval contains the true value” – True, but begs the question – how to define β€œconfidence”
  • 39. 10/15/2001 Cross-sectional studies 39 Simple but imprecise β€’ β€œThere is a 95% probability that the interval contains the true value” – Not quite correct: probability (as conventionally defined) applies to a process, not to a single instance
  • 40. 3/7/2006 Cross-sectional studies 40 Probability applies to a process: example A 95% confidence interval can be viewed as a measurement or estimation process that will be correct (the interval includes the true value of the parameter) 95% of the time and incorrect 5% of the time. Let us make up another estimation process that will be correct (about) 95% of the time.
  • 41. 6/29/2002 Cross-sectional studies 41 Why probability applies to a process β€’ Estimate your gender by flipping a coin 5 times - if the result is 5 heads estimate your gender to be its opposite; otherwise estimate your gender to be what you think it is now. β€’ Probability that estimate will be correct is (1 – Probability of 5 heads) = 0.97 = 97% β€’ Probability that estimate will be incorrect is 3%
  • 42. 6/29/2002 Cross-sectional studies 42 Why probability applies to a process So we now have a measurement process that will be correct 97% of the time. We will use it to measure your gender. Flip the coin 5 times, and suppose you get 5 heads – Is there a 97% probability that you are of the opposite sex?
  • 43. 2/22/2011 Cross-sectional studies 43 Precise but not simple A 95% confidence interval is: 1. obtained by using a procedure that will include the population parameter being estimated 95% of the time 2. the set of all population values which are β€œlikely” to yield a sample like the one we obtained
  • 44. 10/15/2001 Cross-sectional studies 44 Suppose that this line represents the value of the parameter we are trying to estimate True value
  • 45. 10/15/2001 Cross-sectional studies 45 Possible estimates of that parameter in N identical studies (shows sampling variation) o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooooo o o Study estimates True value
  • 46. 10/15/2001 Cross-sectional studies 46 One possible β€œtrue” value and how it would manifest, on average, in N identical studies o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooooo o o 95% of the distribution True value
  • 47. 10/15/2001 Cross-sectional studies 47 Estimate from one study of a given size Estimate ?
  • 48. 10/14/2003 Cross-sectional studies 48 o oo oooo oooooo oooooooo oooooooooo o ooooooooooo o ooooooooooooooo o o A possible β€œtrue” value with < 2.5% chance of being observed at or beyond the estimate 95% of the distribution Estimate ?
  • 49. 10/15/2001 Cross-sectional studies 49 o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oooooooooooooooo o o A possible true value with > 2.5% probability of being observed at or beyond the estimate 95% of the distribution Estimate ?
  • 50. 10/15/2001 Cross-sectional studies 50 o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooo A possible true value with > 2.5% probability of being observed at or beyond the estimate 95% of the distribution Estimate ?
  • 51. 10/15/2001 Cross-sectional studies 51 o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o oooooooooooooo A possible true value with < 2.5% probability of being observed at or beyond the estimate 95% of the distribution Estimate ?
  • 52. 10/14/2003 Cross-sectional studies 52 o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo oo o oooooooooooooo o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oo o ooooooooooooooo o oo oooo oooooo oooooooo oooooooooo o ooooooooooo o ooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o ooooooooooo o oooooooooooooooo o o What the confidence interval represents 95% confidence interval ?
  • 53. 10/15/2001 Cross-sectional studies 53 o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo oo o oooooooooooooo o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oo o ooooooooooooooo o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oo o ooooooooooooooooo o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oo o ooooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o oo o ooooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o ooooooooooo o ooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o ooooooooooo o oooooooooooooooo o o o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o ooooooooooooooooo o o What the confidence interval represents 95% confidence interval
  • 54. 3/8/2006 Cross-sectional studies 54 One possible β€œtrue” value and how it would manifest, on average, in N identical studies o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooooo o o 1.96 x s.e. | 1.96 x s.e. True value
  • 55. 10/15/2001 Cross-sectional studies 55         Confidence intervals – another take
  • 56. 10/15/2001 Cross-sectional studies 56         O One possible population
  • 57. 10/15/2001 Cross-sectional studies 57         O Another possible population
  • 58. 10/15/2001 Cross-sectional studies 58         O A 3rd possible population
  • 59. 10/15/2001 Cross-sectional studies 59         O A 4th possible population
  • 60. 10/15/2001 Cross-sectional studies 60         O A 5th possible population
  • 61. 10/15/2001 Cross-sectional studies 61         O A 6th possible population O O O
  • 62. 10/15/2001 Cross-sectional studies 62         O etc. O O O
  • 63. 10/15/2001 Cross-sectional studies 63         O There are 1.6 x 1060 possible populations (no cases all cases) O O O
  • 64. 10/15/2001 Cross-sectional studies 64         Suppose this is the population (prevalence = 15%) O O O O O O O O O O O O O O O O OO O O O O O O O O O O O O
  • 65. 10/15/2001 Cross-sectional studies 65         Take a sample (n=10) O O O O O O O O O O O O O O O O OO O O O O O O O O O O O O
  • 66. 10/15/2001 Cross-sectional studies 66 ο‚‚  ο‚‚ ο‚€ ο‚€  ο‚€  ο‚‚  The sample O O
  • 67. 10/15/2001 Cross-sectional studies 67 ο‚‚  ο‚‚ ο‚€ ο‚€  ο‚€  ο‚‚  Make point estimate of prevalence O O
  • 68. 6/29/2005 Cross-sectional studies 68 Interval estimate β€’ What are all the possible populations that would be expected to yield this prevalence in a sample of size 10?
  • 69. 10/15/2001 Cross-sectional studies 69         O This one is not possible
  • 70. 3/8/2006 Cross-sectional studies 70         O Possible, but VERY UNLIKELY O
  • 71. 3/8/2006 Cross-sectional studies 71         O Not quite 2.5% probability(2.1%, in fact) O O O O
  • 72. 3/8/2006 Cross-sectional studies 72         O Yields just about 2.5% (3%, actually) probability of selecting 2 (or more) cases in 10 O O O O O
  • 73. 3/8/2006 Cross-sectional studies 73 One possible β€œtrue” value and how it would manifest, on average, in N identical studies o oo oooo oooooo oooooooo oooooooooo o o ooooooooooo o o o o ooooooooooooooooo o o 95% of the distribution True value
  • 74. 3/8/2006 Cross-sectional studies 74         O Just above 2.5% (actually 2.6%) probability of selecting 2 (or fewer) cases in 10 O O O O O OOO O O O O O O O O O O O OO O O O O O O O OO OO O O O O O O O O O OO OOO O O OO O O O O O O O O O O O O O O OO OO OOOOOOO O OO O O O O O O OO O O O O OO O O O O O O O O O O O O O
  • 75. 3/8/2006 Cross-sectional studies 75         O Just below 2.5% (actually 2.4%) probability of selecting 2 (or fewer) cases in 10 O O O O O OOO O OO O O O O O O O O OO O O O O O O O OO OO O O O O O O O O O OO O O O O OO O O O O O O O O O O O O O O OO OO OOOOOOO O OO O O O O OO O O O O O O OO O O O O O O O O O O O O O O O O O
  • 76. 3/8/2006 Cross-sectional studies 76 Interval estimate for 2/10 β€’ Lower bound: 2.5% (5 cases) β€’ Upper bound: 55% (110 cases) Meaning: Our sample of 10 with 2 cases provides evidence to exclude, at conventional error tolerance, populations with fewer than 5 cases or more than 110 cases. Populations with 5-110 cannot be excluded as likely sources for this sample.
  • 77. 3/8/2006 Cross-sectional studies 77 Interval estimate for 2/10 β€’ Actual population prevalence was 15%, which in fact is between 2.5% and 55%. β€’ 2.5% to 55% is a very wide interval, i.e., a very imprecise estimate β€’ To make it more precise, we need a larger sample
  • 78. 78 Signs from around the world – Germany β€œA sign posted in Germany's Black Forest: It is strictly forbidden on our black forest camping site that people of different sex, for instance, men and women, live together in one tent unless they are married with each other for that purpose.”
  • 79. 79 Signs from around the world – Finland On the faucet in a Finnish washroom: β€œTo stop the drip, turn cock to right.”

Editor's Notes

  1. Greetings, Buenos dias, Karibuni, Sawadee-hlap, Bienvenu, Wilkommen, Namaste, Ya-teh habeen, Xin chao, Merhaba, Salam, Kedu, Buna zjua, Yahshi mi siz, Unjani, Sdrasvuitsia, Anyeong haseyo This lecture covers cross-sectional studies, ecologic (aggregate-level) studies [lightly], and confidence intervals. But first, something to put ourselves in the mood for epidemiology.
  2. Whenever one writes something, there’s always the danger that it doesn’t come out quite the way one intended it. Here are what are reportedly real signs observed by Air France employees, courtesy of an Ann Landers column. A sign in a Copenhagen airline ticket office: β€œWe take your bags and send them in all directions.”
  3. A sign in a Norwegian cocktail lounge: β€œLadies are requested not to have children in the bar.”
  4. A sign in a Rome laundry: Ladies, leave your clothes here and spend the afternoon having a good time.”
  5. β€œCdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy. It dn&amp;apos;seot mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Gary C. Ramseyer&amp;apos;s First Internet Gallery of Statistics Jokeshttp://davidmlane.com/hyperstat/humorf.html (#162)
  6. Most of my friends could read this with understanding and rather quickly I might add. Then I had them read a statistical bit of literature: β€œMiittluvraae asilyans sattes an idtenossiy ctuoonr epilsle is the itternoiecsno of a panle pleralal to the xl-yapne and the sruacfe of a btiiarave nmarol dbttiisruein.” Β  Gary C. Ramseyer&amp;apos;s First Internet Gallery of Statistics Jokeshttp://davidmlane.com/hyperstat/humorf.html (#162)
  7. Greetings, Buenos dias, Karibuni, Sawadee-hlap, Bienvenu, Wilkommen, Namaste, Ya-teh habeen, Xin chao, Merhaba, Salam, Kedu, Buna zjua, Yahshi mi siz, Unjani, Sdrasvuitsia, Anyeong haseyo This lecture covers cross-sectional studies, ecologic (aggregate-level) studies [lightly], and confidence intervals. But first, something to put ourselves in the mood for epidemiology.
  8. Since some of the most important cross-sectional studies are national surveys, I will include some explanation of how these are conducted, particularly their approach to sampling, which also affects data analysis.
  9. Cross-sectional studies are most familiar to us as surveys. In a cross-sectional study, people are studied at a β€œpoint” in time. Follow-up is not generally a part of the cross-sectional study, though sometimes a cross-sectional study serves as the baseline for a cohort study or intervention trial. Sometimes cross-sectional studies are carried out repeatedly in the same population(s), so that results can be compared across time and changes observed. However, the basic cross-sectional design is simply a single survey.
  10. The statistics on the slide illustrate findings obtained from national cross-sectional studies conducted by the U.S. Census Bureau and the National Center for Health Statistics. For example, the number of uninsured Americans is now over 50 million (source: USA Today, 9/17/20100, based on data from the Census Bureau). Data from the National Health and Nutrition Examination Survey (NHANES) discloses that almost one in five children older than 5 years was obese (Health, United States, 2010). According to the 2002 National Survey of Family Growth, 35%, or about 7.4 million, of births to U.S. women during the five years before the interview were mistimed or unwanted. [Source: www.cdc.gov/nchs/) Source for Health, United States, 2010: www.cdc.gov/nchs/data/hus/hus10_InBrief.pdf
  11. Note that incidence information is not available from a typical cross-sectional study, except as it can be reconstructed from historical information. An example of such a reconstruction is something called the β€œquit ratio” in the smoking cessation literature. The quit ratio is the incidence proportion of quitting smoking. It is calculated as the ratio of ex-smokers to ever-smokers. For the quit ratio to represent the actual incidence proportion of quitting, mortality and migration prior to that age must be minimal or unrelated to smoking status, and smoking initiation must be rare in the relevant age range. In order to construct incidence from cross-sectional data, it must be possible to identify a cohort at the beginning of the period during which the events occurred and be able to trace it forward in time. That is typically not possible when the population under study may have experienced significant migration and/or death.
  12. Cross-sectional studies measure prevalence of health conditions, exposures, and other characteristics of the population. They provide a β€œsnapshot” or β€œstill life portrait”. The cross-sectional design can be used to measure any factor that can be reported by respondents or assayed noninvasively and that does not require follow-up to assess. Cross-sectional studies are the source of most of what we know about the population other than vital statistics.
  13. A type of cross-sectional study, not usually described as such, is a population census, where an entire population is enumerated. A census provides denominator data for many purposes, such as estimation of birth and mortality rates, assessing the generalizability of studies conducted in subpopulations, and projecting the results from smaller studies. Population censuses involve a huge effort. People can be difficult to find and to count, and may not want to provide data. The 2000 census used a short form, pictured here, for most people and a longer version (46 additional questions) for an approximately 1-in-6 sample. (See http://www.census.gov/dmd/www/pdf/d-61b.pdf for the long form.) In order to separate the task of enumerating the population from the effort to gather data on them, after 2000 the Census Bureau replaced the long form with a continuous survey of about 3 million households, called the American Community Survey. (http://www.census.gov/acs/www/), so that for Census 2010 only the short form was used. Some European countries, particularly the Scandinavian nations, maintain population registries which include demographic, household, health, and other information, making them incredibly useful sources of information for research.
  14. Although the American Community Survey is larger, most U.S. national data on health topics come from national surveys carried out by the National Center for Health Statistics (NCHS), a division of the Centers for Disease Control and Prevention (CDC) (www.cdc.gov/nchs/) and other organizations. NCHS conducts a number of different series of national surveys, including the National Health Interview Survey (NHIS), the National Health and Nutrition Examination Survey (NHANES), the National Survey of Family Growth (NSFG), the National Health Care Survey (NHCS), and the National Immunization Survey (NIS). NHIS and NSFG are household interview surveys. NIS is a random-digit dial telephone interview survey. NHANES includes both interviews and physical examinations, using a mobile clinic. By conducting surveys with the same methodology on a periodic basis, detection and analysis of trends over time are possible. In addition, in recent decades NCHS has linked vital statistics data to records of people who have been respondents in some national surveys, creating large-scale mortality cohort studies. NCHS has also linked survey respondents’ data to information from the U.S. Census, to enable multilevel studies examining relations between individual-level factors and small area characteristics. http://www.cdc.gov/nchs/surveys.htm
  15. What distinguishes a national survey, of course, is the intent to provide data that are statistically representative of the entire country. If an estimate, such as the percentage of children not covered by health insurance, is statistically representative, then the number of people with that characteristic (lack of insurance) can be estimated by combining that estimate with the number of people enumerated in the national census. Surveys can be carried out by one or multiple modes, particularly household interviews, telephone, or mail. To avoid the prohibitive expense of sending interviewers to all parts of the nation, household surveys are conducted using multi-stage cluster sampling techniques, which increase their conceptual and analytic complexity. However, any large study is logistically challenging to conduct successfully. Obtaining a representative sample of a large, diverse, dispersed population is an expensive proposition, so complex sampling designs are employed to optimize the trade-off between cost and informativeness. For example, subsets of the population may be oversampled as a means of obtaining sufficient information on important subgroups without having to study an unnecessarily large number of the rest of the population.
  16. The largest and most frequently conducted national survey in the U.S. is the NHIS, which NCHS conducts every year. The NHIS is a β€œstratified, multistaged, household survey that covers the civilian noninstitutionalized population of the United States”. I explain these terms in the next several slides. The survey includes information on households (plus non-institutional group quarters, such as college dormitories), on families, and on a β€œsample person” in each household. The NHIS is redesigned every decade so that its sampling plan can take advantage of data from the new census.
  17. A population sample selected by drawing people at random from a list of everyone in the country is called a β€œsimple random sample”. Everyone in the population has an equal, known probability of being in the sample. However, surveying such a sample by any method other than mail is prohibitively expensive. Household surveys, therefore, use cluster sampling techniques so that interviews can be carried out in a smaller number of geographical locations. After the geographical locations are selected (the first stage of sampling), then subdivisions of those locations are sampled, and finally individuals are selected (the final stage of sampling). While it makes national surveys more feasible and less costly, such multistaged sampling procedures may entail some loss in the precision of the estimates they produce. However, that loss of precision may be more than made up by the ability to survey a larger sample due to lower interview costs. A multistage sample begins by dividing the country into Primary Sampling Units (PSU’s). In the NHIS, PSU’s are metropolitan statistical areas, large counties, or groups of counties.
  18. Within each PSU selected into the sample, a sample of secondary sampling units, SSU’s, is selected. In the NHIS, SSU’s are usually census block groups. Each selected census block group (usually containing at least several hundred dwelling units) is mapped by driving around the area and/or examining building permits, to create a list of dwelling units in the block group. From this list, a cluster of 4-8 housing units dispersed evenly throughout the block are then selected. Interviewers then visit the sampled housing units to ask the ages and gender of each person who lives there and then, often, randomly select one eligible person for interview. Using the lists and maps, NCHS draws a new representative sample for each week’s interviews, so that the people who have been surveyed at any given time are approximately representative of the entire U.S. population – after the sampling design is taken into account, of course.
  19. Another important concept is that of a β€œstratified” sample. Stratification is the statistical approach that corresponds to the political strategy of β€œdivide and conquer”. Stratification consists of dividing the population to be sampled along some lines (usually geographic or demographic variables such as gender and age). Sampling is then carried out independently within each category, or β€œstratum”. Stratified sampling can increase precision of estimates, because it eliminates a source of random variability, by fixing the proportions of one or more variables. Stratified sampling can also increase precision in another way.
  20. In order to appreciate another way that stratified sampling can improve the precision of estimates, we need to understand something about the relationship between precision and sample size. Suppose that you need estimates of the proportion of people who engage in vigorous physical activity on a regular basis. Let’s say that you were to collect data using a simple random sample and you obtained a point estimate of 0.25, or 25%. The confidence interval is an indicator of precision of an estimate, and in this case the 95% confidence interval would be (0.17,0.33), or 17% to 33%, if the sample size had been 100. It would be (0.21,0.29) for a sample size of 400; (0.22,0.28) for a sample size of 900, and (0.23,0.27) for a sample size of 1,600. These results are shown in the table on the slide. The precision of an estimate is sometimes quantified as the width of the 95% confidence interval (if the confidence interval was computed using a logarithmic transformation, as is done for ratio measures such as the prevalence ratio, then the confidence interval width is measured by the confidence limit ratio – CLR – the ratio of the upper to the lower limit). The widths for the four intervals on the slide are 0.16*, 0.08, 0.06, and 0.04. As you can see, there is a law of diminishing returns, in that adding 300 people to move from 100 to 400 has a much stronger impact on the width of the confidence interval than does adding 700 people to move from 900 to 1,600. * The width is 0.17 if calculated without rounding the confidence limits.
  21. If independent samples are drawn within each stratum, it is possible to sample some strata more intensively and others less so. In this way we can obtain greater precision within specific subgroups without paying for more precision than we need for other groups. Such weighted sampling is used to achieve greater efficiency in meeting the survey objectives. For example, suppose that in your survey of physical activity you need to have estimates for age groups 20 years-39 years old, 40-59 years old, and 60-69 years old. In a simple random sample (SRS), which is unweighted, you will have fewer than half as many people in the 60-69 year-old stratum as in the 40-59 year-old stratum, since the older stratum covers only 10 birth years and will also have experienced higher mortality (not counting changes in the size of the birth cohort, such as the baby boom). So, in order to obtain at least 400 people ages 60-69 years old it would be necessary to have around 900 people who are 40-59 years old. A more efficient solution, yielding almost the same precision, is to oversample people aged 60-69 years old, so that there are twice as many of them in the sample as their proportion in the population. In this way you can improve the precision of your estimate for the 60-69 year-olds and obtain similar precision for the other groups without having to increase your overall sample size as much as if you did not choose a stratified sample. You can even obtain a more precise estimate for the whole population, while saving money.
  22. Besides sampling PSU’s within strata (e.g., metropolitan PSU’s, nonmetropolitan PSU’s), the NHIS also samples census blocks within strata, oversampling some strata.
  23. Most epidemiologic studies are easier to interpret when they are carried out in a β€œdefined population”, a population that has some existence outside the study itself. (If I dared to coin a term, I would call that a β€œpre-defined population”.) Defined populations are especially advantageous for cross-sectional studies, which are often used to make population estimates. Defined populations are typically some political unit, such as a country or state, or an institution, such as an HMO, employer, or profession. The use of probability sampling then enables the results to be statistically generalized to the known population.
  24. An example of a small cross-sectional study that was not conducted in a defined population, but which I had a hand in, is an HIV seroprevalence survey in three county sexually transmitted disease clinics in central NC in 1988. At that time in the HIV epidemic there was genuine uncertainty about how much HIV had disseminated beyond the β€œepicenters” of NYC, San Francisco, and several other large cities. STD patients are a β€œsentinel population”, since they should be one of the first places that HIV will show up when it arrives in an area. The survey was carried out by analyzing leftover sera from syphilis tests for about 3,000 patients. Anonymous questionnaires were linked to the serum specimens and then all remaining identifying information was removed. [Citation: Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV seroprevalence in sexually transmitted disease clients in a low-prevalence southern state. Ann Epidemiol 1993;3:281-288]
  25. The survey found an overall HIV seroprevalence of 2.5%. The seroprevalence ranged from 0.6% among female patients to 1.6% among men who denied homosexual activity to 46 % among men whose sexual partners were exclusively men.
  26. Because of the dramatically different seroprevalences we examined risk factor information from the questionnaire separately for each demographic or sexual preference group. HIV seroprevalence was higher in patients who had syphilis or reported a history of syphilis, who reported a history of gonorrhea, who had had anal intercourse or (among heterosexual men) who had paid for sex.
  27. Interpretation of data from a cross-sectional study such as this one must keep in mind a number of considerations. First, cross-sectional studies provide data on prevalence, not incidence. If incidence is our real interest, as it generally is for etiologic research, prevalence may not be a good surrogate. An important problem with prevalence data is that cross-sectional studies include only β€œsurvivors” and β€œstayers”. Rapidly fatal conditions will be greatly underrepresented in a cross-sectional study, compared to the total number of people who are affected during a given time interval. Also, conditions and characteristics associated with outmigration will also be underrepresented. In addition, for associations where causation is a possibility, it may be difficult to determine whether the β€œcause” preceded the β€œeffect”. With the HIV seroprevalence survey there is the additional uncertainty about what population to apply the results to. Since a major objective in this instance was to find out if heterosexually-acquired HIV was present in NC, generalizability was less of a concern. Even so there was naturally interest in knowing whether these results might be mirrored in other STD clinics in the state, both in terms of the seroprevalence figures and also the associations, as well as whether these results would be similar in other subpopulations, such as injection drug users.
  28. A couple of other points about cross-sectional studies are that the study population can be chosen by exposure. For example, workers in a particular industry or jobtitle can be chosen and compared to a group of people regarded as β€œunexposed”. Or, the study population can be chosen without regard to exposure. Similarly, the study population can be chosen by disease status – people who are bothered by frequent headaches, for example, and people without that condition. In this case it may be difficult or impossible to differentiate between a cross-sectional study and a case-control study with prevalent cases (as distinct from newly-reported cases).
  29. Now we turn to the topic of β€œecologic” studies, which are studies where the unit of observation is some group of people rather than the person.
  30. Although ecologic studies are often thought of as only cross-sectional in design, in fact any study design can be carried out either at the individual level or the group level. So it is possible to conduct a case-control study where communities with some condition – for example, increases in crime rates, for example – are compared to communities without that condition. Or communities can be followed over time, either after an intervention or without one. Because of the availability of population statistics, group level studies are sometimes easier and less costly to conduct than individual-level studies, which is one of the major motivations for conducting them. However, group-level studies can also be quite expensive, even more expensive than individual-level studies. Group-level intervention studies, for example, may be quite costly.
  31. The variables studied in group-level studies have been classified by some authors into two categories. One category consists of summary statistics of individual-level variables. For example, each household has an income and each person has or does not have a high-school diploma. So median household income and percentage of the subgroup with a high school diploma, both of which measures are available in Census data, are summaries of individual-level variables. Other variables may be properties of the aggregate, with no obvious individual-level counterpart. Examples are measures such as the availability of a neighborhood grocery store selling fresh fruits and vegetables, the presence of legislation, and characteristics of the community infrastructure or functioning.
  32. A principal concern in interpreting data from ecological studies is that the link between a summary exposure variable (e.g., number of liquor stores per capita) and an individual-level outcome (e.g., motor vehicle crashes) must be inferred. For this reason, inference to the individual level of findings observed at the group level may not be valid.
  33. An ecologic study of 37 African countries computed a correlation between the national prevalence of circumcision and the seroprevalence of HIV in the capital city of each country. The result was a surprisingly strong correlation coefficient of -0.9 (-1.0 is a perfect, negative correlation). In the map of Africa, the blackened countries represent countries where men are predominantly uncircumcised, the gray represent countries where men are predominantly circumcised, and the white are not included in the sample. In the figure on the right, the slope of the line indicates the epidemiologic strength of association; the correlation coefficient (r) indicates the extent to which the data points lie on a straight line. Note that the slope of the line is heavily influenced by a small number of data points.
  34. Now we turn to the topic of confidence intervals.
  35. What is a confidence interval and what does it do? A confidence interval is a plausible range of values for a quantity being estimated, based on a sample of data. The width of the confidence interval indicates the precision of the estimate for a given level of β€œconfidence”. The narrower the interval, the more precise the estimate. It is important to note that the width of the confidence interval quantifies only random error from sampling variation, not systematic error from nonresponse, study design, and any other nonrandom source. For example, the mean of a random sample serves as an estimate of the mean of the population from which it was drawn. If the mean of a random sample is 5.3, then our best estimate of the mean value in the population is also 5.3. It is very unlikely that it is exactly 5.3, however, due to variability from sampling. In order to quantify the effect of sampling variability, we construct a confidence interval for the estimate. We might think of the confidence interval as giving us some β€œwiggle room” so that we can state our estimate as a range, rather than as a single value. The confidence interval for a mean or proportion is constructed as the point estimate + the amount of sampling variability expected from 1) the estimated variability in the population itself, the size of the sample, and 3) the level of confidence we seek The distance between the two confidence limits can serve as an estimate of precision.
  36. While the correct interpretation of confidence intervals is a subtle business and somewhat controversial, there are some basic mechanical aspects that everyone should understand. The first of these is the tradeoff between precision and confidence level. This tradeoff can be illustrated in a general sense with a familiar experience – when we want to be really confident that we are not mistaken, we hedge our statements, we β€œwaffle”. So, I can be 100% confident that the prevalence of HIV is between 0 and 100% because it has to be. The 100% confidence is reassuring, but the prevalence estimate is totally useless. If I am willing to be less confident in being correct, then I can make a more precise estimate. Taking that to the extreme, I can estimate the prevalence of HIV as 5.23% (on the basis of no data) but I have zero confidence that that estimate is correct. More seriously, we set a level of confidence – conventionally 95% – and then compute a confidence interval. For any given estimate, if we want a higher level of confidence, e.g., 99%, the interval will be wider. If we want a narrower interval, we can reduce the confidence level or increase the number of observations on which the estimate is based.
  37. There is also a tradeoff between the simplicity of interpretation of the meaning of a confidence interval and the precision of that interpretation. Simple interpretations are typically imprecise. Precise interpretations are typically not simple.
  38. A simple interpretation that is imprecise is that β€œthere is 95% confidence that the interval contains the true value”. Although true, this interpretation begs the question, since we then need to know the meaning of β€œconfidence”.
  39. Another example of a simple but imprecise interpretation is that β€œthere is 95% probability that the interval contains the true value.” This statement is not quite correct, at least not according to the frequentist school of statistics, the one from which most of us learn statistics. In the frequentist school, probability applies to a repeatable process but not to a single instance. The statement that the interval will contain the true value with 95% probability is true for the procedure one uses to create a 95% confidence interval. The procedure is designed to do just that. But once we have a specific interval, then at least for the frequentist, there is no way to define probability for that specific interval and that true value.
  40. Let’s create a scenario to illustrate why interpreting a single confidence interval as a probability statement can be problematic. A confidence interval can be viewed as a measurement, or estimation, process which gives the correct result (that is, the interval includes the true parameter) 95% of the time and an incorrect result (that is, the interval does not include the true parameter) 5% of the time. Suppose we make up another estimation process that will be correct about 95% of the time.
  41. The estimation process I propose is for estimating your gender. Of course, you could tell just by looking or from memory, but suppose you want to use a method that is accurate only about 95% of the time. Flip a coin 5 times in a row. If you don’t get heads every time, that is, if tails comes up at least once, then declare your gender to be what you usually regard it as. But if the coin comes up as heads every time, a rather rare occurrence, then declare your gender to be the opposite of what you know it to be. The probability of obtaining 5 heads on 5 independent coin flips is about 3%, so this estimation process will be correct about 97% of the time and incorrect about 3% of the time.
  42. So we now have a measurement process with an even higher level of confidence than a 95% confidence interval! Let’s try it out – flip a coin 5 times in a row. For most of you, most of the time, there will be fewer than 5 heads. But for several of you, the coin will come up as heads on all five flips. So now you have an estimate that comes from a procedure that 97% of the time produces the correct result, and indeed, about 97 in 100 of the measurements you produce this way will come out right. But if you happen to be one of the people who got 5 heads in a row, would you be willing to say there is β€œa 97% probability that the estimate includes the true value”?
  43. We can formulate a more precise definition of a confidence interval, but it is not simple. A 95% confidence interval is an interval (a set of values) obtained from a procedure that will include the population parameter being estimated 95% of the time (and, therefore, it will fail to include it 5% of the time). Another precise, but not simple, definition is that a 95% confidence interval is the set of all population values which have at least 2.5% probability of yielding a sample at or beyond the point estimate. Why 2.5%? We seek a 95% probability that the interval will include the true value, leaving a 5% probability of failing to include the true value. If we divide that 5% between the times we will overestimate the true value and the times we will underestimate it, then we have 2.5% probability for either side of the point estimate. Let’s see this latter formulation in a diagram.
  44. Suppose that this line (in pink, if you have color) represents the value of the population parameter we are trying to estimate. For example, suppose we are trying to estimate the blood lead level in children in a certain city, and the line represents the true value, although we do not know what it is.
  45. We do know that if we were to take identical, independent samples of a certain size from that population, the means for those samples would be distributed around the true value. The larger the samples, the narrower the distribution. Since we know this, we do not expect to get the exact true value in any particular sample of reasonable size. The slide here shows the line with the true value, and then it shows little circles on each side of the slide in what is meant to be the shape of a normal distribution. Each of those circles represents the estimate from a single study, let us say the mean of a single study.
  46. We can take the central 95% of the distribution of sample means and say that these are all sample results we would β€œexpect” to see if we selected a single sample of that same size. The samples at the extremes of the distribution, intended to represent the extreme upper and lower 2.5% of the distribution, we might regard as β€œunexpected”. If someone tells us that the true value was X, and we obtained a value within the 95% distribution for X in our study, we would say that our data were consistent with the estimate we had been told. On the other hand, if our sample yielded an estimate at the lower or upper extreme, we might doubt the accuracy of the estimate we had been told or we might worry that our study was biased. Essentially, our data are not consistent with the value we were given. It could nevertheless be possible that the estimate we were given is correct and our study was not biased – we were just β€œunlucky” in terms of our sample.
  47. The previous slides describe the sampling process. Now let’s suppose that we have drawn a sample and calculated the mean lead level. The result, which is our estimate of the population mean, is represented by this line (in blue if you can see blue).
  48. We realize that our sample mean is unlikely to coincide exactly with the population mean, since we anticipate that there will be sampling variability. So with which population means is our estimate compatible? In principle, the population parameter can have (almost) any value, since in rare instances a sample could differ greatly from the population from which it was taken. But as a practical matter we expect samples not to differ too greatly from the populations from which they are taken, which is why do sample surveys. If the circles in the diagram represent the sampling distribution for our estimate of the population mean, the diagram suggests that there is less than 2.5% probability that the population value represented by the pink line would produce a sample whose mean is as far away as the blue line in the center. We would conventionally regard that possible population value as β€œincompatible” with our sample data. Conversely, our sample data are β€œincompatible” with that particular true population value.
  49. However, this next population value (represented by the pink line again) although quite far from our estimate, is not so far that we are comfortable declaring it incompatible. There is a 2.5% or greater probability that if that is the true value we would draw a sample whose mean is as far or farther to the right than the estimate we obtained.
  50. Now, looking to the right of our estimate, here is a possible population value that is just barely compatible with our estimate, since if it were indeed the true value, then the probability of obtaining a sample whose mean was as far or farther to the left than the estimate we have (the blue line) is just 2.5%.
  51. And just to the right of that possible population value is one that would yield a sample such as ours (or with a mean further to the left) less than 2.5% of the time.
  52. The 95% confidence interval consists of all of those possible population values we just considered that we were unwilling to reject as β€œincompatible” with the point estimate from our sample. If the circles represent the sampling distributions of samples from populations with the four possible parameters we just considered, then we see there is just less than 2.5% probability of getting a sample such as ours from one of the population values whose distribution is represented by red circles in the slide and just 2.5% or greater probability of getting a sample such as ours from one of the population values whose distribution is represented by blue circles. So the population values whose sampling distributions are indicated by blue circles, and all population values that lie between them, form our 95% confidence interval. If we form our 95% confidence interval in this way then there is a 95% probability that the interval constructed in this way will cover the actual population value. That is, the procedure β€œworks” 95% of the time, and does not β€œwork” 5% of the time. As before, once an interval has been constructed, it either includes the actual population value or it does not – just as when you estimated your gender your estimate was either correct or not. Once you had estimated it, there was no more probability about it.
  53. Here is the same diagram showing a few more of the possible population values whose distributions (represented by black circles) are compatible with the sample we observed.
  54. Another way to think about what we have done is the following. We know that if the true value is the pink line, then repeated samples of a given size will have means distributed around the true mean, and that 95% of those sample means will be no more than 1.96 standard errors from the population mean. Conversely, if 95% of the sample means will be within 1.96 standard errors from the population mean, then we can take any sample mean we get and know that 95% of the time the distance to the true population mean will be no greater than 1.96 standard errors in either direction. So the procedure of taking the estimate and going 1.96 standard errors to the left and to the right will give us an interval that should include the population mean 95% of the time and will not include it 5% of the time. Once we have created the interval, however, we don’t know whether this interval is one of the 95% that includes the true value or one of the 5% that does not. This interpretation is pretty simple, and quite precise.
  55. The previous example illustrates the meaning of a confidence interval for a population value that we measure on a continuous scale. Here is an example based on cases of a disease for which we are estimating prevalence or incidence, our familiar population of little people.
  56. Here is a possible population of 200 little people, with one case.
  57. Here is another possible population of 200 people, also with one case, but a different person.
  58. And another possible population with one case, yet a different person.
  59. And a fourth possible population.
  60. A fifth possible population.
  61. A sixth possible population , this one with four cases.
  62. And another population with four cases. Although we do not have time to look at them all, we can imagine a very large (but finite) number of populations of 200 people, with numbers of cases varying from 0 to 200.
  63. There are actually about 1.6 times 1060 possible populations with some number of cases. That is a very large number!
  64. Suppose that this is the population in which we are actually conducting our study. The true prevalence of the disease is 15%. Count the cases and make sure that you agree with that prevalence.
  65. Suppose we take a random sample of 10 people (indicated by the small rectangles) from that population.
  66. Here is that sample, without the rest of the population shown on the slide.
  67. From this sample, we can make a point estimate of the prevalence of the disease. In this case there are two cases in our sample of 10. The prevalence in our sample is therefore 20%, and therefore our point estimate of the prevalence in the population is 20%.
  68. To form a 95% confidence interval around our sample estimate of 0.2, or 20%, we need to know what are all the possible populations that would be expected to yield this prevalence in a sample of size 10, where β€œexpected to” means that there is at least 2.5% probability of observing that prevalence or one farther from the true population value.
  69. This population (prevalence 0.5%) is obviously not one of them, since it has only one case. So this one is not possible.
  70. This possible population (prevalence 1%) could conceivably yield a sample of 10 with 2 cases, but that event is very unlikely (0.226%), since we would have to have randomly selected both of the cases. [For those with a mathematical turn of mind and interest, the probability comes from the hypergeometric distribution. The probability of selecting a sample with 0 cases from this population is 90.226% and of a sample with 1 case 9.548%, leaving 0.226% for the probability of selecting a sample of size 10 with exactly 2 cases.]
  71. This population (prevalence 2.5%) is a more likely candidate. Here the probability is 2.08% that a sample of size 10 would contain two or more cases.
  72. This population – with 6 cases (prevalence 3%) – would yield samples of size 10 that would have two or more cases each about 3% of the time (probability 3.04%).
  73. What we are doing now is examining the probability which which populations having various prevalences could give rise to samples of size 10 with 2 cases. The graph on this slide reminds us that sample means (and proportions) vary about the population value, shown here by the pink line in the center.
  74. At the other end, a population with 109 cases (prevalence 54.5%) would yield samples of size ten with 2 or fewer cases only 2.6% of the time. [The hypergeometric distribution for selecting 2 or fewer cases in a sample of 10 from a population of 200 with 109 cases is 2.637%.]
  75. But this next population (110 cases, prevalence 55%) has so many cases that the probability of drawing a sample of size 10 with only 2 cases (or fewer) is less than 2.5%. [Probability of drawing a sample of 10 with 2 or fewer cases from a population of 200 with 110 cases is 2.441%.]
  76. So, unless my calculations are in error (which they have been in some earlier versions), the lower bound of our 95% confidence interval for our prevalence estimate of 0.2 with our sample of size 10 is should lie between 2.5% and 3%, and the upper bound should like between 54.5% and 55%. In order to ensure 95% coverage, we would set the confidence interval conservatively at 2.5%,55%, which assures us that we have excluded only populations which have less than 2.5% probability of producing a sample on the other side of the confidence limit. This interval says that we cannot exclude, at traditional error tolerances, that a true population prevalence somewhere between 2.5% and 55% gave rise to the observed sample. [The 2.5% represents the population with 5 cases (5/200); the 55% represents the population with 110 cases (110/200).]
  77. To summarize this example, the actual population prevalence was, by assumption, 15%, which is in fact between 2.5% and 55%. However, 2.5% to 55% is a very wide interval – a very imprecise estimate. We probably knew that the prevalence was within that range even before we collected the sample. To make a more precise estimate, we would need to draw a larger sample.