3. Correlational studies are also called
ecological or aggregate studies.
This type of studies uses population-level
data to examine the relationship between
exposure rates and disease rates.
4. We are thus in the case of a study in which
units of analysis are populations or groups
of people rather than individuals.
i.e. The focus will be on the comparison of
populations/groups rather than individual patients
or participants.
5. Examples
Does the percentage of adults with multiple sclerosis
tend to be higher in countries farther from the
equator?
Does the rate of asthma tend to be higher in cities with
higher levels of air pollution?
Does the prevalence of diabetes tend to be higher
when we have higher prevalence of obesity?
7. Data Sources
At least one data source (if not more) that
contains comparable information about the
population characteristics of interest must be
identified.
Information about all the variables of interest
must be available for a suitable number of
populations, which can be grouped by place or
by time.
8. Examples of Populations
All Western European countries
The largest 25 metropolitan areas in the Arab
world
All Sub-Sahara countries
A random sample of survey Areas in London
Historic data for the past decades from one or
more place-based populations
9. Exposures and Outcomes
At least one characteristic of the populations
being examined is designated as an exposure
Exposures are often environmental measures likely to be fairly consistent across an
entire population
At least one characteristic is designated as an
outcome
10. Aggregate Data
Population characteristics are in the form of
aggregate (grouped) data, such as:
the proportion of each population with a particular characteristic
the average value of the variable in the population
11. Examples of Exposures
The percentage of adults older than 30 who have not completed
at least 12 years of education
The mean income in the population
The median age
The number of rainy days over a given year in the population
The average ultraviolet radiation index during midday in the
hottest month of the year
12. Examples of Diseases
The prevalence of obesity among adults
The mean BMI (body mass index) among adults
The annual mortality rate from asthma
13. Cautions
Correlational studies are valid only if the data
points are comparable.
A data point is a discrete unit of information. Generally, any single fact is a data point.
In a statistical or analytical context, a data point is usually derived from a
measurement or research and can be represented numerically and/or graphically.
In some populations, exposures and diseases may
be routinely undercounted or routinely over-
diagnosed compared to other populations.
14. Cautions
If multiple sources of data are used or if the data
were collected over a lengthy period of time,
then the definition of exposure or disease may
differ from one population to another and may
not be comparable.
15. Data Management Example
Data should be entered into a spreadsheet
Each population (A, B, C, etc.) is in its own row
Each exposure and each outcome is in its own
column
16. Analysis: Correlation
On a scatterplot used to illustrate correlation, each point
represents one population in the study.
The exposure is plotted on the x-axis, and the outcome or
disease is plotted on the y-axis.
19. Analysis: Correlation
1. When all the points fall neatly in a line, then the
correlation is strong.
2. When the points are not exactly linear but a line for
trend can be drawn, then the correlation is mild or
moderate.
3. When the points appear to be randomly placed
and no obvious line can be drawn through them,
then the correlation is weak or nonexistent.
20.
21. Analysis: Correlation
If higher levels of exposure are linked to higher rates of
disease, then the slope is positive.
If higher levels of exposure are linked to lower rates of
disease, then the slope is negative.
22. Analysis: Correlation
For continuous variables and other variables with
responses that can be plotted on a number line, a
Pearson correlation coefficient (r) should be used to
calculate the correlation.
For variables that assign a rank to responses or that have
ordered categories, use the Spearman rank-order
correlation (designated by the letter r or the Greek letter
r (rho) in most statistical programs).
23. Analysis: Correlation
The Pearson method is built on the notion that if
Measurement 1 trails Measurement 2 (directly or
inversely), you can get some indications on how
linked they are by calculating Pearson's r -the
correlation coefficient-, which is a quantity
derived from the products of the differences
between each M1 and its average and each M2
and its average.
24. Analysis: Correlation
Spearman's rank coefficient is similar to Pearson in
producing a value from -1 to +1, but you would
use Spearman when the rank order of the data
are important in some way.
The Pearson test is more widely used.
25. r = –1: all points lie perfectly on a line with a negative slope
r = 1: all points lie perfectly on a line with a positive slope
r = 0: no association between the exposure and outcome
r2 shows how strong a correlation is without indicating the
direction of the association
Analysis: Correlation
26. Analysis
Use linear regression models when the goal is to:
compare more than two variables
understand the relationship between two variables
while controlling or adjusting for the effects of other
variables
27. Age Adjustment
When the populations being compared have
very different age structures, age adjustment
may be necessary to make a fair comparison
among populations.
28. Avoiding the Ecological Fallacy
Correlational studies compare groups rather than
individuals.
No individual-level data are included in the analysis,
only population-level data.
The incorrect attribution of population-level
associations to individuals is called the ecological
fallacy.
29. Even though a population with a higher rate of
exposure to something has a higher rate of
disease than populations with lower exposure
rates, individuals in that population who have a
high level of exposure do not necessarily have
the disease.
Avoiding the Ecological Fallacy
30. Avoiding the Ecological Fallacy
The experience of an individual in a population
may vary significantly from the population
average.
It would be incorrect to assume that any one
individual from a country with a high average
body mass index (BMI) will be obese or that an
individual from a country with a low average BMI
will not be obese.
31. However, it is appropriate to identify trends in
populations and to use those observations to
generate hypotheses for individual-level studies
that will test for relationships between the
characteristics of interest in individuals.
Avoiding the Ecological Fallacy
34. Uses of Case Series
Describing the characteristics of and similarities
among a group of individuals with the same signs
and/or symptoms of disease
Identifying new syndromes and refining case
definitions.
Clarifying typical disease progression
Developing hypotheses for future research
35. Sample Size
Some case series for rare conditions may
require only a few participants
Other studies may include several hundred
individuals
36. Getting Started…
Select one disease or condition of interest
Determine what will be new and interesting about
the study
Identify an appropriate and available source of
cases
Establish a clear case definition that spells out
inclusion criteria and exclusion criteria.
37. Case Definitions
Specify characteristics related to:
The disease or procedure
ICD codes (International Classification of Diseases codes) are often used as
part of the definition
Person
Place
Time
39. Data Collection
Primary data: interviews of cases using a
questionnaire and/or qualitative techniques
Secondary data from patient charts (medical
records)
It is often helpful to create a questionnaire that guides the extraction of
information from medical records
Be aware that patient charts are often incomplete; missing information
about a symptom does not mean that the patient did not experience it
40. Most case studies do not require
any advanced analyses or any
numbers beyond simple counts
and frequencies.
43. Overview
The goal of a cross-sectional survey, also
called a prevalence study, is to measure the
proportion of a population with a particular
exposure or disease at one point in time based
on a representative sample of a population.
44. Cross-sectional surveys are among the
most popular study approaches in the
health sciences because they allow for
the relatively rapid collection of new
data.
45. Uses
Cross-sectional surveys are used to:
Describe communities
Assess population needs
Evaluate programs
Establish baseline data prior to the initiation
of longitudinal studies
46. Representative Populations
Cross-sectional studies use a simple study design:
The researcher asks a few hundred people to
complete a short questionnaire and then analyzes the
data.
However, there is one very important requirement: the
participants must be reasonably representative of
some larger population.
47. Representative Populations
If a researcher wants the results of a survey to be
generalizable to all town residents, it is NOT acceptable to
use a convenience population such as:
Friends
Fans attending a football game
Shoppers at a store at a given time on a chosen day
Individuals attending a clinic
Pupils attending a neighbourhood school
48. Representative Populations
If the results of a cross-sectional survey are
intended to reflect the profile of an entire
town (or other population group), then the
study’s sampling strategy must recruit a
population that is as diverse as the town.
49. Analysis: Prevalence
Prevalence = the proportion of the population with a given trait at
the time of the survey
Analysis: Comparative Statistics
Prevalence rate ratios (PRRs) compare the prevalence of a characteristic in
2 population subgroups by taking a ratio of their prevalence rates
Note: An exposure can be said to be “associated” or “related” to a disease,
but a cross-sectional survey cannot show that an exposure caused a
disease.
51. PHC215
By Dr. Khaled Ouanes Ph.D.
E-mail: k.ouanes@seu.edu.sa
Twitter: @khaled_ouanes
HEALTHCARE RESEARCH METHODS
Based on the textbook of introduction to health research methods – K.H. Jacobsen