INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Series and Cross-Sectional Surveys

PHC215
By Dr. Khaled Ouanes Ph.D.
E-mail: k.ouanes@seu.edu.sa
Twitter: @khaled_ouanes
INTRODUCTION TO
HEALTHCARE RESEARCH
METHODS

Correlational /Ecological
Studies

Correlational studies are also called
ecological or aggregate studies.
This type of studies uses population-level
data to examine the relationship between
exposure rates and disease rates.

We are thus in the case of a study in which
units of analysis are populations or groups
of people rather than individuals.
i.e. The focus will be on the comparison of
populations/groups rather than individual patients
or participants.

Examples
 Does the percentage of adults with multiple sclerosis
tend to be higher in countries farther from the
equator?
 Does the rate of asthma tend to be higher in cities with
higher levels of air pollution?
 Does the prevalence of diabetes tend to be higher
when we have higher prevalence of obesity?

Population-level data are used
to look for associations between
two or more group
characteristics

Data Sources
At least one data source (if not more) that
contains comparable information about the
population characteristics of interest must be
identified.
Information about all the variables of interest
must be available for a suitable number of
populations, which can be grouped by place or
by time.

Examples of Populations
All Western European countries
The largest 25 metropolitan areas in the Arab
world
All Sub-Sahara countries
A random sample of survey Areas in London
Historic data for the past decades from one or
more place-based populations

Exposures and Outcomes
At least one characteristic of the populations
being examined is designated as an exposure
Exposures are often environmental measures likely to be fairly consistent across an
entire population
At least one characteristic is designated as an
outcome

Aggregate Data
Population characteristics are in the form of
aggregate (grouped) data, such as:
 the proportion of each population with a particular characteristic
 the average value of the variable in the population

Examples of Exposures
 The percentage of adults older than 30 who have not completed
at least 12 years of education
 The mean income in the population
 The median age
 The number of rainy days over a given year in the population
 The average ultraviolet radiation index during midday in the
hottest month of the year

Examples of Diseases
The prevalence of obesity among adults
The mean BMI (body mass index) among adults
The annual mortality rate from asthma

Cautions
Correlational studies are valid only if the data
points are comparable.
A data point is a discrete unit of information. Generally, any single fact is a data point.
In a statistical or analytical context, a data point is usually derived from a
measurement or research and can be represented numerically and/or graphically.
In some populations, exposures and diseases may
be routinely undercounted or routinely over-
diagnosed compared to other populations.

Cautions
If multiple sources of data are used or if the data
were collected over a lengthy period of time,
then the definition of exposure or disease may
differ from one population to another and may
not be comparable.

Data Management Example
Data should be entered into a spreadsheet
Each population (A, B, C, etc.) is in its own row
Each exposure and each outcome is in its own
column

Analysis: Correlation
 On a scatterplot used to illustrate correlation, each point
represents one population in the study.
 The exposure is plotted on the x-axis, and the outcome or
disease is plotted on the y-axis.

1. When all the points fall neatly in a line, then the
correlation is strong.
2. When the points are not exactly linear but a line for
trend can be drawn, then the correlation is mild or
moderate.
3. When the points appear to be randomly placed
and no obvious line can be drawn through them,
then the correlation is weak or nonexistent.

 If higher levels of exposure are linked to higher rates of
disease, then the slope is positive.
 If higher levels of exposure are linked to lower rates of
disease, then the slope is negative.

 For continuous variables and other variables with
responses that can be plotted on a number line, a
Pearson correlation coefficient (r) should be used to
calculate the correlation.
 For variables that assign a rank to responses or that have
ordered categories, use the Spearman rank-order
correlation (designated by the letter r or the Greek letter
r (rho) in most statistical programs).

The Pearson method is built on the notion that if
Measurement 1 trails Measurement 2 (directly or
inversely), you can get some indications on how
linked they are by calculating Pearson's r -the
correlation coefficient-, which is a quantity
derived from the products of the differences
between each M1 and its average and each M2
and its average.

Spearman's rank coefficient is similar to Pearson in
producing a value from -1 to +1, but you would
use Spearman when the rank order of the data
are important in some way.
The Pearson test is more widely used.

 r = –1: all points lie perfectly on a line with a negative slope
 r = 1: all points lie perfectly on a line with a positive slope
 r = 0: no association between the exposure and outcome
 r2 shows how strong a correlation is without indicating the
direction of the association

Analysis
Use linear regression models when the goal is to:
compare more than two variables
understand the relationship between two variables
while controlling or adjusting for the effects of other
variables

Age Adjustment
When the populations being compared have
very different age structures, age adjustment
may be necessary to make a fair comparison
among populations.

Avoiding the Ecological Fallacy
 Correlational studies compare groups rather than
individuals.
 No individual-level data are included in the analysis,
only population-level data.
 The incorrect attribution of population-level
associations to individuals is called the ecological
fallacy.

Even though a population with a higher rate of
exposure to something has a higher rate of
disease than populations with lower exposure
rates, individuals in that population who have a
high level of exposure do not necessarily have
the disease.

The experience of an individual in a population
may vary significantly from the population
average.
It would be incorrect to assume that any one
individual from a country with a high average
body mass index (BMI) will be obese or that an
individual from a country with a low average BMI
will not be obese.

However, it is appropriate to identify trends in
populations and to use those observations to
generate hypotheses for individual-level studies
that will test for relationships between the
characteristics of interest in individuals.

Key Characteristics of Correlational
(Ecological) Studies

Uses of Case Series
 Describing the characteristics of and similarities
among a group of individuals with the same signs
and/or symptoms of disease
 Identifying new syndromes and refining case
definitions.
 Clarifying typical disease progression
 Developing hypotheses for future research

Sample Size
Some case series for rare conditions may
require only a few participants
Other studies may include several hundred
individuals

Getting Started…
 Select one disease or condition of interest
 Determine what will be new and interesting about
the study
 Identify an appropriate and available source of
cases
 Establish a clear case definition that spells out
inclusion criteria and exclusion criteria.

Case Definitions
Specify characteristics related to:
The disease or procedure
 ICD codes (International Classification of Diseases codes) are often used as
part of the definition
Person
Place
Time

Data Collection
Primary data: interviews of cases using a
questionnaire and/or qualitative techniques
Secondary data from patient charts (medical
records)
 It is often helpful to create a questionnaire that guides the extraction of
information from medical records
 Be aware that patient charts are often incomplete; missing information
about a symptom does not mean that the patient did not experience it

Most case studies do not require
any advanced analyses or any
numbers beyond simple counts
and frequencies.

Key Characteristics of a
Case Series

Overview
The goal of a cross-sectional survey, also
called a prevalence study, is to measure the
proportion of a population with a particular
exposure or disease at one point in time based
on a representative sample of a population.

Cross-sectional surveys are among the
most popular study approaches in the
health sciences because they allow for
the relatively rapid collection of new
data.

Uses
Cross-sectional surveys are used to:
Describe communities
Assess population needs
Evaluate programs
Establish baseline data prior to the initiation
of longitudinal studies

Representative Populations
Cross-sectional studies use a simple study design:
The researcher asks a few hundred people to
complete a short questionnaire and then analyzes the
data.
However, there is one very important requirement: the
participants must be reasonably representative of
some larger population.

If a researcher wants the results of a survey to be
generalizable to all town residents, it is NOT acceptable to
use a convenience population such as:
 Friends
 Fans attending a football game
 Shoppers at a store at a given time on a chosen day
 Individuals attending a clinic
 Pupils attending a neighbourhood school

If the results of a cross-sectional survey are
intended to reflect the profile of an entire
town (or other population group), then the
study’s sampling strategy must recruit a
population that is as diverse as the town.

Analysis: Prevalence
Prevalence = the proportion of the population with a given trait at
the time of the survey
Analysis: Comparative Statistics
 Prevalence rate ratios (PRRs) compare the prevalence of a characteristic in
2 population subgroups by taking a ratio of their prevalence rates
 Note: An exposure can be said to be “associated” or “related” to a disease,
but a cross-sectional survey cannot show that an exposure caused a
disease.

Key Characteristics of
Cross-Sectional Surveys

PHC215
By Dr. Khaled Ouanes Ph.D.
E-mail: k.ouanes@seu.edu.sa
Twitter: @khaled_ouanes
HEALTHCARE RESEARCH METHODS
Based on the textbook of introduction to health research methods – K.H. Jacobsen

INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Series and Cross-Sectional Surveys

More Related Content

What's hot

Viewers also liked

Similar to INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Series and Cross-Sectional Surveys

More from Dr. Khaled OUANES

Recently uploaded

INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Series and Cross-Sectional Surveys