PUH 6301, Public Health Research 1 Course Learning Ou

PUH 6301, Public Health Research 1
Course Learning Outcomes for Unit VI
Upon completion of this unit, students should be able to:
4. Evaluate strategies for data analysis to determine the best
statistical tests needed for research
methods.
4.1 Determine the four levels of measurement as valid research
statistical techniques in the public
health research process.
4.2 Explain why proper data and statistical analysis is
important.
4.3 Describe the basic types of statistic tests.
Course/Unit
Learning Outcomes
Learning Activity
4.1
Unit Lesson
Chapter 28
Chapter 29
Chapter 30

Chapter 31
Chapter 33
Blog: “Descriptive vs. Inferential Statistics: What’s the
Difference?
Unit VI Essay
4.2
Unit Lesson
Chapter 28
Unit VI Essay
4.3
Unit Lesson
Chapter 29
Unit VI Essay
Required Unit Resources
Chapter 28: Data Management
Chapter 29: Descriptive Statistics
Chapter 30: Comparative Statistics
Chapter 31: Regression Analysis
Chapter 33: Additional Analysis Tools
In order to access the following resource, click the link below:
The website below provides a good summary of how the public
health researcher can use descriptive and
inferential statistics methods to conduct public health research.
Market Research Guy. (2011, December 1). Descriptive vs.

inferential statistics: What’s the difference? [Blog
post]. http://www.mymarketresearchmethods.com/descriptive-
inferential-statistics-difference/
UNIT VI STUDY GUIDE
Data Analysis Plan
http://www.mymarketresearchmethods.com/descriptive-
inferential-statistics-difference/
UNIT x STUDY GUIDE
Title
Unit Lesson
Introduction
This unit covers the statistical procedures used to analyze the
data collected from research tools. During this
stage of research, you may begin to draw conclusions and be
able to answer the research question(s) and
sub-question(s) you developed in Unit I. Use statistics in this
stage of research to manipulate the data and
make it understandable for others to read. Shi (2008)
encourages researchers to know and understand basic

statistics and statistical procedures. The data analysis phase of
research is important because it makes sense
of the data that can be used for future research studies
(Jacobsen, 2021).
Data Management
Data management is the entire process of keeping a record of all
the results of clinical assessments
conducted during a research study (Jacobsen, 2021). Record
keeping includes listing details on potential
articles, pulling information from patient charts, tracking
responses from surveys, or recording assessment
results from cohorts or studies. It is vital that those responsible
for collecting and keeping data maintain
confidentiality and the integrity of data sets from all outside
sources. Once researchers enter the data into the
spreadsheet or database, the data should be recoded and double -
checked prior to beginning statistical
analysis.
One way to keep research data organized, analyzed, and entered
correctly and easily is by coding it. Creating
a codebook to describe each variable and explain how the
information that was collected will be entered into
the computer database is valuable to a research study (Jacobsen,
2021). When you use quantitative surveys
to collect data, it is easier to use numeric or alphabetic codes to
organize answers from the questionnaire or
survey. For example, when coding yes and no answers, you can
use the number 1 to represent the answer
for yes and the number two for no answers. In like manner, if
you use a Likert scale, you can use numbers 1-
5 to represent worded answers. Strongly agree may be
represented by number 1, agree by number 2, neutral
by the number 3, and so on. When coding answers from

qualitative surveys, using a codebook is even more
important because the information from open-ended questions
can yield unnecessary data; therefore, the
researcher must provide clear and detailed instructions for how
to code these responses and comments. The
codebook should also specify the following things:
• the name of each variable,
• the type of variable,
• the way the question was asked,
• the options listed on the survey instrument as possible answers
to the question,
• the way answers should be entered into the computer database,
and
• the way to handle missing responses (Jacobsen, 2021).
Researchers can also use the codebook to anticipate data
problems and how those problems will be handled.
It will also specifically describe how to code answers that are
missing or blank. For example, if a participant
leaves a question blank on the survey or questionnaire, the
codebook will specify that the data should be
entered as “missing” or use a number such as 0 or 1 (Jacobsen,
2021).
Steps in Preparing Data for Analysis
Preparing the data collected by the data collection tool can be
cumbersome; however, DePoy and Gitlin
(2005) provide seven steps to assist researchers organize and

prepare the data for analysis:
1. Check data collection tool for accuracy. Make sure
information is completed correctly. It is important
to check for missing information and make sure all missing data
is corrected before data entry and
data analysis begins.
2. Label all variables on the tool. Variables are anything that
has quality or quantity. A variable can be
time, distance, food intake, or even sea level. If you are using
statistical software for analysis, then it
is important to make sure your variables are labeled logically.
3. Assign variable labels to locations on computer. In this step,
the researcher determines the order in
which the variables are entered into the computer program.
Generally, researchers assign a coded
identification number to every participant. These numbers are
considered the raw data.
UNIT x STUDY GUIDE
Title
4. Develop a codebook and master file. This is considered the
dictionary of data. This written record
provides copies of the variable labels and the values that

represent all of the labels.
5. Entering data into the computer-based system is the fifth
step. Double verify or use another quality
control procedure. Sometimes, Microsoft Excel is used for
smaller data sets; however, it is preferable
to use a statistical package such as the Statistical Analysis
System (SAS) or the Statistical Package
for the Social Sciences (SPSS Statistics) for entering large data
sets. Many statistical packages allow
researchers to check ranges. Double verification is an important
quality control measure. Double
verification involves entering the same set of data on two
separate occasions. The program will check
all entries against itself and alert the researchers if there is a
discrepancy.
6. Clean raw data files. Cleaning data occurs when the
researchers check the data that were entered
and make sure they were accurately transcribed. This is another
step to help determine missing
information and to make sure information was coded and
entered correctly.
7. Develop scores. This step involves summarizing information.
The main form of summarizing
information is to use descriptive statistics, which is discussed
later in this lesson.
Computer software or database programs are commonly used to
analyze data received in a research study.
The more commonly used software in public health research are
SAS, Stata, and SPSS Statistics. These
computer programs are “designed to be visually appealing, to
facilitate consistent entry of the acceptable

responses for each question, and to perform automatic skips
between questions” (Jacobsen, 2021, p. 218),
which allows for uniformity and consistency of data entry.
Alternatively, you can use a spreadsheet to enter
data, which does not require a data entry form or code or testing
of a system, but it is easy to mix up data and
makes data cleaning more complex.
Descriptive Statistics
It is important for a public health professional to know about
variables when conducting research. Variables
are important in statistics and data analysis. Variables are any
things or any characteristics that can have
values assigned to them. Variables are generally broken down
into four levels of measurement: ratio, interval,
ordinal, and nominal (Jacobsen, 2021).
• Ratio: This level of measurement has numeric responses on a
scale where numbers indicate value.
For example, if weight is measured in pounds, a person who is
30 pounds is twice a heavy as a
person who weighs 15 pounds. Therefore, the ratio is 2 to 1.
• Interval: This level also uses numeric values; however, the
spacing between all variable categories is
equal. For example, the Likert scale uses the interval variable
scale labeled strongly disagree,
disagree, neutral, agree, and strongly agree. It is implied that
the distance between each variable is
equal.
• Ordinal or ranked: This level of measurement assigns a rank
order to each variable. For instance,
responses can be ordered from least=1 to greatest=4, from

best=1 to worst= 3, or from high
income=4 to low income=1. No matter the scale, the rank or
order of the responses is indicated by
numeric value so it is easily coded.
• Nominal or categorical: This level has responses where the
variables represent groups with no real
rank or order (Jacobsen, 2021). For example, race, ethni city,
age, or blood types are considered
nominal variables. A subtype of nominal variables is binomial
variables. This type of variable has only
two types of possible answers, which are yes and no and are
sometimes called dichotomous
variables (Jacobsen, 2021).
Ratio variables and interval variables can be further categorized
as continuous variables or discrete variables.
Continuous variables “can take on a value with a range”
(Jacobsen, 2021, p. 236). Discrete variables come
from counting something; therefore, there are gaps between
values (Jacobsen, 2021).
Measure of Central Tendency
Descriptive statistics are probably some of the most commonly
used statistics in research. Researchers use
descriptive statistics when the research is describi ng averages,
and there are three measures of central
tendency or average when referring to numerical values, which
are mean, median, and mode.

UNIT x STUDY GUIDE
Title
• The mean of a sample is most frequently used and is
calculated by totaling the number of responses
to a question, then dividing that sum by the total number of
individuals who answered the question
(Jacobsen, 2021). This is a good measure of summary but can be
affected by extreme values or
outliers.
• The median is calculated by putting all the answers given in
order from least to greatest, then
identifying the number that is in the middle (Jacobsen, 2021).
That middle number is called the
median. If the responses come to an even number, then the
researcher should add the middle two
numbers and divide by two, which will yield the median
(Jacobsen, 2021).
• The mode is the measure of central tendency that is the
number or answer given by participants that
appears the most (Jacobsen, 2021).
Advanced Statistics
One consideration you must make as the researcher is that the
results or analysis can be inaccurate or
skewed. Numeric variables have a normal distribution, but they
may be skewed. Skewness may be shown

with “responses that extend farther from the peak on either the
left…or the right…side of the histogram”
(Jacobsen, 2021). Understanding whether the error happened as
a data collection problem or an analysis
problem is important. Ensuring that the data analysis is not
falsified, fabricated, or plagiarized is important
because that data can unintentionally corrupt a research project
if left unchecked. As the researcher, you
should be sure that statistical honesty is adhered to by following
the established rules of scientific research
practices. If at any time you are not sure of your findings or do
not understand the statistics, you are always
encouraged to check with a professional (Jacobsen, 2021).
Professional statisticians are available to make
sure that the sampling methods and sample size fit your
research, that the questionnaire or survey used will
provide usable data, and that the analysis plan is reasonable and
reliable.
References
DePoy, E., & Gitlin, L. N. (2005). Introduction to research:
Understanding and applying multiple strategies (3rd
ed.). Elsevier Mosby.
Jacobsen, K. H. (2021). Introduction to health research
methods: A practical guide (3rd ed.). Burlington, MA:
Jones & Bartlett Learning.
Shi, L. (2008). Health services research method (2nd ed.).
Delmar.

Suggested Unit Resources
In order to access the following resources, click the links
below:
You are encouraged to explore the article below. It offers a
good review of how public health utilizes
descriptive research in epidemiology research.
Naito, M. (2014). Editorial: Utilization and application of
public health data in descriptive epidemiology. Journal
of Epidemiology, 24(6), 435-436.
http://search.proquest.com.libraryresources.columbiasouthern.e
du/docview/1622120440?accountid=
33337
You are encouraged to explore the website below.
Health data tools and statistics. (n.d.).
https://phpartners.org/health_stats.html
Learning Activities (Nongraded)
Nongraded Learning Activities are provided to aid students in
their course of study. You do not have to submit
them. If you have questions, contact your instructor for further
guidance and information.
Click to access the Unit VI terminology flash cards PowerPoint.
Click to access the PDF version.

http://search.proquest.com.libraryresources.columbiasouthern.e
du/docview/1622120440?accountid=33337
https://phpartners.org/health_stats.html
https://online.columbiasouthern.edu/bbcswebdav/xid-
131487526_1
https://online.columbiasouthern.edu/bbcswebdav/xid-
131487525_1
DISCUSSION BOARDS
(REMEMBER THIS IS RELATING TO THE RESEARCH
PAPERS THAT YOU HAVE BEEN WRITING FOR ME)
Unit 3
Discuss what potential ethical issues relate to a research topic
that you would like to investigate. What are the ethical issues,
and how would you approach them? What are your plans for
taking formal research ethics training?
Unit 6
After reading the lesson and section 29.2 in the textbook,
describe the variables (e.g., ratio, interval, ordinal/ranked,
nominal/categorical binomial) that will be used in your research
project.
29.2Types of Variables
A variable is a characteristic that can be assigned to more than
one value. Examples of variables that could be examined during
a population health study include age, sex, annual income,
languages spoken at home, frequency of alcohol ingestion,
cholesterol level, history of chickenpox, and use of contact
lenses. The value of a variable for an individual does not have
to vary over time, but the response among individuals within a
population should be something that might differ.
In most statistical analysis software programs, responses from
individual participants are displayed in the rows of a data table
and each column represents one variable. If one column presents

the data for sex, one value for sex—such as an F or 0 for
females or an M or 1 for males—will be listed in each row of
that column. Another column may represent age in years, and
one value for age—usually a whole number—will be listed in
each row.
There are several ways to classify variables (Figure 29-2).
· A ratio variable is a numeric variable that can be plotted on a
scale on which a value of zero indicates the total absence of the
characteristic. For example, if height is measured in feet, a
measurement of 0 feet tall means there was no height. As a
result, the ratio of heights is meaningful. A person who is 6 feet
tall is twice as tall as a person who is 3 feet tall, yielding a ratio
of 2 to 1.
· An interval variable is a numeric variable for which a value of
zero does not indicate the total absence of the characteristic. An
outside temperature of 0°C does not mean there is no heat. If
the weather turns colder, the temperature may fall to –10°C or
lower. A day with a high temperature of 40°F is not twice as hot
as a day with a maximum temperature of 20°F.
· An ordinal variable, also called a ranked variable, is a variable
with responses that span from first to last, from best to worst,
from most favorable to least favorable, or from always to never,
or that are expressed using other types of ranked scales. (Figure
21-4 provides examples of other types of ranked responses.) The
rank order can be assigned a number. For example, the
responses to a survey that asks participants to indicate their
level of agreement with a statement can be coded with agree as
“3,” neutral as “2,” and disagree as “1.” Alternatively,
responses could be coded with agree as “1” and disagree as “3,”
or neutral could be set as “0,” agree as “1,” and disagree as “ –
1.” No matter whatthe scale is, the order of the responses is
indicated by their numeric values.
· A nominal variable, also called a categorical variable, has
values that represent no inherent rank or order. For example,
there is no obvious way to numerically rank the favorite
recreational sports activities of participants or their blood types.

A dichotomous variable is a subtype of categorical variable with
only two possible answers. A binomial variable is a
dichotomous variable that has been coded as having values of
only “0” and “1,” such as coding yes as 1 and no as 0 or coding
adults as 1 and children as 0.
· FIGURE 29.2 Types of Variables
Variable Type
Definition
Examples
Ratio
Numbers on a scale for which zero indicates the complete
absence of the characteristic
Blood pressure, height, weight
(The ratio of 20 kg to 10 kg is meaningful because the weight
doubles when it increases from 10 kg to 20 kg.)
Interval
Numbers on a scale for which zero does not indicate the
complete absence of the characteristic
Temperature (°F or °C) (The heat does not double if the
temperature increases from 20° to 40°, because 0° does not
represent the absence of all heat.)
Ordinal/ranked
An ordered series that assigns a rank to responses (from first to
last in the series) but for which the numbers assigned to the
values are not meaningful
Highest educational level completed, scales for never (1) to
always (5), scales for strongly disagree (1) to strongly agree (5)
Nominal/categorical
Categories with no inherent rank or order
Employment sector, blood type
Binomial
Categorical variables for which only two responses are possible
yes/no, male/female, case/control
· Ratio and interval variables can be further classified as either
continuous variables or discrete variables. A continuous
variable is a numeric variable that can take on any value within

a range. For example, although height is often rounded to the
nearest inch when it is measured, a person’s height could
actually be 59½ inches or 68¾ inches or 77.1529 inches.
A discrete variable is a numeric variable that is not continuous.
Discrete variables often are generated by counting items, so
there are gaps between the acceptable values. For example, a
family can own 2 egg-laying chickens or 17 chickens, but
cannot own 2½ chickens or 5¼ chickens.
Unit 7
After reading the unit lesson and the required unit resources,
post your abstract in the discussion board and discuss why you
chose this topic and how it relates to research you plan to
continue or to your future career.

PUH 6301, Public Health Research 1 Course Learning Ou

Recommended

Recommended

More Related Content

Similar to PUH 6301, Public Health Research 1 Course Learning Ou

Similar to PUH 6301, Public Health Research 1 Course Learning Ou (20)

More from TatianaMajor22

More from TatianaMajor22 (20)

Recently uploaded

Recently uploaded (20)

PUH 6301, Public Health Research 1 Course Learning Ou