More Related Content
Similar to Chapter 022 (20)
More from stanbridge (20)
Chapter 022
- 1. 1Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Chapter 22
Using Statistics To Describe
Variables
- 2. 2Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Using Statistics to Describe
Variables
Two major classes of statistics
Descriptive statistics
• To reveal characteristics of the sample dataset
Inferential statistics
• To gain information about effects in the population being
studied
- 3. 3Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Using Statistics to Describe
All quantitative research uses descriptive
statistics
For description of the sample
For initial description of variables
For analysis of the primary research problem
Descriptive statistics for descriptive research
Inferential statistics for interventional and
correlational research
- 4. 4Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Using Statistics to Summarize
Data
Terms: the number of elements in a sample is
the “n” of the sample
Data set: 45, 26, 59, 51, 42, 28, 26, 32, 31, 55, 43,
47, 67, 39, 52, 48, 36, 42, 61, 57
n = 20
Descriptive statistics
Frequency distributions
Measures of central tendency
Measures of dispersion
- 5. 5Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Frequency Distributions
Table or figure (line graph, pie chart, etc.)
Continuous variable: the higher numbers
represent more of that variable, and the lower
numbers represent less of that variable
- 6. 6Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Frequency Table
Listing every possible value in the first
column of numbers, and the frequency (tally)
of each value as the second column of
numbers
Data set: 45, 26, 59, 51, 42, 28, 26, 32, 31,
55, 43, 47, 67, 39, 52, 48, 36, 42, 61, 57
(ages)
Sort from lowest to highest values
Tally each value
- 7. 7Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Ungrouped Frequency
Distribution
List all categories of the variable
on which they have data, and tally
each datum on the listing
Age Frequency
26 2
28 1
31 1
32 1
36 1
39 1
42 2
43 1
45 1
47 1
48 1
51 1
52 1
55 1
57 1
59 1
61 1
67 1
- 8. 8Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Grouped Frequency Distribution
Categories are grouped into ranges
Ranges must be mutually exhaustive and
mutually exclusive
Age Frequency
20 - 29 3
30 - 39 4
40 - 49 6
50 - 59 5
60 - 69 2
- 9. 9Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Grouped Frequency Distribution
with Percentages
Adult Age
Range
Frequency
(f)
Percentage (%)
Cumulative
Percentage
20 – 29 3 15 15
30 – 39 4 20 35
40 – 49 6 30 65
50 – 59 5 25 90
60 – 69 2 10 100
Total 20 100
- 10. 10Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Frequency Distributions
Presented in Figures
Graphs
Charts
Histograms
Frequency polygons
- 11. 11Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Line Graph
- 12. 12Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Frequency Table of Smoking
Status
Smoking Status Frequency Percent
Current smoker 1 10
Former Smoker 6 60
Never Smoked 3 30
Total 10 100
- 13. 13Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Histogram of Smoking Status
- 14. 14Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Measures of Central Tendency
Statistics that provides the center or hallmark
value of a data set
Mode
Median (MD)
Mean
- 15. 15Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Mode
The most common value in a data set
Bimodal: two modes exist
Multimodal: more than two modes
- 16. 16Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Median (MD)
The middle value in the data set (after sorting
values from lowest to highest)
If the “n” is even, the two values in the middle
are averaged
The 50th percentile
- 17. 17Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Mean
Arithmetic average of all a variable's values
Most commonly reported measure of central
tendency
Sum of the scores divided by the number of
values in the data set
Formula:
- 18. 18Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
When to Use Mean
Mean: normally distributed values measured
at the interval or ratio level
Ordinal level data from a rating scale If
The n is large
The data are normally distributed
Small values denote very little of the measured
quantity; large ones denote a lot
Mean is sensitive to extreme scores such as
outliers
- 19. 19Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
When to Use Median and Mode
Median: used for non-normal distributions
with small n
Mode: used for nominal values
- 20. 20Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Using Statistics to Explore
Deviations in the Data
Using measures of central tendency to
describe the nature of a data set obscures
the impact of extreme values or deviations in
the data
Measures of dispersion, provide important
insight into nature of the data
- 21. 21Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Measures of Dispersion
Quantifications of how tightly clustered
around the mean the sample is:
Tightly clustered = fairly homogeneous
Widely dispersed = heterogeneous
Range
Difference score
Variance
Standard deviation
- 22. 22Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Range
Presented in two ways:
The lowest score and the highest score (2 through 17)
The difference between the highest and the lowest
score (range of 15)
- 23. 23Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Difference Score
Subtract the mean from each score
Sometimes referred to as a deviation score
The difference score is positive when score is
above the mean, and negative when score is
below the mean
The total of all the difference scores is zero
Formula:
- 24. 24Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Mean Deviation
Average difference score, using the absolute
values
Example:
- 25. 25Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Variance (s²)
Variance commonly used
“s2”
is used to represent a sample variance
“σ2”
is used to represent population variance
Always a positive value, has no upper limit
Bigger variances = more spread
Formula:
- 26. 26Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Standard Deviation (s)
Square root of the variance
Sometimes reported as SD
Most commonly reported measure of
dispersion
Formula:
- 27. 27Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
The Normal Curve
- 28. 28Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
The Normal Curve (Cont’d)
Represents the frequency distribution of a variable
that is perfectly normally distributed
Signifies:
The mean is the most commonly occurring value
There are just as many values above the mean as there are
below the mean
When frequency table is constructed, values are perfectly
symmetric
68% of values are –1 to +1 standard deviations from mean
95% of values are –2 to +2 standard deviations from mean
- 29. 29Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
z-Score
- 30. 30Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
z-Score (Cont’d)
Synonymous with a standard deviation unit
A z value of 1.0 represents 1 standard deviation
unit above the mean
A z value of –1.0 represents 1 standard deviation
unit below the mean
Formula:
- 31. 31Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Sampling Error
Described by the statistic “standard error”
Standard error of the mean is calculated to determine
the magnitude of the variability associated with the
mean
Formula:
where
= standard error of the mean
s = standard deviation
n = sample size
- 32. 32Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Confidence Interval
Determines how closely a sample value
approximates a population value
Can be created for many statistics, such as a
mean, proportion, and odds ratio
Using a table of statistical values, the t-value
is accessed, for the desired interval, usually
95%
- 33. 33Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Confidence Interval (Cont’d)
To calculate a 95% confidence interval
around a mean, for example:
Calculate the mean
Calculate the standard error of the mean
Calculate the degrees of freedom (df) [df = n – 1]
Look up the two-tailed t-value for p < 0.05
- 34. 34Copyright © 2013, 2009, 2005, 2001, 1997 by Saunders, an imprint of Elsevier Inc.
Degrees of Freedom
The number of independent pieces of
information that are free to vary
For confidence interval, the degrees of
freedom (df) are n – 1
This means that there are n – 1 independent
observations in the sample that are free to vary (to
be any value) to estimate the lower and upper
limits of the confidence interval