GROUP 1
GROUP 1
TariAgustin
25021340008
Karina Septiani
25021340011
Andika Guruh S.
25021340017
M. Dani Andhika P.
25021340018
3.
Introduction to DescriptiveStatistics
Definition: Statistical methods used to summarize, organize, and present data in
a meaningful way.
Descriptive analysis provides a concise and meaningful way to convey the
main features of a dataset using both numerical measures and visual
representations (Blbas, 2024).
b. Purpose:
To describe patterns in data.
To make large amounts of information easier to understand.
To prepare data for further analysis (e.g., inferential statistics).
c. Difference from Inferential Statistics:
Descriptive →summarizes data you already have.
Inferential →makes predictions or generalizations beyond the data.
4.
Organizing Data andGraphing Data
It is essential to create frequency distributions to facilitate researchers in
describing, summarizing, and reporting their data.
a. Frequency Distributions
A frequency distribution is a data presentation tool in the form of
columns and rows (tables), which contains numbers that depict the
frequency distribution of the variables being studied.
5.
Organizing Data andGraphing Data
Example:
Table 1. Scores of Twenty-five
Students on a Thirty-item Social
Studies Test
Table 2. A Frequency Distribution of a Thirty-
item Test Ordered from the Highest to the
Lowest Score: Test Scores of Twenty-five
Students
6.
Organizing Data andGraphing Data
Various types of frequency distributions
Single Data Frequency Distribution
Single data frequency distribution is the
distribution of numerical data
without grouping the variable values
(ungrouped data).
Group Data Frequency Distribution
Group Data Frequency Distributionis a presentation
of data from variable values that are large or
varied, to facilitate analysis and presentation.
Steps:
1. Determine the range
Range = Largest data – smallest data
2. Determine the number of interval classes using Sturges' formula
Number of classes = 1 +(3.3) log n
3. Determine the length of the interval class (p) using the following
formula:
p = n/range
7.
Organizing Data andGraphing Data
Various types of frequency distributions
Absolute frequency distribution is a number that
indicates the amount of data in a particular group.
Absolute Frequency Distribution Relative Frequency Distribution
A relative frequency distribution is a percentage that
indicates the amount of data in a particular group.
8.
Organizing Data andGraphing Data
Class Intervals
Class Interval is a range of values into which data is grouped for the purpose
of organizing, summarizing, and analyzing data efficiently.
When we have many scores (20 or more), it is better to put them into
groups.
These groups are called class intervals (e.g., 20–25, 26–30).
Problem: We lose details → if 4 students are in 20–25, we don’t know their
exact scores.
9.
Organizing Data andGraphing Data
Rules for Class Intervals
All intervals must be the same size.
A score can be in only one interval, not two.
Better to use an odd number (5, 7, 9) →midpoint is a whole number.
10.
Organizing Data andGraphing Data
Cumulative Frequency Distribution
A cumulative frequency distribution shows the total number of data values that
fall at or below a certain point.
It is useful for finding how many values are below or above a specific score.
A cumulative frequency table usually includes:
Score / Class Interval →the data values
Frequency →how often each value occurs
Percent Frequency →frequency as a percentage
Cumulative Frequency →running total of frequencies
Cumulative Percentage →running total in percentages
11.
Organizing Data andGraphing Data
Graphing Data
a. Histogram - Helps identify
gaps in data.
b. Frequency polygon - Easier to
compare multiple distributions on the
same graph
12.
Organizing Data andGraphing Data
Graphing Data
c. Pie Graph - Visually intuitive for
showing percentages.
d. Bar Graph - Works well for
continuous data and comparisons
13.
Organizing Data andGraphing Data
Graphing Data
e. Line Graph - Highlights
increases, decreases, and
fluctuations
f. Box Plot (box and whiskers) - Shows
the spread and decrease or increase of
a group of data.
14.
Organizing Data andGraphing Data
Graphing Data
g. Comparing Histograms and
Frequency Polygons
15.
Measures of CentralTendency
A measure of central tendency is a summary score
that represents a set of scores. (Ravid Ruth, 2011)
Mode
Definition: The mode of a
distribution is the score that
occurs with the greatest
frequency in that distribution.
Example :
16.
Measures of CentralTendency
Median
Definition: The middle value when a set of data values has been ordered
from lowest to highest value.
Example:
Suppose we have the following set of 6 scores:
Scores: 10, 12, 13, 13, 15, 16
Step 1: Arrange the scores in order (already done)
Step 2: Count the number of scores
Step 3: Find the middle two scores
Step 4: Calculate the median
17.
Measures of CentralTendency
Mean
Definition: The mean, which is also called the arithmetic mean, is obtained by
adding up the scores and dividing that sum by the number of scores. The
mean is sometimes called the arithmetic mean and the average.
Formula for the Mean:
Mean :
Example:
Suppose we have the following test scores:
70, 80, 85, 90, 95
Step 1: Add all the values
Step 2: Count the number of values
Step 3: Apply the formula
Measures of Variability
TheRange
Range is the difference between the highest and lowest values in a dataset. It
gives a quick sense of how spread out the numbers are.
Example: You have data 2, 4, 4, 6, 10 →Range = 10 − 2 = 8
Strengths: Easy to compute.
Weaknesses: Based only on 2 values →highly affected by outliers.
Standard Deviation and Variance
a. The deviation score is the distance of the raw score from the mean,
indicated by X – X– (i.e., the score minus the mean). The sum of the deviation
scores (i.e., the distances between the raw scores and the mean of that
distribution) is always 0 (zero).
20.
Measures of Variability
b.The variance is the mean of the squared deviations. To calculate it, square
each deviation score, add all the squared deviations, and divide their sum by
n – 1 (the number of scores minus 1) for the sample variance.
21.
Measures of Variability
a.Step 1: Find the mean (average)
Add them: 2 + 4 + 6 = 12
Divide by (n) 3 →mean = 4
b. Step 2: Find the deviation (difference from the mean)
2 − 4 = −2
4 − 4 = 0
6 − 4 = +2
So deviations are: −2, 0, +2
c. Step 3: Square the deviations (to remove minus)
(−2)² = 4
0² = 0
(+2)² = 4
So squared deviations: 4, 0, 4
22.
Measures of Variability
Step4: Find the variance
Add them: 4 + 0 + 4 = 8
If we use population variance (all data included) →divide by 3 →8 ÷ 3 = 2.67
If we use sample variance (just a small part of bigger data) → divide by (3 −
1) = 2 →8 ÷ 2 = 4
Step 5: Find the standard deviation (the square root of variance)
Population SD = √2.67 ≈1.63
Sample SD = √4 = 2
✅Simple rule:
If you have all the data →use divide by n (population).
If you only have a sample from a bigger population → use divide by n − 1
(sample).
23.
Measures of Variability
Computingthe Variance and SD for Populations and Samples
1.Population: When you have all the data (the whole group), divide by N
(the number of scores).
2.Sample: When you only have part of the group (a sample), divide by n-1.
This adjustment (called Bessel’s correction) makes the result more
accurate.
3.Steps (conceptual, no formula):
a.Find the mean.
b.See how far each score is from the mean (deviation).
c.Square those deviations (to avoid negatives).
d.Average them →this is variance.
e.Take the square root →this is standard deviation (SD).
24.
Measures of Variability
Usingthe Variance and SD
1.Variance and SD show how spread out the data is.
2.Small SD = scores are close to the mean (less variation).
3.Big SD = scores are spread out (more variation).
4.Uses in research:
To compare two groups (which group is more consistent?).
To understand reliability (stable or unstable scores?).
To identify how much individual scores differ from the average
25.
Measures of Variability
Varianceand SD in Distributions with Extreme Scores
1.Extreme scores (outliers) increase the variance and SD a lot.
2.Example: if most students score 80–90, but one student scores 20, the SD
becomes much larger.
3.That’s why researchers check data for outliers before analysis.
26.
Measures of Variability
FactorsAffecting the Variance and SD
1.Range of scores – wider range = bigger SD.
2.Mean differences – scores clustered near
the mean = smaller SD.
3.Outliers – extreme values increase SD.
4.Sample size – smaller samples tend to have
more unstable SD; bigger samples give a
more reliable SD.