DESCRIPTIVE STATISTICS
AND DATA PRESENTATION
DESCRIPTIVE STATISTICS
AND DATA PRESENTATION
ANALISIS DATA
BY GROUP 1
GROUP 1
GROUP 1
Tari Agustin
25021340008
Karina Septiani
25021340011
Andika Guruh S.
25021340017
M. Dani Andhika P.
25021340018
Introduction to Descriptive Statistics
Definition: Statistical methods used to summarize, organize, and present data in
a meaningful way.
Descriptive analysis provides a concise and meaningful way to convey the
main features of a dataset using both numerical measures and visual
representations (Blbas, 2024).
b. Purpose:
To describe patterns in data.
To make large amounts of information easier to understand.
To prepare data for further analysis (e.g., inferential statistics).
c. Difference from Inferential Statistics:
Descriptive →summarizes data you already have.
Inferential →makes predictions or generalizations beyond the data.
Organizing Data and Graphing Data
It is essential to create frequency distributions to facilitate researchers in
describing, summarizing, and reporting their data.
a. Frequency Distributions
A frequency distribution is a data presentation tool in the form of
columns and rows (tables), which contains numbers that depict the
frequency distribution of the variables being studied.
Organizing Data and Graphing Data
Example:
Table 1. Scores of Twenty-five
Students on a Thirty-item Social
Studies Test
Table 2. A Frequency Distribution of a Thirty-
item Test Ordered from the Highest to the
Lowest Score: Test Scores of Twenty-five
Students
Organizing Data and Graphing Data
Various types of frequency distributions
Single Data Frequency Distribution
Single data frequency distribution is the
distribution of numerical data
without grouping the variable values
(ungrouped data).
Group Data Frequency Distribution
Group Data Frequency Distributionis a presentation
of data from variable values that are large or
varied, to facilitate analysis and presentation.
Steps:
1. Determine the range
Range = Largest data – smallest data
2. Determine the number of interval classes using Sturges' formula
Number of classes = 1 +(3.3) log n
3. Determine the length of the interval class (p) using the following
formula:
p = n/range
Organizing Data and Graphing Data
Various types of frequency distributions
Absolute frequency distribution is a number that
indicates the amount of data in a particular group.
Absolute Frequency Distribution Relative Frequency Distribution
A relative frequency distribution is a percentage that
indicates the amount of data in a particular group.
Organizing Data and Graphing Data
Class Intervals
Class Interval is a range of values into which data is grouped for the purpose
of organizing, summarizing, and analyzing data efficiently.
When we have many scores (20 or more), it is better to put them into
groups.
These groups are called class intervals (e.g., 20–25, 26–30).
Problem: We lose details → if 4 students are in 20–25, we don’t know their
exact scores.
Organizing Data and Graphing Data
Rules for Class Intervals
All intervals must be the same size.
A score can be in only one interval, not two.
Better to use an odd number (5, 7, 9) →midpoint is a whole number.
Organizing Data and Graphing Data
Cumulative Frequency Distribution
A cumulative frequency distribution shows the total number of data values that
fall at or below a certain point.
It is useful for finding how many values are below or above a specific score.
A cumulative frequency table usually includes:
Score / Class Interval →the data values
Frequency →how often each value occurs
Percent Frequency →frequency as a percentage
Cumulative Frequency →running total of frequencies
Cumulative Percentage →running total in percentages
Organizing Data and Graphing Data
Graphing Data
a. Histogram - Helps identify
gaps in data.
b. Frequency polygon - Easier to
compare multiple distributions on the
same graph
Organizing Data and Graphing Data
Graphing Data
c. Pie Graph - Visually intuitive for
showing percentages.
d. Bar Graph - Works well for
continuous data and comparisons
Organizing Data and Graphing Data
Graphing Data
e. Line Graph - Highlights
increases, decreases, and
fluctuations
f. Box Plot (box and whiskers) - Shows
the spread and decrease or increase of
a group of data.
Organizing Data and Graphing Data
Graphing Data
g. Comparing Histograms and
Frequency Polygons
Measures of Central Tendency
A measure of central tendency is a summary score
that represents a set of scores. (Ravid Ruth, 2011)
Mode
Definition: The mode of a
distribution is the score that
occurs with the greatest
frequency in that distribution.
Example :
Measures of Central Tendency
Median
Definition: The middle value when a set of data values has been ordered
from lowest to highest value.
Example:
Suppose we have the following set of 6 scores:
Scores: 10, 12, 13, 13, 15, 16
Step 1: Arrange the scores in order (already done)
Step 2: Count the number of scores
Step 3: Find the middle two scores
Step 4: Calculate the median
Measures of Central Tendency
Mean
Definition: The mean, which is also called the arithmetic mean, is obtained by
adding up the scores and dividing that sum by the number of scores. The
mean is sometimes called the arithmetic mean and the average.
Formula for the Mean:
Mean :
Example:
Suppose we have the following test scores:
70, 80, 85, 90, 95
Step 1: Add all the values
Step 2: Count the number of values
Step 3: Apply the formula
Measures of
Central Tendency
Comparing the Mode,
Median, and Mean
Measures of Variability
The Range
Range is the difference between the highest and lowest values in a dataset. It
gives a quick sense of how spread out the numbers are.
Example: You have data 2, 4, 4, 6, 10 →Range = 10 − 2 = 8
Strengths: Easy to compute.
Weaknesses: Based only on 2 values →highly affected by outliers.
Standard Deviation and Variance
a. The deviation score is the distance of the raw score from the mean,
indicated by X – X– (i.e., the score minus the mean). The sum of the deviation
scores (i.e., the distances between the raw scores and the mean of that
distribution) is always 0 (zero).
Measures of Variability
b. The variance is the mean of the squared deviations. To calculate it, square
each deviation score, add all the squared deviations, and divide their sum by
n – 1 (the number of scores minus 1) for the sample variance.
Measures of Variability
a. Step 1: Find the mean (average)
Add them: 2 + 4 + 6 = 12
Divide by (n) 3 →mean = 4
b. Step 2: Find the deviation (difference from the mean)
2 − 4 = −2
4 − 4 = 0
6 − 4 = +2
So deviations are: −2, 0, +2
c. Step 3: Square the deviations (to remove minus)
(−2)² = 4
0² = 0
(+2)² = 4
So squared deviations: 4, 0, 4
Measures of Variability
Step 4: Find the variance
Add them: 4 + 0 + 4 = 8
If we use population variance (all data included) →divide by 3 →8 ÷ 3 = 2.67
If we use sample variance (just a small part of bigger data) → divide by (3 −
1) = 2 →8 ÷ 2 = 4
Step 5: Find the standard deviation (the square root of variance)
Population SD = √2.67 ≈1.63
Sample SD = √4 = 2
✅Simple rule:
If you have all the data →use divide by n (population).
If you only have a sample from a bigger population → use divide by n − 1
(sample).
Measures of Variability
Computing the Variance and SD for Populations and Samples
1.Population: When you have all the data (the whole group), divide by N
(the number of scores).
2.Sample: When you only have part of the group (a sample), divide by n-1.
This adjustment (called Bessel’s correction) makes the result more
accurate.
3.Steps (conceptual, no formula):
a.Find the mean.
b.See how far each score is from the mean (deviation).
c.Square those deviations (to avoid negatives).
d.Average them →this is variance.
e.Take the square root →this is standard deviation (SD).
Measures of Variability
Using the Variance and SD
1.Variance and SD show how spread out the data is.
2.Small SD = scores are close to the mean (less variation).
3.Big SD = scores are spread out (more variation).
4.Uses in research:
To compare two groups (which group is more consistent?).
To understand reliability (stable or unstable scores?).
To identify how much individual scores differ from the average
Measures of Variability
Variance and SD in Distributions with Extreme Scores
1.Extreme scores (outliers) increase the variance and SD a lot.
2.Example: if most students score 80–90, but one student scores 20, the SD
becomes much larger.
3.That’s why researchers check data for outliers before analysis.
Measures of Variability
Factors Affecting the Variance and SD
1.Range of scores – wider range = bigger SD.
2.Mean differences – scores clustered near
the mean = smaller SD.
3.Outliers – extreme values increase SD.
4.Sample size – smaller samples tend to have
more unstable SD; bigger samples give a
more reliable SD.
THANKS FOR YOUR KIND ATTENTION
DISCUSSION SEASON
QUIZ SESSION!
- LINK -

Group1_Descriptive Statictis and Data Presentation.pdf

  • 1.
    DESCRIPTIVE STATISTICS AND DATAPRESENTATION DESCRIPTIVE STATISTICS AND DATA PRESENTATION ANALISIS DATA BY GROUP 1
  • 2.
    GROUP 1 GROUP 1 TariAgustin 25021340008 Karina Septiani 25021340011 Andika Guruh S. 25021340017 M. Dani Andhika P. 25021340018
  • 3.
    Introduction to DescriptiveStatistics Definition: Statistical methods used to summarize, organize, and present data in a meaningful way. Descriptive analysis provides a concise and meaningful way to convey the main features of a dataset using both numerical measures and visual representations (Blbas, 2024). b. Purpose: To describe patterns in data. To make large amounts of information easier to understand. To prepare data for further analysis (e.g., inferential statistics). c. Difference from Inferential Statistics: Descriptive →summarizes data you already have. Inferential →makes predictions or generalizations beyond the data.
  • 4.
    Organizing Data andGraphing Data It is essential to create frequency distributions to facilitate researchers in describing, summarizing, and reporting their data. a. Frequency Distributions A frequency distribution is a data presentation tool in the form of columns and rows (tables), which contains numbers that depict the frequency distribution of the variables being studied.
  • 5.
    Organizing Data andGraphing Data Example: Table 1. Scores of Twenty-five Students on a Thirty-item Social Studies Test Table 2. A Frequency Distribution of a Thirty- item Test Ordered from the Highest to the Lowest Score: Test Scores of Twenty-five Students
  • 6.
    Organizing Data andGraphing Data Various types of frequency distributions Single Data Frequency Distribution Single data frequency distribution is the distribution of numerical data without grouping the variable values (ungrouped data). Group Data Frequency Distribution Group Data Frequency Distributionis a presentation of data from variable values that are large or varied, to facilitate analysis and presentation. Steps: 1. Determine the range Range = Largest data – smallest data 2. Determine the number of interval classes using Sturges' formula Number of classes = 1 +(3.3) log n 3. Determine the length of the interval class (p) using the following formula: p = n/range
  • 7.
    Organizing Data andGraphing Data Various types of frequency distributions Absolute frequency distribution is a number that indicates the amount of data in a particular group. Absolute Frequency Distribution Relative Frequency Distribution A relative frequency distribution is a percentage that indicates the amount of data in a particular group.
  • 8.
    Organizing Data andGraphing Data Class Intervals Class Interval is a range of values into which data is grouped for the purpose of organizing, summarizing, and analyzing data efficiently. When we have many scores (20 or more), it is better to put them into groups. These groups are called class intervals (e.g., 20–25, 26–30). Problem: We lose details → if 4 students are in 20–25, we don’t know their exact scores.
  • 9.
    Organizing Data andGraphing Data Rules for Class Intervals All intervals must be the same size. A score can be in only one interval, not two. Better to use an odd number (5, 7, 9) →midpoint is a whole number.
  • 10.
    Organizing Data andGraphing Data Cumulative Frequency Distribution A cumulative frequency distribution shows the total number of data values that fall at or below a certain point. It is useful for finding how many values are below or above a specific score. A cumulative frequency table usually includes: Score / Class Interval →the data values Frequency →how often each value occurs Percent Frequency →frequency as a percentage Cumulative Frequency →running total of frequencies Cumulative Percentage →running total in percentages
  • 11.
    Organizing Data andGraphing Data Graphing Data a. Histogram - Helps identify gaps in data. b. Frequency polygon - Easier to compare multiple distributions on the same graph
  • 12.
    Organizing Data andGraphing Data Graphing Data c. Pie Graph - Visually intuitive for showing percentages. d. Bar Graph - Works well for continuous data and comparisons
  • 13.
    Organizing Data andGraphing Data Graphing Data e. Line Graph - Highlights increases, decreases, and fluctuations f. Box Plot (box and whiskers) - Shows the spread and decrease or increase of a group of data.
  • 14.
    Organizing Data andGraphing Data Graphing Data g. Comparing Histograms and Frequency Polygons
  • 15.
    Measures of CentralTendency A measure of central tendency is a summary score that represents a set of scores. (Ravid Ruth, 2011) Mode Definition: The mode of a distribution is the score that occurs with the greatest frequency in that distribution. Example :
  • 16.
    Measures of CentralTendency Median Definition: The middle value when a set of data values has been ordered from lowest to highest value. Example: Suppose we have the following set of 6 scores: Scores: 10, 12, 13, 13, 15, 16 Step 1: Arrange the scores in order (already done) Step 2: Count the number of scores Step 3: Find the middle two scores Step 4: Calculate the median
  • 17.
    Measures of CentralTendency Mean Definition: The mean, which is also called the arithmetic mean, is obtained by adding up the scores and dividing that sum by the number of scores. The mean is sometimes called the arithmetic mean and the average. Formula for the Mean: Mean : Example: Suppose we have the following test scores: 70, 80, 85, 90, 95 Step 1: Add all the values Step 2: Count the number of values Step 3: Apply the formula
  • 18.
    Measures of Central Tendency Comparingthe Mode, Median, and Mean
  • 19.
    Measures of Variability TheRange Range is the difference between the highest and lowest values in a dataset. It gives a quick sense of how spread out the numbers are. Example: You have data 2, 4, 4, 6, 10 →Range = 10 − 2 = 8 Strengths: Easy to compute. Weaknesses: Based only on 2 values →highly affected by outliers. Standard Deviation and Variance a. The deviation score is the distance of the raw score from the mean, indicated by X – X– (i.e., the score minus the mean). The sum of the deviation scores (i.e., the distances between the raw scores and the mean of that distribution) is always 0 (zero).
  • 20.
    Measures of Variability b.The variance is the mean of the squared deviations. To calculate it, square each deviation score, add all the squared deviations, and divide their sum by n – 1 (the number of scores minus 1) for the sample variance.
  • 21.
    Measures of Variability a.Step 1: Find the mean (average) Add them: 2 + 4 + 6 = 12 Divide by (n) 3 →mean = 4 b. Step 2: Find the deviation (difference from the mean) 2 − 4 = −2 4 − 4 = 0 6 − 4 = +2 So deviations are: −2, 0, +2 c. Step 3: Square the deviations (to remove minus) (−2)² = 4 0² = 0 (+2)² = 4 So squared deviations: 4, 0, 4
  • 22.
    Measures of Variability Step4: Find the variance Add them: 4 + 0 + 4 = 8 If we use population variance (all data included) →divide by 3 →8 ÷ 3 = 2.67 If we use sample variance (just a small part of bigger data) → divide by (3 − 1) = 2 →8 ÷ 2 = 4 Step 5: Find the standard deviation (the square root of variance) Population SD = √2.67 ≈1.63 Sample SD = √4 = 2 ✅Simple rule: If you have all the data →use divide by n (population). If you only have a sample from a bigger population → use divide by n − 1 (sample).
  • 23.
    Measures of Variability Computingthe Variance and SD for Populations and Samples 1.Population: When you have all the data (the whole group), divide by N (the number of scores). 2.Sample: When you only have part of the group (a sample), divide by n-1. This adjustment (called Bessel’s correction) makes the result more accurate. 3.Steps (conceptual, no formula): a.Find the mean. b.See how far each score is from the mean (deviation). c.Square those deviations (to avoid negatives). d.Average them →this is variance. e.Take the square root →this is standard deviation (SD).
  • 24.
    Measures of Variability Usingthe Variance and SD 1.Variance and SD show how spread out the data is. 2.Small SD = scores are close to the mean (less variation). 3.Big SD = scores are spread out (more variation). 4.Uses in research: To compare two groups (which group is more consistent?). To understand reliability (stable or unstable scores?). To identify how much individual scores differ from the average
  • 25.
    Measures of Variability Varianceand SD in Distributions with Extreme Scores 1.Extreme scores (outliers) increase the variance and SD a lot. 2.Example: if most students score 80–90, but one student scores 20, the SD becomes much larger. 3.That’s why researchers check data for outliers before analysis.
  • 26.
    Measures of Variability FactorsAffecting the Variance and SD 1.Range of scores – wider range = bigger SD. 2.Mean differences – scores clustered near the mean = smaller SD. 3.Outliers – extreme values increase SD. 4.Sample size – smaller samples tend to have more unstable SD; bigger samples give a more reliable SD.
  • 27.
    THANKS FOR YOURKIND ATTENTION DISCUSSION SEASON
  • 28.