The document provides information on statistics, frequency distributions, measures of central tendency (mean, median, mode), and how to calculate and interpret them. It defines statistics, descriptive and inferential statistics, and frequency distributions. It outlines the steps to construct a frequency distribution and calculate the mean, median, and mode for both ungrouped and grouped data. Examples are provided to demonstrate calculating each measure of central tendency.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
2. A. Definition of Statistics
Statistics is a branch of science, which
deals with the collection, presentation,
analysis and interpretation of quantitative
data.
4. B. Frequency Distribution
Frequency distribution is a tabular arrangement of data into
appropriate categories showing the number of observations in
each category or group.
There are two major advantages:
(a) it encompasses the size of the table and
(b) it makes the data more interpretive.
5. Parts of Frequency Table
1. Class limit is the groupings or categories defined by the lower and upper limits
Example: LL - UL
2. Class size (c.i) is the width of each class interval
Example: LL – UL
10 – 14
15 – 19
20 – 24
The class size in this score distribution is 5
6. Parts of Frequency Table
3. Class boundaries are the numbers used to separate each category in the frequency
distribution but without gaps created by the class limits. The scores of the students are
discrete. Add 0.5 to the upper limit to get the upper class boundary and subtract 0.5 to the
lower limit to get the lower class boundary in each group or category.
4. Class mark are the midpoint of the lower and upper class limits. The formula is Xm=
𝐿𝐿−𝑈𝐿
2
.
Example: LL – UL Xm
10 – 14 12
15 – 19 17
20 – 24 22
7. Steps in Constructing
Frequency Distribution
1. Compute the value of the range ( R). Range is the difference between highest score and the
lowest score.
R= HS – LS
Determine the class size (c.i). The class size is the quotient when you divide the range by the
desired number of classes or categories. The desired number of classes are usually 5, 10 or
15 and they depend on the number of scores in the distribution. If the desired number of
classes is not identified, find the value of k, where k = 1 + 3.3 log n.
c.i=
𝑅
𝑑𝑒𝑠𝑖𝑟𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
or c.i =
𝑅
𝐾
8. Steps in Constructing
Frequency Distribution
2. Set up the class limits of each class category. Each class defined by the
lower limit and upper limit. Use the lowest score as the lower limit of the first
class
3. Set up the class boundaries if needed. Use the formula,
Cb =
𝐿𝐿 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑒𝑐𝑜𝑛𝑑 𝑐𝑙𝑎𝑠𝑠−𝑈𝐿 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑐𝑙𝑎𝑠𝑠
2
4. Tally the scores in the appropriate classes
5. Find the other parts if necessary, such as class marks, among others.
9. Example: Raw scores of 40 students in a 50-item mathematics
quiz. Construct a frequency distribution following the steps given.
17 25 30 33 25 45 23 19
27 35 45 48 20 38 39 18
44 22 46 26 36 29 15-LS 21
50-HS 47 34 26 37 25 33 49
22 33 44 38 46 41 37 32
R = HS – LS
= 50 – 15
R = 35
n = 40
10. Solve for the value of k and the class size
k= 1 + 3.3 log n
= 1 + 3.3 log 40
= 1 + 3.3(1.6021)
= 1 + 5.2868
= 6.287
k = 6
c.i =
𝑅
𝐾
c.i =
35
6
c.i = 5.833
c.i = 6
11. Construct the class limit starting with the lowest score as the lower
limit of the first category. The last category should contain the highest score in
the distribution. Each category should contain 6 as the size of the width (x).
count the number of scores that falls in each category (f)
x Tally Frequency (f)
15-20 ⁄ ⁄ ⁄ ⁄ 4
21-26 ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ 9
27-32 ⁄ ⁄ ⁄ 3
33-38 ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ 10
39-44 ⁄ ⁄ ⁄ ⁄ 4
45-50 ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ 10
n=40
12. Find the class boundaries and class marks of the
given score distribution.
x Frequency (f) Class
boundaries
Xm
15-20 4 14.5 -20.5 17.5
21-26 9 20.5 – 26.5 23.5
27-32 3 26.5 -32.5 29.5
33-38 10 32.5 – 38.5 35.5
39-44 4 38.5 – 44.5 41.5
45-50 10 44.5 – 50.5 47.5
n=40
13. Graphical Representation of Scores in
Frequency Distribution
Histogram consist of a set of rectangles having bases on the
horizontal axis which centers at the class marks. The base
widths correspond to the class size and the height of the
rectangles corresponds to the class frequencies. Histogram is
best used for graphical representation of discrete data or non-
continuous data.
14. Graphical Representation of Scores in
Frequency Distribution
Frequency polygon is constructed by plotting the class marks
against the class frequencies. The x-axis corresponds to the
class marks and the y-axis corresponds to the class
frequencies. Connect the points consecutively using straight
line. Frequency polygon is best used in representing
continuous data such as the scores of students in a given test.
15. Construct a histogram and frequency polygon
using the frequency distribution of 40
students in 50-item mathematics quiz.
x Frequency (f)
15-20 4
21-26 9
27-32 3
33-38 10
39-44 4
45-50 10
n=40
16. Class Boundaries
Histogram of 40 students in 50-items Mathematics quiz
0
2
4
6
8
10
12
14.5 20.5 26.5 32.5 38.5 44.5
Series 1
17. Class marks
Frequency polygon of 40 students in 50 items
mathematics quiz
4
9
3
10
4
10
0
2
4
6
8
10
12
17.5 23.5 29.5 35.5 41.5 47.5
19. MEAN
Arithmetic Mean, often called as the mean
is the most frequently used measure of
central tendency. It is the sum of all n values
divided by the total frequency.
20. MEAN FOR UNGROUPED DATA
FORMULA:
𝑀𝑒𝑎𝑛 =
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
𝑥 =
∑𝑋
𝑛
Where: 𝑥 = sample Mean (it is read “X bar”)
x = is the value of an observation
∑x= sum of all Xs
n = is the total number of observation
21. Example 1. Scores of 15 students in Mathematics 1 quiz consist of 25 items. The
highest score is 25 and the lowest score is 10. Here are the scores: 25, 20, 18, 18,
17, 15, 15, 15, 14, 14, 13, 12, 12, 10, 10. Find the mean in the following scores;
X(scores)
25
20
18
18
17
15
15
15
14
14
13
12
12
10
10
∑x= 228
n= 15
𝑥 =
∑𝑋
𝑛
=
228
15
= 15.2
22. Example 2: Find the Grade point Average (GPA) of Ritz Glenn
for the first semester of the school year 2019-2020. Use the table
below
Subjects Grade(Xi) Units(Wi) (Xi) (Wi)
BM 112 1.25 3 3.75
BM 101 1.00 3 3.00
AC 103N 1.25 6 7.50
BEC 111 1.00 3 3.00
MGE 101 1.50 3 4.50
MKM 101 1.25 3 3.75
FM 111 1.50 3 4.50
PEN 2 1.00 2 2.00
∑(Wi)=26 ∑(Xi) (Wi) = 32.00
𝑥 =
∑(𝑤𝑖)(𝑥𝑖)
∑𝑤𝑖
=
32
26
= 1.23
23. MEAN FOR GROUPED DATA
FORMULA: Where:
𝑥 =
∑𝑓𝑥𝑚
𝑛
𝑥 = Mean value
f = frequency in each class or category
𝑥𝑚 = midpoint of each class or category
∑fXm = summation of the product if fXm
24. Steps in Solving Mean for
Grouped Data
1. Find the midpoint or class mark (Xm) of each class or
category using the formula; Xm=
𝐿𝐿+𝑈𝐿
2
2.Multiply the frequency and the corresponding class mark fXm
3.Find the sum of the result in step 2
4.Solve the mean using the formula 𝑥 =
∑𝑓𝑥𝑚
𝑛
25. X f Xm fXm
10-14 5 12 60
15-19 2 17 34
20-24 3 22 66
25-29 5 27 135
30-34 2 32 64
35-39 9 37 333
40-44 6 42 252
45-49 3 47 141
50-54 5 52 260
n= 40 ∑fXm = 1345
Example 3: Scores of 40 students in a science class
consist of 60 items and they are tabulated below.
𝑥 =
∑𝑓𝑥𝑚
𝑛
=
1345
40
= 33.63
26. Properties of Mean
1. It measures stability. Mean is the most stable among other measures of
central tendency because every score contributes to the value of the
mean.
2. The sum of each score’s distance from the mean is zero
3. It is easily affected by the extreme scores
4. It may not be an actual score in the distribution
5. It can be applied to interval level of measurement
6. It is very easy to compute.
27. When to used the Mean
1.Sampling stability is desired
2.Other measures are to be computed such as standard
deviation, coefficient of variation and skewness.
28. MEDIAN
Median is the middle value in a set of
observations arranged from highest to lowest or
vice versa.
29. MEDIAN FOR UNGROUPED DATA
To determine the value of median for ungrouped we need to
consider two rules:
1.If n is odd, the median is the middle ranked.
2.If n is even, then the median is the average of the two middle
ranked values
Median (ranked value) =
𝑛+1
2
30. Example: Find the median of the ages of 9 middle-management
employees of a certain company. The ages are 53,
45,59,48,54,46,51,58 and 55
Step 1: Arrange the data in order
45, 46, 48, 51, 53, 54, 55,58,59
Step 2: Select the middle rank value using the formula
Median (ranked value) =(𝑛+1)/2 = (9+1)/2 = 10/2 = 5
Step 3: Identify the median in the data set
45, 46, 48, 51, 53, 54, 55,58,59
31. MEDIAN FOR GROUPED DATA
Formula:
Median (Ranked value) =
𝑁
2
Median = 𝐿𝐵 +
𝑛
2
− 𝑐𝑓
𝑓𝑚
i
Where: LB= lower boundary of the median class
N = number of sample size
cf = cumulative frequency before the median class
fm= frequency of the median class
ci= size of the class interval
32. Steps
1. Determine the Median Class using the formula
Median (Ranked value) =
𝑁
2
=
40
2
= 20
2. Construct a cumulative frequency column in the table
3. Identify the median class by locating the 20th ranked in the table
4. Determine the values of LB, cfp, fm, ci and n
LB = 45 – 0.5 = 44.5
ci= 27- 18 = 9 or 35-26 = 9
cf = cumulative frequency before the median class if the scores are
arranged from lowest to highest value
f= frequency of the median class
33. X f Xm fXm cf<
10-14 5 12 60 5
15-19 2 17 34 7
20-24 3 22 66 10
25-29 5 27 135 15
30-34 2 32 64 17
35-39 9 37 333 26
40-44 6 42 252 32
45-49 3 47 141 35
50-54 5 52 260 40
n= 40 ∑fXm = 1345
Example 3: Scores of 40 students in a science class consist of
60 items and they are tabulated below. The highest score is 54
and the lowest is 10.
34. Steps
5. Apply the formula
Median = 𝐿𝐵 +
𝑛
2
− 𝑐𝑓𝑝
𝑓𝑚
ci
= 34.5 +
40
2
−17
9
5
= 34.5 +
20−17
14
5
= 34.5 + 1.67
= 36.17
35. Properties of the Median
1.It may not be an actual observation in the
data set
2.It can be applied in ordinal level
3.It is not affected by extreme values
because median is a positional measure
36. When to Use the Median
1.The exact midpoint of the score
distribution is desired.
2.There are extreme scores in the
distribution
37. MODE
Mode is the observation which occurs
the most often in a set of values.
38. MODE
A data set that has only one value that occur the greatest
frequency is said to be UNIMODAL. If the data has two values with
the same greatest frequency, both values are considered the mode
and the data set is BIMODAL. TRIMODAL is a distribution of scores
that consists of three modes or MULTIMODAL is a distribution of
scores that consists of more than two modes. There are some cases
when a data set values have the same number frequency when this
occur, the data set is said to be NO MODE.
39. Scores of Section A Scores of Section B Scores of Section C
25 25 25
24 24 25
24 24 25
20 20 22
20 18 21
20 18 21
16 17 21
12 10 18
10 9 18
7 7 18
Example 1: Scores of 10 students in section A, Section B, and
Section C.
40. MODE FOR GROUPED DATA
Formula:
MODE (𝑋) = 𝐿𝐵 +
𝑑1
𝑑1+𝑑2
c.i
Where: LB= lower boundary of the modal class
Modal class(MC)= is a category containing the highest frequency
𝑑1 = difference between the frequency of the modal class and the
frequency above it, when the scores are arranged from lowest to highest
𝑑2 = difference between the frequency of the modal class and the
frequency below it, when the scores are arranged from lowest to highest
c.i = size of the class interval
41. X f cf<
10-14 5 5
15-19 2 7
20-24 3 10
25-29 5 15
30-34 2 17
35-39 9 26
40-44 6 32
45-49 3 35
50-54 5 40
n= 40
Example 2: Scores of 40 students in a science class consist of
60 items and they are tabulated below.
42. Solution
Formula:
MODE (𝑋) = 𝐿𝐵 +
𝑑1
𝑑1+𝑑2
c.i
Where: LB= lower boundary of the modal class
Modal class(MC)= is a category containing the highest frequency
𝑑1 = difference between the frequency of the modal class and the
frequency above it, when the scores are arranged from lowest to highest
𝑑2 = difference between the frequency of the modal class and the
frequency below it, when the scores are arranged from lowest to highest
c.i = size of the class interval
44. Properties of the Mode
1.It can be used when the data are qualitative
as well as quantitative
2.It may not be unique
3.It is not affected by extreme values
4.It may not exist
45. When to Use the Mode
1.When the “typical” value is desired
2.When the data set is measured on a
nominal scale.
50. Quartiles for Ungrouped data
Qk =
𝑘(𝑁+1)
4
Where: Qk = Quartile
N = population
k= quartile location
51. Example 1: Find the first, second and third quartiles of the ages of 9
middle-management employees of a certain company. The ages are
53,45,59,48,54,46,51,58 and 55.
Step 1. Arrange the data in order
45, 46, 48, 51, 53, 54, 55, 58, 59
Step 2. Select the first, second and third quartiles value using
the formula.
𝑄1 =
1(𝑁+1)
4
=
1(9+1)
4
=
10
4
= 2.5
𝑄2 =
2(𝑁+1)
4
=
2(9+1)
4
=
2(10)
4
= 5
𝑄3 =
3(𝑁+1)
4
=
3(9+1)
4
=
3(10)
4
= 7.5
52. Step 3. Identify the first, second, and third quartiles in the data
set
45, 46, 48, 51, 53, 54, 55, 58, 59
2.5th 5th 7th
𝑄1 =
46+48
2
=
94
42
= 47
𝑄3 =
55+58
2
=
113
2
= 56.5
53. Quartiles for Grouped data
Qk = LB +
𝐾𝑁
4
−𝑐𝑓
𝑓
(i)
Where: Qk = Quartile
N = population
k= quartile location
LB= lower boundary of the quartile class
f= frequency of the quartile class
cf= cumulative frequency before the quartile class
i = class interval
54. Example 2: Determine Q1, and Q2 of the frequency distribution on the
ages of 50 people taking travel tours.
Class Limit Frequency
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2
55. Steps
1. Construct a cumulative frequency column in the table
Class Limit Frequency cf<
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
n=50
56. Steps
2. Determine the Q1 class
𝑄1(𝑟𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) =
𝑁
4
=
50
4
=12.5
3. Identify the Q1 class by locating the 12.5th ranked in the table
4. Determine the values of LB, cf, f, and c.i
LB= 35.5 c.i = 9
cf =8 f = 9
58. Steps
5. Applying the same procedure to obtain the values of Q2
Locate the 2nd rank:
2𝑁
4
=
2(50)
4
= 25
Qk = LB +
2𝑁
4
−𝑐𝑓
𝑓
(i)
Q2 = 44.5 +
2(50)
4
−17
14
(9)
Q2 = 44.5 +
25 −17
14
(9)
Q2 = 44.5 + 5.14
Q2 = 49.64
59. Deciles for Grouped data
Dk = LB +
𝐾𝑁
10
−𝑐𝑓
𝑓
(c.i)
Where: Dk = Quartile
N = population
k= Decile location
LB= lower boundary of the quartile class
f= frequency of the Decile class
cf= cumulative frequency before the Decile class
c.i = class interval
60. Example 2: Determine D7 of the frequency distribution on the ages of
50 people taking travel tours.
Class Limit Frequency cf<
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
n=50
62. Percentile for Grouped data
Pk = LB +
𝐾𝑁
100
−𝑐𝑓
𝑓
(c.i)
Where: Pk = Percentile
N = population
k= Percentile location
LB= lower boundary of the Percentile class
f= frequency of the percentile class
cf= cumulative frequency before the percentile class
c.i = class interval
63. Example 3: Determine P22 of the frequency distribution on the ages of
50 people taking travel tours.
Class Limit Frequency cf<
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
n=50
66. MEASURES OF VARIATION
Measures of Variation is a single
value that is used to describe the
spread of the scores in a
distribution.
67. MEASURES OF VARIATION
RANGE = the difference of the highest value and
the lowest value in the data set.
AVERAGE DEVIATION = is the absolute difference
between that element and a given point. Typically
the point from which the deviation is measured is a
measure of central tendency. It is a summary
statistic of statistical dispersion or variability. It is
also called the mean absolute deviation.
68. Range for Ungrouped Data
R = HS – LS
where:
R = range value
HS = highest score
LS = lowest score
69. Example 1: Find the range of the two groups of score
distribution
Group A Group B
10 15
12 16
15 16
17 17
25 17
26 23
28 25
30 26
35 30
𝑅𝐴 = HS – LS
𝑅𝐴 = 35 – 10
𝑅𝐴 = 25
𝑅𝐵 = HS – LS
𝑅𝐵 = 30 – 15
𝑅𝐵 = 15
Analysis
The range of Group A=25 is greater than
the range of Group B= 15. It implies that
the scores in group A are more spread out
than the scores in group B or the scores in
group B are less scattered than the scores
in group A.
70. Range for Grouped Data
R = 𝐻𝑆𝑈𝐵 – 𝐿𝑆𝐿𝐵
where:
R = range value
𝐻𝑆𝑈𝐵 = upper boundary of the highest score
𝐿𝑆𝐿𝐵 = lower boundary of the lowest score
71. Example 2: Find the value of range of the scores of 50
students in mathematics achievement test.
X f
25-32 3
33-40 7
41-48 5
49-56 4
57-64 12
65-72 6
73-80 8
81-88 3
89-97 2
n=50
R = 𝐻𝑆𝑈𝐵 – 𝐿𝑆𝐿𝐵
R = 97.5 – 24.5
R = 15
72. Properties of Range
1. It is quick and easy to understand
2. It is a rough estimation of variation
3. it is easily affected by the extreme scores
Interpretation of Range value
When the range value is large, the scores in the distribution are more dispersed,
widespread or heterogeneous. On the other hand, when the range value is small the scores
in the distribution are less dispersed, less scattered, or homogeneous.
74. Mean Deviation for Ungrouped
Data
MD =
∑⁄𝑥−𝑥⁄
𝑛
Where: MD = Mean deviation value
X = individual score
µ = population mean
𝑋 = sample mean
n= number of cases
75. Example1: Find the mean deviation of the scores of 10 students in a
Mathematics test. Given the scores: 35,30,26,24,20,18,18,16,15,10
Step 1: Compute the mean of the data set
Step 2: Subtract the mean from each of the value in the data
set.
Step 3: Get the absolute values if X –𝑋, then get the sum.
Step. 4” Solve for the Mean deviation using the formula
77. Mean Deviation for Grouped Data
MD =
∑𝑓⁄𝑥𝑚−𝑥⁄
𝑛
Where: MD = Mean deviation value
f = frequency
𝑋𝑚 = class mark or midpoint of each category
𝑋 = mean value
n= number of cases
78. Steps in solving Mean Deviation for Grouped Data
1. Solve for the value of mean
2. Subtract the mean value from each midpoint or class mark
3. Take the absolute value of each difference
4. Multiply the absolute value and the corresponding class frequency
5. Find the sum of the results in step 4
6. Solve for the mean deviation using the formula for grouped data
81. VARIANCE OF UNGROUPED DATA
Formula:
Population Variance (ℴ2
) =
∑ 𝑥𝑚 −𝜇 2
𝑁
Sample Variance (𝑠2
) =
∑ 𝑥𝑚 −𝑥 2
𝑛−1
82. Steps in solving variance of Ungrouped data
1. Solve for the value of mean
2. Subtract the mean value from each score
3. Square the difference between the mean and each score
4. Find the sum of the results in step 3
5. Solve for the population and sample variance using the formula for
ungrouped data
83. Example1: Using the data below, find the variance of the sores of 10
students in a science quiz.
X X -𝑿 (X −𝑿 )𝟐
19 4.4 19.36
17 2.4 5.76
16 1.4 1.96
16 1.4 1.96
15 0.4 0.16
14 -0.6 0.36
14 -0.6 0.36
13 -1.6 2.56
12 -2.6 6.76
10 -4.6 21.16
∑x = 146 ∑(X −𝑿 )𝟐
=60.40
𝑋 = 14.6
ℴ2 =
∑ 𝑥𝑚 −𝑥 2
𝑁
ℴ2 =
60.4
10
ℴ2
= 6.04
𝑠2
=
∑ 𝑥𝑚 −𝑥 2
𝑛−1
𝑠2
=
60.4
10−1
𝑠2 =
60.4
9
𝑠2 = 6.71
84. VARIANCE OF GROUPED DATA
Formula:
Population Variance (ℴ2
) =
∑𝑓 𝑥𝑚 −𝜇 2
𝑁
Sample Variance (𝑠2
) =
∑𝑓 𝑥𝑚 −𝑥 2
𝑛−1
85. Steps in solving Variance of Grouped Data
1. Solve for the value of mean
2. Subtract the mean value from each midpoint or classmark
3. Square the difference between the mean value and midpoint or class
mark
4. Multiply the squared difference and the corresponding class
frequency
5. Find the sum of the results in step 4
6. Solve for the population or sample variance using the formula for
grouped data
91. Interpretation of Standard Deviation
1. If the value of standard deviation is large, on the average, the scores
in the distribution will be far from the mean. Therefore, the scores are
spread out around the mean value. The distribution is also known as
heterogeneous.
2. If the value of standard deviation is small, on the average, the scores
in the distribution will be close to the mean. Hence, the scores are
less dispersed or the scores in the distribution are homogeneous.