STATISTICS-E.pdf

University of La Salette
Santiago City, Inc
A Course Presentation in
STATISTICS
Melissa B. Bacena, MED, MOM

Course Content
• Basic Concepts
• Measures of Central Tendency
• Measures of Variability
• Measures of Correlation
• Non-parametric Test
• Parametric Tests of Hypothesis
• One-Way ANOVA

Introduction to
Statistics
• Definition
• The Nature of Data
• Uses of Statistics
• Methods of Sampling
• Statistics and Computers

Statistics
What is
Statistics?

Definition
❖Statistics is a collection of
methods for planning
experiments, obtaining data and
then organizing, summarizing,
presenting, analyzing,
interpreting and drawing
conclusions based on the data.

collection
• Refers to the gathering of
information

Organization/presenta
tion
• Involves summarizing data or
information in textual,
graphical, or tabular forms

analysis
• Involves describing the data
by using statistical methods
and procedures.

interpretation
• Refers to the process of
making conclusions based on
the analyzed data.

Descriptive statistics
• If you have gathered data from a survey
and have organized them in a systematic,
easy-to-read manner then you have
succeeded in applying the basic principles
of descriptive statistics.

Descriptive statistics
• Among the measurements falling under
descriptive statistics are the measures of
central tendency, measures of variability,
skewness, kurtosis, minimum, maximum,
summation and other items which help in
describing a data set.

Descriptive statistics answers
the questions such as
• How many students are interested to take online
classes?
• What months has the highest and the lowest
number of covid-19 positive?
• What are the most likable Netflix series according
to students?
• Who performed better in the entrance
examination?
• What proportion of the ULS college students likes
online class?

Inferential Statistics
• is a statistical procedure used to draw
inferences for the population on the basis of
the information obtained from the sample.
With inferential statistics, you are going to
try to arrive at conclusions extending beyond
the data alone. You may use it to make
judgments of the possibility that an observed
difference between groups/data is a
dependable one or it just happened due to
chance. It is a matter of deciding between
reality and coincidence.

Inferential statistics can
answer questions such as:
• Is there a significant difference in
the academic performance of
students enrolled in an online and
modular class?
• Is there a significant difference
between the proportions of students
who are interested to take statistics

business
• Business firm collects and gathers data or information
from its everyday operation. Statistics is used to
summarize and describe those data such as the amount of
sales, expenditures, and production to enable the
management to understand and determine the status of
the firm. Data that have been organized and analyzed
provide the management baseline data to make wise
decisions pertaining to the operation of the business.

education
• Through statistical tools, a teacher can
determine the effectiveness of a
particular teaching method by analyzing
test scores obtained by their students.
Results of this study may be used to
improve teaching-learning activities.

Psychology
• Psychologists are able to
interpret meaningful aptitude
tests, IQ tests, and other
psychological tests using
statistical procedures or tools.

medicine
• Statistics is also used in
determining the effectiveness
of new drug products in
treating a particular type of
disease.

agriculture
• Through statistical tools, an agriculturist
can determine the effectiveness of a
new fertilizer in the growth of plants or
crops. Moreover, crop production and
yield can be better analyzed through the
use of statistical methods.

entertainment
• The most favorite actresses and actors can be
determined by using surveys. Ratings of the
members of the board of judges in a beauty
contest are statistically analyzed. Interviews
are used to determine the most widely viewed
television show. The top grosser movies for
this year are reported based on statistical
records of movie houses. All of these activities
involve the use of statistics.

Everyday life
• The number of cars passing through the
streets or a highway are recorded to enable
the traffic enforcers to manage efficiently.
Even the number of pedestrians crossing the
street, the number of people entering a
warehouse of a department store, and the
number of people engaged in video games
involve the use of statistics. In short,
statistics is found and used in everyday life.

Population vs. Sample
❑A population is the complete
collection of elements (scores,
people, measurements, and so on)
to be studied.
❑A sample is a sub-collection of
elements drawn from a population.

Parameter vs. Statistic
➢A parameter is a numerical
measurement describing some
characteristic of a population.
➢A statistic is a numerical
measurement describing some
characteristic of a sample.

Qualitative vs.
Quantitative Data
❖Qualitative (or categorical or
attribute) data can be separated into
different categories that are
distinguished by some non-numerical
characteristic.
❖Quantitative data consist of numbers
representing counts or measurements.

Discrete vs. Continuous Data
g Discrete data result from either a
finite number of possible values or a
countable number of possible values.
(That is, the number of possible
values is 0, 1, 2 or more)
g Continuous data result from
infinitely many possible values that
can be associated with points on a
continuous scale in such a way that
there are no gaps or interruptions.

Nominal Level of
Measurement
➢The nominal level of measurement is
characterized by data that consists
of names, labels, or categories only.
The data cannot be arranged in an
ordering scheme.
➢Example: gender, civil status,
nationality, religion, etc.

Ordinal Level of
Measurement
❖The ordinal level of measurement
involves data that may be arranged in
some order, but differences between
data values either cannot be
determined or are meaningless.
❖Example: good, better or best
speakers; 1 star, 2 star, 3 star movie;
employee rank

Interval Level of
Measurement
❖The interval level of measurement is
like the ordinal level, with the
additional property that meaningful
amounts of differences between data
can be determined. However, there
are no inherent (natural) zero starting
point.
❖Example: body temperature, year
(1955, 1843, 1776, 1123, etc.)

Ratio Level of
Measurement
❑The ratio level of measurement is
the interval modified to include the
inherent zero starting point. For
values at this level, differences and
ratios are meaningful.
❑Example: weights of plastic, lengths
of movies, distances traveled by cars

Determining Adequate
Sample Size

Definition
• Sampling may be defined as
measuring a small portion of
something and then making a general
statement about the whole thing
(Bradfield & Moredock, 1957)

Why we need sampling…
➢Sampling makes possible the study of
a large, heterogeneous population.
➢Sampling is for economy.
➢Sampling is for speed.
➢Sampling is for accuracy.
➢Sampling saves the sources of data
from being all consumed.

There are two general
types of sampling…
❑Probability Sampling
❑Non-Probability Sampling

Probability Sampling
➢The sample is a proportion (a certain
percent) of the population and such
sample is selected from the
population by means of some
systematic way in which every
element of the population has a
chance of being included in the
sample.

Non-Probability Sampling
➢The sample is not a proportion of the
population and there is no system in
selecting the sample. The selection is
dependent on the situation from
which the sample is taken.

Types of Non-Probability
Sampling are…
➢Accidental Sampling
➢Quota Sampling
➢Convenience Sampling

Accidental Sampling
➢The sample elements are
selected by chance.
➢Example: the researcher stands
in a street corner and interviews
everyone who passes by

Quota Sampling
➢Specified number of
elements of certain types
are included in the sample.
➢Example: the number of
viewers to a TV show

Convenience Sampling
➢A process of picking out
elements to constitute a
sample in the most convenient
and fastest way.
➢Example: samples to get
reactions to hot and
controversial issues.

Types of Probability
Sampling are…
➢Pure Random Sampling
➢Systematic Sampling
➢Stratified Sampling
➢Purposive Sampling
➢Cluster Sampling

RANDOM SAMPLING
❖Random Sampling is a sampling technique
where members of the population are
selected in such a way that each member
has an equal chance of being selected.
❖It is also called the lottery or raffle type
of sampling. It uses table of random
numbers.

Stratified Sampling
➢With stratified sampling, the
population is subdivided into at least
two different subpopulations(or
strata) that share the same
characteristics (such as gender), and
then a sample is drawn from each
stratum.

Systematic Sampling
❖In systematic sampling, one chooses
a starting point and then select every
kth (such as every 5th) element in
the population.

Purposive Sampling
❖In purposive sampling, the respondents are
chosen on the basis of their knowledge of
the information desired.
❖Ex: If a research is to be conducted on
the history of a place, the old people of
the place must be consulted and included
in the sample.

Cluster Sampling
❑In cluster sampling, the population
area is divided into sections (or
clusters), a few of those sections are
randomly selected , and then all the
members from the selected sections
are chosen as samples.

1. The
direct or
Interview
Method
2. The Indirect
or Questionnaire
Method
3. The
Registration
Method
4. The
Experimental
Method

Characteristics of a good
questionnaire:
1. It should contain short letter to the respondents
which includes
The purpose of the
study.
An assurance of
confidentiality
The name of the researcher

2. There is a title name for the
questionnaire.
3. It is designed to achieve the
objectives.
4. The directions are
clear.
5. It is designed for easy
tabulation.
6. It avoids the use of double negatives.
7. It avoids double- barreled questions.

Types of
Questionnaire
open
closed
combination
Types of
questions
Multiple
choice
Ranking
Scales
Open- ended

RESEARCH DESIGN
DESCRIPTIVE RESEARCH DESIGN
-Describe a given state of affairs as fully and carefully
as possible.
-summarizes and describes the characteristics(abilities,
opinions, perceptions,etc) of individuals or groups which
are under study.

1. DESCRIPTIVE - COMPARATIVE
-this method involves comparing 2 or more known
groups which differ in some characteristics or
attributes to determine possible differences in
views regarding a particular issue or topic of
interest.
Examples of titles which fall under descriptive-
comparative:
*The Level of Teaching Expertise of Male and Female
Teachers in Magaling University.

*Student Services in XYZ University:An
Assessment
*The Degree of Satisfaction of Freshmen and
Senior Students Regarding Their Teachers’
Teaching Expertise

DESCRIPTIVE- CORRELATIONAL
This method seeks to investigate whether a
relationship exists between two or more variables.
Examples:
Self-confidence in Learning Mathematics and Academic
Performance in Mathematics of Freshmen Students at
ULS.
*Degree of Learner-centeredness Among University
Teachers and the Level of Students’ Intrinsic
Motivation to Learn

Measures of Central
Tendency
• Mean
• Median
• Mode

Mean
• The most reliable and the most
sensitive measure of position.
• It is the most widely used
measure.
• It is commonly known as the
“average” although the median and
the mode are also known as
averages.

Mean:
• It comes into 2 different
forms:
1) Simple Mean
2) Weighted Mean

Example 1:
A study was done on 5 typical fast-food
meals in Metro Manila. The following table
shows the amount of fat, in number of
teaspoons, present in each meal. Calculate
the mean amount of fat for these 5 fast-
food meals.
Fast-food meal A B C D E
Fat (in tsp) 14 18 22 10 16

How to solve the simple
mean:
• The simple mean is obtained by
adding all the values/
observations of a certain
variable and divide the sum by
the total number of values,
cases or observations.

Fast-food meal A B C D E
Fat (in tsp) 14 18 22 10 16
• To obtain the simple mean amount
of fat for the 5 fast-food meals
• Mean = (14+18+22+10+16)/5
• Mean = 80/5 = 16
• This means to say that mean fat
content of the 5 fast-food meals
is too much.

Exercise #2: Find the simple
mean for the following set of data:
• Data A: 17, 19, 25, 14,
18, 24, 11,19
• Data B: 79, 75, 82, 84,
82, 75, 79
• Data C: 35, 32, 37, 42, 45,
33, 41, 44, 35, 38

The simple mean for the
given data are …
• Data A: 18.38
• Data B: 79.43
• Data C: 38.20

Example 2:
• The following represents the final
grades obtained by a nursing
student one summer term:
• Anatomy (5 units) - - - 93
• Chemistry (3 units) - - - 88
• SOT 2 (2 units) - - - 89
– Find the weighted average of the
student.

To solve for the weighted average
of the student we have...
wixi
Mean = ----------
w
93(5) + 88(3) + 89(2)
Mean = --------------------------
10
465 + 264 + 178 907
Mean = ----------------------- = -------- = 90.7 (Excellent)
10 10

Example 3:
• The following represents the responses of
50 randomly chosen respondents in one
item of a research questionnaire:
• Very Strongly Agree (5) - - - 17
• Strongly Agree (4) - - - 11
• Agree (3) - - - 9
• Disagree (2) - - - 12
• Strongly Disagree (1) - - - 1
– Find the weighted response of the
respondents.

To solve for the weighted
response we have...
x w
Mean = ----------
w
where: w =weighted in the point scale
∑xw = the sum of the weight in x’s
∑w = the sum of the weight of x

Table of Interpretation
(5 pt. Likert Scale)
4.50 – 5.00 Very Strongly Agree
3.50 – 4.49 Strongly Agree
2.50 – 3.49 Agree
1.50 – 2.49 Disagree
1.00 – 1.49 Strongly Disagree

• 5(17) + 4(11) + 3(9) + 2(12) + 1(1)
Mean = ------------------------------------------
50
85+44+27+24+1 181
Mean = ----------------------- = -------- = 3.62 (Strongly Agree)
50 50

Table 1
Items 5 4 3 2 1
1. I think Statistics is a worthwhille,
necessary subject
15 26 9
2. I’ll need a good understanding of
Statistics for my research work
5 6 29 10
3. I would like working with
computers
28 12 10
4. I think working with computers
would be enjoyable and stimulating.
30 18 2
Item Mean Distribution on Online Readiness Survey of
Students Taking Statistics

The Median
What is
the
Median?

The median is . . .
• A positional measure that divides
the set of data exactly into two
parts.
• It is the score/observation that is
centrally located between the
highest and the lowest observation.
• Determined by rearranging the data
into an array.

n + 1
X = -------
2
n n
X = --- + --- + 1
2 2
--------------
2
Median for Odd Sample Median for Even Sample

Using the data
in Example 1,
find the
median fat
content of the
5 meals.

The array for the data A is :
10, 14, 16, 18, 22
• To obtain the median fat
content of the 5 meals we have
to use the median formula for
odd sample since n = 5.
• Median = [(n + 1)/2]s
• Median = (5 + 1)/2
• Median = 3rd item = 16

Median for
Even Sample
What is
even?

The following are samples scores
obtained from a 75 item summative test:
(n= 12) 48, 53, 63, 65, 45, 47, 52, 48,
63, 54, 63, 53
• Since n = 12 (even).
• Median = [ 6th
s + 7th
s /2]
• Median = [(53 + 54)/2] = 53.5
Array : 45, 47, 48, 48, 52, 53, 54, 55, 63, 63, 63, 65

Find the median for
Exercise #2.

The mode is …
➢The most favorite score.
➢The score having the highest
frequency.
➢The most frequently occurring score.
➢The least reliable measure of position
➢Determined by way of inspection.

A set of data is said to
be …
• Unimodal or monomodal if it
has only one mode.
• Example: 33, 35, 35, 38,
40, 46
• Its mode is 35.

A set of data is said to
be …
• Bimodal if it has two modes.
• Example: 33, 35, 35, 38,
40, 40, 46
• Its modes are 35 and 40.

A set of data is said to be …
• Multimodal if it has more than
two modes.
• Example: 33, 35, 35, 38, 40,
40, 46, 46, 51, 58, 58, 60
• Its modes are 35, 40, 46 and
58.

Assignment #1: Find the mean,
median and the mode of the ff:
1. 85, 82, 83, 88, 85, 87, 89,
90
2. 12, 14, 20, 19, 23, 22, 28
3. 24, 34, 27, 27, 34, 24
4. 102, 100, 111, 100, 106, 102
5. 75, 86, 78, 84, 88, 86, 84,
85, 81, 84, 80

THREE METHODS OF PRESENTING DATA
1. TEXTUAL PRESENTATION:
-The data gathered are presented in paragraph
form
- Data are written and read
-it is a combination of texts and figures

2. TABULAR PRESENTATION
- Data collected are presented in the form of rows and columns
-data presented can be easily understood and easily be used for
comparison and facilitate analysis of relationship between and
among the variables
MAJOR PARTS OF STATISTICAL TABLE
1. Table Heading – consists of table number and title
2. Stubs – classification or categories are found at the left side of the
body of the table.
3. Box head – the top of the column
- It identifies what are contained in the column

-included in the box head are the stub head, the
master caption and the column captions
4. Body – the main part of the table
This contains the substance or the figures on one’s
data
Stub Head
Master Caption
Column
Caption
Column
Caption
Column
Caption
Row caption
total
Box
head
Table Number Table Heading
Title
Body

Other ways of
presenting
data are . . .

BAR CHART
0
10
20
30
40
50
60
70
80
90
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North

LINE GRAPH
0
10
20
30
40
50
60
70
80
90
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North

PIE CHART
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr

Scatter Plot
0
10
20
30
40
50
60
70
80
90
100
0 1 2 3 4 5
East
West
North

What is a Frequency
Distribution?
• A Frequency
Distribution is a tabular
representation of data
consisting of intervals
and their respective
frequencies.

How to construct a
Frequency Distribution:
• Determine the range. R = H0 –
LO.
• Determine the class size (c) using
the formula, c = (R+1)/ #CI.
• Construct the interval
• Tally the data and determine the
frequency for each interval.

The class interval in a
frequency distribution must:
• Not overlap.
• Be relatively complete where
each data can be tallied in the
different interval.
• Have a uniform class size.
• Not be less than 7 but not
more than 15.

Data:
77 77 85 72 69 80 75 69 80 64
72 68 48 60 44 87 52 74 72 76
63 81 56 71 54 76 81 78 55 74
82 59 40 73 61 80 58 75 63 48
46 51 80 42 65 54 79 57 72 67

Frequency Distribution
Class Interval f % CF< %
82-87 3 6% 50 100%
76-81 12 24% 47 94%
70-75 10 20% 35 70%
64-69 6 12% 25 50%
58-63 6 12% 19 38%
52-57 6 12% 13 26%
46-51 4 8% 7 14%
40-45 3 6% 3 6%
50 100%

Class Interval f Lb CF<
82-87 3 81.5 50
76-81 12 75.5 47
70-75 10 69.5 35
64-69 6 63.5 25
58-63 6 57.5 19
52-57 6 51.5 13
46-51 4 45.5 7
40-45 3 39.5 3
50
25 - 19
Median = 63.5 + 6 (----------)
6
Median = 69.5

C (d1)
Mode = Lbo + ------------
(d1 + d2 )
Where: Lb0 is the lower boundary of the modal class
d1 is the difference in the frequency of the modal
class with the frequency of the class interval
before the modal class
d2 is the difference in the frequency of the modal
class with the frequency of the class interval
after the modal class
Modal Class is the class interval with the highest frequency

Class Interval f Lb CF<
82-87 3 81.5 50
76-81 12 75.5 47
70-75 10 69.5 35
64-69 6 63.5 25
58-63 6 57.5 19
52-57 6 51.5 13
46-51 4 45.5 7
40-45 3 39.5 3
50
2
Mode = 75.5 + 6 (------)
2 + 9
Mode = 76.59

scores frequency
94 - 105 4
82 - 93 6
70 -81 9
58 - 69 5
46 -51 6
34 - 45 6
22 -33 6
10- 21 8
N= 50
The table shows the weekly savings of 50
students at University of La Salette from
their allowances.
Compute for their mean savings, median and mode

scores f x <cf fx
94 - 105 4 99.5 50 398
82 - 93 6 87.5 46 525
70 -81 9 75.5 40 679.5
58 - 69 5 63.5 31 317.5
46 -51 6 51.5 26 309
34 - 45 6 39.5 20 237
22 -33 6 27.5 14 165
10- 21 8 15.5 8 124
N= 50 2755
The table shows the weekly saving of 50
students at University of La Salette from
their allowances

Uses of the Measures
of Central Tendency

The Mean is used…
✓ For interval and ratio measurements
✓ When there are no extreme values in a
distribution since it is easily affected by
extremely high or extremely low scores
✓ When higher statistical computations are
wanted
✓ When the greatest reliability of the
measure of central tendency is wanted
since its computations include all the given
values

The Median is used…
✓ For ordinal and ranked measurements
✓ When there are extreme values, thus the
distribution is markedly skewed
✓ For an open-end distribution; that is, the
lowest or the highest class interval or both
are defined (i.e., 50 and below or 100 and
above)
✓ When one desires to know whether the
cases fall within the upper halves or the
lower halves of a distribution.

The Mode is used…
✓For nominal and categorical data
✓When a rough or quick estimate of a
central value is wanted
✓When the most popular or the most
typical case or value in a distribution
is wanted

Limitations of the
Measures of Central
Tendency

The Limitations of the Mean…
✓ It is the most widely used average, since it
is the most familiar. However, it is often
misused. It can not be used if the
clustering of values. Or items is not
substantial.
✓ If the given values do not tend to cluster
around a central value, the mean is a poor
measure of central location.
✓ It is easily affected by extremely large or
small values. One small value can easily pull
down the mean.

The Limitations of the Mean…
✓ The mean can not be used to compare
distributions since the means of 2 or more
distributions may be the same but their
other characteristics may be entirely
different. The means of distribution A
whose values are 80, 85 and 90 and
distribution B whose values are 86, 85, 84
are both 85. We can not imply, however,
that both distributions possess the same
characteristics since their patterns of
dispersions or variations are markedly
different despite having the same mean.

The Limitations of the Median…
✓ It is easily affected by the number of
items in a distribution.
✓ It can not be determined if the given values
are not arranged according to magnitude
✓ If several values are contained in a
distribution, it becomes laborious task to
arrange them according to magnitude
✓ Its value is not as accurate as the mean
since it is just an ordinal statistic.

The Limitations of the Mode…
✓It is seldom or rarely used since it
does not always exist.
✓Its value is just a rough estimate of
the center of concentration of a
distribution.
✓It is very unstable since its value
easily changes depending on the
approaches used in finding it.

MEASURE OF POSITION or QUANTILES
Used to describe the location of a specific piece of data in
relation to the rest of the sample data.
1. QUARTILES are number values of the variable that divide the
ranked data into quarters. Each set of data has 3 quartiles
3. PERCENTILES are number values of the variable that divide a
set of ranked data into 100 equal parts. Each set of data has 99
percentiles.
2. DECILES are number values of the variable that divide a set of
ranked data into 10 equal parts. Each set of data has 9 deciles.

Measures of Variability
• The statistical tool used to
describe the degree to
which scores/ observations
are scattered/dispersed.
• It is also used to determine
the degree of consistency/
homogeneity of scores.

Measures of Variability
✓Range
✓Mean Deviation
✓Standard Deviation
✓Variance
✓Coefficient of Variation

The following are the scores obtained by
two groups of 2nd year Education
students in Math 001:
Group A
30
28
27
25
25
23
21
20
18
12
Group B
30
20
18
16
15
15
14
13
12
12

X |X - Mean| (X - Mean)
2
30 7.1 50.41
28 5.1 26.01
27 4.1 16.81
25 2.1 4.41
25 2.1 4.41
23 0.1 0.01
21 1.9 3.61
20 2.9 8.41
18 4.9 24.01
12 10.9 118.81
22.9 41.2 256.9
G
R
O
U
P
A
Range = 30 – 12 = 18
Standard dev’n =
256.9/(10-1)
= 28.54
= 5.34
Mean Dev’n = 41.2/10
= 4.12
Variance = (5.34)2
= 28.54
CV = (5.34/22.9) X 100
= 23.32%

Do the same computation
for Group B…

Problem:
t Two seemingly equally excellent BSN
students are vying for an academic
honor where only one must have to be
chosen to get the award. The
following are their grades used as
basis for the award:
t Franzen : 91, 90, 94, 93, 92
t Rico : 92, 92, 90, 94, 92
t Whom do you think deserves to get
the award?

Guiding Principle
t The lesser the value of the
measure, the more consistent,
the more homogeneous and
the less scattered are the
observations in the set of
data.

STATISTICS-E.pdf

Recommended

Recommended

More Related Content

Similar to STATISTICS-E.pdf

Similar to STATISTICS-E.pdf (20)

Recently uploaded

Recently uploaded (20)

STATISTICS-E.pdf