SlideShare a Scribd company logo
1 of 172
PROBABILITY AND STATISTICS
COURSE OUTLINE
I. Introduction to Statistics
II. Tabular and Graphical representation of Data
III. Measures of Central Tendencies, Locations
and Variations
IV. Measure of Dispersion and Correlation
V. Probability and Combinatorics
VI. Discrete and Continuous Distributions
VII.Hypothesis Testing
Text and References
Statistics: a simplified approach by Punsalan and
Uriarte, 1998, Rex Texbook
Probability and Statistics by Johnson, 2008,
Wiley
Counterexamples in Probability and Statistics by
Romano and Siegel, 1986, Chapman and Hall
Introduction to Statistics
Definition
1. In its plural sense, statistics is a set of
numerical data e.g. Vital statistics, monthly
sales, exchange rates, etc.
2. In its singular sense, statistics is a branch of
science that deals with the
collection, presentation, analysis and
interpretation of data.
General uses of Statistics
a. Aids in decision making by providing
comparison of data, explains action that has
taken place, justify a claim or
assertion, predicts future outcome and
estimates un known quantities
b. Summarizes data for public use
Examples on the role of Statistics
- In Biological and medical sciences, it helps researchers
discover relationship worthy of further attention.
Ex. A doctor can use statistics to determine to what
extent is an increase in blood pressure dependent
upon age
- In social sciences, it guides researchers and helps them
support theories and models that cannot stand on
rationale alone.
Ex. Empirical studies are using statistics to obtain socio-
economic profile of the middle class to form new
socio-political theories.
Con’t
- In business, a company can use statistics to
forecast sales, design products, and produce
goods more efficiently.
Ex. A pharmaceutical company can apply statistical
procedures to find out if the new formula is
indeed more effective than the one being used.
- In Engineering, it can be used to test properties
of various materials,
- Ex. A quality controller can use statistics to
estimate the average lifetime of the products
produced by their current equipment.
Fields of Statistics
a. Statistical Methods of Applied Statistics:
1. Descriptive-comprise those methods concerned
with the collection, description, and analysis of a
set of data without drawing conclusions or
inferences about a larger set.
2. Inferential-comprise those methods concerned
with making predictions or inferences about a
larger set of data using only the information
gathered from a subset of this larger set.
con’t
b. Statistical theory of mathematical statistics-
deals with the development and exposition of
theories that serve as a basis of statistical
methods
Descriptive VS Inferential
DESCRIPTIVE
• A bowler wants to find his
bowling average for the past
12 months
• A housewife wants to
determine the average weekly
amount she spent on groceries
in the past 3 months
• A politician wants to know the
exact number of votes he
receives in the last election
INFERENTIAL
A bowler wants to estimate his
chance of winning a game based
on his current season averages
and the average of his opponents.
A housewife would like to predict
based on last year’s grocery
bills, the average weekly amount
she will spend on groceries for
this year.
A politician would like to estimate
based on opinion polls, his
chance for winning in the
upcoming election.
Population as Differrentiated from
Sample
The word population refers to groups or aggregates
of
people, animals, objects, materials, happenings
or things of any form, this means that there are
populations of
students, teachers, supervisors, principals, labora
tory animals, trees, manufactured articles, birds
and many others. If your interest is on few
members of the population to represent their
characteristics or traits, these members
constitute a sample. The measures of the
population are called parameters, while those of
the sample are called estimates or statistics.
The Variable
It refers to a characteristic or property whereby the
members of the group or set vary or differ from one
another. However, a constant refers to a property
whereby the members of the group do not differ one
another.
Variables can be according to functional relationship
which is classified as independent and dependent. If
you treat variable y as a function of variable z, then z is
your independent variable and y is your dependent
variable. This means that the value of y, say academic
achievement depends on the value of z.
Con’t
Variables according to continuity of values.
1. Continuous variable – these are variables
whose levels can take continuous values.
Examples are height, weight, length and
width.
2. Discrete variables – these are variables whose
values or levels can not take the form of a
decimal. An example is the size of a particular
family.
Con’t
Variables according to scale of measurements:
1. Nominal – this refers to a property of the
members of a group defined by an operation
which allows making of statements only of
equality or difference. For example,
individuals can be classified according to thier
sex or skin color. Color is an example of
nominal variable.
Con’t
2. Ordinal – it is defined by an operation whereby
members of a particular group are ranked. In this
operation, we can state that one member is greater or
less that the others in a criterion rather than saying
that he/it is only equal or different from the others
such as what is meant by the nominal variable.
3. Interval – this refers to a property defined by an
operation which permits making statement of equality
of intervals rather than just statement of sameness of
difference and greater than or less than. An interval
variable does not have a “true” zero point.; althought
for convenience, a zero point may be assigned.
Con’t
4. Ratio – is defined by the operation which
permits making statements of equality of
ratios in addition to statements of sameness
or difference, greater than or less than and
equality or inequality of differences. This
means that one level or value may be thought
of or said as double, triple or five times
another and so on.
Assignment no. 1
I. Make a list of at least 5 mathematician or
scientist that contributes in the field of
statistics. State their contributions
II. With your knowledge of statistics, give a real
life situation how statistics is applied. Expand
your answer.
III. When can a variable be considered
independent and dependent? Give an
example for your answer.
Con’t
IV. Enumerate some uses of statistics. Do you
think that any science will develop without
test of the hypothesis? Why?
Examples of Scales of Measurement
1.Nominal Level
Ex. Sex: M-Male F-Female
Marital Status: 1-single 2- married 3-
widowed 4- separated
2. Ordinal Level
Ex. Teaching Ratings: 1-poor 2-fair 3- good 4-
excellent
Con’t
3. Interval Level
Ex. IQ, temperature
4. Ratio Level
Ex. Age, no. of correct answers in exam
Data Collection Methods
1. Survey Method – questions are asked to
obtain information, either through self
administered questionnaire or personal
interview.
2. Observation Method – makes possible the
recording of behavior but only at the time of
occurrence (ex. Traffic count, reactions to a
particular stimulus)
Con’t
3. Experimental method – a method designed for
collecting data under controlled conditions. An
experiment is an operation where there is actual
human interference with the conditions that can
affect the variable under study.
4. Use of existing studies – that is census, health
statistics, weather reports.
5. Registration method – that is car registration,
student registration, hospital admission and ticket
sales.
Tabular Representation
Frequency Distribution is defined as the
arrangement of the gathered data by
categories plus their corresponding
frequencies and class marks or midpoint. It
has a class frequency containing the number
of observations belonging to a class interval.
Its class interval contain a grouping defined by
the limits called the lower and the upper limit.
Between these limits are called class
boundaries.
Frequency of a Nominal Data
Male and Female College students
Major in Chemistry
SEX FREQUENCY
MALE 23
FEMALE 107
TOTAL 130
Frequency of Ordinal Data
Ex. Frequency distribution of Employee Perception on
the Behavior of their Administrators
Perception Frequency
Strongly favorable 10
favorable 11
Slightly favorable 12
Slightly unfavorable 14
Unfavorable 22
Strongly unfavorable 31
total 100
Frequency Distribution Table
Definition:
1. Raw data – is the set of data in its original
form
2. Array – an arrangement of observations
according to their magnitude, wither in
increasing or decreasing order.
Advantages: easier to detect the smallest and
largest value and easy to find the measures
of position
Grouped Frequency of Interval Data
Given the following raw scores in Algebra
Examination,
47 56 42 28 56 41 56 55 59
78 50 55 57 38 62 52 66 65
72 33 34 37 47 42 68 62 54
62 68 48 56 39 77 80 62 71
57 52 60 70
Con’t
1. Compute the range: R = H – L and the number of
classes by K = 1 + 3.322log n where n = number of
observations.
2. Divide the range by 10 to 15 to determine the
acceptable size of the interval. Hint: most
frequency distribution have odd numbers as the
size of the interval. The advantage is that the
midpoints of the intervals will be whole number.
3. Organize the class interval. See to it that the
lowest interval begins with a number that is
multiple of the interval size.
Con’t
4. Tally each score to the category of class interval it
belongs to.
5. Count the tally columns and summarizes it under
column (f). Then add the frequency which is the
total number of the cases (N).
6. Determine the class boundaries. UCB and
LCB.(upper and lower class boundary)
7. Compute the midpoint for each class interval and put
it in the column (M).
M = (LS + HS) / 2
Con’t
8. Compute the cumulative distribution for less
than and greater than and put them in column
cf< and cf>. (you can now interpret the data).
cf = cumulative frequency
9. Compute the relative frequency distribution.
This can be obtained by
RF% = CF/TF x 100%
CF = CLASS FREQUENCY
TF = TOTAL FREQUENCY
Graphical Representation
The data can be graphically
presented according to their
scale or level of
measurements.
1. Pie chart or circle graph. The
pie chart at the right is the
enrollment from
elementary to master’s
degree of a certain
university. The total
population is 4350 students
elementary
34%
high school
31%
college
28%
master's degre
7%
Con’t
2. Histogram or bar graph- this graphical
representation can be used in nominal, ordinal
or interval. For nominal bar graph, the bars
are far apart rather than connected since the
categories are not continuous. For ordinal and
interval data, the bars should be joined to
emphasize the degree of differences
Given the bar graph of how students rate their library.
A-strongly favorable, 90
B-favorable, 48
C-slightly favorable, 88
D-slightly unfavorable, 48
E-unfavorable, 15
F-strongly unfavorable, 25
The Histogram of Person’s Age with Frequency
of Travel
age freq RF
19-20 20 39.2%
21-22 21 41.2%
23-24 4 7.8%
25-26 4 7.8%
27-28 2 3.9%
total 51 100%
Exercises
From the previous grouped data on algebra scores,
a. Draw its histogram using the frequency in the y axis
and midpoints in the x axis.
b. Draw the line graph or frequency polygon using
frequency in the y axis and midpoints in the x axis.
c. Draw the less than and greater than ogives of the
data. Ogives is a cumulation of frequencies by class
intervals. Let the y axis be the CF> and x axis be
LCB while y axis be CF< and x axis be UCB
Con’t
d. Plot the relative frequency using the y axis as
the relative frequency in percent value while
in the x axis the midpoints.
Con’t
25 30 35 40 45 50 55 60 65 70 75 80 85 90
9
8
7
6
5
4
3
2
1
0
f
midpoint
29.5 - UCB
27- midpoint
24.5 - LCB
midpoint
HISTOGRAM
LINE GRAPH
Con’t
29.5 34.5 39.5 44.5 49.5 54.559.5 64.5 69.5 74.5 79.5 84.5
cf less than
40
35
30
25
20
15
10
5
0
UCB
Con’t
40
35
30
25
20
15
10
5
0
24.5 29.5 34.5 39.5 44.5 49.5 54.5 59.5 64.5 69.5 74.5 79.5
cf greater than
LCB
Assignment No. 2
Given the score in a statistics examinations,
33 38 56 35 70 44 81 44 80
47 45 72 45 50 51 51 52 66
54 54 53 56 84 58 56 57 70
55 56 39 56 59 72 63 89 63
60 69 65 61 62 64 64 69 60
65 53 66 66 67 67 68 68 69
66 66 67 70 59 40 71 73 60
73 73 73 73 73 73 74 73 73
74 79 74 74 70 73 46 74 74
75 74 75 75 76 55 77 78 73
79 48 81 44 84 77 88 63 85
73
Con’t
1. Construct the class interval, frequency table,
class midpoint(use a whole number
midpoint), less than and greater than
cumulative frequency, upper and lower
boundary and relative frequency.
2. Plot the histogram, frequency polygon, and
ogives
Con’t
3. Draw the pie chart and bar graph of the plans
of computer science students with respect
to attending a seminar. Compute for the
Relative frequency of each.
A-will not attend=45
B-probably will not attend=30
C-probably will attend=40
D-will attend=25
Measures of Centrality and Location
Mean for Ungrouped Data
X’ = ΣX / N
where X’ = the mean
ΣX = the sum of all scores/data
N = the total number of cases
Mean for Grouped Data
X’ = ΣfM / N
where X’ = the mean
M = the midpoint
fM = the product of the frequency and each
midpoint
N = total number of cases
Con’t
Ex.
1. Find the mean of 10, 20, 25,30, 30, 35, 40 and 50.
2. Given the grades of 50 students in a statistics class
Class interval f
10-14 4
15-19 3
20-24 12
25-29 10
30-34 6
35-39 6
40-44 6
45-49 3
Con’t
The weighted mean. The weighted arithmetic
mean of given groups of data is the average of
the means of all groups
WX’ = ΣXw / N
where WX’ = the weighted mean
w = the weight of X
ΣXw = the sum of the weight of X’s
N = Σw = the sum of the weight of
X
Con’t
Ex.
Find the weighted mean of four groups of
means below:
Group, i 1 2 3 4
Xi 60 50 70 75
Wi 10 20 40 50
Con’t
Median for Ungrouped Data
The median of ungrouped data is the
centermost scores in a distribution.
Mdn = (XN/2 + X (N + 2)/2) / 2 if N is even
Mdn = X (1+N)/2 if N is odd
Ex. Find the median of the following sets of
score:
Score A: 12, 15, 19, 21, 6, 4, 2
Score B: 18, 22, 31, 12, 3, 9, 11, 8
Con’t
Median for Grouped Data
Procedure:
1. Compute the cumulative frequency less than.
2. Find N/2
3. Locate the class interval in which the middle class falls, and
determine the exact limit of this interval.
4. Apply the formula
Mdn = L + [(N/2 – F)i]/fm
where L = exact lower limit interval containing
the median class
F = The sum of all frequencies preceeding L.
fm = Frequency of interval containing the median
class
i = class interval
N = total number of cases
Con’t
Ex.
Find the median of the given frequency table.
class interval f cf<
25-29 3 3
30-34 5 8
35-39 10 18
40-44 15 33
45-49 15 48
50-54 15 63
55-59 21 84
60-64 8 92
65-69 6 98
70-74 2 100
Con’t
Mode of Ungrouped Data
It is defined as the data value or specific score
which has the highest frequency.
Find the mode of the following data.
Data A : 10, 11, 13, 15, 17, 20
Data B: 2, 3, 4, 4, 5, 7, 8, 10
Data C:
3.5, 4.8, 5.5, 6.2, 6.2, 6.2, 7.3, 7.3, 7.3, 8.8
Mode of Grouped Data
For grouped data, the mode is defined as the midpoint
of the interval containing the largest number of
cases.
Mdo = L + [d1/(d1 + d2)]i
where L = exact lower limit interval
containing the modal class.
d1 = the difference of the modal class and the
frequency of the interval preceding the modal class
d2 = the difference of the modal class and the
frequency of the interval after the modal class.
Ex.
Find the mode of the given frequency table.
class interval f cf<
25-29 3 3
30-34 5 8
35-39 10 18
40-44 15 33
45-49 15 48
50-54 15 63
55-59 21 84
60-64 8 92
65-69 6 98
70-74 2 100
Exercises
1. Determine the mean, median and mode of
the age of 15 students in a certain class.
15, 18, 17, 16, 19, 18, 23
, 24, 18, 16, 17, 20, 21, 19
2. To qualify for scholarship, a student should
have garnered an average score of 2.25.
determine if the a certain student is
qualified for a scholarship.
Subject no. of units grade
A 1 2.0
B 2 3.0
C 3 1.5
D 3 1.25
E 5 2.0
3. Find the mean, median and mode of the
given grouped data.
Classes f
11-22 2
23-34 8
35-46 11
47-58 19
59-70 14
71-82 5
83-94 1
Quartiles refer to the values that divide the
distribution into four equal parts. There are 3
quartiles represented by Q1 , Q2 and Q3. The
value Q1 refers to the value in the distribution
that falls on the first one fourth of the
distribution arranged in magnitude. In the
case of Q2 or the second quartile, this value
corresponds to the median. In the case of
third quartile or Q3, this value corresponds to
three fourths of the distribution.
L
H
Q3
Q2
Q1
= 1st quartile
= 2nd quartile
=3rd quartile
The position of the quartiles
in a given set of data
For grouped data, the computing formula of the
kth quartile where k = 1,2,3,4,… is given by
Qk = L + [(kn/4 - F)/fm]Ii
Where L = lower class boundary of the kth
quartile class
F = cumulative frequency before the kth
quartile class
fm = frequency before the kth quartile
i = size of the class interval
Exercises
Compute the value of the first and third quartile of the given
data
class interval f cf<
25-29 3 3
30-34 5 8
35-39 10 18
40-44 15 33
45-49 15 48
50-54 15 63
55-59 21 84
60-64 8 92
65-69 6 98
70-74 2 100
Decile:
If the given data is divided into ten equal
parts, then we have nine points of division
known as deciles. It is denoted by D1 , D2,
D3 , D4 …and D9
Dk = L + [(kn/10 – F)/fm] I
Where k = 1,2,3,4 …9
Exercises
Compute the value of the third, fifth and seventh decile of the
given data
class interval f cf<
25-29 3 3
30-34 5 8
35-39 10 18
40-44 15 33
45-49 15 48
50-54 15 63
55-59 21 84
60-64 8 92
65-69 6 98
70-74 2 100
Percentile- refer to those values that divide a
distribution into one hundred equal parts.
There are 99 percentiles represented by
P1, P2, P3, P4, P5, …and P99. when we say 55th
percentile we are referring to that value at or
below 55/100 th of the data.
Pk = L + [(kn/100 – F)/fm]i
Where k = 1,2,3,4,5,…99
Exercises
Compute the value of the 30th, 55th, 68th and 88th percentile of
the given data
class interval f cf<
25-29 3 3
30-34 5 8
35-39 10 18
40-44 15 33
45-49 15 48
50-54 15 63
55-59 21 84
60-64 8 92
65-69 6 98
70-74 2 100
Assignment no. 3
I. The rate per hour in pesos of 12 employees
of a certain company were taken and are
shown below.
44.75, 44.75, 38.15, 39.25, 18.00, 15.75, 44.75,
39.25, 18.50, 65.25, 71.25, 77.50
a. Find the mean, median and mode.
b. If the value 15.75 was incorrectly written as
45.75, what measure of central tendency
will be affected? Support your answer.
II. The final grades of a student in six subjects were
tabulated below.
Subj units final grade
Algebra 3 60
Religion 2 90
English 3 75
Pilipino 3 86
PE 1 98
History 3 70
a. Determine the weighted mean
b. If the subjects were of equal number of units, what
would be his average?
III. The ages of qualified voters in a certain barangay
were taken and are shown below
Class Interval Frequency
18-23 20
24-29 25
30-35 40
36-41 52
42-47 30
48-53 21
54-59 12
60-65 6
66-71 4
72-77 1
a. Find the mean, median and mode
b. Find the 1st and 3rd quantile
c. Find the 4th and 6th decile
d. Find the 25th and 75th percentile
Measure of Variation
The range is considered to be the simplest form
of measure of variation. It is the difference
between the highest and the lowest value in
the distribution.
R = H – L
For grouped data, the3 difference between the
highest upper class boundary and the lowest
lower class boundary.
Example: find the range of the given grouped
data in slide no. 59
Semi-inter Quartile Range
This value is obtained by getting one half of the
difference between the third and the first
quartile.
Q = (Q3 – Q1)/2
Example:
Find the semin-interquartile range of the
previous example in slide no. 59
Average Deviation
The average deviation refers to the arithmetic
mean of the absolute deviations of the values
from the mean of the distribution. This
measure is sometimes known as the mean
absolute deviation.
AD = Σ│x – x’│/ n
Where x = the individual values
x’ = mean of the distribution
Steps in solving for AD
1. Arrange the values in column according to
magnitude
2. Compute for the value of the mean x’
3. Determine the deviations (x – x’)
4. Convert the deviations in step 3 into positive
deviations. Use the absolute value sign.
5. Get the sum of the absolute deviations in
step 4
6. Divide the sum in step 5 by n.
Example:
1. Consider the following values:
16, 13, 9, 6, 15, 7, 11, 12
Find the average deviation.
For grouped data:
AD = Σf│x – x’│ / n
Where f = frequency of each class
x = midpoint of each class
x’ = mean of the distribution
n = total number of frequency
Example:
Find the average deviation of the given data
Classes f
11-22 2
23-34 8
35-46 11
47-58 19
59-70 14
71-82 5
83-94 1
Variance
For ungrouped data
s2 = Σ(x – x’)2 / n
Example:
Find the variance of
16, 13, 9, 6, 15, 7, 11, 12
For grouped data
s2 = Σf(x – x’)2 / n
Where f = frequency of each class
x = midpoint of each class interval
x’ = mean of the distribution
n = total number of frequency
Example:
Find the variance of the given data
Classes f
11-22 2
23-34 8
35-46 11
47-58 19
59-70 14
71-82 5
83-94 1
Coefficient of variation
If you wish to compare the variability between
different sets of scores or data, coefficient of
variation would be very useful measure for
interval scale data
CV = s/x
Where s = standard deviation
x = the mean
Example:
In a particular university, a researcher wishes to
compare the variation in scores of the urban
students with that of the scores of the rural
students in their college entrance test. It is
know that the urban student’s mean score is
384 with a standard deviation of 101; while
among the rural students, the mean is
174, with a standard deviation of 53, which
group shows more variation in scores?
Standard Deviation
s = √s2
For ungrouped data
s = √ Σ(x – x’)2 / n
For grouped data
s = √ Σf(x – x’)2 / n
Find the standard deviation of the previous examples
for ungrouped and grouped data.
Find the standard deviation of the given data
Classes f
11-22 2
23-34 8
35-46 11
47-58 19
59-70 14
71-82 5
83-94 1
Find the standard deviation of
16, 13, 9, 6, 15, 7, 11, 12
Measure of variation for nominal data
VR = 1 – fm/N
Where VR = the variation ratio
fm = modal class frequency
N = counting of observation
Example:
With the data given by a clinical psychologist on the
type of therapy used, compute the variation ratios.
Type of therapy no. of patients
YR 1980 YR 1985
Logotherapy 20 8
Reality Therapy 60 105
Rational Therapy 42 6
Transactional analysis 39 9
Family therapy 52 5
Others 41 8
Assignment no. 4
I. Compute for the semi-interquartile
range, absolute deviation, variance and
standard deviation test III of assignment no. 3.
II. Compute for the semi-interquartile
range, absolute deviation, variance and
standard deviation of test I of assignment no.
3.
SIMPLE LINEAR REGRESSION AND
MEASURES OF CORRELATION
In this topic, you will learn how to predict the
value of one dependent variable from the
corresponding given value of the independent
variable.
The scatter diagram:
In solving problems that concern estimation and
forecasting, a scatter diagram can be used as a
graphical approach. This technique consist of
joining the points corresponding to the paired
scores of dependent and independent
variables which are commonly represented by
X and Y on the X-Y coordinate system.
Example:
The working experience and income of 8 employees are given
below
Employee years of income
experience (in Thousands)
X Y
A 2 8
B 8 10
C 4 11
D 11 15
E 5 9
F 13 17
G 4 8
H 15 14
Using the Least Squares Linear Regression
Equation:
Y = a + bX
Where b = [nΣxy – ΣxΣy] / [nΣx2 – (Σx)2]
a = y’ – bx’
Obtain the equation of the given data and
estimate the income of an employee if the
number of years experience is 20 years.
Standard Error of Estimate
Se = √ *ΣYi
2 – a(Yi) – b(XiYi)] / n-2
The standard error of estimate is interpreted as
the standard deviation. We will find that the
same value of X will always fall between the
upper and lower 3Se limits.
Measures of Correlation
The degree of relationship between variables is
expressed into:
1. Perfect correlation (positive or negative)
2. Some degree of correlation (positive or
negative)
3. No correlation
For a perfect correlation, it is either positive or
negative represented by +1 and -1. correlation
coefficients, positive or negative, is
represented by +0.01 to +0.99 and -0.01 to -
0.99. The no correlation is represented by 0.
0 to +0.25 very small positive correlation
+0.26 to +0.50 moderately small positive correlation
+0.51 to +0.75 high positive correlation
+0.76 to +0.99 very high positive correlation
+1.00 perfect positive correlation
----------------------------------------------------------
0 to -0.25 very small negative correlation
-0.26 to -0.50 moderately small positive correlation
-0.51 to -0.75 high negative correlation
-0.76 to -0.99 very high negative correlation
-1.00 perfect negative correlation
Anybody who wants to interpret the results of the coefficient of
correlation should be guided by the following reminders:
1. The relationship of two variables does no necessarily mean
that one is the cause of the effect of the other variable. It
does not imply cause-effect relationship.
2. When the computed Pearson r is high, it does not
necessarily mean that one factor is strongly dependent on
the other. On the other hand, when the computed Pearson
r is small it does not necessarily mean that one factor has
no dependence on the other.
3. If there is a reason to believe that the two variables are
related and the computed Pearson r is high, these two
variables are really meant as associated. On the other
hand, if the variables correlated are low, other factors might
be responsible for such small association.
4. Lastly, the meaning of correlation coefficient just simply
informs us that when two variables change there may be a
strong or weak relationship taking place.
The formula for finding the Pearson r is
[nΣXY – ΣXΣY]
r = ------------------------------
√*nΣX2 – (ΣX)2] [nΣY2 – (ΣY)2]
Example: Given two sets of scores. Find the Pearson r and
interpret the result.
X Y
18 10
16 14
14 14
13 12
12 10
10 8
10 5
8 6
6 12
3 0
Correlation between Ordinal Data
This is the Spearman Rank-Order Correlation
Coefficient (Spearman Rho). For cases of 30
or less, Spearman ρ is the most widely used of
the rank correlation method.
6ΣD2
ρ = 1 - -----------
n(n2 – 1)
Where D = (RX – RY)
Example:
Individual Test X Test Y
1 18 24
2 17 28
3 14 30
4 13 26
5 12 22
6 10 18
7 8 15
8 8 12
Gamma Rank Order
An alternative to the rank order correlation is
the Goodman’s and Kruskal’s Gamma (G).
The value of one variable can be estimated or
predicted from the other variable when you
have the knowledge of their values. The
gamma can also be used when ties are found
in the ranking of the data.
NS - N1
G = -----------------
NS + N1
Where NS = the number of pairs ordered in the
parallel direction
N1 = the number of pairs ordered in the
opposite direction
Given a segment of the Filipino Electorate according to religion
and political party
LAKAS LP NP Total
Catholic 50 25 20
INC 34 72 21
Born
Again
22 12 10
Total
Correlation between Nominal Data
The Guttman’s Coefficient of predictability is the
proportionate reduction in error measure which
shows the index of how much an error is reduced in
predicting values of one variable from the value of
another.
ΣFBR - MBC
λc = ------------------
N – MBC
Where FBR = the biggest cell frequencies in the ith row
MBC = the biggest column totals
N = total observations
ΣFBC - MBR
λr = -------------------
N – MBR
Where FBC = the biggest cell frequencies in the
column
MBR = the biggest of the row totals
N = total number of observations
Compute for the λc and λr for the segment of
Filipino electorate and political parties.
Assignment no. 5
1. Given the average yearly cost and sales of company A for a
period of 8 years. Find the pearson r and interpret the
results.
Year Cost Sales
per P10,000 per P10,000
1960 15 38
1961 30 53.3
1962 16 60
1963 39 72
1964 20 40
1965 36 47.5
1966 45 82
1967 10 21.5
2. Given the grades of 10 students in statistics determine the
spearman rho and interpret the result
Student Q1 Q2
A 62 57
B 90 88
C 75 90
D 60 67
E 58 60
F 89 79
G 91 78
H 90 62
I 94 86
J 50 55
3. Compute for the gamma shown and interpret
the result
Socio-
economic
status
EDUCATIONAL STATUS TOTAL
UPPER MIDDLE LOWER TOTAL
UPPER 24 19 5
MIDDLE 12 54 29
LOWER 9 26 25
TOTAL
4. Compute for the λc and λr for the problem no.
3.
Counting Techniques
Consider the numbers 1,2,3 and 4. suppose you want
to determine the total 2 digit numbers that can be
formed if these are combined. First, let us assume
that no digit is to be repeated.
12 21 31 41
13 23 32 42
14 24 34 43
Notice that we were able to used all the possibilities. In
this example, we have 12 possible 2 digit numbers.
Now, what if the digits can be repeated?
11 12 13 14
21 22 23 24
31 23 33 34
41 42 43 44
Hence, we have 16 possible outcomes.
In the first activity, we can do it in n1 ways and after it
has been done, the second activity can be done in
n2 ways, then the total number of ways in which
the two activities can be done is equal to n1 n2.
Example:
1. How many two digit numbers can be formed from
the numbers 1,2,3 and 4 if
a. Repetition is not allowed?
b. Repetition is allowed?
2. How many three digit numbers can be formed from
the digits 1,2,3,4 and 5 if any of the digits can be
repeated?
3. The club members are going to elect their officers. If
there are 5 candidates for president, 5 candidates
for vice president and 3 for secretary, then how
many ways can the officers be elected?
4. An office executive plans to buy as laptop in
which there are 5 brands available. Each of
the brands has 3 models and each model has
5 colors to chose from. In how many ways
can the executive choose?
5. Consider the numbers 2,3 5 and 7. if
repetition is not allowed, how many three
digit numbers can be formed such that
a. They are all odd?
b. They are all even?
c. They are greater that 500?
6. A pizza place offers 3 choices of salad, 20 kinds of
pizza and 4 different deserts. How many different 3
course meals can one order?
7. The executive of a certain company is consist of 5
males and 2 females. How many ways can the
presidents and secretary be chosen if
a. The president must be female and the secretary
must be male?
b. The president and the secretary are of opposite
sex?
c. The president and the secretary should be male?
Permutation
The term permutation refers to the arrangement
of objects with reference to order.
P(n,r) = n! / (n – r)!
Evaluate:
1. P(10,6)
2. P(5,5)
3. P(4,3) + P(4,4)
Examples:
1. In how many ways can a president, a vice
president, a secretary and a treasurer be
elected from a class with 40 students?
2. In how many ways can 7 individuals be
seated in a row of 7 chairs?
3. In how many ways can 9 individuals be
seated in a row of 9 chairs if two individuals
wanted to be seated side by side?
4. Suppose 5 different math books and 7
different physics books shall be arranged in a
shelf. In how many ways can such books be
arranged if the books of the same subject be
placed side by side?
5. Determine the possible permutations of the
word MISSISSIPPI.
6. Find the total 8 digit numbers that can be
formed using all the digits in the following
numerals 55777115
7. In how many ways can 6 persons be seated
around a table with 6 chairs if two
individuals wanted to be seated side by
side?
8. In a local election, there are 7 people
running for 3 positions. In how many ways
can this be done?
Combination
A combination is an arrangement of objects not
in particular order.
nCr = C(n,r) = n! / r!(n-r)!
Evaluate:
1. 8C4
2. 5(5C4 – 5C2)
3. 7C5 / (7C6 – 7C2)
1. A class is consist of 12 boys and 10 girls.
a. In how many ways can the class elect the
president, vice president, secretary and a
treasurer?
b. In how many ways can the class elect 4
members of a certain committee?
2. In how many ways can a student answer 6
out of ten questions?
3. In how many ways can a student answer 6
out of 10 questions if he is required to
answer 2 of the first 5 questions?
4. In how many ways can 3 balls be drawn from
a box containing 8 red and 6 green balls?
5. A box contain 8 red and 6 green balls. In how
many ways can 3 balls be drawn such that
a. They are all green?
b. 2 is red and 1 is green?
c. 1 is red and 2 is green?
6. A shipment of 40 computers are unloaded
from the van and tested. 6 of them are
defective. In how many ways can we select a
set of 5 computers and get at least one
defective?
7. Five letters a,b,c,d,e are to be chosen. In
how many ways could you choose
a. None of them
b. At least two of them
c. At most three of them
Assignment no. 6
1. How many possible outcomes are there if
a. A die is rolled?
b. A pair of dice is rolled?
2. In how many ways can 5 math teachers be
assigned to 4 available subjects if each of the
5 teachers have equal chance of being
assigned to any of the 4 subjects?
3. Consider the numbers 1,2,3,5,and 6. how
many 3 digit numbers can be formed from
these numbers if
a. Repetition is not allowed and 0 should not
be in the first digit?
b. Repetition is allowed and 0 should not be in
the first digit?
4. A college has 3 entrance gates and 2 exit
gates. In how many ways can a student enter
then leave the building?
5. In how many ways can 9 passengers be
seated in a bus if there are only 5 seats
available?
6. In how many ways can 4 boys and 4 girls be
seated in a row of 8 chairs if
a. They can sit anywhere?
b. The boys and girls are to be seated
alternately?
7. In how many ways can ten participants in a
race placed first, second and third?
8. Determine the number of distinct
permutations of each of the following:
a. STATISTICS
b. ADRENALIN
c. 44044999404
9. A class consist of 12 boys and 10 girls. In
how many ways can a committee of five be
formed if
a. All members are boys?
b. 2 are boys and 3 are girls?
10. In how many ways can a student answer an
exam if out of the 6 problem, he is required to
answer only 4?
Probability
In the study of probability, we shall consider activities
for which the outcomes cannot be predicted with
certainty. These activities, called experiment, could
always result in a single outcome. Although the
single outcome can not be predicted before the
performance of the experiment, the set of all
possible outcomes can be determined. This set of all
possible outcomes is referred to as sample space.
Each individual element or outcome in a sample
space is known as a sample point.
Definition of terms:
1. Random experiment- any process of
generating a set of data or observations that
can be repeated under basically the same
conditions, which lead to well defined
outcomes.
2. Sample space – set of all possible outcomes
of an experiment, usually denoted by S.
3. Sample point- an element of the sample
space or outcomes.
4. event- any subset of the sample space usually
denoted by capital letters.
5. Null space- a subset of the sample space that
contains no elements and denoted by the symbol
Ø.
6. Simple event – an event which contains only one
element of the sample space.
7. Compound event – an event that can be expressed
as the union of the simple events, thus containing
more than one sample points.
8. Mutually exclusive events- two events A and B are
mutually exclusive if A∩B have no elements in
common.
The probability of a event A denoted by P(A) is
the sum of the probabilities of mutually
exclusive outcomes that constitute the event.
It must satisfy the following properties:
0 ≤ P(A) ≤ 1
Example:
1. Consider the activity of rolling a die. This activity has
6 possible outcomes, that is 1,2,3,4,5 and 6. thus,
S = {1,2,3,4,5,6}
Any numbers 1 to 6 is a sample point of S. we can say
that there are 6 sample points. If we let A be the
event of getting an even number and B an event of
getting a perfect square, then
A = {2,4,6} and B = {1,4}
Note that the elements of A are elements of the
sample space S. the number of sample points in a
sample space S, events A and B are usually written as
n(S) = 6, n(A) = 3 and n(B) = 2.
2. If a pair of dice is rolled, then determine the
number of sample points of the following:
a. Sample space
b. Event of getting a sum of 5.
c. Event of getting a sum of at most 4.
3. A box contains 6 red and 4 green balls. If three balls
are drawn from the box, then determine the
number of sample points of the following:
a. The sample space
b. The event of getting all green balls
c. The event of getting 1 red and 2 green balls.
Probability is the chance that an event will
happen. The probability of an event A
denoted by P(A) refers to the number
between 0 and 1 including the values of 0 and
1. This number can be expressed as a
fraction, as a decimal or as a percent. When
we assign a probability of 0 to event A, it
means that it is impossible for event A to
occur. When event A is assigned a probability
of 1, then we say that event A will really occur.
P(A) + P(A)’ = 1
The probability of occurrence plus the
probability of non-occurrence is always equal
to 1.
Example:
A student in a statistics class was able to
compute the probability of passing the subject
to be equal to 0.55. Based on this information,
what is the probability that he is not going to
pass the subject?
Three approaches of probability:
1. Subjective probability- it is determine by the use of
intuition, personal beliefs and other indirect
information.
2. A posteriori or probability of relative frequency
(empirical probability) – it is determined by
repeating the experiment a large number of times
using the following rule:
no. of times event A occurred
P(A) = ---------------------------------------------------
no. of times experiment was repeated
Example:
Records show that 120 out of 500 students who
entered in a CS/IT programs leave the school
due to financial problems. What is the
probability that a freshman entering this
college will leave the school due to financial
problem?
2. Last year, the efficiency rating of the employees of a
certain company were taken and presented in a
frequency distribution below:
Efficiency rating no. of employees
60-65 12
66-71 10
72-77 31
78-83 29
84-89 8
Based on the data, what can we say about the
proportion of employees for this year who shall have
an efficiency rating from 72-77 and 84-89?
3. A Priori or classical probability – it is
determined even before the experiment is
performed using the following rule:
n(A)
P(A) = --------
n(S)
Where n(A) = no. of sample points in event A
n(S) = no. of sample points in sample
space S.
1. If a coin is tossed , what is the probability of
getting a head?
2. If two coins are tossed, what is the
probability of getting both heads?
3. If a die is rolled, what is the probability of
getting an odd number? An even number? A
perfect square?
4. If a pair of dice is rolled, what is the
probability of getting a sum of 6? A sum of
13?
5. The probability that a college student
without a flu shot will get the flu is 0.42.what
is the probability that a college student
without the flu shot will not get the flu?
6. A box contains 7 red and 6 green balls. If 2
balls are drawn from the box, what is the
probability of getting both green? 1 red and 1
green?
Addition Rule:
In practice, the probability of two or more
events are usually considered. If we let A and
B be events then these two events can be
combined to form another event. The event
that at least one of the events A or B will
happen is denoted by AUB. The event that
both events A and B will occur is denoted by
A∩B. The probability of AUB denoted by
P(AUB) is given by
P(AUB) = P(A) + P(B) – P(A∩B)
Two events A and B are said to be mutually
exclusive if they can not occur both at the
same time. This implies that the occurrence of
event A excludes the occurrence of event B
and vice versa. Therefore, P(A∩B) has no
sample point which is equal to 0. The previous
equation will be
P(AUB) = P(A) + P(B)
1. Consider rolling a die and the events of
getting an odd number, an even number and
a perfect square. Determine the probability
of getting
a. An odd or an even number.
b. An even number or a perfect square. (this
implies that the two events can occur both
at the same time. Therefore the two events
are non-mutually exclusive events)
2. A card is drawn from an ordinary deck of 52
playing cards. Find the probability of getting
a. An ace or a queen
b. A queen or a face card
c. A black card or a queen
3. You are going to rolled a pair of dice. Find the
probability of getting the sum that is even or
the sum that is multiple of 3.
4. A student goes to the library and checks out
that 40% are work of fiction, 30% are non
fiction and 20% are either fiction or non-
fiction. What is the probability that the
student check out a work of fiction, non-
fiction or both?
5. The probability that Anita will buy machine A is 7/11
and the probability that she will buy machine B is
5/11. If the probability of buying either machine A
and B is 9/11, what is the probability of buying the
two machine?
6. A community swim team has 150 members.
Seventy-five of the members are advanced
swimmers. Forty-seven of the members are
intermediate swimmers. The remainder are
novice swimmers. Forty of the advanced
swimmers practice 4 times a week. Thirty of
the intermediate swimmers practice 4 times a
week. Ten of the novice swimmers practice 4
times a week. Suppose one member of the
swim team is randomly chosen. Answer the
questions (Verify the answers):
a. What is the probability that the member is a
novice swimmer?
b. What is the probability that a member
practice 4 times a week?
c. What is the probability that the member is
an advanced swimmer and practice 4 times a
week?
d. What is the probability that a member is an
advance swimmer and an intermediate
swimmer? Are they mutually exclusive?
SEATWORK
1. A BOX CONTAINS 7 RED, 3 GREEN AND 2 YELLOW BALLS. IF ONE BALL IS DRAWN
FROM THE BOX, THEN WHAT IS THE PROBABILITY OF GETTING
• A RED?
• A NON-RED?
• A NON-GREEN?
2. SUPPOSE THAT WE ROLL A DICE, WHAT IS THE PROBABILITY OF GETTING A SUM OF
6 OR 8?
3. SUPPOSE WE PICK ONE CARD FROM A DECK OF CARDS, WHAT IS THE PROBABILITY
OF GETTING
• A KING OR A SPADE?
• A KING OR NUMBER 8?
4. KLAUS IS TRYING TO CHOOSE WHERE TO GO ON VACATION. HIS CHOICES ARE
A=BAGIUO AND B=TAGAYTAY. HE CAN ONLY AFFORD ONE VACATION. THE
PROBABILITY OF CHOOSING A IS 0.36 AND THE PROBABILITY OF CHOOSING B IS
0.44. WHAT IS THE PROBABILITY THAT HE CHOOSES TO GO EITHER A OR B? WHAT
IS THE PROBABILITY THAT HE WILL NOT CHOOSE ANY OF THE TWO DISTINATION?
Conditional Probability and
Multiplication Rule
It is the probability that a second event will
occur if the first event already happened.
Symbolically, conditional probability is written
as P(A/B) and is read as the probability of
event A given that B has occurred. The
computing formula for the conditional
probability of A given B is given by
P(A/B) = P(A∩B)/P(B), provided P(B) is not equal to
zero.
1. Let P(A) = 0.55
P(B) = 0.35
P(A∩B) = 0.20
Find P(A/B) and P(B/A)
2. A die is rolled. If the result is an even number,
what is the probability that it is a perfect
square?
3. A card is drawn from a deck of 52 cards. Given
that the card drawn is a face card, then what is
the probability of getting a king? A spade? A red
card?
4. A vendor has 35 balloons on strings. 20 balloons are
yellow, 8 are red and 7 are green. A balloon was
selected at random and sold. Given that the balloon
selected and sold is yellow, what is the probability
that the next balloon selected and sold at random is
also yellow?
5. Given that 25 microwaves are on display in a certain
store but 2 of them are defective. A customer wishes
to buy 2 microwaves and pick them up without
replacement. Find the probability that the two are
defective.
6. Should women participate in combat?
yes no
Male 32 18
Female 8 42
a. Find the probability that the respondent
answered YES given that the respondent was
a female.
b. Find the probability that the respondent was
a male given that the respondent answered
NO.
7. A box contains 3 red and 8 black balls. If two
balls are drawn in succession without
replacement, what is the probability that
a. Both are red?
b. The first ball is red and the second ball is
black?
8. A box contains 3 red and 8 black balls. If 2
balls are drawn at random with replacement,
what is the probability that both are red?
Assignment no. 7
1.. A BOX CONTAINS 7 RED, 3 GREEN AND 2 YELLOW BALLS. IF ONE BALL IS DRAWN
FROM THE BOX, THEN WHAT IS THE PROBABILITY OF GETTING
• A RED?
• A NON-RED?
• A NON-GREEN?
2. SUPPOSE THAT WE ROLL A DICE, WHAT IS THE PROBABILITY OF GETTING A SUM OF
6 OR 8?
3. SUPPOSE WE PICK ONE CARD FROM A DECK OF CARDS, WHAT IS THE PROBABILITY
OF GETTING
• A KING OR A SPADE?
• A KING OR NUMBER 8?
4. KLAUS IS TRYING TO CHOOSE WHERE TO GO ON VACATION. HIS CHOICES ARE
A=BAGIUO AND B=TAGAYTAY. HE CAN ONLY AFFORD ONE VACATION. THE
PROBABILITY OF CHOOSING A IS 0.36 AND THE PROBABILITY OF CHOOSING B IS
0.44. WHAT IS THE PROBABILITY THAT HE CHOOSES TO GO EITHER A OR B? WHAT
IS THE PROBABILITY THAT HE WILL NOT CHOOSE ANY OF THE TWO DISTINATION?
5. The probability that it is Friday and that a
student is absent is 0.03. Since there are 5
school days in a week, the probability that it is
Friday is 0.2. What is the probability that a
student is absent given that today is Friday?
Normal Distribution
The normal probability curve is one of the most
commonly used theoretical distributions in
statistical inference. The mathematical
equation of the normal curve was developed
by De Moivre in 1773. the distribution is
sometimes called the Gaussian distribution in
honor of Gauss, who also derived the
equation in the 19th century.
Con’t
A large population investigated in education and
the behavioral sciences has characteristics
that follow a normal distribution. If we are to
study, for instance, the scholastic mental
capacity of a school population N= 1500, we
may find that majority of the student
population will yield average scores, a small
portion will yield above and below average
scores and a few students will yield extremely
high and low scores.
Con’t
The characteristics of the Normal Curve is
1. The curve is symmetrical and bell shaped. It
has its highest point at the center. The lines at
both sides fall off toward the opposite
directions at exactly equal distance from the
center. Therefore if the curve is folded at the
middle, the two sides are perfectly of the
same size and shape.
Con’t
2. The number of cases, N, is infinite. This is the
reason why the curve is asymptotic to the
baseline which means that the curve at both
sides does not touch the baseline or the axis,
and that the curve may extend infinitely to
both directions.
3. The three measures of central tendency,
mean, median and mode coincide at one
point at the center of the distribution.
Con’t
4. The height of the curve indicate the frequency
of cases, expressed as probability, proportion
or percentage. Hence, the total area under the
normal curve is 1.0 in terms of probability or
proportion and 100% in terms of percentage.
Thus one half of the area is 50%
5. The basic unit of measurement is expressed in
sigma units (σ) or standard deviations along
the baseline. It is also called Z-scores.
Con’t
6. Two parameters are used to describe the
curve. One is the parameter mean(μ or x’)
which is equal to zero and the other is the
standard deviation(σ) which is equal to 1.
7. Standard deviations or A scores departing
away from the mean (μ or x’) towards the
right of the curve is in positive while scores
departing from the mean is in negative values.
The normal probability curve
From the previous curve
We can say that,
1. At least 68% of the values in the given set of
data fall within plus or minus 1 standard
deviation from the mean. In symbols, the
interval is given by (x’ – 1σ) – (x’ + 1σ).
2. At least 95% of the value in the given set of
data fall within plus or minus 2 standard
deviation from the mean. In symbol, the
interval is (x’ – 2σ) – (x’ + 2σ) and so on.
To illustrate the significance of the empirical rule,
consider the NCEE scores of students in a certain
college whose mean score x’ or μ = 65 and the standard
deviation σ or SD = 6
1. approximately, 68% of the students in that
college have NCEE scores between 80 plus or
minus 10, that is
65 – (1)(6) – 65 + (1)(6)
59 - 71
The Standard Score
The standard score Z represents a normal
distribution with mean x’ = 0 and SD = 1. such
transformation can be obtained by using the
formula below.
Z = (x – x’) / SD
Normal Curve Areas
The total area under the normal curve is equal
to 1. since a normally distributed set of data is
symmetric, then the total area from Z = 0 to
the right is equal to 0.5. the area from Z = 0 to
the left is also equal to 0.5.
Example:
Find the area under the curve from
1. 0<Z<1.25
2. -1.25<Z<0
Normal Probability Distribution
Find the probability value of
1. P(Z>1.45)
2. P(Z<-0.4)
3. P(-0.4<Z<1.45)
4. P(1.15<Z<2.33)
5. P(Z<1.28)
6. P(Z>-1.04)
Con’t
7. The examination results of a large group of
students in statistics are normally distributed
with a mean of 40 and a standard deviation of
4. If a student is chosen at random, what is
the probability that his score is
a. Below 30?
b. Above 55?
c. Below 42?
d. Between 35 to 45?
e. Between 33 to 50?
Con’t
8. The efficiency rating of 400 faculty members of a
certain university were taken and resulted in a
mean rating of 78 with a standard deviation of 6.75.
assuming that the set of data are normally
distributed, how many of the faculty members have
an efficiency rating of
a. Greater than 78?
b. Less that 78?
c. Greater than 85?
d. Between 75-90?
Assignment no. 8
I. Find the area under the following condition
1. Between the -2.02 and 1.01
2. To the right of 1.62
3. To the left of 0.56
4. Between 0.65 and 1.18
5. Between -2.09 and -0.78
II. In a reading ability test, with a sample of 120
cases, the mean score is 50 and the standard
deviation is 5.5.
Con’t
a. What percentage of the cases falls between
the mean and a score of 55?
b. What is the probability that a score picked at
random will lie above the score of 55?
c. What is the probability that a score will lie
below 40?
d. How many cases fall between 55 to 60?
e. How many cases fall between 40 to 49?
END OF
LECTURE

More Related Content

What's hot

Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributions
Ulster BOCES
 

What's hot (20)

Introduction to Hypothesis Testing
Introduction to Hypothesis TestingIntroduction to Hypothesis Testing
Introduction to Hypothesis Testing
 
Kolmogorov Smirnov
Kolmogorov SmirnovKolmogorov Smirnov
Kolmogorov Smirnov
 
Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributions
 
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmahypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigma
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-test
 
Assumptions of ANOVA
Assumptions of ANOVAAssumptions of ANOVA
Assumptions of ANOVA
 
1.1 statistical and critical thinking
1.1 statistical and critical thinking1.1 statistical and critical thinking
1.1 statistical and critical thinking
 
Normal Distribution – Introduction and Properties
Normal Distribution – Introduction and PropertiesNormal Distribution – Introduction and Properties
Normal Distribution – Introduction and Properties
 
T distribution
T distributionT distribution
T distribution
 
Chi square test
Chi square testChi square test
Chi square test
 
Probability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample MeansProbability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample Means
 
Geometric probability distribution
Geometric probability distributionGeometric probability distribution
Geometric probability distribution
 
Probability
ProbabilityProbability
Probability
 
Probability
ProbabilityProbability
Probability
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
z-test
z-testz-test
z-test
 
Introduction To Survival Analysis
Introduction To Survival AnalysisIntroduction To Survival Analysis
Introduction To Survival Analysis
 
Measures of variability
Measures of variabilityMeasures of variability
Measures of variability
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
 
Unit 4
Unit 4Unit 4
Unit 4
 

Viewers also liked

probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)
abfisho
 
Measure of Dispersion
Measure of DispersionMeasure of Dispersion
Measure of Dispersion
elly_gaa
 
MEASURES OF CENTRAL TENDENCY AND VARIABILITY
MEASURES OF CENTRAL TENDENCY AND VARIABILITYMEASURES OF CENTRAL TENDENCY AND VARIABILITY
MEASURES OF CENTRAL TENDENCY AND VARIABILITY
Mariele Brutas
 
Bab 2: Malayan Union dan Persekutuan Tanah Melayu
Bab 2: Malayan Union dan Persekutuan Tanah MelayuBab 2: Malayan Union dan Persekutuan Tanah Melayu
Bab 2: Malayan Union dan Persekutuan Tanah Melayu
impianhati
 
Message Authentication using Message Digests and the MD5 Algorithm
Message Authentication using Message Digests and the MD5 AlgorithmMessage Authentication using Message Digests and the MD5 Algorithm
Message Authentication using Message Digests and the MD5 Algorithm
Ajay Karri
 
Skewness & Kurtosis
Skewness & KurtosisSkewness & Kurtosis
Skewness & Kurtosis
Navin Bafna
 
Measures of variability and range for ungrouped data
Measures of variability and range for ungrouped dataMeasures of variability and range for ungrouped data
Measures of variability and range for ungrouped data
Samanie M
 
Chapter 03
Chapter 03Chapter 03
Chapter 03
bmcfad01
 

Viewers also liked (20)

Lecture notes on STS 102
Lecture notes on STS 102Lecture notes on STS 102
Lecture notes on STS 102
 
probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)
 
Class heights project report
Class heights project reportClass heights project report
Class heights project report
 
Sample data analysis_elmaddah
Sample data analysis_elmaddahSample data analysis_elmaddah
Sample data analysis_elmaddah
 
Data
DataData
Data
 
Mba i qt unit-2.1_measures of variations
Mba i qt unit-2.1_measures of variationsMba i qt unit-2.1_measures of variations
Mba i qt unit-2.1_measures of variations
 
frequency distribution & graphs
frequency distribution & graphsfrequency distribution & graphs
frequency distribution & graphs
 
Measure of Dispersion
Measure of DispersionMeasure of Dispersion
Measure of Dispersion
 
MEASURES OF CENTRAL TENDENCY AND VARIABILITY
MEASURES OF CENTRAL TENDENCY AND VARIABILITYMEASURES OF CENTRAL TENDENCY AND VARIABILITY
MEASURES OF CENTRAL TENDENCY AND VARIABILITY
 
Math Module Sample
Math Module SampleMath Module Sample
Math Module Sample
 
Fractiles
FractilesFractiles
Fractiles
 
Bab 2: Malayan Union dan Persekutuan Tanah Melayu
Bab 2: Malayan Union dan Persekutuan Tanah MelayuBab 2: Malayan Union dan Persekutuan Tanah Melayu
Bab 2: Malayan Union dan Persekutuan Tanah Melayu
 
Message Authentication using Message Digests and the MD5 Algorithm
Message Authentication using Message Digests and the MD5 AlgorithmMessage Authentication using Message Digests and the MD5 Algorithm
Message Authentication using Message Digests and the MD5 Algorithm
 
Statistics for math (English Version)
Statistics for math (English Version)Statistics for math (English Version)
Statistics for math (English Version)
 
Chapter 3: Prsentation of Data
Chapter 3: Prsentation of DataChapter 3: Prsentation of Data
Chapter 3: Prsentation of Data
 
Skewness & Kurtosis
Skewness & KurtosisSkewness & Kurtosis
Skewness & Kurtosis
 
Measures of variability and range for ungrouped data
Measures of variability and range for ungrouped dataMeasures of variability and range for ungrouped data
Measures of variability and range for ungrouped data
 
INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS
 
Chapter 03
Chapter 03Chapter 03
Chapter 03
 
Frequency Distributions and Graphs
Frequency Distributions and GraphsFrequency Distributions and Graphs
Frequency Distributions and Graphs
 

Similar to Probability in statistics

Probability and statistics(exercise answers)
Probability and statistics(exercise answers)Probability and statistics(exercise answers)
Probability and statistics(exercise answers)
Fatima Bianca Gueco
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
trixiacruz
 
Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statistics
Aniceto Naval
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
raileeanne
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
jamiebrandon
 

Similar to Probability in statistics (20)

Probability and statistics(assign 7 and 8)
Probability and statistics(assign 7 and 8)Probability and statistics(assign 7 and 8)
Probability and statistics(assign 7 and 8)
 
New statistics
New statisticsNew statistics
New statistics
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Probability and statistics(exercise answers)
Probability and statistics(exercise answers)Probability and statistics(exercise answers)
Probability and statistics(exercise answers)
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Statistics1(finals)
Statistics1(finals)Statistics1(finals)
Statistics1(finals)
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Finals Stat 1
Finals Stat 1Finals Stat 1
Finals Stat 1
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Meaning and Importance of Statistics
Meaning and Importance of StatisticsMeaning and Importance of Statistics
Meaning and Importance of Statistics
 
Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statistics
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
 
Statistics and prob.
Statistics and prob.Statistics and prob.
Statistics and prob.
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Statistics final seminar
Statistics final seminarStatistics final seminar
Statistics final seminar
 
AGRICULTURAL-STATISTICS.pptx
AGRICULTURAL-STATISTICS.pptxAGRICULTURAL-STATISTICS.pptx
AGRICULTURAL-STATISTICS.pptx
 
Machine learning pre requisite
Machine learning pre requisiteMachine learning pre requisite
Machine learning pre requisite
 
Descriptive statistics. final
Descriptive statistics. finalDescriptive statistics. final
Descriptive statistics. final
 
CHAPTER 15-HOW TO WRITE CHAPTER 3.pptx
CHAPTER 15-HOW TO WRITE CHAPTER 3.pptxCHAPTER 15-HOW TO WRITE CHAPTER 3.pptx
CHAPTER 15-HOW TO WRITE CHAPTER 3.pptx
 

Recently uploaded

Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman MuscatAbortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Editorial design Magazine design project.pdf
Editorial design Magazine design project.pdfEditorial design Magazine design project.pdf
Editorial design Magazine design project.pdf
tbatkhuu1
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
amitlee9823
 
Peaches App development presentation deck
Peaches App development presentation deckPeaches App development presentation deck
Peaches App development presentation deck
tbatkhuu1
 
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts ServiceVVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
aroranaina404
 
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
amitlee9823
 
Verified Trusted Call Girls Adugodi💘 9352852248 Good Looking standard Profil...
Verified Trusted Call Girls Adugodi💘 9352852248  Good Looking standard Profil...Verified Trusted Call Girls Adugodi💘 9352852248  Good Looking standard Profil...
Verified Trusted Call Girls Adugodi💘 9352852248 Good Looking standard Profil...
kumaririma588
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
infant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptxinfant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptx
suhanimunjal27
 
DESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- IntroductionDESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- Introduction
sivagami49
 
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
nirzagarg
 

Recently uploaded (20)

Q4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentationQ4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentation
 
call girls in Kaushambi (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Kaushambi (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Kaushambi (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Kaushambi (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
 
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...
 
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman MuscatAbortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
 
Editorial design Magazine design project.pdf
Editorial design Magazine design project.pdfEditorial design Magazine design project.pdf
Editorial design Magazine design project.pdf
 
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
 
Peaches App development presentation deck
Peaches App development presentation deckPeaches App development presentation deck
Peaches App development presentation deck
 
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts ServiceVVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
VVIP CALL GIRLS Lucknow 💓 Lucknow < Renuka Sharma > 7877925207 Escorts Service
 
SD_The MATATAG Curriculum Training Design.pptx
SD_The MATATAG Curriculum Training Design.pptxSD_The MATATAG Curriculum Training Design.pptx
SD_The MATATAG Curriculum Training Design.pptx
 
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
 
Verified Trusted Call Girls Adugodi💘 9352852248 Good Looking standard Profil...
Verified Trusted Call Girls Adugodi💘 9352852248  Good Looking standard Profil...Verified Trusted Call Girls Adugodi💘 9352852248  Good Looking standard Profil...
Verified Trusted Call Girls Adugodi💘 9352852248 Good Looking standard Profil...
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
infant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptxinfant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptx
 
DESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- IntroductionDESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- Introduction
 
Sector 104, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 104, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 104, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 104, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls AgencyHire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
 
Sweety Planet Packaging Design Process Book.pptx
Sweety Planet Packaging Design Process Book.pptxSweety Planet Packaging Design Process Book.pptx
Sweety Planet Packaging Design Process Book.pptx
 
Sector 105, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 105, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 105, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 105, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
 

Probability in statistics

  • 2. COURSE OUTLINE I. Introduction to Statistics II. Tabular and Graphical representation of Data III. Measures of Central Tendencies, Locations and Variations IV. Measure of Dispersion and Correlation V. Probability and Combinatorics VI. Discrete and Continuous Distributions VII.Hypothesis Testing
  • 3. Text and References Statistics: a simplified approach by Punsalan and Uriarte, 1998, Rex Texbook Probability and Statistics by Johnson, 2008, Wiley Counterexamples in Probability and Statistics by Romano and Siegel, 1986, Chapman and Hall
  • 4. Introduction to Statistics Definition 1. In its plural sense, statistics is a set of numerical data e.g. Vital statistics, monthly sales, exchange rates, etc. 2. In its singular sense, statistics is a branch of science that deals with the collection, presentation, analysis and interpretation of data.
  • 5. General uses of Statistics a. Aids in decision making by providing comparison of data, explains action that has taken place, justify a claim or assertion, predicts future outcome and estimates un known quantities b. Summarizes data for public use
  • 6. Examples on the role of Statistics - In Biological and medical sciences, it helps researchers discover relationship worthy of further attention. Ex. A doctor can use statistics to determine to what extent is an increase in blood pressure dependent upon age - In social sciences, it guides researchers and helps them support theories and models that cannot stand on rationale alone. Ex. Empirical studies are using statistics to obtain socio- economic profile of the middle class to form new socio-political theories.
  • 7. Con’t - In business, a company can use statistics to forecast sales, design products, and produce goods more efficiently. Ex. A pharmaceutical company can apply statistical procedures to find out if the new formula is indeed more effective than the one being used. - In Engineering, it can be used to test properties of various materials, - Ex. A quality controller can use statistics to estimate the average lifetime of the products produced by their current equipment.
  • 8. Fields of Statistics a. Statistical Methods of Applied Statistics: 1. Descriptive-comprise those methods concerned with the collection, description, and analysis of a set of data without drawing conclusions or inferences about a larger set. 2. Inferential-comprise those methods concerned with making predictions or inferences about a larger set of data using only the information gathered from a subset of this larger set.
  • 9. con’t b. Statistical theory of mathematical statistics- deals with the development and exposition of theories that serve as a basis of statistical methods
  • 10. Descriptive VS Inferential DESCRIPTIVE • A bowler wants to find his bowling average for the past 12 months • A housewife wants to determine the average weekly amount she spent on groceries in the past 3 months • A politician wants to know the exact number of votes he receives in the last election INFERENTIAL A bowler wants to estimate his chance of winning a game based on his current season averages and the average of his opponents. A housewife would like to predict based on last year’s grocery bills, the average weekly amount she will spend on groceries for this year. A politician would like to estimate based on opinion polls, his chance for winning in the upcoming election.
  • 11. Population as Differrentiated from Sample The word population refers to groups or aggregates of people, animals, objects, materials, happenings or things of any form, this means that there are populations of students, teachers, supervisors, principals, labora tory animals, trees, manufactured articles, birds and many others. If your interest is on few members of the population to represent their characteristics or traits, these members constitute a sample. The measures of the population are called parameters, while those of the sample are called estimates or statistics.
  • 12. The Variable It refers to a characteristic or property whereby the members of the group or set vary or differ from one another. However, a constant refers to a property whereby the members of the group do not differ one another. Variables can be according to functional relationship which is classified as independent and dependent. If you treat variable y as a function of variable z, then z is your independent variable and y is your dependent variable. This means that the value of y, say academic achievement depends on the value of z.
  • 13. Con’t Variables according to continuity of values. 1. Continuous variable – these are variables whose levels can take continuous values. Examples are height, weight, length and width. 2. Discrete variables – these are variables whose values or levels can not take the form of a decimal. An example is the size of a particular family.
  • 14. Con’t Variables according to scale of measurements: 1. Nominal – this refers to a property of the members of a group defined by an operation which allows making of statements only of equality or difference. For example, individuals can be classified according to thier sex or skin color. Color is an example of nominal variable.
  • 15. Con’t 2. Ordinal – it is defined by an operation whereby members of a particular group are ranked. In this operation, we can state that one member is greater or less that the others in a criterion rather than saying that he/it is only equal or different from the others such as what is meant by the nominal variable. 3. Interval – this refers to a property defined by an operation which permits making statement of equality of intervals rather than just statement of sameness of difference and greater than or less than. An interval variable does not have a “true” zero point.; althought for convenience, a zero point may be assigned.
  • 16. Con’t 4. Ratio – is defined by the operation which permits making statements of equality of ratios in addition to statements of sameness or difference, greater than or less than and equality or inequality of differences. This means that one level or value may be thought of or said as double, triple or five times another and so on.
  • 17. Assignment no. 1 I. Make a list of at least 5 mathematician or scientist that contributes in the field of statistics. State their contributions II. With your knowledge of statistics, give a real life situation how statistics is applied. Expand your answer. III. When can a variable be considered independent and dependent? Give an example for your answer.
  • 18. Con’t IV. Enumerate some uses of statistics. Do you think that any science will develop without test of the hypothesis? Why?
  • 19. Examples of Scales of Measurement 1.Nominal Level Ex. Sex: M-Male F-Female Marital Status: 1-single 2- married 3- widowed 4- separated 2. Ordinal Level Ex. Teaching Ratings: 1-poor 2-fair 3- good 4- excellent
  • 20. Con’t 3. Interval Level Ex. IQ, temperature 4. Ratio Level Ex. Age, no. of correct answers in exam
  • 21. Data Collection Methods 1. Survey Method – questions are asked to obtain information, either through self administered questionnaire or personal interview. 2. Observation Method – makes possible the recording of behavior but only at the time of occurrence (ex. Traffic count, reactions to a particular stimulus)
  • 22. Con’t 3. Experimental method – a method designed for collecting data under controlled conditions. An experiment is an operation where there is actual human interference with the conditions that can affect the variable under study. 4. Use of existing studies – that is census, health statistics, weather reports. 5. Registration method – that is car registration, student registration, hospital admission and ticket sales.
  • 23. Tabular Representation Frequency Distribution is defined as the arrangement of the gathered data by categories plus their corresponding frequencies and class marks or midpoint. It has a class frequency containing the number of observations belonging to a class interval. Its class interval contain a grouping defined by the limits called the lower and the upper limit. Between these limits are called class boundaries.
  • 24. Frequency of a Nominal Data Male and Female College students Major in Chemistry SEX FREQUENCY MALE 23 FEMALE 107 TOTAL 130
  • 25. Frequency of Ordinal Data Ex. Frequency distribution of Employee Perception on the Behavior of their Administrators Perception Frequency Strongly favorable 10 favorable 11 Slightly favorable 12 Slightly unfavorable 14 Unfavorable 22 Strongly unfavorable 31 total 100
  • 26. Frequency Distribution Table Definition: 1. Raw data – is the set of data in its original form 2. Array – an arrangement of observations according to their magnitude, wither in increasing or decreasing order. Advantages: easier to detect the smallest and largest value and easy to find the measures of position
  • 27. Grouped Frequency of Interval Data Given the following raw scores in Algebra Examination, 47 56 42 28 56 41 56 55 59 78 50 55 57 38 62 52 66 65 72 33 34 37 47 42 68 62 54 62 68 48 56 39 77 80 62 71 57 52 60 70
  • 28. Con’t 1. Compute the range: R = H – L and the number of classes by K = 1 + 3.322log n where n = number of observations. 2. Divide the range by 10 to 15 to determine the acceptable size of the interval. Hint: most frequency distribution have odd numbers as the size of the interval. The advantage is that the midpoints of the intervals will be whole number. 3. Organize the class interval. See to it that the lowest interval begins with a number that is multiple of the interval size.
  • 29. Con’t 4. Tally each score to the category of class interval it belongs to. 5. Count the tally columns and summarizes it under column (f). Then add the frequency which is the total number of the cases (N). 6. Determine the class boundaries. UCB and LCB.(upper and lower class boundary) 7. Compute the midpoint for each class interval and put it in the column (M). M = (LS + HS) / 2
  • 30. Con’t 8. Compute the cumulative distribution for less than and greater than and put them in column cf< and cf>. (you can now interpret the data). cf = cumulative frequency 9. Compute the relative frequency distribution. This can be obtained by RF% = CF/TF x 100% CF = CLASS FREQUENCY TF = TOTAL FREQUENCY
  • 31. Graphical Representation The data can be graphically presented according to their scale or level of measurements. 1. Pie chart or circle graph. The pie chart at the right is the enrollment from elementary to master’s degree of a certain university. The total population is 4350 students elementary 34% high school 31% college 28% master's degre 7%
  • 32. Con’t 2. Histogram or bar graph- this graphical representation can be used in nominal, ordinal or interval. For nominal bar graph, the bars are far apart rather than connected since the categories are not continuous. For ordinal and interval data, the bars should be joined to emphasize the degree of differences
  • 33. Given the bar graph of how students rate their library. A-strongly favorable, 90 B-favorable, 48 C-slightly favorable, 88 D-slightly unfavorable, 48 E-unfavorable, 15 F-strongly unfavorable, 25
  • 34. The Histogram of Person’s Age with Frequency of Travel age freq RF 19-20 20 39.2% 21-22 21 41.2% 23-24 4 7.8% 25-26 4 7.8% 27-28 2 3.9% total 51 100%
  • 35. Exercises From the previous grouped data on algebra scores, a. Draw its histogram using the frequency in the y axis and midpoints in the x axis. b. Draw the line graph or frequency polygon using frequency in the y axis and midpoints in the x axis. c. Draw the less than and greater than ogives of the data. Ogives is a cumulation of frequencies by class intervals. Let the y axis be the CF> and x axis be LCB while y axis be CF< and x axis be UCB
  • 36. Con’t d. Plot the relative frequency using the y axis as the relative frequency in percent value while in the x axis the midpoints.
  • 37. Con’t 25 30 35 40 45 50 55 60 65 70 75 80 85 90 9 8 7 6 5 4 3 2 1 0 f midpoint 29.5 - UCB 27- midpoint 24.5 - LCB midpoint HISTOGRAM LINE GRAPH
  • 38. Con’t 29.5 34.5 39.5 44.5 49.5 54.559.5 64.5 69.5 74.5 79.5 84.5 cf less than 40 35 30 25 20 15 10 5 0 UCB
  • 39. Con’t 40 35 30 25 20 15 10 5 0 24.5 29.5 34.5 39.5 44.5 49.5 54.5 59.5 64.5 69.5 74.5 79.5 cf greater than LCB
  • 40. Assignment No. 2 Given the score in a statistics examinations, 33 38 56 35 70 44 81 44 80 47 45 72 45 50 51 51 52 66 54 54 53 56 84 58 56 57 70 55 56 39 56 59 72 63 89 63 60 69 65 61 62 64 64 69 60 65 53 66 66 67 67 68 68 69 66 66 67 70 59 40 71 73 60 73 73 73 73 73 73 74 73 73 74 79 74 74 70 73 46 74 74 75 74 75 75 76 55 77 78 73 79 48 81 44 84 77 88 63 85 73
  • 41. Con’t 1. Construct the class interval, frequency table, class midpoint(use a whole number midpoint), less than and greater than cumulative frequency, upper and lower boundary and relative frequency. 2. Plot the histogram, frequency polygon, and ogives
  • 42. Con’t 3. Draw the pie chart and bar graph of the plans of computer science students with respect to attending a seminar. Compute for the Relative frequency of each. A-will not attend=45 B-probably will not attend=30 C-probably will attend=40 D-will attend=25
  • 43. Measures of Centrality and Location Mean for Ungrouped Data X’ = ΣX / N where X’ = the mean ΣX = the sum of all scores/data N = the total number of cases Mean for Grouped Data X’ = ΣfM / N where X’ = the mean M = the midpoint fM = the product of the frequency and each midpoint N = total number of cases
  • 44. Con’t Ex. 1. Find the mean of 10, 20, 25,30, 30, 35, 40 and 50. 2. Given the grades of 50 students in a statistics class Class interval f 10-14 4 15-19 3 20-24 12 25-29 10 30-34 6 35-39 6 40-44 6 45-49 3
  • 45. Con’t The weighted mean. The weighted arithmetic mean of given groups of data is the average of the means of all groups WX’ = ΣXw / N where WX’ = the weighted mean w = the weight of X ΣXw = the sum of the weight of X’s N = Σw = the sum of the weight of X
  • 46. Con’t Ex. Find the weighted mean of four groups of means below: Group, i 1 2 3 4 Xi 60 50 70 75 Wi 10 20 40 50
  • 47. Con’t Median for Ungrouped Data The median of ungrouped data is the centermost scores in a distribution. Mdn = (XN/2 + X (N + 2)/2) / 2 if N is even Mdn = X (1+N)/2 if N is odd Ex. Find the median of the following sets of score: Score A: 12, 15, 19, 21, 6, 4, 2 Score B: 18, 22, 31, 12, 3, 9, 11, 8
  • 48. Con’t Median for Grouped Data Procedure: 1. Compute the cumulative frequency less than. 2. Find N/2 3. Locate the class interval in which the middle class falls, and determine the exact limit of this interval. 4. Apply the formula Mdn = L + [(N/2 – F)i]/fm where L = exact lower limit interval containing the median class F = The sum of all frequencies preceeding L. fm = Frequency of interval containing the median class i = class interval N = total number of cases
  • 49. Con’t Ex. Find the median of the given frequency table. class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 50. Con’t Mode of Ungrouped Data It is defined as the data value or specific score which has the highest frequency. Find the mode of the following data. Data A : 10, 11, 13, 15, 17, 20 Data B: 2, 3, 4, 4, 5, 7, 8, 10 Data C: 3.5, 4.8, 5.5, 6.2, 6.2, 6.2, 7.3, 7.3, 7.3, 8.8
  • 51. Mode of Grouped Data For grouped data, the mode is defined as the midpoint of the interval containing the largest number of cases. Mdo = L + [d1/(d1 + d2)]i where L = exact lower limit interval containing the modal class. d1 = the difference of the modal class and the frequency of the interval preceding the modal class d2 = the difference of the modal class and the frequency of the interval after the modal class.
  • 52. Ex. Find the mode of the given frequency table. class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 53. Exercises 1. Determine the mean, median and mode of the age of 15 students in a certain class. 15, 18, 17, 16, 19, 18, 23 , 24, 18, 16, 17, 20, 21, 19 2. To qualify for scholarship, a student should have garnered an average score of 2.25. determine if the a certain student is qualified for a scholarship.
  • 54. Subject no. of units grade A 1 2.0 B 2 3.0 C 3 1.5 D 3 1.25 E 5 2.0
  • 55. 3. Find the mean, median and mode of the given grouped data. Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 56. Quartiles refer to the values that divide the distribution into four equal parts. There are 3 quartiles represented by Q1 , Q2 and Q3. The value Q1 refers to the value in the distribution that falls on the first one fourth of the distribution arranged in magnitude. In the case of Q2 or the second quartile, this value corresponds to the median. In the case of third quartile or Q3, this value corresponds to three fourths of the distribution.
  • 57. L H Q3 Q2 Q1 = 1st quartile = 2nd quartile =3rd quartile The position of the quartiles in a given set of data
  • 58. For grouped data, the computing formula of the kth quartile where k = 1,2,3,4,… is given by Qk = L + [(kn/4 - F)/fm]Ii Where L = lower class boundary of the kth quartile class F = cumulative frequency before the kth quartile class fm = frequency before the kth quartile i = size of the class interval
  • 59. Exercises Compute the value of the first and third quartile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 60. Decile: If the given data is divided into ten equal parts, then we have nine points of division known as deciles. It is denoted by D1 , D2, D3 , D4 …and D9 Dk = L + [(kn/10 – F)/fm] I Where k = 1,2,3,4 …9
  • 61. Exercises Compute the value of the third, fifth and seventh decile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 62. Percentile- refer to those values that divide a distribution into one hundred equal parts. There are 99 percentiles represented by P1, P2, P3, P4, P5, …and P99. when we say 55th percentile we are referring to that value at or below 55/100 th of the data. Pk = L + [(kn/100 – F)/fm]i Where k = 1,2,3,4,5,…99
  • 63. Exercises Compute the value of the 30th, 55th, 68th and 88th percentile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 64. Assignment no. 3 I. The rate per hour in pesos of 12 employees of a certain company were taken and are shown below. 44.75, 44.75, 38.15, 39.25, 18.00, 15.75, 44.75, 39.25, 18.50, 65.25, 71.25, 77.50 a. Find the mean, median and mode. b. If the value 15.75 was incorrectly written as 45.75, what measure of central tendency will be affected? Support your answer.
  • 65. II. The final grades of a student in six subjects were tabulated below. Subj units final grade Algebra 3 60 Religion 2 90 English 3 75 Pilipino 3 86 PE 1 98 History 3 70 a. Determine the weighted mean b. If the subjects were of equal number of units, what would be his average?
  • 66. III. The ages of qualified voters in a certain barangay were taken and are shown below Class Interval Frequency 18-23 20 24-29 25 30-35 40 36-41 52 42-47 30 48-53 21 54-59 12 60-65 6 66-71 4 72-77 1
  • 67. a. Find the mean, median and mode b. Find the 1st and 3rd quantile c. Find the 4th and 6th decile d. Find the 25th and 75th percentile
  • 68. Measure of Variation The range is considered to be the simplest form of measure of variation. It is the difference between the highest and the lowest value in the distribution. R = H – L For grouped data, the3 difference between the highest upper class boundary and the lowest lower class boundary. Example: find the range of the given grouped data in slide no. 59
  • 69. Semi-inter Quartile Range This value is obtained by getting one half of the difference between the third and the first quartile. Q = (Q3 – Q1)/2 Example: Find the semin-interquartile range of the previous example in slide no. 59
  • 70. Average Deviation The average deviation refers to the arithmetic mean of the absolute deviations of the values from the mean of the distribution. This measure is sometimes known as the mean absolute deviation. AD = Σ│x – x’│/ n Where x = the individual values x’ = mean of the distribution
  • 71. Steps in solving for AD 1. Arrange the values in column according to magnitude 2. Compute for the value of the mean x’ 3. Determine the deviations (x – x’) 4. Convert the deviations in step 3 into positive deviations. Use the absolute value sign. 5. Get the sum of the absolute deviations in step 4 6. Divide the sum in step 5 by n.
  • 72. Example: 1. Consider the following values: 16, 13, 9, 6, 15, 7, 11, 12 Find the average deviation.
  • 73. For grouped data: AD = Σf│x – x’│ / n Where f = frequency of each class x = midpoint of each class x’ = mean of the distribution n = total number of frequency
  • 74. Example: Find the average deviation of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 75. Variance For ungrouped data s2 = Σ(x – x’)2 / n Example: Find the variance of 16, 13, 9, 6, 15, 7, 11, 12
  • 76. For grouped data s2 = Σf(x – x’)2 / n Where f = frequency of each class x = midpoint of each class interval x’ = mean of the distribution n = total number of frequency
  • 77. Example: Find the variance of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 78. Coefficient of variation If you wish to compare the variability between different sets of scores or data, coefficient of variation would be very useful measure for interval scale data CV = s/x Where s = standard deviation x = the mean
  • 79. Example: In a particular university, a researcher wishes to compare the variation in scores of the urban students with that of the scores of the rural students in their college entrance test. It is know that the urban student’s mean score is 384 with a standard deviation of 101; while among the rural students, the mean is 174, with a standard deviation of 53, which group shows more variation in scores?
  • 80. Standard Deviation s = √s2 For ungrouped data s = √ Σ(x – x’)2 / n For grouped data s = √ Σf(x – x’)2 / n
  • 81. Find the standard deviation of the previous examples for ungrouped and grouped data. Find the standard deviation of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 82. Find the standard deviation of 16, 13, 9, 6, 15, 7, 11, 12
  • 83. Measure of variation for nominal data VR = 1 – fm/N Where VR = the variation ratio fm = modal class frequency N = counting of observation
  • 84. Example: With the data given by a clinical psychologist on the type of therapy used, compute the variation ratios. Type of therapy no. of patients YR 1980 YR 1985 Logotherapy 20 8 Reality Therapy 60 105 Rational Therapy 42 6 Transactional analysis 39 9 Family therapy 52 5 Others 41 8
  • 85. Assignment no. 4 I. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation test III of assignment no. 3. II. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation of test I of assignment no. 3.
  • 86. SIMPLE LINEAR REGRESSION AND MEASURES OF CORRELATION In this topic, you will learn how to predict the value of one dependent variable from the corresponding given value of the independent variable.
  • 87. The scatter diagram: In solving problems that concern estimation and forecasting, a scatter diagram can be used as a graphical approach. This technique consist of joining the points corresponding to the paired scores of dependent and independent variables which are commonly represented by X and Y on the X-Y coordinate system.
  • 88. Example: The working experience and income of 8 employees are given below Employee years of income experience (in Thousands) X Y A 2 8 B 8 10 C 4 11 D 11 15 E 5 9 F 13 17 G 4 8 H 15 14
  • 89. Using the Least Squares Linear Regression Equation: Y = a + bX Where b = [nΣxy – ΣxΣy] / [nΣx2 – (Σx)2] a = y’ – bx’ Obtain the equation of the given data and estimate the income of an employee if the number of years experience is 20 years.
  • 90. Standard Error of Estimate Se = √ *ΣYi 2 – a(Yi) – b(XiYi)] / n-2 The standard error of estimate is interpreted as the standard deviation. We will find that the same value of X will always fall between the upper and lower 3Se limits.
  • 91. Measures of Correlation The degree of relationship between variables is expressed into: 1. Perfect correlation (positive or negative) 2. Some degree of correlation (positive or negative) 3. No correlation
  • 92. For a perfect correlation, it is either positive or negative represented by +1 and -1. correlation coefficients, positive or negative, is represented by +0.01 to +0.99 and -0.01 to - 0.99. The no correlation is represented by 0.
  • 93. 0 to +0.25 very small positive correlation +0.26 to +0.50 moderately small positive correlation +0.51 to +0.75 high positive correlation +0.76 to +0.99 very high positive correlation +1.00 perfect positive correlation ---------------------------------------------------------- 0 to -0.25 very small negative correlation -0.26 to -0.50 moderately small positive correlation -0.51 to -0.75 high negative correlation -0.76 to -0.99 very high negative correlation -1.00 perfect negative correlation
  • 94. Anybody who wants to interpret the results of the coefficient of correlation should be guided by the following reminders: 1. The relationship of two variables does no necessarily mean that one is the cause of the effect of the other variable. It does not imply cause-effect relationship. 2. When the computed Pearson r is high, it does not necessarily mean that one factor is strongly dependent on the other. On the other hand, when the computed Pearson r is small it does not necessarily mean that one factor has no dependence on the other. 3. If there is a reason to believe that the two variables are related and the computed Pearson r is high, these two variables are really meant as associated. On the other hand, if the variables correlated are low, other factors might be responsible for such small association. 4. Lastly, the meaning of correlation coefficient just simply informs us that when two variables change there may be a strong or weak relationship taking place.
  • 95. The formula for finding the Pearson r is [nΣXY – ΣXΣY] r = ------------------------------ √*nΣX2 – (ΣX)2] [nΣY2 – (ΣY)2]
  • 96. Example: Given two sets of scores. Find the Pearson r and interpret the result. X Y 18 10 16 14 14 14 13 12 12 10 10 8 10 5 8 6 6 12 3 0
  • 97. Correlation between Ordinal Data This is the Spearman Rank-Order Correlation Coefficient (Spearman Rho). For cases of 30 or less, Spearman ρ is the most widely used of the rank correlation method. 6ΣD2 ρ = 1 - ----------- n(n2 – 1) Where D = (RX – RY)
  • 98. Example: Individual Test X Test Y 1 18 24 2 17 28 3 14 30 4 13 26 5 12 22 6 10 18 7 8 15 8 8 12
  • 99. Gamma Rank Order An alternative to the rank order correlation is the Goodman’s and Kruskal’s Gamma (G). The value of one variable can be estimated or predicted from the other variable when you have the knowledge of their values. The gamma can also be used when ties are found in the ranking of the data.
  • 100. NS - N1 G = ----------------- NS + N1 Where NS = the number of pairs ordered in the parallel direction N1 = the number of pairs ordered in the opposite direction
  • 101. Given a segment of the Filipino Electorate according to religion and political party LAKAS LP NP Total Catholic 50 25 20 INC 34 72 21 Born Again 22 12 10 Total
  • 102. Correlation between Nominal Data The Guttman’s Coefficient of predictability is the proportionate reduction in error measure which shows the index of how much an error is reduced in predicting values of one variable from the value of another. ΣFBR - MBC λc = ------------------ N – MBC Where FBR = the biggest cell frequencies in the ith row MBC = the biggest column totals N = total observations
  • 103. ΣFBC - MBR λr = ------------------- N – MBR Where FBC = the biggest cell frequencies in the column MBR = the biggest of the row totals N = total number of observations Compute for the λc and λr for the segment of Filipino electorate and political parties.
  • 104. Assignment no. 5 1. Given the average yearly cost and sales of company A for a period of 8 years. Find the pearson r and interpret the results. Year Cost Sales per P10,000 per P10,000 1960 15 38 1961 30 53.3 1962 16 60 1963 39 72 1964 20 40 1965 36 47.5 1966 45 82 1967 10 21.5
  • 105. 2. Given the grades of 10 students in statistics determine the spearman rho and interpret the result Student Q1 Q2 A 62 57 B 90 88 C 75 90 D 60 67 E 58 60 F 89 79 G 91 78 H 90 62 I 94 86 J 50 55
  • 106. 3. Compute for the gamma shown and interpret the result Socio- economic status EDUCATIONAL STATUS TOTAL UPPER MIDDLE LOWER TOTAL UPPER 24 19 5 MIDDLE 12 54 29 LOWER 9 26 25 TOTAL
  • 107. 4. Compute for the λc and λr for the problem no. 3.
  • 108. Counting Techniques Consider the numbers 1,2,3 and 4. suppose you want to determine the total 2 digit numbers that can be formed if these are combined. First, let us assume that no digit is to be repeated. 12 21 31 41 13 23 32 42 14 24 34 43 Notice that we were able to used all the possibilities. In this example, we have 12 possible 2 digit numbers.
  • 109. Now, what if the digits can be repeated? 11 12 13 14 21 22 23 24 31 23 33 34 41 42 43 44 Hence, we have 16 possible outcomes. In the first activity, we can do it in n1 ways and after it has been done, the second activity can be done in n2 ways, then the total number of ways in which the two activities can be done is equal to n1 n2.
  • 110. Example: 1. How many two digit numbers can be formed from the numbers 1,2,3 and 4 if a. Repetition is not allowed? b. Repetition is allowed? 2. How many three digit numbers can be formed from the digits 1,2,3,4 and 5 if any of the digits can be repeated? 3. The club members are going to elect their officers. If there are 5 candidates for president, 5 candidates for vice president and 3 for secretary, then how many ways can the officers be elected?
  • 111. 4. An office executive plans to buy as laptop in which there are 5 brands available. Each of the brands has 3 models and each model has 5 colors to chose from. In how many ways can the executive choose? 5. Consider the numbers 2,3 5 and 7. if repetition is not allowed, how many three digit numbers can be formed such that a. They are all odd? b. They are all even? c. They are greater that 500?
  • 112. 6. A pizza place offers 3 choices of salad, 20 kinds of pizza and 4 different deserts. How many different 3 course meals can one order? 7. The executive of a certain company is consist of 5 males and 2 females. How many ways can the presidents and secretary be chosen if a. The president must be female and the secretary must be male? b. The president and the secretary are of opposite sex? c. The president and the secretary should be male?
  • 113. Permutation The term permutation refers to the arrangement of objects with reference to order. P(n,r) = n! / (n – r)! Evaluate: 1. P(10,6) 2. P(5,5) 3. P(4,3) + P(4,4)
  • 114. Examples: 1. In how many ways can a president, a vice president, a secretary and a treasurer be elected from a class with 40 students? 2. In how many ways can 7 individuals be seated in a row of 7 chairs? 3. In how many ways can 9 individuals be seated in a row of 9 chairs if two individuals wanted to be seated side by side?
  • 115. 4. Suppose 5 different math books and 7 different physics books shall be arranged in a shelf. In how many ways can such books be arranged if the books of the same subject be placed side by side? 5. Determine the possible permutations of the word MISSISSIPPI. 6. Find the total 8 digit numbers that can be formed using all the digits in the following numerals 55777115
  • 116. 7. In how many ways can 6 persons be seated around a table with 6 chairs if two individuals wanted to be seated side by side? 8. In a local election, there are 7 people running for 3 positions. In how many ways can this be done?
  • 117. Combination A combination is an arrangement of objects not in particular order. nCr = C(n,r) = n! / r!(n-r)! Evaluate: 1. 8C4 2. 5(5C4 – 5C2) 3. 7C5 / (7C6 – 7C2)
  • 118. 1. A class is consist of 12 boys and 10 girls. a. In how many ways can the class elect the president, vice president, secretary and a treasurer? b. In how many ways can the class elect 4 members of a certain committee? 2. In how many ways can a student answer 6 out of ten questions? 3. In how many ways can a student answer 6 out of 10 questions if he is required to answer 2 of the first 5 questions?
  • 119. 4. In how many ways can 3 balls be drawn from a box containing 8 red and 6 green balls? 5. A box contain 8 red and 6 green balls. In how many ways can 3 balls be drawn such that a. They are all green? b. 2 is red and 1 is green? c. 1 is red and 2 is green?
  • 120. 6. A shipment of 40 computers are unloaded from the van and tested. 6 of them are defective. In how many ways can we select a set of 5 computers and get at least one defective? 7. Five letters a,b,c,d,e are to be chosen. In how many ways could you choose a. None of them b. At least two of them c. At most three of them
  • 121. Assignment no. 6 1. How many possible outcomes are there if a. A die is rolled? b. A pair of dice is rolled? 2. In how many ways can 5 math teachers be assigned to 4 available subjects if each of the 5 teachers have equal chance of being assigned to any of the 4 subjects?
  • 122. 3. Consider the numbers 1,2,3,5,and 6. how many 3 digit numbers can be formed from these numbers if a. Repetition is not allowed and 0 should not be in the first digit? b. Repetition is allowed and 0 should not be in the first digit? 4. A college has 3 entrance gates and 2 exit gates. In how many ways can a student enter then leave the building?
  • 123. 5. In how many ways can 9 passengers be seated in a bus if there are only 5 seats available? 6. In how many ways can 4 boys and 4 girls be seated in a row of 8 chairs if a. They can sit anywhere? b. The boys and girls are to be seated alternately? 7. In how many ways can ten participants in a race placed first, second and third?
  • 124. 8. Determine the number of distinct permutations of each of the following: a. STATISTICS b. ADRENALIN c. 44044999404 9. A class consist of 12 boys and 10 girls. In how many ways can a committee of five be formed if a. All members are boys? b. 2 are boys and 3 are girls?
  • 125. 10. In how many ways can a student answer an exam if out of the 6 problem, he is required to answer only 4?
  • 126. Probability In the study of probability, we shall consider activities for which the outcomes cannot be predicted with certainty. These activities, called experiment, could always result in a single outcome. Although the single outcome can not be predicted before the performance of the experiment, the set of all possible outcomes can be determined. This set of all possible outcomes is referred to as sample space. Each individual element or outcome in a sample space is known as a sample point.
  • 127. Definition of terms: 1. Random experiment- any process of generating a set of data or observations that can be repeated under basically the same conditions, which lead to well defined outcomes. 2. Sample space – set of all possible outcomes of an experiment, usually denoted by S. 3. Sample point- an element of the sample space or outcomes.
  • 128. 4. event- any subset of the sample space usually denoted by capital letters. 5. Null space- a subset of the sample space that contains no elements and denoted by the symbol Ø. 6. Simple event – an event which contains only one element of the sample space. 7. Compound event – an event that can be expressed as the union of the simple events, thus containing more than one sample points. 8. Mutually exclusive events- two events A and B are mutually exclusive if A∩B have no elements in common.
  • 129. The probability of a event A denoted by P(A) is the sum of the probabilities of mutually exclusive outcomes that constitute the event. It must satisfy the following properties: 0 ≤ P(A) ≤ 1
  • 130. Example: 1. Consider the activity of rolling a die. This activity has 6 possible outcomes, that is 1,2,3,4,5 and 6. thus, S = {1,2,3,4,5,6} Any numbers 1 to 6 is a sample point of S. we can say that there are 6 sample points. If we let A be the event of getting an even number and B an event of getting a perfect square, then A = {2,4,6} and B = {1,4} Note that the elements of A are elements of the sample space S. the number of sample points in a sample space S, events A and B are usually written as n(S) = 6, n(A) = 3 and n(B) = 2.
  • 131. 2. If a pair of dice is rolled, then determine the number of sample points of the following: a. Sample space b. Event of getting a sum of 5. c. Event of getting a sum of at most 4. 3. A box contains 6 red and 4 green balls. If three balls are drawn from the box, then determine the number of sample points of the following: a. The sample space b. The event of getting all green balls c. The event of getting 1 red and 2 green balls.
  • 132. Probability is the chance that an event will happen. The probability of an event A denoted by P(A) refers to the number between 0 and 1 including the values of 0 and 1. This number can be expressed as a fraction, as a decimal or as a percent. When we assign a probability of 0 to event A, it means that it is impossible for event A to occur. When event A is assigned a probability of 1, then we say that event A will really occur.
  • 133. P(A) + P(A)’ = 1 The probability of occurrence plus the probability of non-occurrence is always equal to 1. Example: A student in a statistics class was able to compute the probability of passing the subject to be equal to 0.55. Based on this information, what is the probability that he is not going to pass the subject?
  • 134. Three approaches of probability: 1. Subjective probability- it is determine by the use of intuition, personal beliefs and other indirect information. 2. A posteriori or probability of relative frequency (empirical probability) – it is determined by repeating the experiment a large number of times using the following rule: no. of times event A occurred P(A) = --------------------------------------------------- no. of times experiment was repeated
  • 135. Example: Records show that 120 out of 500 students who entered in a CS/IT programs leave the school due to financial problems. What is the probability that a freshman entering this college will leave the school due to financial problem?
  • 136. 2. Last year, the efficiency rating of the employees of a certain company were taken and presented in a frequency distribution below: Efficiency rating no. of employees 60-65 12 66-71 10 72-77 31 78-83 29 84-89 8 Based on the data, what can we say about the proportion of employees for this year who shall have an efficiency rating from 72-77 and 84-89?
  • 137. 3. A Priori or classical probability – it is determined even before the experiment is performed using the following rule: n(A) P(A) = -------- n(S) Where n(A) = no. of sample points in event A n(S) = no. of sample points in sample space S.
  • 138. 1. If a coin is tossed , what is the probability of getting a head? 2. If two coins are tossed, what is the probability of getting both heads? 3. If a die is rolled, what is the probability of getting an odd number? An even number? A perfect square? 4. If a pair of dice is rolled, what is the probability of getting a sum of 6? A sum of 13?
  • 139. 5. The probability that a college student without a flu shot will get the flu is 0.42.what is the probability that a college student without the flu shot will not get the flu? 6. A box contains 7 red and 6 green balls. If 2 balls are drawn from the box, what is the probability of getting both green? 1 red and 1 green?
  • 140. Addition Rule: In practice, the probability of two or more events are usually considered. If we let A and B be events then these two events can be combined to form another event. The event that at least one of the events A or B will happen is denoted by AUB. The event that both events A and B will occur is denoted by A∩B. The probability of AUB denoted by P(AUB) is given by P(AUB) = P(A) + P(B) – P(A∩B)
  • 141. Two events A and B are said to be mutually exclusive if they can not occur both at the same time. This implies that the occurrence of event A excludes the occurrence of event B and vice versa. Therefore, P(A∩B) has no sample point which is equal to 0. The previous equation will be P(AUB) = P(A) + P(B)
  • 142. 1. Consider rolling a die and the events of getting an odd number, an even number and a perfect square. Determine the probability of getting a. An odd or an even number. b. An even number or a perfect square. (this implies that the two events can occur both at the same time. Therefore the two events are non-mutually exclusive events)
  • 143. 2. A card is drawn from an ordinary deck of 52 playing cards. Find the probability of getting a. An ace or a queen b. A queen or a face card c. A black card or a queen
  • 144. 3. You are going to rolled a pair of dice. Find the probability of getting the sum that is even or the sum that is multiple of 3. 4. A student goes to the library and checks out that 40% are work of fiction, 30% are non fiction and 20% are either fiction or non- fiction. What is the probability that the student check out a work of fiction, non- fiction or both?
  • 145. 5. The probability that Anita will buy machine A is 7/11 and the probability that she will buy machine B is 5/11. If the probability of buying either machine A and B is 9/11, what is the probability of buying the two machine?
  • 146. 6. A community swim team has 150 members. Seventy-five of the members are advanced swimmers. Forty-seven of the members are intermediate swimmers. The remainder are novice swimmers. Forty of the advanced swimmers practice 4 times a week. Thirty of the intermediate swimmers practice 4 times a week. Ten of the novice swimmers practice 4 times a week. Suppose one member of the swim team is randomly chosen. Answer the questions (Verify the answers):
  • 147. a. What is the probability that the member is a novice swimmer? b. What is the probability that a member practice 4 times a week? c. What is the probability that the member is an advanced swimmer and practice 4 times a week? d. What is the probability that a member is an advance swimmer and an intermediate swimmer? Are they mutually exclusive?
  • 148. SEATWORK 1. A BOX CONTAINS 7 RED, 3 GREEN AND 2 YELLOW BALLS. IF ONE BALL IS DRAWN FROM THE BOX, THEN WHAT IS THE PROBABILITY OF GETTING • A RED? • A NON-RED? • A NON-GREEN? 2. SUPPOSE THAT WE ROLL A DICE, WHAT IS THE PROBABILITY OF GETTING A SUM OF 6 OR 8? 3. SUPPOSE WE PICK ONE CARD FROM A DECK OF CARDS, WHAT IS THE PROBABILITY OF GETTING • A KING OR A SPADE? • A KING OR NUMBER 8? 4. KLAUS IS TRYING TO CHOOSE WHERE TO GO ON VACATION. HIS CHOICES ARE A=BAGIUO AND B=TAGAYTAY. HE CAN ONLY AFFORD ONE VACATION. THE PROBABILITY OF CHOOSING A IS 0.36 AND THE PROBABILITY OF CHOOSING B IS 0.44. WHAT IS THE PROBABILITY THAT HE CHOOSES TO GO EITHER A OR B? WHAT IS THE PROBABILITY THAT HE WILL NOT CHOOSE ANY OF THE TWO DISTINATION?
  • 149. Conditional Probability and Multiplication Rule It is the probability that a second event will occur if the first event already happened. Symbolically, conditional probability is written as P(A/B) and is read as the probability of event A given that B has occurred. The computing formula for the conditional probability of A given B is given by P(A/B) = P(A∩B)/P(B), provided P(B) is not equal to zero.
  • 150. 1. Let P(A) = 0.55 P(B) = 0.35 P(A∩B) = 0.20 Find P(A/B) and P(B/A) 2. A die is rolled. If the result is an even number, what is the probability that it is a perfect square? 3. A card is drawn from a deck of 52 cards. Given that the card drawn is a face card, then what is the probability of getting a king? A spade? A red card?
  • 151. 4. A vendor has 35 balloons on strings. 20 balloons are yellow, 8 are red and 7 are green. A balloon was selected at random and sold. Given that the balloon selected and sold is yellow, what is the probability that the next balloon selected and sold at random is also yellow? 5. Given that 25 microwaves are on display in a certain store but 2 of them are defective. A customer wishes to buy 2 microwaves and pick them up without replacement. Find the probability that the two are defective.
  • 152. 6. Should women participate in combat? yes no Male 32 18 Female 8 42 a. Find the probability that the respondent answered YES given that the respondent was a female. b. Find the probability that the respondent was a male given that the respondent answered NO.
  • 153. 7. A box contains 3 red and 8 black balls. If two balls are drawn in succession without replacement, what is the probability that a. Both are red? b. The first ball is red and the second ball is black? 8. A box contains 3 red and 8 black balls. If 2 balls are drawn at random with replacement, what is the probability that both are red?
  • 154. Assignment no. 7 1.. A BOX CONTAINS 7 RED, 3 GREEN AND 2 YELLOW BALLS. IF ONE BALL IS DRAWN FROM THE BOX, THEN WHAT IS THE PROBABILITY OF GETTING • A RED? • A NON-RED? • A NON-GREEN? 2. SUPPOSE THAT WE ROLL A DICE, WHAT IS THE PROBABILITY OF GETTING A SUM OF 6 OR 8? 3. SUPPOSE WE PICK ONE CARD FROM A DECK OF CARDS, WHAT IS THE PROBABILITY OF GETTING • A KING OR A SPADE? • A KING OR NUMBER 8? 4. KLAUS IS TRYING TO CHOOSE WHERE TO GO ON VACATION. HIS CHOICES ARE A=BAGIUO AND B=TAGAYTAY. HE CAN ONLY AFFORD ONE VACATION. THE PROBABILITY OF CHOOSING A IS 0.36 AND THE PROBABILITY OF CHOOSING B IS 0.44. WHAT IS THE PROBABILITY THAT HE CHOOSES TO GO EITHER A OR B? WHAT IS THE PROBABILITY THAT HE WILL NOT CHOOSE ANY OF THE TWO DISTINATION?
  • 155. 5. The probability that it is Friday and that a student is absent is 0.03. Since there are 5 school days in a week, the probability that it is Friday is 0.2. What is the probability that a student is absent given that today is Friday?
  • 156. Normal Distribution The normal probability curve is one of the most commonly used theoretical distributions in statistical inference. The mathematical equation of the normal curve was developed by De Moivre in 1773. the distribution is sometimes called the Gaussian distribution in honor of Gauss, who also derived the equation in the 19th century.
  • 157. Con’t A large population investigated in education and the behavioral sciences has characteristics that follow a normal distribution. If we are to study, for instance, the scholastic mental capacity of a school population N= 1500, we may find that majority of the student population will yield average scores, a small portion will yield above and below average scores and a few students will yield extremely high and low scores.
  • 158. Con’t The characteristics of the Normal Curve is 1. The curve is symmetrical and bell shaped. It has its highest point at the center. The lines at both sides fall off toward the opposite directions at exactly equal distance from the center. Therefore if the curve is folded at the middle, the two sides are perfectly of the same size and shape.
  • 159. Con’t 2. The number of cases, N, is infinite. This is the reason why the curve is asymptotic to the baseline which means that the curve at both sides does not touch the baseline or the axis, and that the curve may extend infinitely to both directions. 3. The three measures of central tendency, mean, median and mode coincide at one point at the center of the distribution.
  • 160. Con’t 4. The height of the curve indicate the frequency of cases, expressed as probability, proportion or percentage. Hence, the total area under the normal curve is 1.0 in terms of probability or proportion and 100% in terms of percentage. Thus one half of the area is 50% 5. The basic unit of measurement is expressed in sigma units (σ) or standard deviations along the baseline. It is also called Z-scores.
  • 161. Con’t 6. Two parameters are used to describe the curve. One is the parameter mean(μ or x’) which is equal to zero and the other is the standard deviation(σ) which is equal to 1. 7. Standard deviations or A scores departing away from the mean (μ or x’) towards the right of the curve is in positive while scores departing from the mean is in negative values.
  • 163. From the previous curve We can say that, 1. At least 68% of the values in the given set of data fall within plus or minus 1 standard deviation from the mean. In symbols, the interval is given by (x’ – 1σ) – (x’ + 1σ). 2. At least 95% of the value in the given set of data fall within plus or minus 2 standard deviation from the mean. In symbol, the interval is (x’ – 2σ) – (x’ + 2σ) and so on.
  • 164. To illustrate the significance of the empirical rule, consider the NCEE scores of students in a certain college whose mean score x’ or μ = 65 and the standard deviation σ or SD = 6 1. approximately, 68% of the students in that college have NCEE scores between 80 plus or minus 10, that is 65 – (1)(6) – 65 + (1)(6) 59 - 71
  • 165. The Standard Score The standard score Z represents a normal distribution with mean x’ = 0 and SD = 1. such transformation can be obtained by using the formula below. Z = (x – x’) / SD
  • 166. Normal Curve Areas The total area under the normal curve is equal to 1. since a normally distributed set of data is symmetric, then the total area from Z = 0 to the right is equal to 0.5. the area from Z = 0 to the left is also equal to 0.5. Example: Find the area under the curve from 1. 0<Z<1.25 2. -1.25<Z<0
  • 167. Normal Probability Distribution Find the probability value of 1. P(Z>1.45) 2. P(Z<-0.4) 3. P(-0.4<Z<1.45) 4. P(1.15<Z<2.33) 5. P(Z<1.28) 6. P(Z>-1.04)
  • 168. Con’t 7. The examination results of a large group of students in statistics are normally distributed with a mean of 40 and a standard deviation of 4. If a student is chosen at random, what is the probability that his score is a. Below 30? b. Above 55? c. Below 42? d. Between 35 to 45? e. Between 33 to 50?
  • 169. Con’t 8. The efficiency rating of 400 faculty members of a certain university were taken and resulted in a mean rating of 78 with a standard deviation of 6.75. assuming that the set of data are normally distributed, how many of the faculty members have an efficiency rating of a. Greater than 78? b. Less that 78? c. Greater than 85? d. Between 75-90?
  • 170. Assignment no. 8 I. Find the area under the following condition 1. Between the -2.02 and 1.01 2. To the right of 1.62 3. To the left of 0.56 4. Between 0.65 and 1.18 5. Between -2.09 and -0.78 II. In a reading ability test, with a sample of 120 cases, the mean score is 50 and the standard deviation is 5.5.
  • 171. Con’t a. What percentage of the cases falls between the mean and a score of 55? b. What is the probability that a score picked at random will lie above the score of 55? c. What is the probability that a score will lie below 40? d. How many cases fall between 55 to 60? e. How many cases fall between 40 to 49?