2. Definitions of Statistics
• the science that deals with the body of principles and procedures for
the collection, organization, summarization, presentation, and
analysis of numerical data (Asaad, 2008)
• the totality of methods that are employed in the collection,
processing, and analysis of data (Freund, 2001)
3. Fields of Statistics
• Mathematical statistics – deals with the development and exposition
of theories that serve as bases of statistical methods
• Applied statistics – refers to the application of statistical methods to
solve real problems involving randomly generated data as well as the
development of new statistical method motivated by real problems
4. Importance/Use of Statistics
1. Aids in decision making
• Provides comparison
• Explains action that has taken place
• Justifies a claim or assertion
• Predicts future outcome
• Estimates unknown quantities
2. Summarizes data for public use
5. Branches of Statistics
• Descriptive statistics – includes anything done to data that is
designed to summarize, or describe, without attempting to infer
anything that goes beyond the data themselves (tables and graphs,
measures of central tendencies, variation) – description
❑Comprise those methods concerned with the collection, description, and
analysis of a set of data without drawing conclusions or inferences about a
larger set (population)
❑The main concern is simply to describe the set of data such that otherwise
obscure information is brought out clearly
❑Conclusions apply only to the data on hand
6. Branches of Statistics
• Inferential statistics – serves to make generalizations that go beyond
the data (decision-making)
❑Comprise those methods concerned with making predictions or
inferences about a larger set of data using only the information
gathered from a subset of this larger set (sample)
❑The main concern is not merely to describe but actually predict and
make inferences based on the information gathered
❑Conclusions are applicable to a larger set of data which the data on
hand is only a subset
7. Comparative Examples
Descriptive Inferential
A bowler wants to find his bowling average for
the past 12 games.
A bowler wants to estimate his chance of
winning a game based on his current season
averages and the averages of his opponents.
A housewife wants to determine the average
weekly amount she spent on groceries in the
past 3 months.
A housewife would like to predict based on
last year’s grocery bills, the average weekly
amount she will spend on groceries for this
year.
A politician wants to know the exact number
of votes he received in the last election.
A politician would like to estimate based on
an opinion poll, his chance for winning in the
upcoming election.
8. Exercises
1. As a result of recent cutbacks by the oil-producing nations, we can
expect the price of gasoline to double in the next year.
2. At least 5% of all fires reported last year in a certain city were
deliberately set be arsonists.
3. Of all patients who have received this particular type of drug at a
local clinic, 60% later developed significant side effects.
4. Assuming that less than 20% of the Columbian coffee beans were
destroyed by frost this past winter, we should expect an increase of
no more than 30 cents for a kilogram of coffee by the end of the
year.
5. As a result of a recent poll, most Americans are in favor of building
additional nuclear power plants.
10. Basic Statistical Terms
Data – the quantities (numbers) or qualities (attributes) measured or
observed that are to be collected and/or analyzed
Examples:
12 years old
40 (test score in Math 8)
1,070 (enrollment for Grade 7 in a school)
male (sex of a student)
Islam (religion of a student)
11. Basic Statistical Terms
Variable – a characteristic or attribute of persons or objects which can
assume different values or labels for different persons or objects under
consideration
Examples:
Undergraduate major in BSED
Faculty ranks
Test score in English test
Grade in ESP 9
12. Basic Statistical Terms
Measurement – the process of determining the value or label of a
particular variable for a particular experimental unit (e.g. using tests,
questionnaires, rating scales, checklists, etc.)
Experimental unit - the individual or object on which a variable is
measured
13. Classification of Variables
Discrete vs. Continuous
Discrete variable - a variable which can assume finite, or, at most,
countably infinite number of values, measured by counting or
enumeration
Examples:
Enrollment per grade level
Number of siblings
Number of out-of-school youth in a barangay
14. Classification of Variables
Discrete vs. Continuous
Continuous variable - a variable which can assume the infinitely many
values corresponding to a line interval
Examples:
Height (in cm)
BMI
Temperature
15. Classification of Variables
Qualitative vs. Quantitative
Qualitative variable - a variable that yields categorical responses
Examples:
Cause of dropping-out of school
Religion
Nationality
School (public, private)
16. Classification of Variables
Qualitative vs. Quantitative
Quantitative variable - a variable that takes on numerical values
representing an amount or quantity
Examples:
Number of children in a family
Age
General average
17. Sources of Data
• Primary source – refers to data that come from the original sources
and are collected especially for the task at hand. This includes
government agencies, business establishments, organizations, and
individuals who carry original data or who have firsthand information
relevant to a given problem.
• Secondary source – refers to data collected by others for another
purpose. Researchers can retrieve information stored in books,
pamphlets, periodicals, microfilm, and computer files.
18. Presentation of Data
• Textual presentation – uses statements with numerals in order to
describe the data purposely to get attention to some significant data.
Hence, it consists of describing data in expository form. It is adequate
for limited amounts of information. However, if there are many facts
involved, this method should not be used alone, because of the
difficulty in reading and assimilating a list of facts and figures.
19. Example
• Of the 150 sample interviewed, the following complaints were noted:
27 for lack of books in the library, 25 for a dirty playground, 20 for
lack of laboratory equipment, 17 for a not well-maintained university
buildings, while another 17 complained of unsanitary cafeteria
because of foul smelling toilet. Another 13 complained that the food
in the cafeteria is not enough, 10 perceived that the teachers are not
friendly, and five complained for lack of resting place.
20. Presentation of Data
• Tabular presentation – uses statistical tables. It is a systematic
organization of data in columns and rows. By having the statistical
table, the researcher is able to communicate his work to others easily.
23. Presentation of Data
• Graphical presentation – uses illustrative materials to facilitate the
easy comparison and interpretation of data without having to go
through numerical data
27. Other Statistical Terms
• Population - a collection of all the elements under consideration in a statistical
study
• Sample - a part or subset of the population from which the information is
collected
Example:
A manufacturer of kerosene wants to determine if customers are satisfied with the
performance of their heaters. Toward this goal, 5, 000 of his 200, 000 customers
are contacted and each is asked, “Are you satisfied with the performance of the
kerosene heater you purchased?”
• Population – 200, 000 customers
• Sample – 5, 000 customers
28. Sampling
• Measuring a small portion of something and then making a general
statement about the whole thing (Bradfield & Moredock, p. 38)
29. Purposes and Advantages of Sampling
• Enables the investigation of a large population
• Reduces cost of research
• Makes data more relevant and accurate (If there is no sampling, the
collection of data may take a very long period of time. Data gathered
may already be obsolete.)
• Avoids consuming all the sources of data
30. Disadvantages of Sampling
• Sampling data involve more care in preparing detailed sub-
classifications because of a small number of subjects.
• If the sampling plan is not correctly designed and followed, the results
may be misleading.
• Sampling requires an expert to conduct the study in an area. If this is
lacking, the results could be erroneous.
• The characteristics to be observed may occur rarely in a population,
e.g. teachers over 30 years of teaching experience.
• Complicated sampling plans are laborious to prepare.
31. Types of Sampling
• Probability sampling - A sampling procedure that gives every element
of the population a nonzero chance of being selected in the sample.
• Non-probability sampling - A sampling procedure that does not give
every element of the population a nonzero chance of being selected
in the sample.
32. Simple Random Sampling
• It is a method of selecting n units out of the N units in the population
in such a way that every distinct sample of size n has an equal chance
of being drawn. The process of selecting the sample must give an
equal chance of selection to any one of the remaining elements in the
population at any one of the n draws.
34. Simple Random Sampling
• It may be with replacement (SRSWR) or without replacement
(SRSWOR). In SRSWR, a chosen element is always replaced before the
next selection is made, so that an element may be chosen more than
once.
• Procedure:
• Make a list of the sampling units and number them from 1 to N.
• Select n (distinct for SRSWOR, not necessarily distinct for SRSWR) numbers
from 1 to N using some random process.
35. Simple Random Sampling
• Table of random numbers. This is the most systematic technique
for getting sample units at random. (The researcher should use
digits in the Random Table equal to the digits of your population.)
• Lottery Sampling (fishbowl technique). This can be applied by
assigning numbers to the participants of your population
assembling them in a sampling frame. Then write the numbers of
your participant in small pieces of paper one number to a piece.
Roll these, and put in the container big enough to allow all rolled
papers to move freely in all directions. Pick the desired number of
participants from the container, continue shake up to the time you
reach the required number of your sample.
36. Stratified Random Sampling
• In stratified random sampling, the population of N units is first
divided into subpopulations called strata. Then a simple random
sample is drawn from each stratum, the selection being made
independently in different strata.
• Procedure:
• Divide the population into strata. Ideally, each stratum must consist of more
or less homogeneous units.
• After the population has been stratified, a simple random sample is selected
from each stratum.
37. Systematic Sampling
• Systematic sampling with a “random start” is a method of selecting a
sample by taking every kth unit from an ordered population, the first
unit being selected at random. Here k is called the sampling interval,
the reciprocal 1/k is the sampling fraction.
38. Systematic Sampling
Method A
1. Number the units of the population consecutively from 1 to N.
2. Determine the k, the sampling interval using the formula k = N / n.
3. Select the random start r, where 1 r k. The unit corresponding
to r is the first unit of the sample. (Determining the r may be done
by lottery.)
4. The other units of the sample correspond to every kth sampling
unit in the sampling frame.
39. Systematic Sampling
Method B
1. Number the units of the population consecutively from 1 to N.
2. Let k be the nearest integer to N/n.
3. Select the random start r, where 1 r N. The unit corresponding to
r is the first unit of the sample.
4. Consider the list of units of the population as a circular list, i.e., the
last unit in the list is followed by the first. The other units of the
sample correspond to every kth sampling unit in the sampling frame.
40. Cluster Sampling
• It is a method of sampling where a sample of distinct groups, or
clusters, of elements is selected and then a census of every element
in the selected clusters is taken. Similar to strata in stratified
sampling, clusters are non-overlapping sub-populations which
together comprise the entire population. For example, a household is
a cluster of individuals living together or a city block might also be a
cluster. Unlike strata, however, clusters are preferably formed with
heterogeneous, rather than homogeneous elements so that each
cluster will be typical of the population. (Ex: Choosing a number of
schools from a list of schools in which all the students of these
schools are included in the sample.)
41. Cluster Sampling
Procedure:
1. Number the clusters from 1 to N.
2. Select n numbers from 1 to N at random. (for example, lottery) The
clusters corresponding to the selected numbers form the sample of
clusters.
3. Observe all the elements in the sample of clusters.
42. Multistage Sampling
• In multistage sampling, the population is divided into a hierarchy of
sampling units corresponding to the different sampling stages. In the
first stage of sampling, the population is divided into primary stage
units (PSU) then a sample of PSU’s is drawn. In the second stage of
sampling, each selected PSU is subdivided into second-stage units
(SSU) then a sample of SSU’s is drawn. The process of subsampling
can be carried to a third stage, fourth stage and so on, by sampling
the subunits instead of enumerating them completely at each stage.
43. Multistage Sampling
• Suppose all aspects of the educative process in all elementary schools in a region
composed of 8 provinces are to be investigated. Suppose that 20 percent is
decided is decided upon to be the sample. The 20 percent of 8 equals 1.6 or two
provinces. These two provinces may be chosen by pure random sampling. The
two provinces chosen are called the primary stage units (PSU’s). The SSU’s are
towns. Suppose a province has 28 towns and the other has 31 towns. Twenty
percent of 28 equals 5.6 or 6 towns in that province and twenty percent of 31
towns equals 6.2 or six towns also in the other province. These towns may be
chosen by either pure random sampling or systematic sampling. The third and
final sampling units are the elementary schools. Take 20% of the elementary
schools from each of the six towns in one province and 20 percent of the
elementary schools in each town of the six towns in the other province. The
elementary schools may be selected by pure random or systematic random
sampling. In addition, stratified sampling should now be applied in choosing 20%
of the teachers, 20% of the pupils and 20% of the parents.
45. Purposive Sampling
• sets out to make a sample agree with the profile of the population
based on some preselected characteristic
Example:
A researcher is interested in finding out a particular reaction of
some students on the devaluation of the peso. Instead of asking
the opinions of all students in various colleges and universities in
Davao City, she may only ask the student leaders of a particular
college or a university.
46. Quota Sampling
• selects a specified number (quota) of sampling units possessing certain
characteristics
Example:
A researcher wants to determine the most favored
soft drinks from a population of televiewers. She
continues this process until she arrives at her quota.
Because she has a quota she has set, she neglects
other participants’ opinion regarding the soft drinks
they favor most.
47. Convenience Sampling
• selects sampling units that come to hand or are convenient to get
information from
Example:
A researcher wants to determine the reaction of people on the oil-
price hike. The most convenient and fastest way of reaching people is
by the telephone to be able to interview. Or the researcher may stand
in a street corner and interview everyone who passes by.