BASIC CONCEPTS in STAT 1 [Autosaved].pptx

BASIC CONCEPTS
Descriptive Statistics versus Inferential Statistics
Descriptive Statistics – is the field of statistics that
focuses on quantitatively description of a collection of
data and only a simple summary of the samples with
corresponding measures is stated while in inferential
statistics, conclusion are being formulated from the
direct data.
Parameter versus Statistics
Both terms are actually homologous to one another
except for the fact that parameters describes a whole
population while a statistics described a sample of the
population

CLASSIFICATION OF DATA
Qualitative Data
• Measure of types and may
represented by names or
symbols
• Describe individuals or
objects by their categories
or groups
• Ex. Gender, Nationality,
Student type
Quantitative Data
• Measure of values or
counted and expressed in
numbers
• Operation such as addition
and averaging make sense
• Answer the questions how
many, how much
• Ex. Weight in kilograms
Age, Grades

Raw Data and Array Data
• A data can be considered raw if it is in its original
form, if the data collected is already arranged in
certain pattern such as in ascending or
descending order, then it is no longer raw but
rather in its arrayed form.
Ex. The scores of 7 Pharmacy students during their
first quiz in statistics.
• Raw Data (21, 22,19, 24, 22, 28, 25)
• Array Data, arranged in ascending order
(19, 21,22,22,24, 25, 28)

Ex. The ages of the kids who attended the
birthday party were listed in the table. (Raw
Data)
Kids
• Joe
• Jake
• Hannah
• Marian
• Roberto
• Patrick
• Shane
Age
• 11
• 9
• 7
• 10
• 8
• 5
• 10

Array Data
Kids
Patrick
Hannah
Roberto
Jake
Marian
Shane
Joe
Age
5
7
8
9
10
10
11

Classification of Variables
In a study, the individuals or subjects are the people or
objects to be studied. The variables, on the other
hand, are the characteristics of the individual to be
observed or measured.
Ex. A researcher wants to conduct a study on the
performance of male athletes in the university in their
games. Identify the individuals and the variables.
Individual or Subjects Variables
All male athletes in the Winning and losing
University records in their
games

According to Functional Relationships
• 1. Independent Variable – called the predictor
variable
• 2. Dependent Variable – called the criterion
variable
Ex. The Academic performance of students in
Mathematics depends on their study habits and
their attitudes towards the subject.
Independent variable – Student’s study habits
and attitudes
Dependent variable- Academic performance of
students in Mathematics

According to continuity of Values
• 1. Continuous Variables – variables that can be
expressed in decimals. Ex. Price of
commodities, grades, height
• 2. Discrete or Discontinuous Variables –
variables that cannot be expressed in
decimals. Ex. Number of people, number of
floors

Level of Measurements
1. Nominal Scale – Data that consists of names, labels, or categories only. The
data cannot be arranged in an ordering scheme.
Ex. Gender, Nationality
2. Ordinal scale – Data contain the properties of nominal level, the data can
be arranged in an ordering scheme or ranked, the difference between the
values of the data cannot be determined. The interval is meaningless.
Ex. Ranks in a contest, military ranks, performance ranks
3. Interval value – Data contain the properties of ordinal level. Data values can
be ranked. The difference between the values of the data are of known sizes.
Ex. Temperature (Celsius, Fahrenheit), IQ.
4. Ratio Scale – Data contain the properties of interval level. The zero
indicates the absence of the characteristics under consideration. The data
values has meaning.
Ex. Height in meters, weight in kilograms or pounds.

SOURCES OF DATA
• DOCUMENTARY SOURCE- are those documents
that are published or unpublished and are
usually in the forms of reports, letters,
magazines, newspapers, internet materials.
• FIELD SOURCE- these are individuals or offices
with the authority to give information and have
proper knowledge and expertise regarding the
concerns of given study. Ex. DOST, DOH

METHODS OF COLLECTION OF DATA
• 1. Direct Method- is often called the interview method. This done through
a direct and personal contact of the researcher with the person from
whom data will be collected.
• 2. Indirect Method- this method is also known as the questionnaire
method. It is executed through the use of either online questionnaire or
paper form questionnaire distributed to groups of people that are most of
the time randomly chosen.
• 3. Registration Method- This method is done through gathering of data
from concerned offices. Ex. Information about a population, the
appropriate office to visit is the National Statistics Office
• 4. Observation Method- this method is purely based on the subjective
remarks of the observer. It is applicable to data pertaining to attitude,
behavior, and values of individuals.
• 5. Experimentation Method- It is the method that determines the cause
and effect relationships of a certain parameter or even under a controlled
condition.

SAMPLE SIZE FORMULA
• Most of the time population is used
simultaneously with sample but these two are
actually different terms.
• Population- is the complete set of individuals or
subject while sample is just a representative of
the whole population.
n =
𝑁
1+𝑁𝑒2, slovin
where: n = sample size
N = population
e = margin of error

Ex. A researcher wants to conduct a study in a
university with 10000 students. If he wants to achieve
90% precision, how many students must he take as his
sample.
Given: N = 10,000; e = 10%
Solution: n =
𝑁
1+𝑁𝑒2
n =
10000
1+10000(0.10)2
= 99 students

SAMPLING TECHNIQUE
• 1. Probability Sampling- This sampling technique is also called Simple Random
Sampling. In this technique, the samples are randomly picked and therefore
the selection of sample is without any bias. Each member of the population
has an equal chance of being picked as part of the sample. Ex. Lottery, raffle.
• 2. Restricted Random Sampling- This is often times used when the population
to be considered is too large.
• a. Systematic sampling- The selection of sample is done by picking every kth
element of the population. The kth element in the population is obtained using
the given formula.
kth =
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑆𝑖𝑧𝑒
𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑖𝑧𝑒
=
𝑁
𝑛
Ex. A researcher wants to conduct a study in a university with 10,000 students
with 90% precision
• kth =
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑆𝑖𝑧𝑒
𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑖𝑧𝑒
=
𝑁
𝑛
=
10,000
99
= 101

b. Stratified Sampling. The population is divided
into strata(groups) based on their homogeneity or
commonalities. The steps in doing the stratified
sampling are as follows:
1. Determine the distribution of the population in
each stratum;
2. Find the percentage of each stratum from the
population;
3. Multiply the percentage of each stratum by the
sample size(n). (See sampling techniques)

STRATA DISTRIBUTION OF
POPULATION
PERCENTAGE FROM
THE POPULATION
SAMPLE UNITS PER
STRATUM
UST 15,000 30% 60
UP 10,000 20% 40
NU 25,000 50% 100
Total 50,000 100% 200

3.CLUSTER SAMPLING
This technique is frequently applied on
geographical basis when the
population from which a sample is to
be selected includes heterogeneous
groups.

4. NON-RANDOM SAMPLING
In this technique, not all the population has equal chance to be selected. The
selection is influenced by the goal of the researcher.
a. Purposive sampling. The samples are chosen based on purpose of
certain criteria. Ex. In a population of college students, you are studying
the effects of being sporty in the academic performance. The possibility
is that you will choose the athletes of the university by purpose.
b. Quota-Sampling- In this technique, a certain limit is pre-established to
determine who among the population can be part of the sample. A good
example of this is the determination of the students who can qualify in a
university. For example, the top 5% of the examinees shall be admitted
by the university.
c. Convenience Sampling- The sample is selected based on the accessibility
of the researcher in convenience sampling. For example, the researcher
is doing a study about the performance of universities in the Philippines.
If the researcher lives near Manila, he has the option to take the
Universities in Manila as samples.

How to construct Frequency
Distribution Table
• Ex. 1 The data shown are the scores of 30
students in Statistics exam. Construct a FDT.
47 65 81 65 68 55 56 69 61 75 71 67 61 87 50
74 49 66 49 89 77 75 79 85 68 90 57 63 54 90
• Step 1. Determine the Range (R)
• Definition: Range is the difference between
the highest and the lowest score,
R = Highest score – Lowest score
R = 90 – 47 = 43

» Tally FREQ
• 47-51 //// 4
• 52-56 /// 3
• 57-61 // 2

• Step 2. Determine the desired number of Class
Interval
• Definition: Class interval is the grouping of
category defined by a lower limit and upper limit.
The ideal number of CI is between 5 and 15.
• Step 3. Determine the Class size (i)
• Definition: class size is the difference between
two successive lower class limits. To get i,
• i = Range / Desired number of CI
i = 43/9 = 4.78 or 5
Step 4. Construct and fill up the FDT

Class interval
(5)
Class Boundary
(6)
Class Frequency
(7)
Class Mark(X)
(8)
1 47-51 46.5-51.5 4 49
2 52-56 51.5-56.5 3 54
3 57-61 56.561.5 3 59
4 62-66 61.5-66.5 4 64
5 67-71 66.5-71.5 5 69
6 72-76 71.5-76.5 3 74
7 77-81 76.5-81.5 3 79
8 82-86 81.5-86.5 1 84
9 87-91 86.5-91.5 4 89

TYPES OF FREQUENCY DISTRIBUTION
Less than Cumulative frequency Distribution
CLASS INTERVAL CLASS FREQUENCY LESS THAN GREATER THAN
1 47-51 4 4 30
2 52-56 3 7 26
3 57-61 3 10 23
4 62-66 4 14 20
5 67-71 5 19 16
6 72-76 3 22 11
7 77-81 3 25 8
8 82-86 1 26 5
9 87-91 4 30 4

RELATIVE FREQUENCY
CLASS INTERVAL CLASS FREQUENCY RELATIVE FREQUENCY
1 47-51 4 13.33%
2 52-56 3 10%
3 57-61 3 10%
4 62-66 4 13.33%
5 67-71 5 16.67%
6 72-76 3 10%
7 77-81 3 10%
8 82-86 1 3.33%
9 87-91 4 13.33%

• CONSTRUCTION OF FREQ. POLYGON,
HISTOGTRAM, BAR GRAPH,

MEASURE OF CENTRAL TENDENCY
-is a value that describes to which a set of data will
likely fall. The three measures of central tendency
of data are the mean, median, and mode.
MEAN – is the average of a set of data and is
denoted by a symbol of 𝑥. It is the value equal to
the sum of all the values in a data 𝑥 divided by
the total elements in a given data (N) and is
summarized by the formula given below;
𝑥 =
𝑋1+𝑋2….+𝑋𝑛−1+𝑥𝑛
𝑁
=
𝑥
𝑁
( for un
grouped data)
x

MEAN FOR GROUPED DATA
CLASS INTERVAL CLASS FREQ.(f) CLASS MARK (x) fx
47-51 4 49 4(49)=196
52- 56 3 54 3(54) =162
𝑓 = 𝑁= 𝑓𝑥 =

After completing the table, the given formula
can now be used to calculate for the mean of a
grouped data commonly called as the weighted
mean.
𝑥 =
𝑓1𝑋1+𝑓2𝑋2….+𝑓𝑛−1𝑋𝑛−1+𝑓𝑛𝑥𝑛
𝑁
=
𝑓𝑥
𝑁
( grouped
data)

Ex. Find the mean of all the grades of
James in his Engineering course if his
grades are as follows.
Mathematical Analysis 90
World Literature 78
Discrete Math 92
Calculus-Based Physics 89
Engineering Economy 96
n 1 2 3 4 5
𝑥 =
x 90 78 92 89 90 445

𝑥 =
𝑥
𝑁
=
445
5
= 89
Ex. 2. The results of the scores in Mathematics
test during the teacher’s Board Examination are
summarized by the table below. Find the mean
of the score of all the examinees.

CLASS INTERVAL CLASS FREQUENCY (f)
10-21 5
21-31 10
32-42 11
43-53 7
54-64 23
65-75 56
76-86 6
87-97 8
98-108 4
N =

CLASS INTERVAL CLASS FREQUENCY
(f)
CLASS MARK (x) fx
10-20 5 15 75
21-31 10 26 260
32-42 11 37 407
43-53 7 48 336
54-64 23 59 1357
65-75 56 70 3920
76-86 6 81 486
87-97 8 92 736
98-108 4 103 412
𝑓 = 𝑁 = 130
𝑥 =
𝑓𝑥
𝑁
=
7989
130
=61.45
𝑓𝑥 = 7989

Ex. The data shown are the scores in Statistics
Exam. Find the mean score of the 30 students if
9 class intervals shall be used in grouping the
data.
47 65 81 65 68 55
56 69 61 75 71 67
61 87 50 74 49 66
49 89 77 75 79 85
68 90 57 63 54 90

Step 1. Determine the Range (R).
R = 90 -47 = 43
Step 2. Determine the desired number of CI
The problem requires 9 class intervals.
Step 3. Determine the class size (i)
I = 43/9 = 4.78 = 5
Step 4. Construct and fill the FDT.
Step 5. Make the class intervals. Start with the
lowest score until the highest score is reached.
Note: 47 is the lowest value from the data.
Class size (i) = 5

(f)
CLASS MARK (x) fx
47-51
52-56
57-61
62-66
67-71
72-76
77-81
82-86
87-91

Step 6. Determine the class boundary.
(Disregard this step since class boundary is not
yet significant in solving for mean of data.)
Step 7. Determine the Class frequency (f)
(f)
CLASS MARK (x) fx
47-51 4 49 196
52-56 3 54 162
57-61 3 59 177
62-66 4 64 256
67-71 5 69 345
72-76 3 74 222
77-81 3 79 237
82-86 1 84 84
87-91 4 89 356
𝑓𝑥 = 2031

MEDIAN
Another important measure of central tendency
is the median. It is usually denoted by 𝑥. By
definition, median is the value of the middle
when all the elements in a set of data are
arranged in ascending order.
1 2 3 4 5 6 7 8 9 10 11

Ex. Find the median of the measured
height of all athletes of the ACN
University given below.
181 211 189 200 206 195 188 189 182
Soln. Identify the location of the median element using
the given formula:
M =
𝑛+1
2
, where n = total number of elements in a
set of data, M =
9+1
2
= 5, therefore, the 5th element is the
median element of the given data.
Step 2. Arrange the data in ascending order.
181 182 188 189 189 195 200 206 211
𝑥 = 189

GROUPED DATA
𝑋= 𝑋𝐿𝐵+ i (
𝑁
2
− <𝒄𝒇𝒃
𝑓𝑚
)
Where: 𝑋𝐿𝐵 = lower class boundary of the
median class
i= class size
< 𝑐𝑓𝑏= less than cumulative
freq. before the median class
fm = class freq. of the median class

How to get the median class
The median class is the class interval to which M
is included with respect to less than cumulative
frequency. For grouped data, M is calculated by
dividing the total number of frequencies by (
𝑵
𝟐
)

CLASS INTERVAL CLASS FEQUENCY
(f)
< CUMULATIVE
FREQUENCY
10-20 5 5
21-31 10 15
32-42 11 26
43-53 7 33(27-33)
54-64 23 56(34-56)
65-75 55 111(57-111)
76-86 7 118(112-118)
87-97 8 126
98-108 4 130
𝑓 = 𝑵 = 130 ;
N
2
= 65

Since M is 65, therefore, the median class is the class interval 65-75
since includes elements 57 to 111 when arranged in ascending order,
unlike class interval 54-64 which includes only elements 34 to 56.
1. Lower class boundary of the median class(𝑋𝐿𝐵)
Recall: Lower class boundary= Lower class limit-0.5
Upper class boundary = upper class limit +0.5
𝑋𝐿𝐵 = 65 – 0.5 = 64.5
2. Class size (i) = Upper class limit – lower class limit +1
i = 75 - 65 + 1 =11
3.Less than cumulative frequency before the median class
=56
4. Median class frequency (fm) =55

BASIC CONCEPTS in STAT 1 [Autosaved].pptx

Recommended

Recommended

More Related Content

Similar to BASIC CONCEPTS in STAT 1 [Autosaved].pptx

Similar to BASIC CONCEPTS in STAT 1 [Autosaved].pptx (20)

Recently uploaded

Recently uploaded (20)

BASIC CONCEPTS in STAT 1 [Autosaved].pptx