Stastistics in Physical Education - SMK.pptx

Statistics in Physical Education
“Every Moment is a Golden One for him who has the
Vision to Recognize it as such!”
Prof. Shatrunjay Mrityunjay Kote, Ph. D.
Assistant Professor,
M. S. M’s. College of Physical Education,
Khadkeshwar, Aurangabad
shatru29570@gmail.com

Introduction to statistical tests
Although calculations may be easily performed by suing computers but right
interpretation of calculated results need clarity of statistical concepts

Face of Statistics
•It is a specific branch of mathematics that deals with analysis of data collected on
various population groups
• Statistics involves mathematical abilities more than addition, subtraction, division and
multiplication which are repeated many times in a logical fashion.
• for fuller details of statistical tests may refer to Chandha (1992); Vincent (1995);
Hopkin et al. (1996); Sincrich et al. (2002); Triola (2002)

Statistical concepts
1. Understanding of basic statistics is indispensable for dealing with the process of
evaluation of test and measurement.
2. The statistical concepts facilities proper and effective interpretation of test scores
or measurements taken by the coach or a physical educator
3. While a computer assists the teacher or the coach in saving the huge time needed
for enormous calculations, but the meaning of results is made clear only through
the understanding of relevant statistical test concepts.
4. Tests act as seed to measurements, the statistical tests act as seed to the
construction of all other types of tests and are also essential for the testing of
validity, reliability and objectivity of all tests.

Functions of Statistical Tests
The information which we can deduce from test and measurement is based on our statistical
ability. It is the statistical tools which enable us to do the following important functions:
1. Organize and tabulate date (presentation of facts in a definite form)
2. Analysis data
3. Synthesize data (classification / combination of facts)
4. Compare groups of data
5. Simplification of unwieldy and complex data
6. Proper interpretation of a data
7. testing of hypotheses
8. understand the relationship and association between different parameters, make predications
and take decisions.
9. Construction of physical, psychomotor and written tests
10. Evaluation of individual measurements
11. selection of sportsperson
12. Monitoring of training and teaching effects and testing the need for individualization of
training and teaching.

Statistics: Meaning
Meaning: The word “statistics” is a plural form of ‘statistic’. The term statistic
is uncommon to that an extent that many of the students of statistics may be
unaware of its singular form. The word statistics has been taken from German
word ‘statistik’ meaning a political state. Since, facts and figures were required
in olden days mainly by kings for their administration. Therefore, in the
beginning. It was also known as the ‘Science of Kings’ (Chadha, 1992).
Subsequently, its scope has greatly widened and statistics now refers to a huge
body of methods, symbols and formulae dealing with phenomena that can be
described numerically providing quantitative arrays of information
Statistic is numerical value which characterizes a group of scores. For example
the average height characterizes the entire sample whose all subjects’ heights
have been measured to calculate the average height. A number of such
characterizing values refer to the plural form of above mentioned statistic and
thus, give rise to the more commonly used term that is statistics.
According to Webster’s dictionary, statistics means “The classified facts
representing the condition of people in a state, especially those facts which can
be stated in numbers or in tables or in any diagrammatic or classified
arrangements.

Statistics: Definition
• Definition: Statistics may be defined as “summarized figures of numerical facts
such as percentages, averages (means, medians, modes), standard deviations etc. and
methods of dealing with numerical facts”
• Tate (1955) described statistics by following interesting sentence. You compute
statistics from statistics by statistical methods.
• Mangal (1987) defines statistics as “A subject or branch of knowledge that helps us
in the scientific collection, presentation, analysis and interpretation of numerical
facts”
• Hastad and Lacy (1994) define statistics as “the science of collecting, classifying,
presenting and interpreting numerical data”
• In general, we can define statistics as “the study of collection, organization,
summarizing, analysis and interpretation of numerical data”

Meaning of Important Statistical Terms
1. Data: Any information collected about facts is called data. The word data is the
plural form of datum (fact).
2. Population: A defined group of all subjects or all objects with at least one
common characteristic, is known as a population or universe.
3. Sample: A subgroup drawn for the study of selected characteristics of the
population is known as a sample. Hence a sample may be defined as a much
smaller part of population selected for representing the whole population.
4. Sampling: it may be defined as the scientific procedure of selecting a sample i.e.,
only a few items / subjects from the entire population of items or subjects

• Types of sampling:
• Simple Random Sampling: in simple random sampling each member of the
specified population has an equal chance of being included in the sample. All the
subjects are assigned numbers and then the required number of subjects is selected
with the help of a random number table given in standard statistics books and on
internet (www.graphpad.com)
• Systematic Sampling: After listing all the subjects, sampling is done at such
systematic intervals that will provide the required number of subjects.
• Stratified sampling: A random sample containing equal representation of each
strata (sub-group) of the population. In this type of sampling first the population is
divided into homogeneous groups and then the procedure of systematic sampling
is applied on each group.
• Cluster Sampling: A random sample collected from randomly selected clusters.
Clusters differ from sub-groups because the subjects of a cluster are widely
dissimilar while subjects of a sub-group are quite similar.
• Proportional Sampling: It is used in all previous methods mentioned above. It
involves determining the percentage representation of each category of a
population.

5. Statistics: Any characteristics of population studied from a sample is know as a
statistic or a variable whereas a number of characteristics of varied population groups
are know as statistics.
6. Parameter: The investigator is interested to study various characteristics of a
population. Each characteristic when studied on the entire population is known as a
particular parameter. In other words a measurable characteristic of the entire defined
population.
7. Hypothesis: supposition of a answer to the problem at hand made by the investigator
before conducting the research, in a testable form, which is to be later tested by
scientific research
Features of a Hypothesis: (a) the hypothesis provides the basis for the planning; (b) It
is a supposition about some phenomenon; (c) A hypothesis is to be tested by gathering
actual facts about the phenomenon.

• 8. Norms: norms are naturally occurring normal values and the normal ranges
cannot be modified by experts. Statistical tools which evaluate the status of an
individual with respect to recently collected data on a large sample of a population
group of the individual being examined.
• 9. Standards: standards are the investigator set limits of physical, physiological or
psychomotor levels for scoring or selecting individuals for a a particular
competitions, degree / prize awarding or items selection etc.
• 10. Degree of freedom: The degree of freedom may be defined as the number in a
distribution which (individual values) is independent of each other and cannot be
deducted from each other. Generally the degree of freedom is given by the formula:
• Df = Ns-Np; Where df = degree of freedom; Ns = Number of scores (observation);
Np = number of parameters being estimated.

Classification of Statistics
Descriptive Statistics: the branch of statistics which describes the statistical constants
of the data such as average, measures of variability, standard scores etc. is known as
descriptive statistics. For instance, the frequency distribution mean, median, mode,
standard deviation, coefficient of variation, standard ‘Z’ score, ‘T’ score, sigma score,
hull score, percentiles, normal curve etc. come under descriptive statistics, in other
words, descriptive statistics summarize the achievements, structural and functional
characteristics of champions and groups in numerical figures usually in tabular and
diagrammatic forms.
Inferential Statistics: the statistical tools used to make conclusions or inferences about
the study of differences or relationships’ predictions are known as inferential statistics
or statistical tests. Examples are ‘t’ test, ‘F’ test, chi-square test, regression coefficients
etc. in other words, inferential statistics provide mathematical tools to compare groups
and to examine the level of group differences, prediction of one variable from the other
as well as to test various types of hypotheses.

Importance of Statistics
1. Statistics that is statistical tests are an essential part of test construction for any type
of physical performance evaluation
2. These help to find the central tendency (mean value) which represents the average
performance of the group (class or team etc.)
3. These help to compare a group of sportsmen with another
4. Statistics enables to classify the players in different grades / levels
5. These help the physical education to validate his/her methods of teaching training
and to study the effects of different teaching /training methods.
6. Statistics are essential to understand research literature, conduct research and report
the results of test and measurements
7. No scientific evaluation of tests is possible without applying statistical tests
8. With the help of statistics, a physical educationist is able to interpret the scores and
grades awarded to the students in quantitative and qualitative data
9. It is the statistics that helps the physical educationist to select sports talent at young
age and to predict the performance potential of his / her trainees.

Data and its Distribution
Meaning of Data
The term ‘data’ is the plural form of datum, which means ‘fact’. Thus, data refers to a
number of facts collected by the investigator. In other words, data refers to the raw
scores especially the numerical records obtained in a statistical enquiry. In measurement
and evaluation data is used for numerical scores about facts such as measured height,
weight, performance in athletic tests, achievement scores for qualitative tests
(intelligence, attitude, sportsmanship behaviour, interest etc.)
Definition of Data:
Any information collected systematically about facts or measures is called data

Kinds of Data
Data may be divided into various kinds with respect to different bases of its
classification. The bases of classification of data are given below;
(a) Sources of data collection
(b) Type of distribution of data
(c) Organisation of data

(a) Sources of data collection
Kinds of data based on sources of the collection:
(i) Primary data: The data collected by actual measurement testing or observation with
the help of scientific equipments or questionnaire, is known as primary data or the
data collected from a primary source. For example, measuring height, weight,
standing broad jump, intelligence quotient directly on a selected sample of
subjects.
(ii) Secondary Data: The utilization of data previously collected / recorded by any
other investigator or agency used in a different manner by the present investigator,
is known a secondary sources. For example, the height, age, weight etc. collected
at the time of recruitment of soldiers or admission of children in schools or
selection of athletes etc., used by the investigator for his / her research work, is the
kind of secondary data.

(b) Kinds of Data Based on Types of its Distribution
(i) Continuous data: data is known as continuous when it may be subdivided or
classified in a continuous series without having any gap. When the values of raw scores
difference from one another by indefinitely small changes, the data belongs to
continuous type of data. Majority of the variables in which physical education
teachers/coaches are interested, have the continuous (normal) distribution and thus,
measurement of these variables gives rise to continuous or normally distributed data.
Examples of this kind of data may be numerous such as data on height, weight, lengths,
temperatures, intelligence levels, attitudes, performances etc.
(ii) Discrete data: Data is known as discrete or discontinuous when it cannot be
divided into continuous series and contrarily falls into discontinuous, that is
unconnected discrete series. The variables which are discontinuous, with real gaps
between one value and the next, the data collected on such variables gives rise to
discrete data which can be expressed only in whole units (numbers). Examples are
number of eggs laid by a hen, blood groups, ethnic groups (human races), sex etc.

(c) Kinds of data based on its organisation
Statistical data may be divided into the following tow kinds:
(i) Raw data (ungrouped data): The data when presented by the raq scores as they were
recorded, without any attempt to arrange them into a more meaning ful or convenient
form, is called ungrouped or raw data.
(ii) Organized data (grouped data): the data when organized or arranged in some more
meaningful manner such as from high to low value or grouped into categories or classes
in order to facilitate further computation, is known as organized or grouped data. The
grouped data is usually represented by the frequency distribution.

Frequency Distribution
Definition: Frequency distribution may be defined as ‘method of grouping the raw
scores of data’. It is usually represented in a tabular form enlisting the raw scores or
intervals of scores and the frequencies with which the raw scores exist in each class
interval. Such tables are known as frequency tables.
Before the invention of calculator, the calculations based on grouped data were faster
even though the procedures were quite indirect and complicated. However, with the
recent wide spread availability of pocket and inexpensive calculators, the use of
frequency distribution has greatly reduced. The computation of statistical constants
based on ungrouped data, is known as computer method of calculation.
For frequency distributions, the raw scores are arranged in ascending or descending
series giving the rank or merit position of the individual’s score. While doing so, the
frequency of individual’s scores also becomes apparent. The frequency of a score
defines the number of times a score is repeated in a series of data. In physical education,
the data being of continuous nature (normally distributed), contains a large number of
rank orders with very low frequency of each score resulting into long series of data. In
order to summarize such a data more adequately, the rank orders are divided into few
classes which results in the grouping of data with increased frequency. This
organization of data is practically known as the frequency distribution with grouped
data. Such organization of data needs the construction of a frequency distribution table.

Construction of a frequency distribution table
The frequency distribution table may be constructed for both discrete and continuous
data and thus, is of two types:
(a) Discrete frequency distribution table: Since the discrete data is indivisible in
fractional units and may have large gaps. Therefore, it has fewer number of
categories with larger number of frequencies in each category.
The discrete frequency distribution table construction is very simple and does not
involve any number of steps rather it is a simple listing of tallies against each
discrete category / class. An example of discrete frequency distribution table is
given below with respect to the distribution of subjects belong to different blood
groups
Type of Blood Group Tallies Frequency
A Iiiii; iiiii; iiiii; iiiii; iii 23
B Iiiii; iiiii; iiiii; ii 17
O Iiiii; iiiii; iiiii; iiiii; iiiii;
iiiii; iiiii; iiiii; iiiii; iiiii;
iiiii; iiiii
60
AB Iiiii; iiiii 10

(b) Continuous frequency distribution table construction: as mentioned earlier, the
continuous frequency distribution series are very lengthy and are usually grouped
before the construction of a continuous frequency distribution table. The continuous
frequency table consists of five columns instead of three columns of discrete frequency
table. Two additional columns are of cumulative frequency and percentage cumulative
frequency. Consequently a process of construction also becomes somewhat lengthy and
therefore, it is divided into many steps given below:
Step 1: Finding Range of Scores: in order to find the total size of the range of
distribution, the lowest score is subtracted from the highest score plus one that is
mathematically. Range = (Highest score + 1) – lowest score. For example, consider the
distribution of height of 40 male students given in the following table:
Step 2: Deciding the number and size of class intervals: the range is divided into a
number of class intervals (groups) by deciding the size (sub-range) of each group.
While deciding the number of groups, generally the size of the sample is taken into
consideration. If the sample size is less than 50, the number of groups is usually kept
less than 5 whereas in samples of 50 to 100, the number of groups is usually kept
between 5 to 10. in sample containing more than 100 items, the number of groups is
taken between 10 to 20 (Tate, 1955)

S. No. Height S. No. Height S. No. Height
1 165.3 16 166.5 31 170.5
2 172.5 17 164.7 32 171.1
3 168.2 18 169.5 33 170.7
4 171.9 19 171.0 34 168.8
5 174.5 20 171.1 35 174.1
6 176.0 21 170.3 36 167.3
7 166.0 22 167.5 37 165.9
8 167.3 23 168.8 38 170.4
9 169.5 24 173.5 39 166.9
10 175.0 25 174.4 40 172.2
11 177.5 26 170.2
12 173.5 27 169.7
13 164.3 28 168.9
14 170.2 29 167.6
15 168.5 30 170.6

Highest value = 177.5; Lowest value = 164.3; Range = (177.5+1.0) – 164.3 = 14.2
After deciding the number of the groups, the size of class intervals may be calculated by
dividing the total range of the sample by the number of groups
Desired Class interval = Range/No. of groups (Rule 1)
The alternate rule for deciding the grouping of raw data into various class interval.
When class interval is decided first instead of the number of the groups, the value of 2,
3, 4, 5 or 10 units is used as the class interval. In this case the number of groups is
calculated by dividing the range by the class interval
Or No of Groups = Range / class inter (rule 1)
For example, if any investigator desires to have 3 cms as the class interval for the
distribution of height of 40 male students given in above table then the number of
groups may be calculated as below:
No. of groups = 14.2 / 3 = 4.7 (rounded off to 5)
Where f = frequency, cf = cumulative frequency and % cf= cumulative frequency in
percentage
Class Tallies f cf % cf

Step 3: Setting up the contents of frequency distribution table:
(i) For setting up the construction frequency distribution table, firstly five columns are
labeled as given in table in earlier slide.
(ii) The second sub-step: in writing the contents is writing the classes of distribution
that is filling up the first column of the frequency distribution table
Step 4: Checking the frequency distribution: for checking the frequencies (column 3)
are summed up. The sum should be equal to the total number of subjects measured. For
instance, the total of column – 3 in above example is 40 which is equal to the total
number of the subjects in the sample
Step 5: finding midpoint of each class interval: the mid point of each class interval
helps to discuss and explain the results by giving a single representative value to each
class (group) instead of the total range of each class. For instance, the mid points of the
classes (group) of the above frequency distribution represented in tables.
For computing the midpoint, the smallest unit of accuracy is subtracted from the lowest
limit of the first class (group) of the frequency distribution. Then the difference between
the upper and the lower limit is divided by 2 and the result is added to the lower limit of
the above mentioned subtracted lower limit.

Tests of Central Tendency
Meaning and definition: invariably we come across average values in newspapers daily.
These average values are most characteristic of a given group. In other words, we may
say that a greater concentration of scores in any group is towards the middle value of
scores. The average value or the value having greater concentration around it is
statistically known as central tendency. The meaning and definition of central tendency.
The meaning and definition of central tendency follows. When a group of students is
assembled and arranged in files (rows) according to their height or according to their
achievement score in a fitness test, the bigger rows will be towards the middle and these
mid points of continuous variables are referred to as the central tendency. In other
towrds, the central tendency represents the typical average score of a normal variable.
According to Rothstein (1985), “ Central tendency represents a point in the set of
distribution around which scores seem to center”. The three basic measures of the
central tendency are mean, median, and mode which are by far the most widely used
statistics, both for the researchers and for the general population. Daily we talk about
average values. For example, average income, average rate of inflation, average size of
family, average speed, average run rate etc. the measures of central tendency are used
to condense information.
Tate (1955) had defined “Central tendency is a sort of average or typical value of the
items in the series and its function is to summarize the series in terms of this average
value”.

Mean
It may be defined as sum of all the individual scores or values of the items in a series
divided by the number of items. The mean is usually designated by the symbol capital
‘M’ or X bar. It is the same as is the arithmetic mean.
For teachers of physical education and sports trainers, ‘mean’ is the most frequently
used and relevant measure of central tendency because almost all physical,
physiological, psychomotor and achievement scores are normally distributed variables
and mean is the best measure of central tendency for such variables.
Calculation of Mean: Mean represents the general average which is most commonly
talked about and is computed by simple arithmetic calculation by the following
formulae:
(a) For ungrouped data: M= ∑X / N
Where m stands for mean, ∑X stands for sum of the individual values or scores of the
items and N for the total number of items in a series of group.
For example, if we want to compute the mean of an ungrouped data of height measured
on 40 male students presented in the table. We will first add up all the 40 values of
height measured and then divide the same by 40 to find the mean (average) height of
the male students of that group. Mathematically, it may be written as
M=X1+X2+X3+X4+…………..+X40 / 40

Mean
(b) For grouped data: For a grouped data presented in the form of a frequency distribution table,
the calculation of mean is done by the following formula number 1:
M= ∑fx / N
Where ‘X’ represents the mid-point of the subgroup (or class), ‘f’ its respective frequency and ‘N’
the total number of subjects in all the sub-groups or total sample being studied.
For example, if we want to calculate the mean height of the above mentioned 40 male students
presented in the form of frequency distribution shown in table, the steps for computing mean are
presented below in table:
Class (height
in cm)
X=Mid point
(cm)
F= Frequency fX
164.1-166.0 165 05 825
166.1-168.0 167 06 1002
168.1-170.0 169 08 1352
170.1-172.0 171 11 1881
172.1-174.0 173 04 692
174.1-176.0 175 05 875
176.1-178.0 177 01 177
164.1-178.0 171 N=40 ∑fX=6804
M=∑fx / N = 6804/40 = 170.1

Mean
Still another method of computing the mean of grouped data is based on an assumed
mean value with the help of the following formula
M=AM+ ∑fx / N x I
Where
AM = Assumed mean
F = frequency of a class
N = total number
I = Class interval
X’ = X-AM/i
X = Midpoint of the class
∑ = Sum of frequency
For example, the above short cut method of computing mean value of the data used in
table is illustrated in table. The assumed mean is usually the value of the mid point of
the middle class of the distribution. Hence, let assumed mean (AM) = 171; i = 2
in other words, using following formula:
M=AM + (∑fx’ / N x i) => M = 171.0 + (-18/40 x 2) =>
M = 171 + (-36/40) or 171 – 0.9 = 170.1

Mean
In other words, using following formula:
M = AM + (∑fx’ / N x i) => M = 171.0 + (-18/40 x 2) => M = 171 + (-36/40) or
171 – 0.9 = 170.1
Mean is the balance point of distribution and can be tested by finding the sum of
difference on plus and minus side of the mean. The sum of differences on the plus side
should equal the sum of the differences on the minus side. In other words, the sum of
deviations from the mean (X – X bar) is always 0. It may be illustrated as given in Table
9.3
Class (Height in Cm) X Mid
point of
class (cm)
f X’= X-AM/ i fX’
164.1-166.0 165 5 -3 -15
166.1-168.0 167 6 -2 -12
168.1-170.0 169 8 -1 -8
170.1-172.0 171 11 0 0
172.1-174.0 173 04 +1 +4
174.1-176.0 175 05 +2 +10
176.1-178.0 177 01 +3 +3
164.1-178.0 171 40 0 ∑fx’ = -18

Mean
Sum of deviations from mean is always zero
X X-X bar
4 -6 Minus Deviation from mean = 21
5 -5
6 -4
7 -3
8 -2
9 -1
10 X bar = 10 0
11 +1 Plus Deviation from mean = 21
(Sum of deviation = zero)
12 +2
13 +3
14 +4
15 +5
16 +6

Median
The median is the middle value among all the scores. When the scores or values of available
are arranged in ascending or descending order of magnitude, the score of value of the central
item in this descending or ascending series is known as median. According to Mangal
(1987). “the median of a distribution is the point on the score scale below which one half or
50% of the scores fall”. Median divides the ascending or descending series of items into two
equal parts. It may be clearly understood or explained that it is not the number of central
item which is the median, but it is the value or the measure of the central item which is
known as median. In other words, the value of the 50th percentile is known as the median
value.
However, median is obtained by arranging the scores in a sequence and finding the value of
the middle score in this distribution.
Calculation of Median: the procedure of calculating median for ungrouped and grouped
details described below:
(a) Ungrouped Data: Mathematically, when the number of items is odd and not even
Median = N + 1 / 2th Value
Where Median = Median
N = total number of the series
For example, if we have a series of seven values, the median will be (7+1)/2 = Fourth value
in the series.
For instance, if the body weight of seven boys is arranged in the ascending order, the value
of the weight of the 4th boy will be known as the median weight as illustrated below in Table

Median
Median of the sample having odd number of subjects:
Median for even values: if the number in the sample is even (say 8 boys then median =
4.5th value which is non-existent) then the average value of the meddle two scores is
known as the Median. In the above case, the average weight of 4th and 5th boys in a
sense of eight boys will be taken as Median as illustrated below in Table.
S. No. (after ascending order) Body weight (kg)
1 30
2 32
3 35
4 38
5 40
6 45
7 47

Median
Median of the sample having even number of subjects:
Median = N + 1 / 2 => 8 + 1 / 2 4.5th Value
Median = 38 + 39 / 2 = 38.5 kg
i.e., mathematically, median of even items may be calculated by the following formula:
Sum of middle two scores / 2
(b) Grouped Data: in case, the data is on a larger sample and arranged in the form of
frequency distribution , the calculation of median needs to locate the class (group) in which
the middle value lies. If N = 50 i.e., even number formula, then the median will be given by
the term determined by the following:
S. No. (after ascending order) Body weight (kg)
1 30
2 32
3 35
4 38
} Median = 38.5
39
5
6 40
7 45
8 47

Median
Median = (N/2)th item + (N/2 + 1) / 2th item
In case N = 50 and its frequency distribution is as per table then
Median = (N/2)th item + (N/2 + 1) / 2th item
Therefore, Median = 50/2 + 50/2+1 / 2 = 25 + 26 / 2 = 25.5th value
Median of large sample organized in the form of a frequency table
The cumulative frequency column indicates that, the 25.5th value is in class interval 168.6 – 171.5
Since: Median = L + N/2 – F/ f x I When N = Odd number
Median = L + N+`/2 – F / f x I When N = Even number
F = Cumulative frequency below median class
f = frequency of class having median
i = Class interval
N = total number (sum of all frequencies)
S. No. Body Height Classes f Cf
1 162.6-165.5 4 4
2 165.5-168.5 11 15
3 168.6-171.5 21 36
4 171.6-174.5 10 46
5 174.5-177.5 4 50
162.6-177.5 N = 50

Median
For example, applying the above formula to the data given in table, we get
L = 168.6; N = 50; F = 15; f = 21
Since, Median = L + (N+1)/2 – F/ f x i
168.6 + (25.5-15)/ 21 or 168.5 + 10.5/21 x3 =>168.6 + 1.5 =>Median = 170.1 cm
The median may also be calculated by the formula of calculating percentiles from
grouped data because 50th percentile is the median. The formula of calculating 50th
percentile or median is given below:
Median = I r I + 0.5 (N) - ∑fb/ fw x (i)
Where I r I = lower real value of the interval containing middle value of a distribution.
0.5 = fractional value out of one or 50/100 = 0.5
N = Number of subjects
∑fb = cumulative frequency below the interval containing median value
fw = frequency with in the median interval
i = size of class interval.

Mode
“the most frequently occurring value in a distribution is known as mode”. In other words, it is the most
common value of a distribution and is usually found to lie towards the center of the distribution and is known
as a measure of central tendency. Sometimes, it is symbolized as Mo, however mostly it is not symbolized
and is used as the complete word mode. It indicates the centre of concentration of frequency in and around a
given value. Mode is most important in making choices of sizes say, for footwear, ready-made garments
where the manufacturer is interested to produce number of items according to the frequency of sizes / choices
of customers.
Calculation of Mode: The procedure to compute mode is described below for both ungrouped data:
(a) For ungrouped data: in case of ungrouped data, the value of that datum which occurs most frequently
gives the mode value. Hence, mode of ungrouped data is calculated by the construction of frequency
table. The value which has the highest frequency is the Mode.
(b) In Grouped Data: in case of frequency distribution table data, the mode may be defined as the midpoint of
the class interval containing the largest number of items i.e., maximum frequency
Or Mode = L + fm – f (m-1)/ (fm – f (m-1) + (f(m) – f (m+1) x i
Where: L = Lower limit of model class
fm = frequency of model class
f(m-1) = frequency of class preceding model class
f(m+1) = frequency of class succeeding model class
For example, for the data of table
Mode = 168.6 + (21-11) / (21-11) + (21-10) x 3
 168.6 + 10/21 x 3
 168.6 + 10/7
 168.6 + 1.43
 Mode = 170.03 cm
Another formula for computing Mode is, Mode = 3 median – 2 where median is median and M is a mean of
the given data. For example, considering data of table
As median = 170 cm
And mean = 169.88 = 3 x 170 – 2 x 169.88 => 510 – 339.76 => 170.24 cms.

Properties
(A) Properties of Mean: (1) It is the balance point of the distribution where the negative
deviations equals that of positive deviation. (2) It is quite responsive to each score and
is more sensitive to extreme scores than are the median and mode. (3) It is a measure
which best reflects the total of all the scores. (4) It is widely used, implicitly or
explicitly, in advanced statistical tests. (5) It is least affected by the sampling
fluctuations. The mean fluctuates least when based on different samples drawn
randomly from a population.
(B) Properties of Median: (1) It is not affected by the extreme scores. (2) It is
responsive only to the number of scores above or below its value. (3) It is subject to
greater sampling fluctuations. The value of median fluctuates considerably when based
on different sample of the same population. (4) it is the only suitable and stable measure
of central tendency for open-ended distributions. Where there are no specific values in
the beginning and / or at the end of the distribution, unlike mean, median can be easily
computed.
(C) Properties of Mode: (1) Determined by the most frequently occurring value, the
mode is affected most by the sampling fluctuations. (2) It is greatly affected by the
choice of class interval than mean and median. (3) It is most suited measure to discrete
(categorical) variables.

Merits and Demerits of Mean
Central
Tendency
Merits Demerits
Mean a) It is common average easily
understood by all.
b) It gives weight to each and every
value in proportion to its quantity.
c) It is accurately defined
d) It can be calculated without
knowing the details of the data
e) It is not affected by the position in
the distribution
f) Many further statistical constants
may be obtained from mean value.
a) It is greatly affected by
extreme values.
b) It cannot be located with
the help of a diagrammatic
presentation of the data.
c) It cannot be used in the
case of qualitative data.

Merits and Demerits of Median
Central
Tendency
Merits Demerits
Median a) It is not affected by the size of
extreme values.
b) It is an important measure of
central tendency of qualitative
data
c) It can be located on graphical
presentation of Data
a) It is affected by sampling
fluctuations
b) It is necessary to arrange
the data in ascending or
descending order to compute
the median value
c) It does not take into
consideration every value of
the group

Merits and Demerits of Mode
Central
Tendenc
y
Merits Demerits
Mode a) It is most easily calculated
b) It gives the best representative
value of the sample
c) It can be determined graphically
d) It gives the actual value of an
important item of the series
e) It is not affected by extreme
values.
a) Choice of class intervals has
a great influence on the value
of mode
b) It does not consider all the
values
c) No further statistical
manipulation can be obtained
d) It is greatly affected by
change in ‘the’ sample

Comparative Roles of Mean, Median and Mode
The above comparative picture of merits and demerits of mean, median and mode and their
properties give an idea of the comparative roles of each. Calculation of any one of the three
provides a measure of central tendency.
The comparative roles of Mean, Median and Mode in different situations are described
below in brief one by one:
1. Relative role and use of Mean: (i) It is the most accurate measure of central tendency. Its
role is most significant among the three forms of central tendency because it is the mo0st
stable form of average value which has the least fluctuations in its value when based on
different samples drawn from the same population. (ii) Mean is best suited to elaborate
statistical treatment for the computation of other statistical constants like standard deviation
and coefficients of correlation.
2. Relative role and use of Median: (i) When exact mid point is needed, it is the median
which is helpful and is used to divide the group into two equal parts. (ii) if a distribution has
extreme scores with a great variability around the central tendency, the role of median is
most important because it is not affected by the extreme values. For example, if a physical
education teacher is interested to find the central tendency of hammer throw in a group of 20
students and he finds that two or three students throw the hammer to very small distance due
to their inexperience and non-cooperation. The teacher will get the same average typical
value as obtained when these two – three students also learn and co-operate in improving
their performance. (iii) The role of median is also important when the data has a
incompletely defined grouping having players from greatly different grades of a high school.
It is impossible to calculate mean in such a data, however, there is neither any difficulty nor
any deterioration in the reliability of central tendency if computed from the median.

Comparative Roles of Mean, Median and Mode
3. Relative Role of Mode: (i) One of the significant role of mode is to provide a quick
measure of central tendency. Just a cursory look over the data enables to find the most
repeatedly occurring value in a group of sample values which represents the mode. The
value which is repeated maximum number of times in a non-grouped data and the
highest frequency of the grouped data is the value of mode and is quite evidently the
quickest way of finding a measure of central tendency. (ii) the role of mode is most
significant in the large scale manufacture of consumer goods where the most frequency
occurring size (mode) is to be manufactured in the maximum quantity. Obviously
median and mean have only limited values in finding the size of shoes and ready-made
garments which will be most in demand for their fitting to most men and women of the
region, so the manufacturer has to depend upon the average or central tendency
computed in the form of mode.
Summarily, the comparative role of three measures of central tendency and the
preference of using these measures is given in table in next slide.

Preferential use of mean, median and mode in different situations
Mean Median Mode
Mean is computed Median is computed Mode is computed
(i) To get an accurate average
value
(i) To know exact middle
value
(i) To know most frequently
occurring value quickly
(ii) When other statistical
constants like standard
deviation, coefficient of
variation etc. are needed
(ii) When the frequency
distribution is open ended in
the beginning or /and at the
end
(ii) When the investigator is
interested in most frequently
occurring value especially for
manufacturing consumer
good etc.
(iii) When the distribution
has no extreme values
(iii) When the data has
extreme values with a wide
range which will greatly
affect the mean value
(iii) When the graphical
representation of data is
available
(iv) When the data is
normally distributed and has
more than typical value
interrupted discretely (e.g.
bimodal distribution)
(iv) When graphical
representation on data is
available
(iv) When the data is not
normally distributed.

Relationship among Mean, Median and Mode
The relationship between mean, median and mode depends on the type of distribution of
scores of achievements or values of the measurements of the group of individuals. When the
data of these values is normally distributed or is symmetrical, the values of mean, median
and mode are exactly equal. In case, the data is asymmetrical or lkewed, the values of mean,
median and mode are different and their relative positions are indicated in figure. It is the
shape of the frequency distribution which enables the investigator to decide about the
appropriateness of computing mean, median, and mode for a particular data.
If the investigator does not need any further details of data like standard deviation etc., the
calculation of mode and median is more easier. When the distribution is skewed, the
calculation of median is most suited and when investigator is working for consumer
products manufacture etc., mode is most suitable. For teachers of physical education and
sports sciences, the calculation of mean is most frequently needed while median is used to
eliminate extreme scores and mode is seldom needed. However, the students of physical
education and sports must be clear in their minds regarding the relationship between the
three measures of central tendency. In slightly skewed distribution, usually the difference
between mean and median is one third of the difference between mean and mode as
represented by the following mathematical formulae:
Mean = Median ± 1/3 (mode-mean)
Or Mode = 3 Median – 2 Mean
Or Mode = Median ± 2/3 (Median – Mean)
Or Median = Mean ± 1/3 (Mode – Mean)

Perfect normal (unskewed distribution)
Perfect normal (unskewed distribution)

Positively Skewed Distribution

Negatively Skewed Distribution
Negatively Skewed Distribution

Test of Variability
Meaning and Definition of Variability: A cursory look at the graphical representation
of three figures of normal curve enable us to know that the two random samples may
have a common mean but still may differ in the scatter of their individual values. The
two groups may have widely different mean value but may have a comparable spread of
variability of its individual scores and most frequently, the two groups have their
individual mean values and scatters.
Two groups with different variability but equal mean values

Meaning and Definition of Variability
The spread of a group’s individual scores from lowest value to highest value or
dispersions from the mean value is known as the variability of the group under study.
As ‘mean’ provides a representative single value which describes the level of
performance of the entire group, similarly the measure of variability is supposed to
describe summarily the spread or dispersion of performance of individuals of the group.
Different mean value but similar variability

Meaning and Definition of Variability
Different mean value and different variability

For example, let us consider the following values of body height of 10 athletes and 10
basketball players, to understand more clearly the meaning and importance of
variability.
Table below recorded data on height along with mean, range and standard deviation of
10 male athletes and 10 male basketball players:
Category Individual height measured (cm) Mean (cm) Range
(cm)
S. D. +/-
(cm)
Athletes N=10 155, 159, 160, 163, 170, 174,
176, 178, 180, and 185
170 31 10.2
Basketball 166, 167, 168, 169, 170, 170, 71,
172, 173 and 174
170 09 2.6

Variance
It is also a reliable and useful measure of variability and may be defined as “the mean of
the squares of the deviations of scores from their mean”
In other words, it is equal to the square of standard deviation. Rather, it is the last but
one step in the calculation of standard deviation and is used in inferential statistics.
Mathematically,
Variance = ∑(X-X bar)2 / N or N-1 or ∑(X-μ)2 / N or
Variance = ∑X square – (∑X)2/N divided by N (for N>30) or N-1
Where X = individual Score; X bar = Sample Mean; μ = Population Mean
Coefficient of Variation: This measure of variability is used to compare the variation
of two or more variables having different units of measurement because it eliminates
the units by converting standard deviation into a coefficient.
It may be defined as “the standard deviation in percentage of the mean value”
It is computed by the following equation
Coefficient of Variation(CV) = S. D./ Mean x 100
For example, observe the computation of CV for the assumed data given in Table given
in next slide.

Coefficient of Variation
Now observe the comparative values of coefficient of variation in case of 100M, SBJ
and Triceps skinfold scores presented in the histogram given in next graph . The values
of CV indicate clearly that triceps skinfold thickness is many fold more variable than
SBJ or 100M performance and that SBJ has been observed as more variable to 100M
performance. Such a clear picture about the relative variability is not visible if we
examine the values of SD of performances of 100M running. Standing broad jump and
triceps skinfold thickness measurements (i.e., SD = 0.8 sec.; SD = 20 cm; and SD =
3.2mm)
Histogram of relative variability from the values of CV
Statistics 100 M (sec) SBJ (cm) Triceps skinfold (mm)
Mean
S. D.
C. V.
14.0 sec
± 0.8 sec
S. D./mean x 100
0.8x100/14.0 = 40/7 = 5.71
210 cm
± 20 cm
20x100/210 = 200/21
= 9.52
8.4 mm
± 3.2mm
3.2x100/8.4 = 800/21
= 38.09

Coefficient of Variation
The fundamental property of coefficient of variation: It is a measure of relative
variation. In other words, it enables to compare the variability in different variables
having widely differing mean values (or mean values expressed in different units of
measurement). Thus, CV enables to compare the variability of different measurements
of a group on the one hand and variability of different groups with respect to a variable
on the other hand.
Absolute and Relative Variability: After standing various measures of validity the reader
should be now ready to understand the difference between absolute and relative
variability. The meaning and use of both absolute and relative variability have their own
separate utility.
(a) Absolute Variability: The range, A.D., Q. D., S. D., and variance represent and
absolute variability of a parameter. In other words, absolute variability is the quantity of
variation directly in the units of measurements.
(b) Relative Variability: The CV represents the relative variability of different variales
irrespective of their units of measurements.

Relationship between average deviation, quartile deviation and standard deviation
There is a definite relationship between above mentioned three tests of variability of
normally distributed variables the Quartile Deviation is smallest, the value of Average
Deviation comes next whereas the numerical value of Standard Deviation is the largest.
Representing the variability of normal distribution symbolically
QD < AD < SD
Further QD = 2 x SD/3 or AD = 5AD/6
AD = 4 x SD/5 or AD = 6QD/5
SD = 5 AD/ 4 or 3 QD/2
The proportion of percentage of individual scores covered within one unit of
QD/AD/SD is given below:
1. X bar ± 1.0 QD = 50% of the total number of score
2. X bar ± 1.0 AD = 57.51% of the total number of score
3. X bar ± 1.0 SD = 68.27% that is about 2/3 of the total number of values or scores of a
normal distribution

Normal Probability Curve
The word normal is commonly used in everyday life to describe the average value for a
variable in a desirable range. In other words, a person having height, weight,
intelligence or motor ability close to the average value is known as normal. The
individuals having lower values than the average value are called below average and
having higher value than the average value are called above average. Since most of the
biological, psychological and motor performance variables are distributed in such a way
that majority of the individuals possess average or nearly average value and only a few
individuals possess values which are widely away from the average value. In other
words, the frequency on either side of the average values goes on decreasing as the
difference between the average and the individual values goes on increasingly such a
distribution when plotted by taking frequency on the vertical axis (i.e., Y axis) and size
of variable on the horizontal axis (or X axis) gives rise to a bell shaped curve. A
perfectly bell shaped curve is known as normal probability curve or simply ‘normal
curve’ and the distribution of such scores or values giving rise to a normal curve is
known as normal probability distribution or simply ‘normal distribution’.

A perfect normal curve
However, statistically, the areas of frequencies of a perfect normal curve are constant
and perfectly defined by statistical constants in percentage of values of a distribution or
in proportion to standard deviations. But, the actual distribution based on samples is
seldom perfect and just approaches towards perfection and here the word probability is
used. Probability values are illustrated in exact distribution of a perfect normal curve
shown in the above figure. The accuracy of sample distribution is found by comparing
the probable values with the exact experimental observations.

Meaning and Definition of Probability
In order to understand properly the implication of probable values, it is important to
explain the meaning and definition of probability.
Probability is a quite common term used in everyday life. Majority of the statistical
inferences and conclusions are based on the comparison of experimentally observed
values with the statistical probabilities. Routinely we use the term probability
accompanied with percentage value as indicated in the following sentences:
While tossing a coin there is 50% probability that it will come up heads and
50% probability that it will come up tails.
There is 80% probability that this goal keeper will block all shots on his goal.
From a deck of playing cards there is 50% probability of getting the first card
of red colour. There is 25% probability of getting it from diamonds.
Thus, from the observation of above description of events, it may be concluded that
probability may be defined as “the natural chance of occurrence of an event in a single
trial expressed in terms of its percentage value with respect to total possible outcomes”.
Minium (1978) defines probability in the following alternative ways:
“Given a population of possible outcomes, each of which is equally likely to occur, the
probability of occurrence on a single trial of an outcome characterized by A is equal to
the number of outcomes yielding A, divided by the total number of possible outcomes”

“The probability of the occurrence of any event or any situation(say A) on a single trial
is the proportion of trials characterized by A in an infinite series of trials, when each
trial is conducted in a like manner”
“The probability of occurrence of any one of several particular outcomes is the sum of
their individual probabilities, provided that they are mutually exclusive”
“The probability of several particular outcomes occurring jointly is the product of their
separate probability provided that the events that generate these outcomes are
independent’
For example in a deck of 52 cards there are 4 Aces, on drawing a card at random there
are only four chances out of 52 that the drawn card will be an Ace. Hence, the
probability of drawing an Ace from a pack of 52 cards is 4/52 x 100 = 7.6923%
Similarly there are 13 cards of clubs, diamonds, spades and hearts each. Hence the
probability of drawing a card of diamonds from a pack of cards is and so on
13/52x100=25%, with a clear concept of probability as explained above, it s quite
pertinent to describe the difference between above mentioned theoretical probability
and empirical (practical i.e., observed or experimental) probability.

The Theoretical probability is calculated based upon the logical ground as explained
while dealing with the definition of probability.
The empirical probability is not calculated but is based on the actual experience of
getting the observation while experimenting.
The inferences in statistics are mainly based upon the comparison of empirical
probability with the theoretical probability.
It may be mentioned here that due to the normal distribution of common human
attributes like beauty, wealth, health, height, weight, intelligence etc., the probability is
that the majority among human beings possess average beauty, health, wealth, height,
weight, intelligence etc. According to the pattern of normal distribution, only a very few
persons have a probability of possessing extreme (marked) deviations from average
values of these attributes.

As mentioned above, majority of the biological, psycho-motor and physiological
variables are distributed in a fashion characterized by the normal probability curve
(more commonly known as normal curve). It becomes very important to describe,
illustrate and explain briefly the normal probability curve along with its principles,
application etc.
The details of variability along with the meaning and definition of standard deviation
have already been dealt. It may be sufficient to mentioned here that standard deviation
is a measure of variability and is used to describe the theoretical probability of the
normal distribution curve.
It may be noted that many of the procedures in inferential statistics are based on the
assumption of normal distribution of the concerned variables. Hence, it is very relevant
to discuss here what is a normal curve? What are its principles? What are its uses
especially in answering statistical questions?

Meaning of the Normal Curve: The normal curve is a graphical representation of the
normal distribution while the normal distribution relates to a set of scores which when
represented graphically provide the pictorial normal curve. This, the normal distribution
and normal curve are one and the same. The normal curve is also a mathematical
abstraction defined by the following equations (Minium 1978):
Y= N/σ√2 π x e to the power –x Square / 2 σ Square
Where Y= frequency of a given x
N = Total number of cases
χ = X-X bar
X = raw scores (individual values)
X bar = mean of raw scores
σ = Standard deviation of the distribution of raw scores
π = 3.1416 (the ratio of the circumference of a circle to its diameter)
e = 2.7183 (base of the Napierian system of logarithms)
Thus, π and e are mathematical constants.

The variables which are responsible for determining the value of y and x, x, SD (σ) and
N. The above equation gives an ideal theoretical normal curve which is completely bell
shaped as against the empirical normal curve obtained by plotting the actual observed
data. Since the data of a coin tossing or dice throwing involving chance or probability
values, when plotted on a graph paper, give rise to a bell shaped curve with all the
properties of a normal curve, this normal curve is known s Normal Probability Curve.
In early nineteenth century, Gauss derived the normal curve while working on
experimental error in his astronomy experiments.
The normal probability curve is also popularly known as Gaussian curve and the normal
distribution as Gaussian distribution after Gauss.

Stastistics in Physical Education - SMK.pptx

Stastistics in Physical Education - SMK.pptx

Recommended

Recommended

More Related Content

Similar to Stastistics in Physical Education - SMK.pptx

Similar to Stastistics in Physical Education - SMK.pptx (20)

More from shatrunjaykote

More from shatrunjaykote (20)

Recently uploaded

Recently uploaded (20)

Stastistics in Physical Education - SMK.pptx