Tabulation of Data, Frequency Distribution, Contingency table
1.
2. Principles of presentation of data:
1. Increase interest of reader.
2. Concise without losing important details.
3. Presented in simple form.
4. Facilitate further analysis.
5. Define the problem & suggest it’s solution.
3. Text
Tabulation
Diagrams & Graphs
Tabulation is the first step in the analysis of
data
4. In statistics, a frequency distribution is a table that displays the frequency of various
outcomes in a sample. Each entry in the table contains the frequency or count of the
occurrences of values within a particular group or interval, and in this way, the table
summarizes the distribution of values in the sample.
A frequency distribution shows us a summarized grouping of data divided into
mutually exclusive classes and the number of occurrences in a class. It is a way of
showing unorganized data.
5. After collecting data, the first task for a
researcher is to organize and simplify the data
so that it is possible to get a general overview
of the results.
This is the goal of descriptive statistical
techniques.
One method for simplifying and organizing
data is to construct a frequency distribution.
5
6. A frequency distribution is an organized
tabulation showing exactly how many
individuals are located in each category on the
scale of measurement. A frequency
distribution presents an organized picture of
the entire set of scores, and it shows where
each individual is located relative to others in
the distribution.
6
7. A frequency distribution table consists of at
least two columns - one listing categories on
the scale of measurement (X) and another for
frequency (f).
In the X column, values are listed from the
highest to lowest, without skipping any.
For the frequency column, tallies are
determined for each value (how often each X
value occurs in the data set). These tallies are
the frequencies for each X value.
The sum of the frequencies should equal N.
7
8. The way, in which observations are distributed
into various classes.
Frequency distribution
discrete frequency distribution continuous frequency distribution
To make the work easy, we use tally marks.
9. When a frequency distribution table lists
all of the individual categories (X values-
i.e. Variables is Discrete) it is called a
Discrete frequency distribution.
9
10. Ex:- II MBBS students conducted family health
survey(FHS) and recorded number of children's
among 40 families as below:
Prepare the frequency table for given data & draw
your conclusion from the same.
10
0 2 1 3 2 1 2 1
2 1 2 2 1 2 1 2
2 2 1 0 2 1 2 1
2 2 1 2 1 2 1 2
2 1 2 3 1 2 1 0
11. Soln : Let us consider the variable
X : Number of children's in family
S= Smallest value=Minimum value number of children’s =0,
L= Largest Value= Maximum value number of children’s =3,
11
No. of
children’s(X)
Tally Marks Frequency (f)
0 III 3
1 IIII IIII II 12
2 IIII IIII IIII IIII III 23
3 II 2
Total N= 40
12. No. of family
members
Tally bars frequency
1 IIII 5
2 IIII IIII 10
3 IIII IIII IIII 15
4 IIII IIII IIII IIII 20
5 IIII IIII 10
6 & more IIII 5
Total N= 65
13. Sometimes, however, a set of scores covers a
wide range of values. In these situations, a
list of all the X values would be quite long -
too long to be a “simple” presentation of the
data.
To remedy this situation, a grouped
frequency distribution table is used.
13
14. In a grouped table, the X column lists groups
of scores, called class intervals, rather than
individual values.
These intervals all have the same width,
usually a simple number such as 2, 5, 10, and
so on.
Interval must same throughout the all classes.
Groups should not be too broad or too short.
Group should be between 5 and 15.
14
15. 1. Range (R) – the difference between the highest
score and the lowest score.
2. Class Interval (k) – a grouping or category defined
by a lower limit and an upper limit.
3. Class Boundaries (CB) – these are also known as the
exact limits, and can be obtained by subtracting 0.5
from the lower limit of an interval and adding 0.5 to
the upper limit interval.
16. 4. Class Mark (x) – is the middle value or the midpoint of a
class interval. It is obtained by getting the average of the
lower class limit and the upper class limit.
5. Class Size (i) – is the difference between the upper
class boundary and the lower class boundary of a
class interval
7. Class Frequency – it refers to the number of
observations belonging to a class interval, or the
number of items within a category.
18. Steps in Constructing a Frequency Distribution
1. Find the range R, using the formula:
R = Highest Score – Lowest Score k
2. Compute for the number of class intervals, n, by
using the formula:
k = 1+3.3 log n
19. Note: The ideal number of class intervals
should be 5 to 15. Less than 8 intervals are
recommended for a data with less than 50
observations/values. For a data with 50 to
100 observations/values, the suggested
number should be greater than 8.
20. 3. Compute for the class size, I, using the formula:
i = R/k
Please note also that the few number of class intervals
will result to crowded data while too many number of
class intervals tend to spread out the data too much.
4. Using the lowest score as lower limit, add (i – 1)to
it to obtain the higher limit of the desired class
interval.
21. 5. The lower limit of the second interval may be obtained by
adding the class size to the lower limit of the first interval.
Add (i – 1) to the result to obtain the higher limit of the
second interval.
6. Repeat step 5 to obtain the third class interval, and so on,
and so forth.
7. When the n class intervals are completed, determine the
frequency for each class interval by counting the
elements.
22. Solution:
1. R = Highest Score – Lowest Score
R = 90 – 51
R = 39
2. k = 8 (desired interval)
3. i = R/k
i = 39/8
i = 4.875
i = 5
23. The Frequency Distribution of the Statistics Score of 50
Students
Class Interval Tally Marks f
LL - UL
50 - 55 IIII 4
55 - 60 III 3
60 - 65 IIII 4
65 - 70 IIII IIII 10
70 - 75 IIII IIII 9
75 - 80 IIII II 7
80 - 85 IIII 5
85 - 90 IIII III 8
Total N=50
24. Cumulative Frequency
Distribution
Class Interval f <cf >cf
LL - UL
50 - 55 4 4 50
55 - 60 3 7 46
60 - 65 4 11 43
65 - 70 10 21 39
70 - 75 9 30 29
75 - 80 7 37 20
80 - 85 5 42 13
85 - 90 8 50 8
50
The Frequency Distribution of the Statistics Score of 50 Students
25. Cumulative frequency
Less than cumulative frequency Greater than cumulative frequency
(l.c.f.) (g.c.f.)
l.c.f. denotes no. of observation whose values
are less than upper limits.
g.c.f. denotes no. of observation whose values
are greater than lower limits.
26. Cumulative Frequency distribution
Marks Frequency
Cumulative frequency
l.c.f. g.c.f.
0-50 5 5 60+5=N=65
50 - 60 10 5+10=15 50+10=60
60 - 70 15 15+15=30 35+15=50
70 – 80 20 30+20=50 15+20=35
80 - 90 10 50+10=60 10+5=15
90 – 100 5 N=65 5
28. A two-way table presents categorical data by
counting the number of observations that fall
into each group for two variables, one divided
into rows and the other divided into columns.
29. There are 40 students in batch D of II
MBBS out of which 25 are boys. The 18
boys and 13 girls lives in hostel. Prepare
contingency table for gender wise
current residential status for batch D.
31. Ex. In study of susceptibility to diphtheria, 982
children were studied in Mumbai. Out of 494 boys
63 were Schick positive i.e. susceptible. 807
children were schick negative and out of these 376
were girls.
Prepare table for Given Information
32. Classification of children according to sex and
susceptibility to diphtheria
Schick Test
Sex
+ ve -ve Total
Boys 63 ?? 494
Girls ?? 376 ??
Total ?? 807 N= 982
33. Schick Test
Sex
+ ve -ve Total
Boys 63 =494 -63
=431
494
Girls =492-376
=112
376 =982-492
=488
Total =63+112
=175
807 N= 982