3. Editing
• “Editing is a step where by researchers eliminate errors
or points of confusion in the raw data”.
• “Editing detects the errors, correct them when possible
and certifies that minimum data quality standards have
been achieved”.
3
Dr. Amitabh Mishra
4. Objectives of Editing
• The purpose of editing is to guarantee that the
data are-
1. Accurate
2. Complete
3. Uniformly entered
4. Consistent with intent of questions
5. Arranged to simplify coding and tabulation.
4
Dr. Amitabh Mishra
5. NEED FOR EDITING
Editing is needed because-
1. Parts of the questionnaire may be incomplete
2. The pattern of responses may indicate that the respondent did
not understood or follow the instructions
3. The responses show little variance
4. One or more pages are missing
5. The questionnaire is answered by someone who does not
qualify for participation
5
Dr. Amitabh Mishra
6. Stages of Editing
• Editing can be done on two stages-
1. Field editing
2. Central editing
• Field editing is responsibility of field supervisor. During data
collection field worker/respondents often use abbreviations
and special symbols. soon after data have been gathered
interviewer must review the questionnaire.
6
Dr. Amitabh Mishra
7. • After the field work is done trained and experienced
editors check and edit each questionnaire
thoroughly.
• Editors identify the inconsistencies between the
answers.
• Editor’s task is identify the fake interviews (Fake
interviews can be identified by checking responses of open ended
questions).
Dr. Amitabh Mishra 7
8. Treatment of Unsatisfactory Results
1. Returning to the Field
(The questionnaires with unsatisfactory responses may be returned to the field,
where the interviewers re -contact the respondents)
2. Assigning Missing Values
(If returning the questionnaires to the field is not feasible, the editor may assign
missing values to unsatisfactory responses.)
3. Discarding Unsatisfactory Respondents
(In this approach, the respondents with unsatisfactory responses are simply
discarded)
8
Dr. Amitabh Mishra
10. Coding
“Coding means assigning a code, usually a number, to each
possible response to each question.
“Coding involves assigning numbers or other symbols to
answer so the responses can be grouped in to a limited
number of classes or categories”- Cooper & Schindler
10
Dr. Amitabh Mishra
11. Example
S.N. Category Code
1 Male 1
2 Female 2
Dr. Amitabh Mishra 11
S.N. Category Code
1 Male M
2 Female F
S.N. Category Code
1 Male
2 Female
12. Rules of Coding
1. Appropriateness- categories should be
appropriate to research problem and objectives.
2. Exhaustiveness- there should be a class for
every data item. The researcher often uses
“other” option.
12
Dr. Amitabh Mishra
13. 3. Mutually exclusivity- specific answers should be placed in
one and only one category.
Ex- In an occupation survey non mutually exclusive classification may
be-
a) Professional
b) Managerial
c) Sales
d) Clerical
e) Craft
f) Operative
g) Unemployed
13
Dr. Amitabh Mishra
14. Coding close-ended questions
• Dichotomous or multiple choice questions have response
category.
• While coding such questions numerical codes are provided to
each response category.
Response
category
Codes Response
category
Codes
Yes 1 Male 1
Do not know 2 Female 2
No 3
14
Dr. Amitabh Mishra
15. Coding open-ended questions
• Researcher should review each open question and
establish meaning full category .
Ex- How many cup of coffee/ tea you drink in a day?
If respondents
answered
Response category Code
More than 5 cups/day Heavy consumer 1
Between 2-5 cups/day Moderate consumer 2
Less than 2 cups/ day Light consumer 3
O cups/day Non consumer 4
15
Dr. Amitabh Mishra
17. • “A table is a systematic arrangement of statistical data
in column and rows”.
• “Tabulation is a process where by raw data on
completed questionnaire are transformed in to the “list
of needed information”.
• The purpose of table is to simplify the presentation and
facilitate comparison.
17
Dr. Amitabh Mishra
19. Significance of Tabulation
1. It simplifies the complex data
2. It facilitates comparison
3. It gives identity to the data
4. It reveals pattern
19
Dr. Amitabh Mishra
20. Parts of Table
1. Table number
2. Title of table
3. Caption
4. Stub
5. Body of table
6. Head notes
7. Foot notes
20
Dr. Amitabh Mishra
22. Types of Tabulation
1. Uni-variate Tabulation
2. Bi-variate Tabulation or Multivariate tabulation
22
Dr. Amitabh Mishra
23. Univariate Tabulation
• “Uni-variate tabulation counts one questions answer”
• Such a tabulation results in frequency distribution of
answers. As-
– No. of people who answered in first response category
– No. of people who answered in first response category. Etc.
23
Dr. Amitabh Mishra
24. • What is your opinion regarding mandatory fitting of airbags,
GPRS system, & seat belts in all the vehicles in country.
– In favor of
– Indifferent towards
– Opposed to
Number Percent (%)
In favor of 55 33.4
Indifferent towards 31 19.4
Opposed to 74 46.2
Total 160 100(%)
24
Dr. Amitabh Mishra
25. Approach B Approach C
Number
In favor of 55
Indifferent towards 31
Opposed to 74
Total 160
Percent
In favor of 34.4
Indifferent towards 19.4
Opposed to 46.2
Total 100 %
25
Dr. Amitabh Mishra
26. Bi-variate Tabulation
or
Multivariate tabulation
• In Bi-variate Tabulation or Multivariate tabulation the
researcher simultaneously tabulate the responses of
two or more questions.
26
Dr. Amitabh Mishra
27. EXAMPLE
1. What is your gender?
a) Male
b) Female
2. How often you use credit cards when purchasing PIZZA at Dominos.
a) Regularly
b) Occasionally
c) Never
27
Dr. Amitabh Mishra
28. Uses of credit cards for purchase of Dominos Pizza
Usages rate Male Female
Number Percent (%) Number Percent (%)
Regularly 20 10 100 50
Occasionally 60 30 80 40
Never 120 60 20 10
Total 200 100% 200 100%
28
Dr. Amitabh Mishra
30. • There are no hard and fast rules for preparing a statistical table.
• “In collection and tabulation, common sense is the chief requisite
and experience is the chief teacher.” - Prof. Bowley
• However, the following points should be borne in mind while
preparing a table-
1. Table must contain all the essential parts, such as, table number,
title, head note, caption, etc .
2. Table should be simple to understand.
3. It should also be compact, complete and self-explanatory.
4. Table should be of proper size
Dr. Amitabh Mishra 30
31. 5. Indicate a zero quantity by a zero and do not use zero to indicate such
information which is not available.
6. In case of non-availability of information, one should write N.A. or
indicate it by dash (-).
7. Ditto marks (,,) should be avoided in a table. Similarly the expression
‘etc’ should not be used in a table.
8. Table should not be overloaded with details.
9. Abbreviations should be avoided, particularly in titles and sub-titles
10. In all tables the captions and stubs should be arranged in some
systematic manner. (The manner of presentation may be alphabetically,
or chronologically depending upon the requirement).
Dr. Amitabh Mishra 31
32. 11. The unit of measurement should be mentioned in the head
note.
12. The figures should be rounded off to the nearest hundred, or
thousand or lakh.
13. There should be a proper title to each table. It should tell
what exactly the table presents
Dr. Amitabh Mishra 32
34. • “Cross tabulation (or crosstabs) is a statistical process that
summarizes categorical data to create a contingency table”.
• “Cross tabulation is a technique for comparing data from two
or more categorical variables”.
• Cross tabulation provide a basic picture of the interrelation
between two variables and can help find interactions between
them.
Dr. Amitabh Mishra 34
35. • While a frequency distribution describes one variable at a
time, a cross-tabulation describes two or more variables
simultaneously.
• It helps us to understand how one variable (such as brand) is
related to another variable (gender). Example- answer to
following questions can be determined by cross tabulation.
1. How many brand loyal users are male.
2. Is familiarity with new product related to age and education level.
3. Is product use (heavy user, medium users, light users, and non users) related
to interest in outdoor activities (high, medium and low).
35
Dr. Amitabh Mishra
36. Significance of Cross tabulation
1. A cross-tabulation gives you a basic picture of how two
variables inter-relate.
2. It can be easily interpreted and understood by managers who
are statistically oriented.
3. A series of cross tabulation can provide greater insights into a
complex phenomenon. Etc.
Dr. Amitabh Mishra 36
37. Example- Gender and Internet Usage
• Suppose we are interested in determining whether internet
usage is related to gender?
• For the purpose of cross-tabulation, respondents can be
classified as-
– Male and female.
– light users (whose reported use is less than 5 hrs.) & heavy users
(whose reported use is more than 5 hrs.)
37
Dr. Amitabh Mishra
38. Dr. Amitabh Mishra 38
Gender & Internet usage
Row
Internet Usage Male Female Total
Light (1) 5 10 15
Heavy (2) 10 5 15
Column Total 15 1 5
39. Types of Cross tabulation
• Cross tabulation can be-
1. Cross tabulation with Two variable or Bi variate
cross tabulation.
2. Cross-Tabulation with Three Variables
Dr. Amitabh Mishra 39
40. Two Variables Cross-Tabulation
• Since two variables have been cross-classified, percentages could be
computed either
– Column wise, based on column totals, or
– Row wise, based on row totals.
• The general rule is to compute the percentages in the direction of
the independent variable, across the dependent variable.
40
Dr. Amitabh Mishra
41. Internet Usage by Gender: Percentage
calculation column wise
Gender
Internet Usage Male Female
Light 33.3% 66.7%
Heavy 66.7% 33.3%
Column total 100% 100%
41
Dr. Amitabh Mishra
42. Gender by Internet Usage:
Percentage calculation row wise
Internet Usage
Gender Light Heavy Total
Male 33.3% 66.7% 100.0%
Female 66.7% 33.3% 100.0%
42
Dr. Amitabh Mishra