Steps in Data
Processing
Dr. Kshitija Gandhi
PHD, MPHIL, MCOM,MBA,UGC NET
Vice Principal
Pratibha College of
Commerce and Computer studies
Data
Processing
Data reduction involves winnowing out the
irrelevant from the relevant data and establishing
order from chaos and giving shape to a mass of data
The essence of data processing in research is data
reduction
Data processing is concerned with editing, coding,
classifying, tabulating and charting and
diagramming research data.
Steps in Data
processing
Editing of Data
Coding of Data
Classification of Data
Tabulation of Data
1. Editing of Data
Editing is the first step in data processing.
Editing is the process of examining the data collected in
questionnaires/schedules to detect errors and omissions and to see that
they are corrected and the schedules are ready for tabulation.
When the whole data collection is over a final and a thorough check up
is made.
Editing for quality
Are the data forms complete?
Are the data free of bias?
Are the recordings free of errors?
Are the inconsistencies in responses within limits?
Are there evidences to show dishonesty of enumerators or interviewers?
Are there any wanton manipulation of data.
Editing for Tabulation
Does certain accepted
modification to data or even
rejecting certain pieces of
data in order to facilitate
tabulation.
For instance, extremely high
or low value data item may
be ignored or bracketed with
suitable class interval.
Field Editing
It is done by the
enumerator.
The schedule filled up by
the enumerator or the
respondent might have
some abbreviated
writings, illegible
writings and the like.
These are rectified by the
enumerator.
This should be done soon
after the enumeration or
interview before the loss
of memory.
The field editing should
not extend to giving some
guess data to fill up
omissions.
Central Editing
It is done by the researcher after
getting all schedules or
questionnaires or forms from the
enumerators or respondents.
Obvious errors can be corrected.
For missed data or information,
the editor may substitute data or
information by reviewing
information provided by likely
placed other respondents.
A definite inappropriate answer
is removed and “no answer” is
entered when reasonable
attempts to get the appropriate
answer fail to produce results.
Checkpoints
They should be familiar with instructions given to the
interviewers and coders as well as with the editing
instructions supplied to them for the purpose,
While crossing out an original entry for one reason or
another, they should just draw a single line on it so that
the same may remain legible,
They must make entries (if any) on the form in some
distinctive color and that too in a standardized form,
They should initial all answers which they change or
supply,
Editor’s initials and the data of editing should be
placed on each completed form or schedule.
Coding is the process/operation by which
data/responses are organized into classes/categories
and numerals or other symbols are given to each
item according to the class in which it falls. In other
words, coding involves two important operations;
Deciding the categories to be used and
b) allocating individual answers to them.
Coding
What is coding? Identifying or denoting numerals to the responses given by
the respondent
Why need of coding? To interpret the answers, classify them, record data in
spreadsheet
When it is needed? Necessary in both cases
Open Ended category
Close Ended category
How it is used? For Graphical Representation or for charting of data.
For drawing necessary figures.
To find out most recommended Noon
geared Two Wheeler
Unit Occupation Vehicle KM travel per
day
Marital status Family
1 Govt. Employees Activa Above 50 Married Joint
2 Private Employees Vespa 30- 50 km Unmarried Nuclear
3 Professional Pleasure 10 – 30 km
4 Unemployed Jupiter Below 10
5 Businessman Elecrtic Vehicles
6 Student
7 Housewife
To find out most recommended Noon
geared Two Wheeler
Unit Occupation Vehicle KM travel
per day
Marital status Family
1 4 2 1 1 1
2 5 2 2 2 1
3 3 4 3 1 2
4 5 5 1 1 2
5 2 1 1 2 2
6 1 1 2 1 1
Scheme in
Advance
Entering
records
• Appropriate to Research
Objectives
• Comprehensive
• Mutually Exclusive
• Single Variable Entry
Rule
Book
2. Coding of
Data
Efficient Analysis
Decisions are taken at designing stage
Helpful for tabulation
Reduction of errors
Objectives of Classification
The complex scattered
and haphazard data is
organized into concise,
logical and intelligible
form.
It is possible to make
the characteristics of
similarities and dis –
similarities clear.
Comparative studies is
possible.
Understanding of the
significance is made
easier and thereby
good deal of human
energy is saved.
Underlying unity
amongst different
items is made clear
and expressed.
Data is so arranged
that analysis and
generalization
becomes possible.
Classification is of two types
• variables or quantity
Quantitative
classification
• attributes
Qualitative classification
• making many (more than two) groups on the basis of some
quality or attributes
• Grouping the workers of a factory under various income (class
intervals) groups come under the multiple classification
Multiple Classification
• Classification into two groups on the basis of presence or
absence of a certain quality
• making two groups into skilled workers and unskilled workers
is the dichotomous classification
Dichotomous
Classification
4. Tabulation of Data
Tabulation is the process of summarizing raw data and displaying it in
compact form for further analysis.
Tabulation may be by hand, mechanical, or electronic. The choice is made
largely on the basis of the size and type of study, alternative costs, time
pressures, and the availability of computers, and computer programmes.
If the number of questionnaire is small, and their length short, hand
tabulation is quite satisfactory.
4. Tabulation of Data
Table may be divided into: (i) Frequency tables, (ii) Response
tables, (iii) Contingency tables, (iv) Uni-variate tables, (v) Bi-
variate tables, (vi) Statistical table and (vii) Time series tables.
Generally a research table has the following parts: (a) table
number, (b) title of the table, (c) caption (d) stub (row heading),
(e) body, (f) head note, (g) foot note.
Steps in the preparation of table
Title of table: The table should be first given a brief, simple and clear title which may express the basis of classification.
Columns and rows: Each table should be prepared in just adequate number of columns and rows.
Captions and stubs: The columns and rows should be given simple and clear captions and stubs.
Ruling: Columns and rows should be divided by means of thin or thick rulings.
Arrangement of items; Comparable figures should be arranged side by side.
Deviations: These should be arranged in the column near the original data so that their presence may easily be noted.
Size of columns: This should be according to the requirement.
Steps in the preparation of table
Arrangements of
items: This should be
according to the
problem.
Special emphasis: This
can be done by writing
important data in bold
or special letters.
Unit of
measurement: The unit
should be noted below
the lines.
Approximation: This
should also be noted
below the title.
Foot – notes: These may
be given below the
table.
Total: Totals of each
column and grand total
should be in one line.
Source : Source of data
must be given. For
primary data, write
primary data.
Steps in the preparation of table
It is always necessary to present facts in tabular form if they can be presented more simply in
the body of the text.
Tabular presentation enables the reader to follow quickly than textual presentation.
A table should not merely repeat information covered in the text.
The same information should not, of course be presented in tabular form and graphical form.
Smaller and simpler tables may be presented in the text while the large and complex table
may be placed at the end of the chapter or report.
Data
Diagrams
Charts: A chart is a diagrammatic form of
data presentation. Bar charts, rectangles,
squares and circles can be used to present
data. Bar charts are uni-dimensional, while
rectangular, squares and circles are two-
dimensional.
Graphs: The method of presenting
numerical data in visual form is called
graph, A graph gives relationship between
two variables by means of either a curve or
a straight line.
Graphs may be divided into two categories.
(1) Graphs of Time Series (2) Graphs of
Frequency Distribution. In graphs of time
series one of the factors is time and other or
others is / are the study factors.
THANK YOU

Data processing

  • 1.
    Steps in Data Processing Dr.Kshitija Gandhi PHD, MPHIL, MCOM,MBA,UGC NET Vice Principal Pratibha College of Commerce and Computer studies
  • 2.
    Data Processing Data reduction involveswinnowing out the irrelevant from the relevant data and establishing order from chaos and giving shape to a mass of data The essence of data processing in research is data reduction Data processing is concerned with editing, coding, classifying, tabulating and charting and diagramming research data.
  • 3.
    Steps in Data processing Editingof Data Coding of Data Classification of Data Tabulation of Data
  • 4.
    1. Editing ofData Editing is the first step in data processing. Editing is the process of examining the data collected in questionnaires/schedules to detect errors and omissions and to see that they are corrected and the schedules are ready for tabulation. When the whole data collection is over a final and a thorough check up is made.
  • 5.
    Editing for quality Arethe data forms complete? Are the data free of bias? Are the recordings free of errors? Are the inconsistencies in responses within limits? Are there evidences to show dishonesty of enumerators or interviewers? Are there any wanton manipulation of data.
  • 6.
    Editing for Tabulation Doescertain accepted modification to data or even rejecting certain pieces of data in order to facilitate tabulation. For instance, extremely high or low value data item may be ignored or bracketed with suitable class interval.
  • 7.
    Field Editing It isdone by the enumerator. The schedule filled up by the enumerator or the respondent might have some abbreviated writings, illegible writings and the like. These are rectified by the enumerator. This should be done soon after the enumeration or interview before the loss of memory. The field editing should not extend to giving some guess data to fill up omissions.
  • 8.
    Central Editing It isdone by the researcher after getting all schedules or questionnaires or forms from the enumerators or respondents. Obvious errors can be corrected. For missed data or information, the editor may substitute data or information by reviewing information provided by likely placed other respondents. A definite inappropriate answer is removed and “no answer” is entered when reasonable attempts to get the appropriate answer fail to produce results.
  • 9.
    Checkpoints They should befamiliar with instructions given to the interviewers and coders as well as with the editing instructions supplied to them for the purpose, While crossing out an original entry for one reason or another, they should just draw a single line on it so that the same may remain legible, They must make entries (if any) on the form in some distinctive color and that too in a standardized form, They should initial all answers which they change or supply, Editor’s initials and the data of editing should be placed on each completed form or schedule.
  • 10.
    Coding is theprocess/operation by which data/responses are organized into classes/categories and numerals or other symbols are given to each item according to the class in which it falls. In other words, coding involves two important operations; Deciding the categories to be used and b) allocating individual answers to them.
  • 11.
    Coding What is coding?Identifying or denoting numerals to the responses given by the respondent Why need of coding? To interpret the answers, classify them, record data in spreadsheet When it is needed? Necessary in both cases Open Ended category Close Ended category How it is used? For Graphical Representation or for charting of data. For drawing necessary figures.
  • 12.
    To find outmost recommended Noon geared Two Wheeler Unit Occupation Vehicle KM travel per day Marital status Family 1 Govt. Employees Activa Above 50 Married Joint 2 Private Employees Vespa 30- 50 km Unmarried Nuclear 3 Professional Pleasure 10 – 30 km 4 Unemployed Jupiter Below 10 5 Businessman Elecrtic Vehicles 6 Student 7 Housewife
  • 13.
    To find outmost recommended Noon geared Two Wheeler Unit Occupation Vehicle KM travel per day Marital status Family 1 4 2 1 1 1 2 5 2 2 2 1 3 3 4 3 1 2 4 5 5 1 1 2 5 2 1 1 2 2 6 1 1 2 1 1
  • 14.
    Scheme in Advance Entering records • Appropriateto Research Objectives • Comprehensive • Mutually Exclusive • Single Variable Entry Rule Book
  • 15.
    2. Coding of Data EfficientAnalysis Decisions are taken at designing stage Helpful for tabulation Reduction of errors
  • 16.
    Objectives of Classification Thecomplex scattered and haphazard data is organized into concise, logical and intelligible form. It is possible to make the characteristics of similarities and dis – similarities clear. Comparative studies is possible. Understanding of the significance is made easier and thereby good deal of human energy is saved. Underlying unity amongst different items is made clear and expressed. Data is so arranged that analysis and generalization becomes possible.
  • 17.
    Classification is oftwo types • variables or quantity Quantitative classification • attributes Qualitative classification • making many (more than two) groups on the basis of some quality or attributes • Grouping the workers of a factory under various income (class intervals) groups come under the multiple classification Multiple Classification • Classification into two groups on the basis of presence or absence of a certain quality • making two groups into skilled workers and unskilled workers is the dichotomous classification Dichotomous Classification
  • 18.
    4. Tabulation ofData Tabulation is the process of summarizing raw data and displaying it in compact form for further analysis. Tabulation may be by hand, mechanical, or electronic. The choice is made largely on the basis of the size and type of study, alternative costs, time pressures, and the availability of computers, and computer programmes. If the number of questionnaire is small, and their length short, hand tabulation is quite satisfactory.
  • 19.
    4. Tabulation ofData Table may be divided into: (i) Frequency tables, (ii) Response tables, (iii) Contingency tables, (iv) Uni-variate tables, (v) Bi- variate tables, (vi) Statistical table and (vii) Time series tables. Generally a research table has the following parts: (a) table number, (b) title of the table, (c) caption (d) stub (row heading), (e) body, (f) head note, (g) foot note.
  • 20.
    Steps in thepreparation of table Title of table: The table should be first given a brief, simple and clear title which may express the basis of classification. Columns and rows: Each table should be prepared in just adequate number of columns and rows. Captions and stubs: The columns and rows should be given simple and clear captions and stubs. Ruling: Columns and rows should be divided by means of thin or thick rulings. Arrangement of items; Comparable figures should be arranged side by side. Deviations: These should be arranged in the column near the original data so that their presence may easily be noted. Size of columns: This should be according to the requirement.
  • 21.
    Steps in thepreparation of table Arrangements of items: This should be according to the problem. Special emphasis: This can be done by writing important data in bold or special letters. Unit of measurement: The unit should be noted below the lines. Approximation: This should also be noted below the title. Foot – notes: These may be given below the table. Total: Totals of each column and grand total should be in one line. Source : Source of data must be given. For primary data, write primary data.
  • 22.
    Steps in thepreparation of table It is always necessary to present facts in tabular form if they can be presented more simply in the body of the text. Tabular presentation enables the reader to follow quickly than textual presentation. A table should not merely repeat information covered in the text. The same information should not, of course be presented in tabular form and graphical form. Smaller and simpler tables may be presented in the text while the large and complex table may be placed at the end of the chapter or report.
  • 23.
    Data Diagrams Charts: A chartis a diagrammatic form of data presentation. Bar charts, rectangles, squares and circles can be used to present data. Bar charts are uni-dimensional, while rectangular, squares and circles are two- dimensional. Graphs: The method of presenting numerical data in visual form is called graph, A graph gives relationship between two variables by means of either a curve or a straight line. Graphs may be divided into two categories. (1) Graphs of Time Series (2) Graphs of Frequency Distribution. In graphs of time series one of the factors is time and other or others is / are the study factors.
  • 24.

Editor's Notes

  • #16 Coding is necessary for efficient analysis and through it the several replies may be reduced to a small number of classes which contain the critical information required for analysis. Coding decisions should usually be taken at the designing stage of the questionnaire. This makes it possible to pre-code the questionnaire choices and which in turn is helpful for computer tabulation as one can straight forward key punch from the original questionnaires. But in case of hand coding some standard method may be used. One such standard method is to code in the margin with a  colored  pencil. The other method can be to transcribe the data from the questionnaire to a coding sheet. Whatever method is adopted, one should see that coding errors are altogether eliminated or reduced to the minimum level.