Data analysis means the ordering, manipulating, and summarizing of data to obtain answers to research questions. Its purpose is to reduce data to intelligible and interpretable form so that the relations of research problems can be studied and tested.
3. ANALYSIS
means the ordering, manipulating, and summarizing of data
to obtain answers to research questions. Its purpose is to
reduce data to intelligible and interpretable form so that the
relations of research problems can be studied and tested.
INTERPRETATION
gives the results of the analysis, makes inferences pertinent
the research relations studied, and draws conclusions about
these relations.
5. Steps in Processing of Data
Questionnaire
checking Editing
Coding
Tabulation
Data
Cleaning
Statistically
adjusting the
data
Selecting a
Data Analysis
Strategy
6. QUESTIONNAIRE CHECKING
The initial step in questionnaire checking involves a check of all
questionnaires for completeness and interviewing quality.
A questionnaire returned from the field may be unacceptable for several
reasons:
Part of the questionnaire may be incomplete.
The pattern of responses may indicate that the respondent did not understand or
follow the instructions.
The responses show little variance.
The questionnaire is answered by someone who does not qualify for
participation.
The returned questionnaire is physically incomplete, one or more pages are
missing.
7. EDITING
• Editing means checking the collected/gathered
data for its accuracy, completeness, and utility
• It means to remove the errors, incompleteness,
and inconsistency in data’s
• Impose minimal quality standards on the raw
data
• This can be done in two stage
Field Edit
Central office edit
8. Field Edit
◦ The objective of field editing is to make sure that proper procedure is
followed in selecting the respondent, interviewing them, and recording
their responses.
◦ The main problems faced in field editing are:
Inappropriate Respondents – Instead of house owners, the tenant is
interviewed.
Incomplete interviews,
Improper understanding, Lack of consistency,
Legibility,
Fictitious interview – Questionnaires are filled by the interviewer
himself without conducting the interview.
◦ Look for Completeness, Legibility, Comprehensibility, Consistency, and
Uniformity
9. ◦ It is more thorough than field editing.
◦ More complete and exacting edit
◦ Problems of consistency and rapport with respondents are some of the
issues that get highlighted during office editing.
◦ Best performed by a number of editors, each looking at
one part of the data
◦ Decision on how to handle item non-response and other
omissions need to be made
◦ List-wise deletion (drop for all analyses) vs. case-wise
deletion (drop only for present analysis)
10. Example of Inconsistency: A respondent indicated that he doesn’t drink coffee, but when
questioned about his favorite brand, he replied ‘BRU’.
Treatment of Unsatisfactory Responses
Returning to the field – Questionnaires with unsatisfactory responses may be returned to the
field, where the interviewers recontact the respondents.
Assigning missing values – The editor may assign missing values to unsatisfactory
responses. This approach may be desirable if 1) the number of respondents with
unsatisfactory responses is small, 2) the proportion of unsatisfactory responses for each of
these respondents is small, or 3) the variables with unsatisfactory responses are not the key
variables.
Discarding unsatisfactory respondents – This is possible only when the proportion of
unsatisfactory respondents is small or the sample size is large.
11. Coding
Coding refers to those activities that help in transforming
edited questionnaires into a form that is ready for analysis.
Coding speeds up the tabulation while editing eliminates
errors.
Characteristics
of coding of
data
Exhaustiness
Mutually
exclusive
unidimensional
12. Coding involves assigning numbers
or other symbols to answers so that the
responses can be grouped into a
limited number of classes or
categories.
The code includes an indication of
the column and data record it will
occupy.
13. DATA ENTRY
It is the process of transferring coded data in to data files
through a keyboard of computer terminal into
DATA VERIFICATION
It is the process of visually comparing the numbers in the
printout with codes on the original sources
Errors in data verification are Missing values, Outliners
and Wild code
14. CREATING AND DOCUMENTING THE
ANALYSIS
The data cleaning process is followed by developing an
analysis file using a statistical software package
15. Organization of data
The collected data from various sources should be organized into
homogenous groups to get a meaningful relationship
Classification of data: the collected raw data must be grouped to
enable easy retrieval of essential data
Classification according to attributes: data are categorized according to
descriptive characteristics e.g.:- gender, type of family
Classification according to numerical characteristics: are a quantitative
phenomenon that can be measured objectively: income, height, etc
16. Presentation of data
Once the data has been classified the researchers need to
summarize, organize, and communicate the information
using tables or diagrams which is termed as presentation of
data.
Tabular presentation
Graphical presentation: advanced technique where data
are presented in different ways such as bar chart, pie
chart and line diagram etc
17. Tabulation
Refers to counting the number of cases that fall into various categories.
The results are summarized in the form of statistical tables.
The raw data is divided into groups and sub-groups.
The counting and placing of data in a particular group and sub-group are
done.
The tabulation involves:
Sorting and counting.
Summarizing of data.
18. Tabulation
Tabulation may be of two types:
Simple tabulation – In simple tabulation, a single
variable is counted.
Cross tabulation – Includes two or more variables,
which are treated simultaneously.
Tabulation can be done entirely by hand, by machine, or by
both hand and machine.
19. Sorting and counting of data: Sorting can be done as follows:
Gender Tally marks Frequencies
Male Iiii 4
Female Ii 2
Other gender I 1
Format of a Blank table
Table No.
TITLE – Number of children per family
Head Note – Unit of measurement
Sub-Heading Caption Total
Body
Foot note
Sub heading indicates
the row title or the row
headings. The caption
indicates what each
column is meant for.
The body of the table
gives full information
on the frequency.
20. Kinds of Tabulation
Simple or one-way tabulation – The multiple-choice questions which allow only one
answer may use one-way tabulation or univariate. The questions are predetermined and
consist of counting the number of responses falling into a particular category and
calculating the percentage.
Example
Table 14.1: Study of number of children in a family
No. of children Family Percentage
0 10 5
1 30 15
2 70 35
21. Cross Tabulation or Two-way Tabulation
This is known as Bivariate Tabulation. The data may include two or more variables.
Eg. Popularity of a health drink among families having different incomes.
Table 14.3: Use of Health Drink
Income
per month
No. of
children
per family
1 2 No. of
families
10000 10 5 8 23
1001-2000 5 0 8 13
2001-3000 20 10 12 42
22. Data cleaning
Includes checking for missing values, consistency checks, outliners and
wild codes.Although preliminary consistency checks have been made
during editing, the checks at this stage are more thorough and extensive,
because they are made by computer.
Consistency checks – Identify data that are out of range, logically
inconsistent or have extreme values. For eg. A respondent may indicate
that she charges long-distance calls to a calling card, although she does not
have one.
23. Treatment of missing responses
Missing responses represent values of a variable that are unknown, either because
respondents provided ambiguous answers or their answers were not properly recorded.
Substitute a Neutral Value – A neutral value, typically the mean to the variable, is
substituted for the missing responses.
Substitute an Imputed Response – The respondent’s pattern of responses to other
questions are used to impute or calculate a suitable response to the missing questions.
Casewise Deletion – Cases or respondents with any missing responses are discarded
from the analysis.
Pairwise deletion – Instead of discarding all cases with any missing values, the
researcher uses only the cases or respondents with complete responses for each
calculation. As a result, different calculations in an analysis may be based on different
sample sizes.
24. Statistically Adjusting the Data
If any correction needs to be done for the statistical analysis, the data
is adjusted accordingly.
25. Selecting a Data Analysis Strategy
The selection of a data analysis strategy should be
based on the earlier steps of the marketing research
process, known characteristics of the data, properties
of statistical techniques and the background and
philosophy of the researcher.
26. Types of Quantitative
Analysis
Classification based
on goals of analysis
Descriptive statistics
Collecting
Organizing
Summarizing
Presenting
Inferential statistics
Hypothesis testing
Making predictions
Classification based
on assumptions
about data
•Parametric test
•Specifies certain conditions
about the population
•t test, z test, ANOVA test
Non-parametric test
No assumptions are made
about the population
Chi-square, Mann-Whitney
Classification based
on number of
variables involved
Univariate analysis
Applies when study involves
single variable
Bivariate analysis
Applies when study involves
two variables
Multivariate analysis
Applies when study involves
more than 2 variables
27.
28. REFERENCES
1.Sreevani Rentala: Basics in Nursing Research and Biostatistics
2018: First Edition: Jaypee Brothers Medical Publishers.
2.Dr.R.Bincy: Nursing Research -Buiding evidence for Practice:
2012.
3. Suresh K Sharma: Nursing Research & Statistics: 2018