Zagazig university
Faculty of Veterinary Medicine
Session#2:
Statistical Analysis of Questionnaire Data
M.Afifi
M.Sc., Biostatistics(Co-Supervision with ISSR, Cairo University)
Ph.D., Candidate (AVC, UPEI, Canada)
E-mail: M.Afifi@zu.edu.eg, Afifi-stat6@hotmail.com
Tel: +201060658185
 Changing the way you look at questionnaire
 Uses of questionnaire in veterinary research!!!!!!!!!!!!!!
 Topics
 Questionnaire Data
 Data Entry
 Data Analysis
 Results (Tables + Figures)
 Report
Questionnaire Data
 Questionnaire Data
 Consists of group of Major
Items (Construct) assessed by
some questions in order judge
quality of those Constructs
Construct
Single Item
Q1
 Likert scale and Data Coding
 Likert items are used to measure respondents' attitudes to a particular
question or statement.
 Typical familiar five-point Likert scale
5 point Likert scale
3 point Likert scale
 Likert scale Data Coding
 Bipolar scaling method (symmetry), measuring either (+Ve) positive or (-Ve)
negative response to a statement.
 Central tendency : 1-2-3-4-5 =3
 Sometimes a four-point scale is used; since the middle option of "Neither
agree nor disagree" is not available.
Reverse coding
 One common validation technique for survey items
is to rephrase a "positive" item in a "negative"
way. When done properly, this can be used to check
if respondents are giving consistent answers.
 For example, concerning our SSQ
‫الحفظ‬ ‫علي‬ ‫يعتمد‬ ‫الدراسي‬ ‫المقرر‬(‫ينور‬ ‫هللا‬ ‫شغال‬)...............
Questionnaire Data Entry
Data Entry
Data Entry
Excel Data Sheet Scanner and OCR
 It is preferable to enter data firstly into excel sheet then to be uploaded to
SPSS
 Open Excel Sheet
 Give student ID’s (rows=Cases) for each questionnaire
 Question No. across (Columns=variables)
 Template for Data Entry
Questionnaire Questions
Respondents (Students)
 For Example to enter 10 question questionnaire for 40 student this will go
like as follows:
Upload data onto SPSS
 Open SPSS
 Click cancel on opening screen
 File > Open > Data
 After your data opens up in SPSS, save it in case you have problems later on
(File > Save as >file name)
Check for what can go wrong in data entry?
 Max (5)
 Min (1)
 Count (No. of questionnaires)
Data Analysis
Reliability Analysis, Cronbach's Alpha
Reliability coefficient (Cronbach's Alpha)
 Measure of internal consistency, that is, how closely related
a set of items are as a group.
Reliability coefficient (Cronbach's Alpha)
 Example: compute Cronbach's alpha using SPSS, use a dataset
that contains four test items - q1, q2, q3 and q4 (questionnaire.sav.)
 The alpha coefficient for the four items is 0.839, suggesting that the
items have relatively high internal consistency. (Note reliability
coefficient of .70 or higher is considered "acceptable" )
Interpreting Reliability coefficient (Cronbach's Alpha)
 range from zero (no reliability) -1.00 (perfect reliability).
 High reliability >>>>questions of a test tended to “pull together.” Students
who answered a given question correctly were more likely to answer other
questions correctly. If a parallel test were developed by using similar items,
the relative scores of students would show little change.
 Low reliability >>>questions tended to be unrelated to each other in terms
of who answered them correctly. The resulting test scores reflect
peculiarities of the items or the testing situation more than students’
knowledge of the subject matter.
NB:
If a questionnaire includes positively-keyed and
negatively-keyed items, then the negatively-
keyed items must be “reverse-scored” before
computing total scores and before conducting
reliability analysis)
Data Analysis
I. Simple/Basic Statistical analysis
Descriptive Statistics
I. Simple/Basic Statistical analysis
The data analysis decision for Likert items depends on the objective for which
questionnaire was developed development.
 If you have a series of individual questions that have Likert response
options for your participants to answer. Modes, frequencies.
 If you have a series of Likert-type questions that when combined describe
a personality trait or attitude - use means and standard deviations to
describe the scale.
Construct
Likert Scale
Single Item
Q1
I. Likert-type question (item) Single-item :
 Each single questions
 Frequencies and Distribution each alternative
 The number and percentage of students who choose each
alternative are reported. i.e. (% that agree, disagree etc)
 Use mode the most frequent
 The bar graph on the right shows the percentage choosing each
response
Pooled respondents’ opinions on the statements
Pooled respondents’ opinions on the statements (Questions)
Clustered Bar Chart
Stacked Bar Chart
 Medians and Interquartile range
 Medians: number found exactly in the middle of the distribution
 a measure of central tendency
 roughly speaking, it shows what the ‘average’ respondent might
think, or the ‘likeliest’ response.
 IQR :a measure of dispersion: it shows whether the responses are
clustered together or scattered across the range of possible
responses.
 Example
 Question of 5 point scale, ranging from “1=strongly disagree” to
“5=Strongly agree”. Were filled by 60 students
 The number of respondents was as follows
 How do I interpret this data???????????????
 Data:
 Calculating the median
 This ‘middle’ number is your data ( In case of Odd No.)
 Two middle numbers the median is half-way between them (In
case of even No.).
 Median = 3
 Calculating the IQR
 Use same arrangement of responses that we used above. When you
divide this line into four equal parts, the ‘cut-off’ points are called
quartiles. (IQR = 4 – 3 = 1)
1st Q 3rd Q2nd Q
Interpretation: Reporting the data
Consensus and dissonance
‫والتنافر‬ ‫التوافق‬
 A relatively small IQR (0-1), as was the case above, is an
indication of consensus.
 larger IQRs suggest that opinion is polarised, i.e., that your
respondents tend to hold strong opinions either for or against this
topic (dissonance)
 For Example
 Mdn=4, IQR=0 most respondents indicated agreement with the
statement
 Mdn=3, IQR=3 If we report that the respondents are,
on average, undecided, that would be a statistical distortion of the data.
 report more accurately: “Opinion seems to be divided with regard to… .
Many respondents (N=28, 47%) expressed strong disagreement or
disagreement, but a roughly equal number (N=26, 43%) indicated that they
agreed or strongly agreed
Averages (mean)
 Average = 3.3 something between ‘undecided’ and ‘disagreement’.
 ‘Our study revealed mild disagreement regarding this Q.
 This is statistical nonsense not an optimal interpretation. Such an
argument relies on the assumption that the psychological distance
between ‘strong agreement’ and ‘agreement’ is the same as that
between ‘agreement’ and ‘no opinion’..
 Don’t use “Ordinal data cannot yield mean values”
Box-plots
Box-plots
II. Composite (summated) scales:
 Composed of a series of four or more Likert-type items that are combined
into a single composite
 Measure concept, e.g. the feeling (social presence) can not be measured
directly also called latent variable. To measure such "soft" implicit
variables with questionnaires, several questions are asked. They then can
be combined into a single composite variable,
 Created by adding up all the values with a potential score from min (no
amenities) to max (all amenities).
 Let us look at the central tendency and dispersion of the index
II. Composite (summated) scales:
 Mean : characterize the center of the data
Standard Deviation: measures of variability of the data around the mean
Coefficient of Variation:
No. and (%) below and above the average
 Data Analysis
 II. More Elaborate analysis comparison between genders,
 Factors impacting student satisfaction
 Academic achievement pre-enrolment
 Social factors
 Financial factors
 External factors
 Work commitments
 Institutional factors
 Worked Example
 Assume that we want to asses student satisfaction regarding teaching
 4 Questions
 60 student
 Report (Results )
 Tables
 Figures
 Interpretation
 Frequencies and Distribution each alternative
 Considering our Questionnaire.sav
 Analyze >>> Descriptive Stat >>> Frequencies
 Frequencies and Distribution each alternative
 Considering our Questionnaire.sav
 Analyze >>> Descriptive Stat >>> Frequencies
 Frequencies and Distribution each question
 Frequencies and Distribution each Q
 To get the medians and IQR
 Keep in mind your code book
 Report (Results )

statistical analysis of questionnaires

  • 1.
    Zagazig university Faculty ofVeterinary Medicine Session#2: Statistical Analysis of Questionnaire Data M.Afifi M.Sc., Biostatistics(Co-Supervision with ISSR, Cairo University) Ph.D., Candidate (AVC, UPEI, Canada) E-mail: M.Afifi@zu.edu.eg, Afifi-stat6@hotmail.com Tel: +201060658185
  • 3.
     Changing theway you look at questionnaire  Uses of questionnaire in veterinary research!!!!!!!!!!!!!!
  • 4.
     Topics  QuestionnaireData  Data Entry  Data Analysis  Results (Tables + Figures)  Report
  • 5.
  • 6.
     Questionnaire Data Consists of group of Major Items (Construct) assessed by some questions in order judge quality of those Constructs
  • 7.
  • 10.
     Likert scaleand Data Coding  Likert items are used to measure respondents' attitudes to a particular question or statement.  Typical familiar five-point Likert scale
  • 11.
  • 12.
  • 13.
     Likert scaleData Coding  Bipolar scaling method (symmetry), measuring either (+Ve) positive or (-Ve) negative response to a statement.  Central tendency : 1-2-3-4-5 =3  Sometimes a four-point scale is used; since the middle option of "Neither agree nor disagree" is not available.
  • 14.
    Reverse coding  Onecommon validation technique for survey items is to rephrase a "positive" item in a "negative" way. When done properly, this can be used to check if respondents are giving consistent answers.  For example, concerning our SSQ ‫الحفظ‬ ‫علي‬ ‫يعتمد‬ ‫الدراسي‬ ‫المقرر‬(‫ينور‬ ‫هللا‬ ‫شغال‬)...............
  • 16.
  • 17.
  • 18.
    Data Entry Excel DataSheet Scanner and OCR
  • 19.
     It ispreferable to enter data firstly into excel sheet then to be uploaded to SPSS  Open Excel Sheet  Give student ID’s (rows=Cases) for each questionnaire  Question No. across (Columns=variables)
  • 20.
     Template forData Entry Questionnaire Questions Respondents (Students)
  • 21.
     For Exampleto enter 10 question questionnaire for 40 student this will go like as follows:
  • 24.
    Upload data ontoSPSS  Open SPSS  Click cancel on opening screen  File > Open > Data  After your data opens up in SPSS, save it in case you have problems later on (File > Save as >file name)
  • 25.
    Check for whatcan go wrong in data entry?  Max (5)  Min (1)  Count (No. of questionnaires)
  • 26.
  • 27.
    Reliability coefficient (Cronbach'sAlpha)  Measure of internal consistency, that is, how closely related a set of items are as a group.
  • 28.
    Reliability coefficient (Cronbach'sAlpha)  Example: compute Cronbach's alpha using SPSS, use a dataset that contains four test items - q1, q2, q3 and q4 (questionnaire.sav.)  The alpha coefficient for the four items is 0.839, suggesting that the items have relatively high internal consistency. (Note reliability coefficient of .70 or higher is considered "acceptable" )
  • 29.
    Interpreting Reliability coefficient(Cronbach's Alpha)  range from zero (no reliability) -1.00 (perfect reliability).  High reliability >>>>questions of a test tended to “pull together.” Students who answered a given question correctly were more likely to answer other questions correctly. If a parallel test were developed by using similar items, the relative scores of students would show little change.  Low reliability >>>questions tended to be unrelated to each other in terms of who answered them correctly. The resulting test scores reflect peculiarities of the items or the testing situation more than students’ knowledge of the subject matter.
  • 30.
    NB: If a questionnaireincludes positively-keyed and negatively-keyed items, then the negatively- keyed items must be “reverse-scored” before computing total scores and before conducting reliability analysis)
  • 31.
    Data Analysis I. Simple/BasicStatistical analysis Descriptive Statistics
  • 32.
    I. Simple/Basic Statisticalanalysis The data analysis decision for Likert items depends on the objective for which questionnaire was developed development.  If you have a series of individual questions that have Likert response options for your participants to answer. Modes, frequencies.  If you have a series of Likert-type questions that when combined describe a personality trait or attitude - use means and standard deviations to describe the scale.
  • 33.
  • 34.
    I. Likert-type question(item) Single-item :  Each single questions
  • 35.
     Frequencies andDistribution each alternative  The number and percentage of students who choose each alternative are reported. i.e. (% that agree, disagree etc)  Use mode the most frequent  The bar graph on the right shows the percentage choosing each response
  • 36.
  • 37.
    Pooled respondents’ opinionson the statements (Questions)
  • 38.
  • 39.
  • 41.
     Medians andInterquartile range  Medians: number found exactly in the middle of the distribution  a measure of central tendency  roughly speaking, it shows what the ‘average’ respondent might think, or the ‘likeliest’ response.  IQR :a measure of dispersion: it shows whether the responses are clustered together or scattered across the range of possible responses.
  • 42.
     Example  Questionof 5 point scale, ranging from “1=strongly disagree” to “5=Strongly agree”. Were filled by 60 students  The number of respondents was as follows  How do I interpret this data???????????????
  • 43.
  • 44.
     Calculating themedian  This ‘middle’ number is your data ( In case of Odd No.)  Two middle numbers the median is half-way between them (In case of even No.).  Median = 3
  • 45.
     Calculating theIQR  Use same arrangement of responses that we used above. When you divide this line into four equal parts, the ‘cut-off’ points are called quartiles. (IQR = 4 – 3 = 1) 1st Q 3rd Q2nd Q
  • 46.
    Interpretation: Reporting thedata Consensus and dissonance ‫والتنافر‬ ‫التوافق‬  A relatively small IQR (0-1), as was the case above, is an indication of consensus.  larger IQRs suggest that opinion is polarised, i.e., that your respondents tend to hold strong opinions either for or against this topic (dissonance)
  • 47.
     For Example Mdn=4, IQR=0 most respondents indicated agreement with the statement  Mdn=3, IQR=3 If we report that the respondents are, on average, undecided, that would be a statistical distortion of the data.  report more accurately: “Opinion seems to be divided with regard to… . Many respondents (N=28, 47%) expressed strong disagreement or disagreement, but a roughly equal number (N=26, 43%) indicated that they agreed or strongly agreed
  • 48.
    Averages (mean)  Average= 3.3 something between ‘undecided’ and ‘disagreement’.  ‘Our study revealed mild disagreement regarding this Q.  This is statistical nonsense not an optimal interpretation. Such an argument relies on the assumption that the psychological distance between ‘strong agreement’ and ‘agreement’ is the same as that between ‘agreement’ and ‘no opinion’..  Don’t use “Ordinal data cannot yield mean values”
  • 49.
  • 50.
  • 52.
    II. Composite (summated)scales:  Composed of a series of four or more Likert-type items that are combined into a single composite  Measure concept, e.g. the feeling (social presence) can not be measured directly also called latent variable. To measure such "soft" implicit variables with questionnaires, several questions are asked. They then can be combined into a single composite variable,  Created by adding up all the values with a potential score from min (no amenities) to max (all amenities).  Let us look at the central tendency and dispersion of the index
  • 53.
    II. Composite (summated)scales:  Mean : characterize the center of the data Standard Deviation: measures of variability of the data around the mean Coefficient of Variation: No. and (%) below and above the average
  • 55.
     Data Analysis II. More Elaborate analysis comparison between genders,  Factors impacting student satisfaction  Academic achievement pre-enrolment  Social factors  Financial factors  External factors  Work commitments  Institutional factors
  • 56.
     Worked Example Assume that we want to asses student satisfaction regarding teaching  4 Questions  60 student
  • 57.
     Report (Results)  Tables  Figures  Interpretation
  • 58.
     Frequencies andDistribution each alternative  Considering our Questionnaire.sav  Analyze >>> Descriptive Stat >>> Frequencies
  • 59.
     Frequencies andDistribution each alternative  Considering our Questionnaire.sav  Analyze >>> Descriptive Stat >>> Frequencies
  • 60.
     Frequencies andDistribution each question
  • 61.
     Frequencies andDistribution each Q
  • 62.
     To getthe medians and IQR
  • 63.
     Keep inmind your code book
  • 64.