To describe how to use SPSS for conducting basicstatistical analysis
and interpret the output of the analysis.
Preparing SPSS file for data entry
Methods of presenting and Summarizing data
Graphical presentation of data
By the end of this workshop you will be able to:
•STANDS FOR STATISTICAL PACKAGE FOR SOCIAL
اإلجتماعية العلوم في اإلحصائية الحزم
Variable , Data
• Variables :متغيرات
These are observations, which vary from one person to
another or from one group of members to others as: age,
weight, blood pressure, sex.
• Data البيانات: Value of the variable.
• Body weight= 70 KG
You are conducting a research to see if proper
infection control training for nurses decreases
prevalence of needle stick injury.
Dependent variable (DV): Decreases prevalence
of needle stick injury.
Independent variable (ID): proper infection
Continuous quantitative: e.g. age, weight, height
Discrete quantitative: e.g. no of patients, number of
children per family.
2.Qualitative (non numerical)/categorigcal:
Nominal qualitative: e.g.
(blood grouping A, B, AB, O),(sex: male and female)
Ordinal qualitative: e.g. (mild, moderate, severe)
An icon next to each variable provides information about data type
Session 1 cont.:
How to design questionnaire in SPSS
How to enter data in SPSS
• Coding and Design questionnaire in SPSS
• Data entry
Copy, paste from excel into SPSS
From SPSS, open existing excel file
Create new file directly from SPSS
Practical training 1: (10 min.)
•Please each one design the questionnaire with
you on SPSS and enter data
N.B How to Code questions with more than one answer
What are risks, adverse effects do you know of oral Isotretinoin therapy? (You can
choose more than one answer):
4. Lipid profile disturbance
5. Hepatic side effects
9. I don’t know
Each item in answer enter as a separate
variable with yes, no answer
How to Code open ended questions
• Read through responses
• Create a preliminary code based on
• Put responses into category and code it
• Try not to have more than 10 categories,
with no individual category receiving less
than 5% of responses.
• Also, there is software that can be used to
help you code open-ended responses.
How to perform data cleaning/manipulation?
Use data set 1
• Data files are not always organized to meet specific user needs, user
may need to select specific group, split data file into separate group
• Copy, paste: age, duration, HbA1C
From Data menu
1. Select case: first 20 cases, male only, Saudi, age <45ys
2. Split file: by sex, nationality
3. Sort cases: duration, HbA1C (ascending, descending)
From SPSS dialog box, go to:
Select cases, Sort cases,
Practical training 2
Use data set 1
From transform menu
1. Recode into different variable:
Age: from number to groups (1- <25y, 2- ≥25y)
Duration 1- <5, 2- ≥5
HBA1C 1- <6.5, 2- ≥6.5
1. Recode into same variable
2. Compute variable: Mean, %, BMI ضرورى االقواس
From SPSS dialog box, go to:
Into Same variables
Into different variables
Methods of presentation
• Frequency tables
• Cross tabulations
• Rang, IQR
Descriptive statistics, Data presentation
• After data entry, it can be analyzed using descriptive statistics
• To find wrong entries, have basic knowledge on the sample,
Frequency analysis (simple frequency table)
Crosstabs (2 X 2 table and C x R)
1- Simple frequency table
From the menu choose:
Analyze > Descriptive Statistics > Frequencies...
e.g. (sex, nationality, compliance and comments)
So you can do frequency to filter data, detect
Missing Values for a Numeric Variable
• Type 999 in the Value field.
Missing Values for a String Variable
• Type NR (no response) in the Value field.
A value is missing may be important to your analysis. For example, you may
find it useful to distinguish between those respondents, and non respondents
Characteristics of good table:
2. Self explanatory
3- Explaining abbreviation
4- Columns and rows labeled clearly
5- Unites of measures should be written
6- Title: Every table should have a title, above the
table, which is clear and answers, as possible as you
can, four questions
(what, who, where and when).
• Crosstabs are used to examine the relationship between two
variables Analyze > Descriptive Statistics >crosstabs
• e.g. 2x2 (sex & nationality)
Odds Ratio (OR)
• The odds ratio is the odds of outcome occurrence in
one group divided by the odds of outcome
occurrence in the comparison group
• Analysis of case-control studies
• If the OR = 1 there is no difference between the two
• If the OR >1 this exposure is risk factor for
occurrence of disease
• If the OR <1 this exposure is protective factor for
occurrence of disease
Relative risk (RR)
RR indicates how many times those
exposed are likely to develop the disease
relative to non-exposed.
• Analysis of cohort studies
RR= 1: the exposure is not associated with the
RR > 1: the exposure is a risk
RR < 1: the exposure is a protective
Practical training 3:
•Using data set 1 exercise
•Descriptive statistics: Simple frequency table for
•Crosstabs: 2 X 2 table: relation between gender and
•C x R: Association of compliance to treatment and
Not normally distributed
Methods of data presentation
Practical Training 3
Perform descriptive analysis for the variables:
• Age and HbA1C (mean, median, SD, range, min, max ) and write the
comment on table
A percentile or (centile) is the value below which a certain
percentage of observations fall.
For example, the 10th percentile is the value below which 10
percent of the observations may be found.
Often used to compare an individual value with a norm. e.g. physical
growth charts for children e.g. weight for age chart
• SEM measures the variability of the
mean of the sample as an estimate
of the true value of the mean of the
population from which the sample
How to present data by
• This type of graph is suitable to represent data of the two subtypes of
qualitative and quantitative discrete type
• Analyze> descriptive statistics> frequency> chart> bar chart
• Graphs > legacy dialogs>bar
Types of bar charts:
Simple bar chart
Multiple/grouped bar chart
Segmented/stacked bar chart
Continuous quantitative data
For all the four types of variables
𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 𝐨𝐟 𝐜𝐚𝐭𝐞𝐠𝐨𝐫𝐲 𝐨𝐫 𝐢𝐧𝐭𝐞𝐫𝐯𝐚𝐥
Box plot ( often called box and whisker plot)
• This is a vertical or horizontal rectangle, with the
ends of the rectangle corresponding to the upper
and lower quartiles of the data values.
• A line drawn through the rectangle corresponds
to the median value.
• Whiskers, starting at the ends of the rectangle,
usually indicate minimum and maximum values.
The dependent variable on the
vertical axis (the y-axis)
The independent variable on the
horizontal axis (the x-axis).
The value of (r) ranges between ( -1) and ( +1)
Pareto chart (Analyze> quality control> pareto)
• Is a vertical bar graph in which values
are plotted in decreasing order of
relative frequency from left to right.
• Is one of the seven basic tools of quality
control. Useful for analyzing what
problems need attention first.
• Pareto principle (80/20 rule), is a theory
maintaining that 80 percent of the
output from a given situation or system
is determined by 20 percent of the
• Pareto chart guiding how to solve 80%
of the problem.
Quiz: which cause could solve 80% of problem
Pareto chart ranking perceived problems of food service providers at
• A way to quickly and easily
visualize how well the students in
your class were doing over the
course of the year.
• A way to show the average exam
scores throughout the course on
an area chart.
• Show a trend over time
• Using data set 2 exercise to make:
• Pie graph for sex , Bar chart for compliance
• Bar graph for compliance with sex
• Bar chart for compliance of females only
• Histogram of age, duration
• Histogram of female height/ male height
• Box plot for age
• SPSS for the Classroom: the Basics
• California state university. IBM SPSS statistics 20. part 1
• IBM SPSS Statistics 20 Brief Guide.
• www. spsstests.com.
صالحا كله عملنا اجعل اللهم