2. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 2
LECTURE 1
DATA HANDLING
For analysis purpose, sometimes you have to transform you data before it can be analysed. There are
different ways from which we can transform data dependent on the purpose. Two common data
transformation technique is “Recoding” and “Computing”.
RECODING
e.g. we want to assign participants (cases) into different age group, with the conditions: group 1 (18-30
years), group 2 (31-40), and group 3 (41-50).
In SPSS, from the menu, choose Transform > Recode into Different Variables
Choose the variable you want to group, namely age, then click on the arrow to move the variable into
the Input Variable Output Variable area. Specify the name for the Output Variable (e.g. age_group) and
click Change to create the new variable.
Click on Old and New Variables to define the groups, e.g. group 1 (New Value = 1) will included cases that
are with a range between 18 and 29 (<30). To finish, click Add.
Repeat the steps for group 2 and 3, then click on Continue and on the main dialog box, click OK to finish.
COMPUTING
e.g. we need to calculate the average score of motivation made up by Intrinsic_Motivation_learn and
Extrinsic_Motivation_learn for each of the partcipant.
In SPSS, from the menu, choose Transform > Compute Variables
In the box labelled Target Variable, type the new name of the variable you want to create, e.g.
Motivation_Average.
3. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 3
Moving the variables that you use to compute to the Numeric Expressions area (Intrinsic_Motivation_learn
and Extrinsic_Motivation_learn) clicking on the . Using the appropriate algebraic and numbers to
create the function as shown.
Click OK to finish.
FILE HANDLING
SELECT CASES
Sometimes we just want to perform analysis on certain cases or groups.
e.g. we want select only male for a given analysis. In SPSS, choose Data > Select Cases
Choose If condition is satisfied, then click If to proceed.
4. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 4
Transfer the gender variable to the command box, then indicate the approprate condition for our selection
(gender = 2). Click Continue to proceed.
Choose the Output option of preference, then click on OK to execute.
SPLIT FILE
e.g. if we want to perform a certain analysis for different groups, to save time of repeating the procedure,
we can spit the cases into different files before we perform any analysis.
In SPSS, choose Data > Split File
Transfer the variable that we base on to split the file into the Groups Based on box (in this case, it is
gender).
5. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 5
Click on OK to finish. At the end of the document, you will notice the information Split by gender
GRAPHS
We can use graphs to explore our data to see if they are normally distributed or not (normal distribution
will be covered in the second lecture). Here we present the two most common to help us initially explore if
there is any problem with the data, namely Histogram and Box-plot.
HISTOGRAM
A histogram plots a variable (x-axis) against the frequency of the scores (y-axis).
e.g. we want to explore the score distribution of the variable Intrinsic_Motivation_learn
In SPSS, choose Graphs > Chart Builder
In the Gallery tab, choose Histogram, then drag Simple Diagram into Chart Review area.
Then drag the variable Intrinsic_Motivation_learn to the x-axis. In the Element Properties dialog box, we
can create a label for the X and Y-axes by clicking on the axis and typing the new label, then click Apply.
Click on OK to finish.
6. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 6
The histogram below helps to notice that there is a score (score = 0) that is very different from others called
an outlier.
BOXPLOT
A boxplot is another way to present our data. At the center of the box is the median (a score that stands in
the middle and divides the data into two halves). The top and the bottom of the box are the limits of which
50% of the scores fall into. Scores that lie outside the box are considered as outliers.
e.g. we want to explore the score distribution of the variable Extrinsic_Motivation_learn
In SPSS, choose Graphs > Chart Builder
In the Gallery tab, choose Boxplot, then drag Simple Boxplot into Chart Review area.
Then drag the variable Extrinsic_Motivation_learn to the Y-axis. In addition, we can compare 2 groups
(male vs. female) by dragging the variable to the X-axis. In the Element Properties dialog box, we can create
a label for the X and Y-axes by clicking on the axis and typing the new label, then click Apply.
Click on OK to finish.
7. Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof.Dr.Chang Zhu page 7
The boxplot shows that there is a score falling outside (score = 0) which is an outlier. In the data, this score
comes from case 58.
Assignment
- (optionnal): Try to recode the values of Intrinsic_Motivation_learn such that 1 5, 2 4, 3 3,
4 2, and 5 1. Name the new variable as Intrinsic_reversed.
- (optional): Try to compute the natural logarit (ln) of Extrinsic_Motivation_learn using the Function
Group (Arithmetic). Name the new variable as Ln_ Extrinsic_Motivation.
- Create your own sample data: minimum 10 variables with minimum of 50 cases (deadline October
15, 2014)