Introduction to SPSS
SPSS: Pros and Cons
Pros:
• Minimum effort to start using the software
• User-friendly, menu-driven interface
• No requirement to know how to program and use syntax
Cons:
• Not free
• Proprietary software: dependent on IBM for updates and support
• Point-and-click does not encourage reproducibility
• Limited advanced statistical methods (e.g., SEM, multilevel modeling) and graphical
capabilities
Objective and Outline
• Focus on common data management utilities in SPSS
• Emphasis on point-and-click
• Topics:
• SPSS environment
• Entering data
• Modifying data
• Combining data
• Exploring data
• Linear regression example
Software and Materials
• SPSS version 29
• SPSS version 30 (released 9.30.2024)
• CLICC Virtual Desktop
• Workshop materials
SPSS Environment
• Data View
• Variable View
• Overview
• Syntax Editor
• Command Syntax
• Output (Viewer)
SPSS Environment
• Three main windows to interact with your data:
• Data View
• Spreadsheet format
• Each row represents an observation,
each column a variable
• Variable View:
• Manage your variables
• Each row represents a variable, each column
represents various attributes of the variable
• Overview:
• New feature in SPSS 29
• Dashboard style summary of the active dataset
(number of cases and variables, missing values, duplicated cases, etc.)
Syntax Editor
• Complicated analyses involving multiple
choices might be difficult to replicate using
point-and-click
• Make your analysis reproducible, even when
you use point-and-click
• Click the Paste button instead of OK
to generate syntax
• Save the script for future reference
• See the Command Syntax References in the
Help menu
8
Entering Data
• Reading in an Excel or CSV
• Open SPSS Data
• Import Data
• Missing Data
Importing Data into SPSS
• SPSS can read various types of data from
different sources:
• SPSS data (.sav)
• Excel (.xls, .xlsx)
• CSV (.csv)
• Text files (.txt)
• Database files (e.g., SQL)
• Other statistical software: R, Stata, SAS
• Open data: For SPSS data and some non-SPSS formats, typically requires less setup
• Import data: For non-SPSS file formats, this option allows you to customize how the
data is read, such as specifying delimiters, data types, and variable names
Two Types of Missing Data in SPSS
• System Missing
• Automatically assigned when data is imported without a value for a particular variable
• Treated as missing by SPSS analyses
• User Missing
• Values that a researcher explicitly designates as missing
• Specific values (e.g., -99, 999) can represent missing data
• This allows for more flexibility in handling data, particularly when certain codes in a
dataset are not valid responses (like indicating "no response" or "not applicable")
• User missing values can be set up in the Variable View by specifying which values
should be treated as missing
• User Missing supports multiple types of missingness, while System Missing allows only one
Missing data representation in SPSS
• For numeric variables:
• System Missing is represented by a dot
• User Missing by a number
• For string variables:
• SPSS does not automatically recognize
a blank space as a system missing value
• Even an empty strings is a valid data point
• Explicitly define missing string variable values
• in SPSS, it is generally preferred to have your data in numeric format rather than as
string variables, especially when performing statistical analyses
12
Modifying Data
• Reordering
• Variable / value
labels
• Renaming
• Transform
• String to numeric
• Recoding
• Compute variable
Modify Data in Variable View Window
• Reordering
• Variable labels
• Value labels
• Renaming
14
Transform menu
• Transform:
• String to numeric
• Recoding
• Compute variable
15
Document Data Modifications
• Document as much as you can:
• Use the Paste instead of OK option when
available
• Use Utilities -> Data File Comments to
document the changes
• To display the comments in the Viewer,
select Display comments in output
Combining Data
• Appending
• Merging
Appending (Add cases)
Combines multiple datasets by adding rows from one dataset to another
Data -> Merge Files -> Add Cases
18
Merging
Joins datasets by matching rows based on shared variable(s)
Data -> Merge Files -> Add Variables
19
Exploring Data
• Descriptives
• Stratified
• Conditional
• Crosstabulations
• Correlations
• Scatterplots
Descriptive Statistics
• Descriptive statistics are useful:
• Getting to know your data (easier to understand through means, medians, SDs)
• Identifying patterns
• Enabling comparison
• Visualizing data
• Overview window
21
Descriptives
• In SPSS, the Descriptives procedure calculates basic descriptive statistics for one or more
continuous numeric variables
• By default, Descriptives computes mean, standard deviation,
minimum, and maximum
• Click "Options" to customize additional descriptive statistics
Stratified and Conditional Descriptives
• Descriptive statistics calculated for
subgroups or subsets within a larger
dataset
• Use Data -> Split File option to
summarize subsets of your data based
on one or more grouping variables
• Use Data -> Select Cases option to
summarize data and exclude certain
cases based on specific criteria
• Split File and Select Cases alter data
handling. You must disable them to
analyze all cases again
23
Graphs
• Graphs simplify complex data and reveal patterns,
visualize data distribution effectively, enhance
communication of results
• SPSS has a variety of graphs and charts (in
previous versions referred to as Legacy
Dialogs) that can be used to visualize data
• Chart Builder is a true graphic engine that offers
more flexibility
• Allows you to create charts using predefined gallery
options or individual elements (e.g., axes and bars)
• Build charts by dragging and dropping items onto the
canvas
(Bivariate) Crosstabulations
Useful for examining relationships between two categorical variables
(Bivariate) Correlations
Useful for examining relationships between two continuous variables
(Bivariate) Scatterplot
Linear Regression Example
28
General Linear Model in SPSS
Let’s model writing score as a function of gender, ses and reading score
Use hs1.sav
29
Univariate GLM Options
• Dependent Variable: Outcome variable
• Fixed factors: Categorical variables. SPSS creates
dummy variables (0/1) for all but one reference
category. The last group becomes the reference
category (e.g., in ses, "high" is the reference
category in SPSS)
• Covariates: Continuous predictors
• To run the model without interactions, click OK or
Paste
• To model interactions among variables, click the
Model button in the upper right corner.
30
Univariate GLM with Main effects
• First, indicate main effects:
1. Select variables.
2. In the middle box, choose Main Effects
from the dropdown menu.
3. Click the arrow below Main Effects to
add the terms to the Model window on the right.
31
Univariate GLM with Interactions
• The options in the Specify Model section:
• Full Factorial: Includes all main effects for
factors and covariates, as well as all factor-by-
factor interactions. Covariate interactions are
not included.
• Build Terms: Specify main effects and
interaction terms manually.
• Build Custom Terms: Offers more flexibility
than Build Terms, allowing for polynomial
terms and other custom constructs.
• To indicate an interaction:
1.Click on Build Terms.
2.Select the variables to interact (e.g., gender
and read).
3.In the middle box, select Interaction from the
dropdown menu.
4.Click the arrow below Interaction to add the
terms to the Model window on the right.
32
Click “Parameter estimates”
33
Additional resources
• IBM SPSS resources (demos, tutorial, packages, add-ons)
• UCLA OARC resources (learning modules, data analysis examples, annotated
outputs)
• SPSS user forum
• Search YouTube for SPSS videos
Thank you!

Why we need to learnintrospss_version29.pptx

  • 1.
  • 2.
    SPSS: Pros andCons Pros: • Minimum effort to start using the software • User-friendly, menu-driven interface • No requirement to know how to program and use syntax Cons: • Not free • Proprietary software: dependent on IBM for updates and support • Point-and-click does not encourage reproducibility • Limited advanced statistical methods (e.g., SEM, multilevel modeling) and graphical capabilities
  • 3.
    Objective and Outline •Focus on common data management utilities in SPSS • Emphasis on point-and-click • Topics: • SPSS environment • Entering data • Modifying data • Combining data • Exploring data • Linear regression example
  • 4.
    Software and Materials •SPSS version 29 • SPSS version 30 (released 9.30.2024) • CLICC Virtual Desktop • Workshop materials
  • 5.
    SPSS Environment • DataView • Variable View • Overview • Syntax Editor • Command Syntax • Output (Viewer)
  • 6.
    SPSS Environment • Threemain windows to interact with your data: • Data View • Spreadsheet format • Each row represents an observation, each column a variable • Variable View: • Manage your variables • Each row represents a variable, each column represents various attributes of the variable • Overview: • New feature in SPSS 29 • Dashboard style summary of the active dataset (number of cases and variables, missing values, duplicated cases, etc.)
  • 7.
    Syntax Editor • Complicatedanalyses involving multiple choices might be difficult to replicate using point-and-click • Make your analysis reproducible, even when you use point-and-click • Click the Paste button instead of OK to generate syntax • Save the script for future reference • See the Command Syntax References in the Help menu
  • 8.
    8 Entering Data • Readingin an Excel or CSV • Open SPSS Data • Import Data • Missing Data
  • 9.
    Importing Data intoSPSS • SPSS can read various types of data from different sources: • SPSS data (.sav) • Excel (.xls, .xlsx) • CSV (.csv) • Text files (.txt) • Database files (e.g., SQL) • Other statistical software: R, Stata, SAS • Open data: For SPSS data and some non-SPSS formats, typically requires less setup • Import data: For non-SPSS file formats, this option allows you to customize how the data is read, such as specifying delimiters, data types, and variable names
  • 10.
    Two Types ofMissing Data in SPSS • System Missing • Automatically assigned when data is imported without a value for a particular variable • Treated as missing by SPSS analyses • User Missing • Values that a researcher explicitly designates as missing • Specific values (e.g., -99, 999) can represent missing data • This allows for more flexibility in handling data, particularly when certain codes in a dataset are not valid responses (like indicating "no response" or "not applicable") • User missing values can be set up in the Variable View by specifying which values should be treated as missing • User Missing supports multiple types of missingness, while System Missing allows only one
  • 11.
    Missing data representationin SPSS • For numeric variables: • System Missing is represented by a dot • User Missing by a number • For string variables: • SPSS does not automatically recognize a blank space as a system missing value • Even an empty strings is a valid data point • Explicitly define missing string variable values • in SPSS, it is generally preferred to have your data in numeric format rather than as string variables, especially when performing statistical analyses
  • 12.
    12 Modifying Data • Reordering •Variable / value labels • Renaming • Transform • String to numeric • Recoding • Compute variable
  • 13.
    Modify Data inVariable View Window • Reordering • Variable labels • Value labels • Renaming
  • 14.
    14 Transform menu • Transform: •String to numeric • Recoding • Compute variable
  • 15.
    15 Document Data Modifications •Document as much as you can: • Use the Paste instead of OK option when available • Use Utilities -> Data File Comments to document the changes • To display the comments in the Viewer, select Display comments in output
  • 16.
  • 17.
    Appending (Add cases) Combinesmultiple datasets by adding rows from one dataset to another Data -> Merge Files -> Add Cases
  • 18.
    18 Merging Joins datasets bymatching rows based on shared variable(s) Data -> Merge Files -> Add Variables
  • 19.
    19 Exploring Data • Descriptives •Stratified • Conditional • Crosstabulations • Correlations • Scatterplots
  • 20.
    Descriptive Statistics • Descriptivestatistics are useful: • Getting to know your data (easier to understand through means, medians, SDs) • Identifying patterns • Enabling comparison • Visualizing data • Overview window
  • 21.
    21 Descriptives • In SPSS,the Descriptives procedure calculates basic descriptive statistics for one or more continuous numeric variables • By default, Descriptives computes mean, standard deviation, minimum, and maximum • Click "Options" to customize additional descriptive statistics
  • 22.
    Stratified and ConditionalDescriptives • Descriptive statistics calculated for subgroups or subsets within a larger dataset • Use Data -> Split File option to summarize subsets of your data based on one or more grouping variables • Use Data -> Select Cases option to summarize data and exclude certain cases based on specific criteria • Split File and Select Cases alter data handling. You must disable them to analyze all cases again
  • 23.
    23 Graphs • Graphs simplifycomplex data and reveal patterns, visualize data distribution effectively, enhance communication of results • SPSS has a variety of graphs and charts (in previous versions referred to as Legacy Dialogs) that can be used to visualize data • Chart Builder is a true graphic engine that offers more flexibility • Allows you to create charts using predefined gallery options or individual elements (e.g., axes and bars) • Build charts by dragging and dropping items onto the canvas
  • 24.
    (Bivariate) Crosstabulations Useful forexamining relationships between two categorical variables
  • 25.
    (Bivariate) Correlations Useful forexamining relationships between two continuous variables
  • 26.
  • 27.
  • 28.
    28 General Linear Modelin SPSS Let’s model writing score as a function of gender, ses and reading score Use hs1.sav
  • 29.
    29 Univariate GLM Options •Dependent Variable: Outcome variable • Fixed factors: Categorical variables. SPSS creates dummy variables (0/1) for all but one reference category. The last group becomes the reference category (e.g., in ses, "high" is the reference category in SPSS) • Covariates: Continuous predictors • To run the model without interactions, click OK or Paste • To model interactions among variables, click the Model button in the upper right corner.
  • 30.
    30 Univariate GLM withMain effects • First, indicate main effects: 1. Select variables. 2. In the middle box, choose Main Effects from the dropdown menu. 3. Click the arrow below Main Effects to add the terms to the Model window on the right.
  • 31.
    31 Univariate GLM withInteractions • The options in the Specify Model section: • Full Factorial: Includes all main effects for factors and covariates, as well as all factor-by- factor interactions. Covariate interactions are not included. • Build Terms: Specify main effects and interaction terms manually. • Build Custom Terms: Offers more flexibility than Build Terms, allowing for polynomial terms and other custom constructs. • To indicate an interaction: 1.Click on Build Terms. 2.Select the variables to interact (e.g., gender and read). 3.In the middle box, select Interaction from the dropdown menu. 4.Click the arrow below Interaction to add the terms to the Model window on the right.
  • 32.
  • 33.
    33 Additional resources • IBMSPSS resources (demos, tutorial, packages, add-ons) • UCLA OARC resources (learning modules, data analysis examples, annotated outputs) • SPSS user forum • Search YouTube for SPSS videos Thank you!

Editor's Notes

  • #15 \
  • #17 SIAVASH, WE NEED MENU MERGE FILES -> ADD CASES
  • #18 SIAVASH, WE NEED MENU MERGE FILES -> ADD VARIABLES
  • #20 SIAVASH, WE NEED TO REPLACE THIS OVERVIEW WINDOW SCREENSHOT
  • #21 SIAVASH, WE NEED NEW ANALYZE -> DESCRIPTIVE STATISTICS -> DESCRIPTIVES SCREENSHOT
  • #23 SIAVASH, WE NEED A SCREENSHOT OF GRAPH MENU, THERE ARE NO MORE LEGACY DIALOGS
  • #24 SIAVASH, WE NEED A NEW ANALYZE MENU (THE REST IS OK, NO NEED TO REDO IT)
  • #25 SIAVASH, WE NEED A NEW ANALYZE MENU AND OUTPUT FOR WRITING AND READING SCORE CORRELATION (THE REST IS OK, NO NEED TO REDO IT)
  • #26 SIAVASH, WE NEED A SCREENSHOT OF MENU GRAPHS -> SCATTERPLOT
  • #28 SIAVASH, WE NEED A NEW SCREENSHOT OF THIS
  • #29 SIAVASH, WE NEED A NEW SCREENSHOT OF THIS
  • #30 SIAVASH, WE NEED A NEW SCREENSHOT OF THIS
  • #31 SIAVASH, WE NEED A NEW SCREENSHOT OF THIS