1. Core System
1.1.1. What's new in version 21?
1.1.2. What's new in version 20?
1.1.3. What's new in version 19?
1.1.4. What's new in version 18?
1.1.5. What's new in version 17.0?
1.1.6. What's new in version 16.0?
1.1.7. What's new in version 15.0?
1.1.8. What's new in version 14.0.1
1.1.9. What's new in version 14.0?
126.96.36.199. Version 14.0 compatibility with previous releases
1.1.10. What's new in version 11.5?
1.1.11. What's new in version 11.0
1.1.12. What's new in version 10.0.5
1.1.13. What's new in version 10.0
1.1.14. What's new in version 9.0
1.1.15. What's new in version 8.0
1.1.16. What's new in version 7.5
1.1.17. What's new in version 7.0
188.8.131.52. Designated window versus active window
184.108.40.206.1. Changing the designated window
1.1.19. Status Bar
1.1.20. Dialog boxes
1.1.21. Variable names and variable labels in dialog box lists
1.1.22. Resizing dialog boxes
1.1.23. Dialog box controls
1.1.24. Selecting variables
1.1.25. Data type, measurement level, and variable list icons
1.1.26. Getting information about variables in dialog boxes
1.1.27. Command line options
1.1.28. Basic steps in data analysis
1.1.29. Statistics Coach
1.2. Getting Help
1.2.1. Getting Help on Output Terms
1.3. Data files
1.3.1. Opening data files
220.127.116.11. To open data files
18.104.22.168. Data file types
22.214.171.124. Opening file options
126.96.36.199. Reading Excel Files
188.8.131.52. Reading Excel 95 or Later Files
184.108.40.206.1. How to Read Excel 95 or Later Files
220.127.116.11. Reading older Excel files and other spreadsheets
18.104.22.168. Reading dBASE files
22.214.171.124. Reading Stata files
126.96.36.199. Reading Database Files
188.8.131.52.1. To Read Database Files
184.108.40.206.2. Selecting a Data Source
220.127.116.11.3. Selecting Data Fields
18.104.22.168.4. Creating a Relationship between Tables
dialogs for generating command syntax. You can create custom dialogs to generate syntax from
multiple commands, including custom extension commands implemented in Python or R. See the
topic Creating and Managing Custom Dialogs for more information.
Multiple language support. In addition to the ability to change the output language available in
previous releases, you can now change the user interface language. See the topic General options
for more information.
Codebook. The Codebook procedure reports the dictionary information -- such as variable names,
variable labels, value labels, missing values -- and summary statistics for all or specified variables
and multiple response sets in the active dataset. For nominal and ordinal variables and multiple
response sets, summary statistics include counts and percents. For scale variables, summary
statistics include mean, standard deviation, and quartiles. See the topic Codebook for more
Nearest Neighbor analysis. Nearest Neighbor analysis is a method for classifying cases based on
their similarity to other cases. In machine learning, it was developed as a way to recognize patterns
of data without requiring an exact match to any stored patterns, or cases. Similar cases are near
each other and dissimilar cases are distant from each other. Thus, the distance between two cases
is a measure of their dissimilarity. See the topic Nearest Neighbor Analysis for more information.
Multiple Imputation. The Multiple Imputation procedure performs multiple imputation of missing
data values. Given a dataset containing missing values, it outputs one or more datasets in which
missing values are replaced with plausible estimates. You can then obtain pooled results when
running other procedures. The procedure also summarizes missing values in the working dataset.
This feature is available in the Missing Values add-on option. See the topic Impute Missing Data
Values (Multiple Imputation) for more information.
RFM analysis. RFM (recency, frequency, monetary) analysis is a technique used to identify existing
customers who are most likely to respond to a new offer. This technique is commonly used in direct
marketing. This feature is available in the EZ RFM add-on option. See the topic RFM Analysis for
Categorical Regression enhancements. Categorical Regression has been enhanced to include
regularization and resampling methods to assess and improve prediction accuracy. Together, these
new methods make it possible to create state-of-the-art models, even for high-volume data (where
there are more variables than observations, such as in genomics). This feature is available in the
Categories add-on option. See the topic Categorical Regression (CATREG) for more information.
Graphboard. Graphboard visualizations are graphs, charts, and plots created from a visualization
template. IBM® SPSS® Statistics ships with built-in visualization templates. You can also use a
separate product, IBM® SPSS® Visualization Designer, to create your own visualization templates.
The new visualization templates are effectively custom visualization types. See the topic Creating
and Editing Graphboard Visualizations for more information.
Exporting output. More output export format options and more control over exported content,
• Wrap or shrink wide table in Word documents. See the topic Word/RTF options for more
• Create new worksheets or append data to existing worksheets in an Excel workbook. See the
topic Excel options for more information.
• Save output export specifications in the form of command syntax with the OUTPUT EXPORT
command. All the features for exporting output in the Export Output dialog are now also available
in command syntax; so you can save and re-run your export specifications and include them in
automated production jobs. See the topic OUTPUT EXPORT for more information.
• The Output Management System (OMS) now supports these additional output formats: Word,
Excel, and PDF. See the topic Output Management System for more information.
See the topic ALTER TYPE for more information.
• Read and write Unicode data and syntax files. See the topic General options for more information.
• Control the default directory location to look for and save files. See the topic File locations
options for more information.
Performance. For computers with multiple processors or processors with multiple cores,
multithreading for faster performance is now available for some procedures. See the topic THREADS
Subcommand (SET command) for more information.
Statistical enhancements. Statistical enhancements include:
• Partial Least Squares (PLS). A predictive technique that is an alternative to ordinary least
squares (OLS) regression, canonical correlation, or structural equation modeling, and it is
particularly useful when predictor variables are highly correlated or when the number of
predictors exceeds the number of cases. See the topic Partial Least Squares Regression for more
• Multilayer perceptron (MLP). The MLP procedure fits a particular kind of neural network called a
multilayer perceptron. The multilayer perceptron uses a feed-forward architecture and can have
multiple hidden layers. The multilayer perceptron is very flexible in the types of models it can fit.
It is one of the most commonly used neural network architectures. This procedure is available in
the new Neural Networks option. See the topic Multilayer Perceptron for more information.
• Radial basis function (RBF). A Radial basis function (RBF) network is a feed-forward, supervised
learning network with only one hidden layer, called the radial basis function layer. Like the
multilayer perceptron (MLP) network, the RBF network can do both prediction and classification.
It can be much faster than MLP, however it is not as flexible in the types of models it can fit.
This procedure is available in the new Neural Networks option. See the topic Radial Basis
Function for more information.
• Generalized Linear Models supports numerous new features, including ordinal multinomial and
Tweedie distributions, maximum likelihood estimation of the negative binomial ancillary parameter,
and likelihood-ratio statistics. This procedure is available in the Advanced Statistics option. See
the topic Generalized Linear Models Response for more information.
• Cox Regression now provides the ability to export model information to an XML (PMML) file. This
procedure is available in the Advanced Statistics option. See the topic Cox Regression Save New
Variables for more information.
• Complex Samples Cox Regression. Apply Cox proportional hazards regression to analysis of
survival times—that is, the length of time before the occurrence of an event for samples drawn
by complex sampling methods. This procedure supports continuous and categorical predictors,
which can be time-dependent. This procedure provides an easy way of considering differences in
subgroups as well as analyzing effects of a set of predictors. The procedure estimates variances
by taking into account the sample design used to select the sample, including equal probability
and probability proportional to size (PPS) methods and with replacement (WR) and without
replacement (WOR) sampling procedures. This procedure is available in the Complex Samples
Programmability extension. Programmability extension enhancements include:
• R-Plugin. Combine the power of IBM® SPSS® Statistics with the ability to write your own
statistical routines with R. This plug-in is available only as a download from
• Nested Begin Program-End Programcommand structures. See the topic BEGIN PROGRAM-END
PROGRAM for more information.
• Ability to create and manage multiple datasets.
Export results in PDF format. Export output in PDF format, including Viewer outline headings as
bookmarks in the PDF file. See the topic Export output for more information.
Control chart enhancements. You can now define rules for control charts to help you quickly
identify points that are out of control.
More chart types in Chart Builder. The Chart Builder has been expanded to include histograms,
boxplots, scatterplot matrices, overlay scatterplots, population pyramids, error bar charts, high-
low-close charts, difference area charts, range bar charts, dot plots, charts of separate variables,
and paneled charts. You can also create charts that were not previously available, such as charts
with dual, independent y axes.
Chart Editor enhancements. The Chart Editor now offers more control over your charts. Major
features include an updated Variables tab for changing chart types easily, automatic control of
white space, additional distribution curves for histograms, a tool for quickly rescaling axes, and the
ability to use custom equations to create reference lines. See the topic What's New and Different
for more information.
Programmatic control of output documents. You can now create, open, activate, save, and
close Viewer and documents with command syntax using OUTPUT NEW, OUTPUT NAME, OUTPUT
ACTIVATE, OUTPUT OPEN, OUTPUT SAVE, and OUTPUT CLOSE.
Ordinal Regression. This procedure, previously available as part of the Advanced Statistics add-
on option, is now available in the Core system. See the topic Ordinal Regression for more
PMML model files with transformations. You can now include transformations in PMML model files
and merge information from model files using the TMS BEGIN-TMS END and TMS MERGE commands.
Generalized Linear Models. The Generalized Linear Models procedure expands the general linear
model so that the dependent variable is linearly related to the factors and covariates via a specified
link function. Moreover, the model allows for the dependent variable to have a non-normal
distribution. This procedure is available in the Advanced Statistics option. See the topic Generalized
Linear Models Response for more information.
Generalized Estimating Equations. The Generalized Estimating Equations procedure extends the
generalized linear model to allow for analysis of repeated measurements. This procedure is available
in the Advanced Statistics option. See the topic Generalized Estimating Equations for more
Complex Samples Ordinal Regression. The Complex Samples Ordinal Regression procedure
performs regression analysis on a binary or ordinal dependent variable for samples drawn by complex
sampling methods. Optionally, you can request analyses for a subpopulation. This procedure is
available in the Complex Samples option. See the topic Complex Samples Ordinal Regression for more
Optimal Binning. The Optimal Binning procedure discretizes one or more scale variables by
distributing the values of each variable into bins. Bin formation is optimal with respect to a
categorical guide variable that "supervises" the binning process. Bins can then be used instead of
the original data values for further analysis. This procedure is available in the Data Preparation
option. See the topic Optimal Binning for more information.
The Programmability Extension now allows you to write to the active dataset and create custom
pivot tables and custom procedures. For more information go to
commas, or other characters that are not allowed in ˜variable names. See the topic RENAME
Subcommand (SAVE TRANSLATE command) for more information.
• Use the new SQLsubcommand of the SAVE TRANSLATEcommand to append new columns to
database tables, modify database table column attributes, join tables, and perform other actions
that are permitted with valid SQL statements. See the topic SQL Subcommand (SAVE
TRANSLATE command) for more information.
• Use the new Chart Builder interface (Graphs menu) to build charts from predefined gallery charts
or from the individual parts (for example, coordinate systems and bars) that make up a chart.
See the topic Building Charts for more information.
• Create custom chart types by using powerful GGRAPHand GPLcommand syntax. See the topic
GGRAPH for more information.
• New Expert Modeler in the Forecasting option automatically identifies and estimates the best-
fitting model for one or more time series, thus eliminating the need to identify an appropriate
model through trial and error. For more information, see Time Series Modeler and TSMODEL.
• New Data Validation option provides a quick visual snapshot of your data and provides the ability
to apply validation rules that identify invalid data values. You can create rules that flag out-of-
range values, missing values, or blank values. You can also save variables that record individual
rule violations and the total number of rule violations per case. A limited set of predefined rules is
provided that you can copy or modify. For more information, see Introduction to Data Preparation
New Anomaly Detection procedure in the Data Validation option finds unusual observations that
could adversely affect predictive models. Some of these outlying observations represent truly
unique cases and are thus unsuitable for prediction, while other observations are caused by
data-entry errors in which the values are technically “correct” and thus cannot be caught by the
Validate Data procedure. For more information, see Identify Unusual Cases and DETECTANOMALY.
• New Multidimensional Unfolding procedure (PREFSCAL) in the Categories option attempts to find
the structure in a set of proximity measures between row and column objects. This process is
accomplished by assigning observations to specific locations in a conceptual low-dimensional
space such that the distances between points in the space match the given (dis)similarities as
closely as possible. The result is a least-squares representation of the objects in that low-
dimensional space, which, in many cases, helps you further understand your data. This procedure
is currently available with PREFSCALcommand syntax. See the topic PREFSCAL for more
• New Predictor Selection procedure (SELECTPRED) in SPSS Statistics Server sifts through a very
large number of categorical and continuous predictor variables. The procedure selects a smaller
subset for use in predictive modeling procedures that cannot accept so many predictors. This
procedure is currently available with SELECTPREDcommand syntax. See the topic SELECTPRED
for more information.
• New Naïve Bayes procedure (NAIVEBAYES) in SPSS Statistics Server produces a simple and stable
model for predictor selection and classification. This procedure is currently available with
NAIVEBAYEScommand syntax. See the topic NAIVEBAYES for more information.
• Improved significance testing capabilities in the Custom Tables option allows you to now perform
significance tests on subtotals and multiple response sets. See the topic Custom Tables: Test
Statistics Tab for more information.
• More flexibility is available in defining multiple response sets for multiple dichotomies. See the
select specific autoscript functions.
• Use Create/Modify Autoscripts on the Utilities menu to create new autoscript functions for the
currently selected output object type in the Viewer.
• Use Run Scripts on the Edit menu to run a personal script on the currently selected output
object in the Viewer (a variety of sample personal scripts are supplied with IBM® SPSS®
• Use Open or New on the File menu to modify any personal script or create a new personal script.
HTML and ASCII format for exporting output. You can export output in HTML (HTML 3.0) and
ASCII text format. For HTML format, pivot tables can be exported as HTML tables, and charts can
be exported in JPEG format and automatically embedded by reference in your HTML document. Use
Export on the File menu of the Viewer to export output.
Expanded features for reading databases. The new Database Wizard enables you to specify
multiple joins, including both inner and outer joins. Use Database Capture on the File menu to read
databases into SPSS Statistics.
Customizable toolbars. You can modify toolbars and create your own toolbars to include the
features you use often, including personal scripts and any items available on the menus. Use
Toolbars on the View menu to customize toolbars.
Statistics Coach. For users who are not familiar with SPSS Statistics or with the available
statistical procedures, the Statistics Coach can help you get started with many of the basic
statistical techniques in the Core system.
More statistical procedures in the Core system. Factor analysis, discriminant analysis, cluster
analysis, and proximity and distance measures are now included in the Core system (Analyze menu)
and feature new, flexible, pivot table output.
Variance Components Analysis. A new procedure in the Advanced Statistics option, Variance
Components Analysis extends the analytic capabilities of the General Linear Model procedures.
Statistical enhancements. Many statistical procedures now have additional features:
• Crosstabs. McNemar test and clustered bar charts.
• Frequencies. Pie charts.
• Factor Analysis. Promax rotation method.
• Discriminant Analysis. Leave-one-out classification (similar to jackknifing).
• Logistic Regression (Professional Statistics). Pseudo R Squared measures and Hosmer-
Lemeshow goodness-of-fit statistics.
• General Linear Model (Advanced Statistics). Expanded set of analysis options and techniques.
New tables features. With the Custom Tables option, you can save multiple response set
information, and pivoting features have been enhanced to provide greater flexibility for pivoting
More printing control. Printing features have been expanded to include alignment control of
individual output items, user-specified page and column breaks in large tables, and widow and
orphan control for tables that break across pages.
• Use Align Left, Center, or Align Right on the Format menu in the Viewer to change the alignment
for the selected output item.
• Use Break on the Format menu in an activated pivot table to specify a page or column break at
There are a number of different types of windows in IBM® SPSS® Statistics:
Data Editor. The Data Editor displays the contents of the data file. You can create new data files
or modify existing data files with the Data Editor. If you have more than one data file open, there is
a separate Data Editor window for each data file.
Viewer. All statistical results, tables, and charts are displayed in the Viewer. You can edit the
output and save it for later use. A Viewer window opens automatically the first time you run a
procedure that generates output.
Pivot Table Editor. Output that is displayed in pivot tables can be modified in many ways with the
Pivot Table Editor. You can edit text, swap data in rows and columns, add color, create
multidimensional tables, and selectively hide and show results.
Chart Editor. You can modify high-resolution charts and plots in chart windows. You can change
the colors, select different type fonts or sizes, switch the horizontal and vertical axes, rotate 3-D
scatterplots, and even change the chart type.
Text Output Editor. Text output that is not displayed in pivot tables can be modified with the Text
Output Editor. You can edit the output and change font characteristics (type, style, color, size).
Syntax Editor. You can paste your dialog box choices into a syntax window, where your selections
appear in the form of command syntax. You can then edit the command syntax to use special
features that are not available through dialog boxes. You can save these commands in a file for use
in subsequent sessions.
Data Editor and Viewer
-server, -user, and -passwordswitches. Windows only.
-singleseat. Start application in a single seat mode.
-nologo. Start the application without displaying the splash screen.
-production [prompt|silent]. Start the application in production mode. The promptand silent
keywords specify whether to display the dialog box that prompts for runtime values if they are
specified in the job. The prompt keyword is the default and shows the dialog box. The silent
keyword suppresses the dialog box. If you use the silentkeyword, you can define the runtime
symbols with the -symbolswitch. Otherwise, the default value is used. The -switchserverand -
singleseatswitches are ignored when using the -productionswitch.
-symbol <values>. List of symbol-value pairs used in the production job. Each symbol name starts
with @. Values that contain spaces should be enclosed in quotes. Rules for including quotes or
apostrophes in string literals may vary across operating systems, but enclosing a string that
includes single quotes or apostrophes in double quotes usually works (for example, “'a quoted
value'”). The symbols must be defined in the production job using the Runtime Values tab. See the
topic Runtime values for more information.
-background. Run the production job in the background on a remote server. Your local computer
does not have to remain on and does not have to remain connected to the remote server. You can
disconnect and retrieve the results later. You must also include the -productionswitch and
specify the server using the -serverswitch.
<filename> .... List of filenames, which can include all application supported file types. Enclose a
file name with double quotes if it contains spaces.
-help|-h. Display the command help.
If the -server, -user, -password, -switchserver, and -singleseatswitches are omitted, SPSS
Statistics runs in the default mode.
Note: The following examples assume that you changed directories to the executable location. The
details may vary by operating system and may require path specifications.
Starting in distributed mode using a specific server:
stats -server mystatssvr:3016 -user myuser -password mypassword
Starting in distributed mode using a specific server and a domain name:
stats -server mystatssvr:3016 -user "mydomainmyuser" -password mypassword
Starting in single seat mode:
Starting in production mode, letting SPSS Statistics prompt for runtime values:
stats C:job1.spj -production
Starting in production mode with defined symbol-value pairs:
stats C:job1.spj -production silent -macro @sex male @state "North Dakota"