SlideShare a Scribd company logo
1 of 65
Amity Institute of Psychology and Allied Sciences
1
Amity Institute of Psychology
and Allied Sciences
Introduction to SPSS (STAT626)
Amity Institute of Psychology and Allied Sciences
2
Introduction to SPSS
(STAT626)
Internal Assessment (Test)
Amity Institute of Psychology and Allied Sciences
Module 1
Introduction
Amity Institute of Psychology and Allied Sciences
Introduction to SPSS
• SPSS means “Statistical Package for the Social Sciences” and was first
launched in 1968. Since SPSS was acquired by IBM in 2009, it's officially
known as IBM SPSS Statistics but most users still just refer to it as “SPSS”.
Amity Institute of Psychology and Allied Sciences
Introduction to SPSS
• SPSS is short for Statistical Package for the Social Sciences, and it’s used
by various kinds of researchers for complex statistical data analysis. The
SPSS software package was created for the management and statistical
analysis of social science data. It was originally launched in 1968 by SPSS
Inc., and was later acquired by IBM in 2009.
• Officially dubbed IBM SPSS Statistics, most users still refer to it as SPSS.
As the world standard for social-science data analysis, SPSS is widely
coveted due to its straightforward and English-like command language and
impressively thorough user manual.
• SPSS is used by market researchers, health researchers, survey
companies, government entities, education researchers, marketing
organizations, data miners, and many more for processing and analyzing
survey data,
Amity Institute of Psychology and Allied Sciences
Introduction to SPSS
In computer science, garbage in, garbage out (GIGO) is the concept that
flawed, or nonsense (garbage) input data produces nonsense output.
Rubbish in, rubbish out (RIRO) is an alternate wording.
Amity Institute of Psychology and Allied Sciences
Introduction to SPSS
• SPSS - Quick Overview Main Features
• SPSS is software for editing and analyzing all sorts of data. These data may
come from basically any source: scientific research, a customer database,
Google Analytics or even the server log files of a website. SPSS can open
all file formats that are commonly used for structured data such as
• spreadsheets from MS Excel or OpenOffice;
• plain text files (.txt or .csv);
• relational (SQL) databases;
• Stata and SAS.
Amity Institute of Psychology and Allied Sciences
Data analysis with SPSS: general aspects,
workflow, critical issues
In general:
• Except for graphs, SPSS output should not be presented in reports or presentations.
• Instead, the information from SPSS output should be used to construct proper tables,
or to produce proper conclusions.
• SPSS has two worksheets sheets; both must be set up correctly (you can switch
between the two worksheets by clicking on these at the bottom of the SPSS window):
• The Data View, which contains the data.
• The Variable View, which contains information about each variable.
• In SPSS (as with most statistical software), each row represents one unit of analysis
in the Data View.
Amity Institute of Psychology and Allied Sciences
Data analysis with SPSS: general aspects,
workflow, critical issues
• In SPSS (as with most statistical software), each column represents one variable in
the Data View.
• At the top of each column (but not in Row 1!) is the name of the variable;
• When setting up the Variable View, many columns can be filled in. These are the
important columns to get right (the others aren’t important for our purposes):
• Name: A short description of the variable (with no spaces or punctuation!);
• Type: Usually Numeric (apart from names which can be String).
• Label: A fuller description of the variable. This is what will appear in tables and
graphs to describe the variable. For example Diastolic blood pressure (in mm Hg).
• Values: For qualitative variables only: tell SPSS what each number represents (for
example, does a 1 represent females, or males?)
• Measure: This is important: You must tell SPSS whether each variable is nominal,
ordinal, or scale (i.e. quantitative) by using the drop-down options.
Amity Institute of Psychology and Allied Sciences
Workflow
Amity Institute of Psychology and Allied Sciences
SPSS: general description, functions, menus,
commands
SPSS WINDOWS
Data Editor Window: It displays the contents of the data file. This is the window that opens
automatically when you start an SPSS session. In this window, you can create new data files or
modify existing ones. When you open more than one data file, each data file has a separate
Data Editor Window. The Data Editor Window provides two view of the data:
Data View: It displays the data values. Each variable is a column. Each row is a case.
Variable View: It displays a table consisting of variable names and their attributes. You can modify the
properties of each variable or add new variables or delete existing variables in the Variable View
Window.
Amity Institute of Psychology and Allied Sciences
SPSS: general description, functions, menus,
commands
SPSS WINDOWS
Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you
run a procedure that generates output.
Pivot Table Editor: It displays the results in pivot tables. To open this window, right click on the table, go to edit content
and select “In separate window”. Alternatively, left click on the table and go to Edit Menu. Select edit content and then
in separate window. You will be able to modify the table.
Chart Editor Window: This window is used to edit high-resolution charts and plots.
Text Output Editor Window: This is used to modify text output that is not displayed in pivot tables. To open the window,
right click on the text output, go to edit content and select “In separate window”. You will be able to modify the text
output.
Syntax Editor Window: It displays the choices made in the dialog box in the form of command syntax. These
commands can be edited and run to get some output. You can also copy an old SPSS program here and run it.
Amity Institute of Psychology and Allied Sciences
SPSS file management
There are three types of SPSS files that we will use during this class: data files, which
end in .sav; syntax files, which end in .sps; and output files, which end in .spv.
Amity Institute of Psychology and Allied Sciences
SPSS file management
IBM SPSS Statistics Data File Structure
The basic structure of IBM SPSS Statistics data files is similar to a database table:
Rows (records) are cases. Each row represents a case or an observation. For example,
each individual respondent to a questionnaire is a case.
Columns (fields) are variables. Each column represents a variable or characteristic that is
being measured. For example, each item on a questionnaire is a variable.
IBM SPSS Statistics data files also contain metadata that describes and defines the data
contained in the file. This descriptive information is called the dictionary. The information
contained in the dictionary includes:
Variable names and descriptive variable labels
Descriptive values labels
Missing values definitions
Print and write formats
Amity Institute of Psychology and Allied Sciences
• Setting directory
• Generating a codebook
• Defining
• Recoding &
• Computing variables
Amity Institute of Psychology and Allied Sciences
Module 2
Input and data cleaning
Amity Institute of Psychology and Allied Sciences
Defining variable
• Defining a variable includes giving it a name, specifying
its type, the values the variable can take (e.g., 1, 2, 3),
etc.
• Without this information, your data will be much harder to
understand and use.
• Whenever you are working with data, it is important to
make sure the variables in the data are defined so that
you (and anyone else who works with the data) can tell
exactly what was measured, and how.
Amity Institute of Psychology and Allied Sciences
Defining variable
• You can define information about your variables by accessing
the Variable View tab (at the bottom of the Data Editor window).
The Variable View tab displays information about the variables in
your data. You can get to the Variable View window in two ways:
• In the Data Editor window, click the Variable View tab at the bottom.
• In the Data Editor window, in the Data View tab, double-click a
variable name at the top of the column. This method has the
advantage of taking you to the specific variable you clicked.
Amity Institute of Psychology and Allied Sciences
Manual Input of Data
• Define Variables
• The "one person, one row" Rule
19
Amity Institute of Psychology and Allied Sciences
Manual Input of Data
• When you open the SPSS program, you will see a blank
spreadsheet in Data View. If you already have another dataset open
but want to create a new one, click File > New > Data to open a
blank spreadsheet.
• You will notice that each of the columns is labeled “var.” The column
names will represent the variables that you enter in your dataset.
You will also notice that each row is labeled with a number (“1,” “2,”
and so on). The rows will represent cases that will be a part of your
dataset. When you enter values for your data in the spreadsheet
cells, each value will correspond to a specific variable (column) and
a specific case (row).
20
Amity Institute of Psychology and Allied Sciences
Automated input of data and file import
• Excel to SPSS
• Text file to SPSS
21
Amity Institute of Psychology and Allied Sciences
Automated input of data and file import
• If you already have data that are in an SPSS file format
(file extension “.sav”), you can simply open that file to
begin working with your data in SPSS.
• However, if you have data stored in other types of files,
such as an Excel spreadsheet or a text file, you will need
to instruct SPSS how to read the file and then save it in
the SPSS file format (“.sav”).
• Below, we will cover how to import data from two
common types of files: Excel files and text files.
22
Amity Institute of Psychology and Allied Sciences
Automated input of data and file import
To open your Excel file in SPSS:
• File, Open, Data, from the SPSS menu.
• Select type of file you want to open,Excel *.xls *.xlsx,
*.xlsm .
• Select file name.
• Click 'Read variable names' if the first row of the
spreadsheat contains column headings.
• Click Open.
23
Amity Institute of Psychology and Allied Sciences
Data Cleaning
• Missing Values
• Invalid values
24
Amity Institute of Psychology and Allied Sciences
Transform
• Recoding variables
• Computing variables
25
Amity Institute of Psychology and Allied Sciences
Descriptive Analysis of Data
26
Amity Institute of Psychology and Allied Sciences
https://study.com/academy/lesson/what-is-
descriptive-statistics-examples-lesson-
quiz.html
27
Amity Institute of Psychology and Allied Sciences
Module III - Descriptive analysis
of data
28
Amity Institute of Psychology and Allied Sciences
Descriptive Statistics
Procedures for depicting the main aspects of sample data, without
necessarily inferring to a larger population.
• Descriptive statistics usually include the mean, median, and mode
to indicate central tendency, as well as
• the range and standard deviation that reveal how widely spread
the scores are within the sample.
• Descriptive statistics could also include charts and graphs such as
a frequency distribution or histogram, among others.
29
Amity Institute of Psychology and Allied Sciences
Frequencies
• When summarizing quantitative (continuous/interval/ratio) variables,
we are typically interested in questions like:
• What is the "center" of the data? (Mean, median)
• How spread out is the data? (Standard deviation/variance)
• What are the extremes of the data? (Minimum, maximum; Outliers)
• What is the "shape" of the distribution? Is it symmetric or
asymmetric? Are the values mostly clustered about the mean, or are
there many values in the "tails" of the distribution? (Skewness,
kurtosis)
30
Amity Institute of Psychology and Allied Sciences
Descriptives
When summarizing quantitative (continuous/interval/ratio) variables, we are typically
interested in questions like:
• What is the "center" of the data? (Mean, median)
• How spread out is the data? (Standard deviation/variance)
• What are the extremes of the data? (Minimum, maximum; Outliers)
• What is the "shape" of the distribution? Is it symmetric or asymmetric? Are the values
mostly clustered about the mean, or are there many values in the "tails" of the
distribution? (Skewness, kurtosis)
• In SPSS, the Descriptives procedure computes a select set of basic descriptive
statistics for one or more continuous numeric variables. In all, the statistics it can
produce are:
• N valid responses, Mean, Sum, Standard deviation, Variance, Minimum, Maximum,
Range, Standard error of the mean (or S.E. mean), Skewness, Kurtosis
31
Amity Institute of Psychology and Allied Sciences
Explore
• The Explore procedure produces detailed univariate statistics and graphs for numeric
scale variables for an entire sample, or for subsets of a sample. It can also be used to
assess the normality of a numeric scale variable with special inferential statistics and
detailed diagnostic plots.
• To run the Explore procedure, click Analyze > Descriptive Statistics > Explore.
32
Amity Institute of Psychology and Allied Sciences
Crosstabs
To describe a single categorical variable, we use frequency tables.
To describe the relationship between two categorical variables, we use a special type of
table called a cross-tabulation (or "crosstab" for short).
In a cross-tabulation, the categories of one variable determine the rows of the table, and
the categories of the other variable determine the columns. The cells of the table contain
the number of times that a particular combination of categories occurred. The "edges" (or
"margins") of the table typically contain the total number of observations for that category.
This type of table is also known as a:
• Crosstab.
• Two-way table.
• Contingency table.
33
Amity Institute of Psychology and Allied Sciences
Charts
SPSS to create bar graphs, histograms, line graphs, and scatterplots.
Editing the graphs, and printing selected parts of the output.
34
Amity Institute of Psychology and Allied Sciences
Module IV - Statistical tests
35
Amity Institute of Psychology and Allied Sciences
Module IV - Statistical tests
Means: the numerical average of a set of scores, computed as the sum of all scores
divided by the number of scores.
T-test: A t-test is a type of inferential statistic used to determine if there is a significant
difference between the means of two groups, which may be related in certain features.
One-way ANOVA: The one-way analysis of variance (ANOVA) is used to determine
whether there are any statistically significant differences between the means of three or
more independent (unrelated) groups.
Non parametric tests: The one-way analysis of variance (ANOVA) is used to determine
whether there are any statistically significant differences between the means of three or
more independent (unrelated) groups.
36
Amity Institute of Psychology and Allied Sciences
Module IV - Statistical tests
Normality tests: a theoretical distribution in which values pile up in the center at the
mean and fall off into tails at either end. When plotted, it gives the familiar bell-shaped
curve expected when variation about the mean value is random. The normal distribution
has several primary characteristics: It is symmetrical, it has both upper and lower
asymptotes, and its mean, median, and mode are the same value.
37
Amity Institute of Psychology and Allied Sciences
Correlation and regression
Correlation: n. the degree of a relationship (usually linear) between two variables, which
may be quantified as a correlation coefficient.
Regression analysis
any of several statistical techniques that are used to describe, explain, or predict (or all
three) the variance of an outcome or dependent variable using scores on one or more
predictor or independent variables.
For example, a regression analysis could show the extent to which 1st-year grades in
college (outcome) are predicted by such factors as standardized test scores, courses
taken in high school, letters of recommendation, and particular extracurricular activities.
38
Amity Institute of Psychology and Allied Sciences
Module V - Multivariate analysis
39
Amity Institute of Psychology and Allied Sciences
Factor Analysis
factor analysis
(FA) a broad family of mathematical procedures for reducing a set of
interrelations among manifest variables to a smaller set of unobserved
latent variables or factors.
For example, a number of tests of mechanical ability might be
intercorrelated to enable factor analysis to reduce them to a few
factors, such as fine motor coordination, speed, and attention.
40
Amity Institute of Psychology and Allied Sciences
Factor Analysis
manifest variable
a variable whose values can be directly observed or measured, as
opposed to one whose values must be inferred.
In structural equation modeling and factor analysis, manifest variables
are used to study latent variables. Also called indicator variable.
41
Amity Institute of Psychology and Allied Sciences
Factor Analysis
latent variable
a theoretical entity or construct that is used to explain one or more
manifest variables. Latent variables cannot be directly observed or
measured but rather are approximated through various measures
presumed to assess part of the given construct.
42
Amity Institute of Psychology and Allied Sciences
Factor Analysis
43
Amity Institute of Psychology and Allied Sciences
Factor Analysis
SPSS Anxiety Questionnaire
1. Statistics makes me cry
2. My friends will think I’m stupid for not being able to cope with SPSS
3. Standard deviations excite me
4. I dream that Pearson is attacking me with correlation coefficients
5. I don’t understand statistics
6. I have little experience of computers
7. All computers hate me
8. I have never been good at mathematics
44
Amity Institute of Psychology and Allied Sciences
Factor Analysis
45
Amity Institute of Psychology and Allied Sciences
Factor Analysis
46
Amity Institute of Psychology and Allied Sciences
Factor Analysis
47
Amity Institute of Psychology and Allied Sciences
Factor Analysis
48
Amity Institute of Psychology and Allied Sciences
Factor Analysis
49
Amity Institute of Psychology and Allied Sciences
Factor Analysis
50
Amity Institute of Psychology and Allied Sciences
Factor Analysis
51
Amity Institute of Psychology and Allied Sciences
Factor Analysis
52
Amity Institute of Psychology and Allied Sciences
Factor Analysis
53
Amity Institute of Psychology and Allied Sciences
Factor Analysis
54
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
a method of multivariate data analysis in which individuals or units are
placed into distinct subgroups based on their strong similarity with
regard to specific attributes.
For example, one might use cluster analysis to form groups of
individual children on the basis of their levels of anxiety, aggression,
delinquency, and cognitive difficulties so as to identify useful typologies
that could increase understanding of co-occurring mental disorders and
lead to more appropriate treatments for specific individuals.
There are several different forms of cluster analysis—including
hierarchical clustering and latent class analysis—and each is
appropriate for use with different types of data. Results of a cluster
analysis often are presented in a dendrogram. 55
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
Dendrogram
n. a type of treelike diagram used in hierarchical clustering. It lists all of
the participants at one end and then directs branches out from those
participants who are similar and connects them with a node that
represents a cluster. A dendrogram could be used, for example, to
cluster individuals into various categories of HIV risk, depending on
their number of sexual partners, their frequency of unprotected sex,
and the perceived risk of their partners. Individuals who had few sexual
partners with little or no unprotected sex and who perceived little or no
partner risk of HIV infection would be branched into a cluster that could
be labeled low risk, whereas individuals with high values on these three
variables would branch into a high-risk cluster, with other individuals
presumably clustering into a medium-risk group.
56
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
57
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
58
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
59
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
60
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
61
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
62
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
63
Amity Institute of Psychology and Allied Sciences
Cluster Analysis
64
Amity Institute of Psychology and Allied Sciences
Videos
https://forms.gle/vNpsRtjPH3RoE7MKA
https://docs.google.com/forms/d/e/1FAIpQLSfXnVuKSLI47pOeEvIhX_6
YX1M8Fa_cv0Mnt5A7jXHVoOhPfA/viewform?usp=sf_link
https://study.com/academy/lesson/what-is-a-t-test-procedure-
interpretation-examples.html
https://study.com/academy/lesson/cluster-analysis-market-
segmentation-definition-examples.html
https://www.youtube.com/watch?v=Se28XHI2_xE

More Related Content

Similar to Introduction to Statistical package of social sciences

Spss tutorial 1
Spss tutorial 1Spss tutorial 1
Spss tutorial 1
debataraja
 
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docxChapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
cravennichole326
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual Project
ThienSi Le
 

Similar to Introduction to Statistical package of social sciences (20)

5116427.ppt
5116427.ppt5116427.ppt
5116427.ppt
 
Spss basics tutorial
Spss basics tutorialSpss basics tutorial
Spss basics tutorial
 
What Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisWhat Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data Analysis
 
6967176.ppt
6967176.ppt6967176.ppt
6967176.ppt
 
spss-anintroduction-150704135929-lva1-app6892.ppt
spss-anintroduction-150704135929-lva1-app6892.pptspss-anintroduction-150704135929-lva1-app6892.ppt
spss-anintroduction-150704135929-lva1-app6892.ppt
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Spss tutorial 1
Spss tutorial 1Spss tutorial 1
Spss tutorial 1
 
Spss tutorial 1
Spss tutorial 1Spss tutorial 1
Spss tutorial 1
 
SPSS :Introduction for beginners
SPSS :Introduction for beginners SPSS :Introduction for beginners
SPSS :Introduction for beginners
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)
 
SPSS software
SPSS software SPSS software
SPSS software
 
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docxChapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
Chapter 1An Overview of IBM® SPSS® StatisticsIntroduction An .docx
 
Spss chapter1
Spss chapter1Spss chapter1
Spss chapter1
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual Project
 
Statrting spss
Statrting spssStatrting spss
Statrting spss
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
4 Statistical Software.pptx
4 Statistical Software.pptx4 Statistical Software.pptx
4 Statistical Software.pptx
 
Chapter -1.pptx
Chapter -1.pptxChapter -1.pptx
Chapter -1.pptx
 
SPSS vs Stata: The Best Ever Comparison
SPSS vs Stata: The Best Ever ComparisonSPSS vs Stata: The Best Ever Comparison
SPSS vs Stata: The Best Ever Comparison
 
Ibm spss statistics 19 brief guide
Ibm spss statistics 19 brief guideIbm spss statistics 19 brief guide
Ibm spss statistics 19 brief guide
 

Recently uploaded

Recently uploaded (20)

Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
 
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdf
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
 
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
 
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCAOpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 

Introduction to Statistical package of social sciences

  • 1. Amity Institute of Psychology and Allied Sciences 1 Amity Institute of Psychology and Allied Sciences Introduction to SPSS (STAT626)
  • 2. Amity Institute of Psychology and Allied Sciences 2 Introduction to SPSS (STAT626) Internal Assessment (Test)
  • 3. Amity Institute of Psychology and Allied Sciences Module 1 Introduction
  • 4. Amity Institute of Psychology and Allied Sciences Introduction to SPSS • SPSS means “Statistical Package for the Social Sciences” and was first launched in 1968. Since SPSS was acquired by IBM in 2009, it's officially known as IBM SPSS Statistics but most users still just refer to it as “SPSS”.
  • 5. Amity Institute of Psychology and Allied Sciences Introduction to SPSS • SPSS is short for Statistical Package for the Social Sciences, and it’s used by various kinds of researchers for complex statistical data analysis. The SPSS software package was created for the management and statistical analysis of social science data. It was originally launched in 1968 by SPSS Inc., and was later acquired by IBM in 2009. • Officially dubbed IBM SPSS Statistics, most users still refer to it as SPSS. As the world standard for social-science data analysis, SPSS is widely coveted due to its straightforward and English-like command language and impressively thorough user manual. • SPSS is used by market researchers, health researchers, survey companies, government entities, education researchers, marketing organizations, data miners, and many more for processing and analyzing survey data,
  • 6. Amity Institute of Psychology and Allied Sciences Introduction to SPSS In computer science, garbage in, garbage out (GIGO) is the concept that flawed, or nonsense (garbage) input data produces nonsense output. Rubbish in, rubbish out (RIRO) is an alternate wording.
  • 7. Amity Institute of Psychology and Allied Sciences Introduction to SPSS • SPSS - Quick Overview Main Features • SPSS is software for editing and analyzing all sorts of data. These data may come from basically any source: scientific research, a customer database, Google Analytics or even the server log files of a website. SPSS can open all file formats that are commonly used for structured data such as • spreadsheets from MS Excel or OpenOffice; • plain text files (.txt or .csv); • relational (SQL) databases; • Stata and SAS.
  • 8. Amity Institute of Psychology and Allied Sciences Data analysis with SPSS: general aspects, workflow, critical issues In general: • Except for graphs, SPSS output should not be presented in reports or presentations. • Instead, the information from SPSS output should be used to construct proper tables, or to produce proper conclusions. • SPSS has two worksheets sheets; both must be set up correctly (you can switch between the two worksheets by clicking on these at the bottom of the SPSS window): • The Data View, which contains the data. • The Variable View, which contains information about each variable. • In SPSS (as with most statistical software), each row represents one unit of analysis in the Data View.
  • 9. Amity Institute of Psychology and Allied Sciences Data analysis with SPSS: general aspects, workflow, critical issues • In SPSS (as with most statistical software), each column represents one variable in the Data View. • At the top of each column (but not in Row 1!) is the name of the variable; • When setting up the Variable View, many columns can be filled in. These are the important columns to get right (the others aren’t important for our purposes): • Name: A short description of the variable (with no spaces or punctuation!); • Type: Usually Numeric (apart from names which can be String). • Label: A fuller description of the variable. This is what will appear in tables and graphs to describe the variable. For example Diastolic blood pressure (in mm Hg). • Values: For qualitative variables only: tell SPSS what each number represents (for example, does a 1 represent females, or males?) • Measure: This is important: You must tell SPSS whether each variable is nominal, ordinal, or scale (i.e. quantitative) by using the drop-down options.
  • 10. Amity Institute of Psychology and Allied Sciences Workflow
  • 11. Amity Institute of Psychology and Allied Sciences SPSS: general description, functions, menus, commands SPSS WINDOWS Data Editor Window: It displays the contents of the data file. This is the window that opens automatically when you start an SPSS session. In this window, you can create new data files or modify existing ones. When you open more than one data file, each data file has a separate Data Editor Window. The Data Editor Window provides two view of the data: Data View: It displays the data values. Each variable is a column. Each row is a case. Variable View: It displays a table consisting of variable names and their attributes. You can modify the properties of each variable or add new variables or delete existing variables in the Variable View Window.
  • 12. Amity Institute of Psychology and Allied Sciences SPSS: general description, functions, menus, commands SPSS WINDOWS Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you run a procedure that generates output. Pivot Table Editor: It displays the results in pivot tables. To open this window, right click on the table, go to edit content and select “In separate window”. Alternatively, left click on the table and go to Edit Menu. Select edit content and then in separate window. You will be able to modify the table. Chart Editor Window: This window is used to edit high-resolution charts and plots. Text Output Editor Window: This is used to modify text output that is not displayed in pivot tables. To open the window, right click on the text output, go to edit content and select “In separate window”. You will be able to modify the text output. Syntax Editor Window: It displays the choices made in the dialog box in the form of command syntax. These commands can be edited and run to get some output. You can also copy an old SPSS program here and run it.
  • 13. Amity Institute of Psychology and Allied Sciences SPSS file management There are three types of SPSS files that we will use during this class: data files, which end in .sav; syntax files, which end in .sps; and output files, which end in .spv.
  • 14. Amity Institute of Psychology and Allied Sciences SPSS file management IBM SPSS Statistics Data File Structure The basic structure of IBM SPSS Statistics data files is similar to a database table: Rows (records) are cases. Each row represents a case or an observation. For example, each individual respondent to a questionnaire is a case. Columns (fields) are variables. Each column represents a variable or characteristic that is being measured. For example, each item on a questionnaire is a variable. IBM SPSS Statistics data files also contain metadata that describes and defines the data contained in the file. This descriptive information is called the dictionary. The information contained in the dictionary includes: Variable names and descriptive variable labels Descriptive values labels Missing values definitions Print and write formats
  • 15. Amity Institute of Psychology and Allied Sciences • Setting directory • Generating a codebook • Defining • Recoding & • Computing variables
  • 16. Amity Institute of Psychology and Allied Sciences Module 2 Input and data cleaning
  • 17. Amity Institute of Psychology and Allied Sciences Defining variable • Defining a variable includes giving it a name, specifying its type, the values the variable can take (e.g., 1, 2, 3), etc. • Without this information, your data will be much harder to understand and use. • Whenever you are working with data, it is important to make sure the variables in the data are defined so that you (and anyone else who works with the data) can tell exactly what was measured, and how.
  • 18. Amity Institute of Psychology and Allied Sciences Defining variable • You can define information about your variables by accessing the Variable View tab (at the bottom of the Data Editor window). The Variable View tab displays information about the variables in your data. You can get to the Variable View window in two ways: • In the Data Editor window, click the Variable View tab at the bottom. • In the Data Editor window, in the Data View tab, double-click a variable name at the top of the column. This method has the advantage of taking you to the specific variable you clicked.
  • 19. Amity Institute of Psychology and Allied Sciences Manual Input of Data • Define Variables • The "one person, one row" Rule 19
  • 20. Amity Institute of Psychology and Allied Sciences Manual Input of Data • When you open the SPSS program, you will see a blank spreadsheet in Data View. If you already have another dataset open but want to create a new one, click File > New > Data to open a blank spreadsheet. • You will notice that each of the columns is labeled “var.” The column names will represent the variables that you enter in your dataset. You will also notice that each row is labeled with a number (“1,” “2,” and so on). The rows will represent cases that will be a part of your dataset. When you enter values for your data in the spreadsheet cells, each value will correspond to a specific variable (column) and a specific case (row). 20
  • 21. Amity Institute of Psychology and Allied Sciences Automated input of data and file import • Excel to SPSS • Text file to SPSS 21
  • 22. Amity Institute of Psychology and Allied Sciences Automated input of data and file import • If you already have data that are in an SPSS file format (file extension “.sav”), you can simply open that file to begin working with your data in SPSS. • However, if you have data stored in other types of files, such as an Excel spreadsheet or a text file, you will need to instruct SPSS how to read the file and then save it in the SPSS file format (“.sav”). • Below, we will cover how to import data from two common types of files: Excel files and text files. 22
  • 23. Amity Institute of Psychology and Allied Sciences Automated input of data and file import To open your Excel file in SPSS: • File, Open, Data, from the SPSS menu. • Select type of file you want to open,Excel *.xls *.xlsx, *.xlsm . • Select file name. • Click 'Read variable names' if the first row of the spreadsheat contains column headings. • Click Open. 23
  • 24. Amity Institute of Psychology and Allied Sciences Data Cleaning • Missing Values • Invalid values 24
  • 25. Amity Institute of Psychology and Allied Sciences Transform • Recoding variables • Computing variables 25
  • 26. Amity Institute of Psychology and Allied Sciences Descriptive Analysis of Data 26
  • 27. Amity Institute of Psychology and Allied Sciences https://study.com/academy/lesson/what-is- descriptive-statistics-examples-lesson- quiz.html 27
  • 28. Amity Institute of Psychology and Allied Sciences Module III - Descriptive analysis of data 28
  • 29. Amity Institute of Psychology and Allied Sciences Descriptive Statistics Procedures for depicting the main aspects of sample data, without necessarily inferring to a larger population. • Descriptive statistics usually include the mean, median, and mode to indicate central tendency, as well as • the range and standard deviation that reveal how widely spread the scores are within the sample. • Descriptive statistics could also include charts and graphs such as a frequency distribution or histogram, among others. 29
  • 30. Amity Institute of Psychology and Allied Sciences Frequencies • When summarizing quantitative (continuous/interval/ratio) variables, we are typically interested in questions like: • What is the "center" of the data? (Mean, median) • How spread out is the data? (Standard deviation/variance) • What are the extremes of the data? (Minimum, maximum; Outliers) • What is the "shape" of the distribution? Is it symmetric or asymmetric? Are the values mostly clustered about the mean, or are there many values in the "tails" of the distribution? (Skewness, kurtosis) 30
  • 31. Amity Institute of Psychology and Allied Sciences Descriptives When summarizing quantitative (continuous/interval/ratio) variables, we are typically interested in questions like: • What is the "center" of the data? (Mean, median) • How spread out is the data? (Standard deviation/variance) • What are the extremes of the data? (Minimum, maximum; Outliers) • What is the "shape" of the distribution? Is it symmetric or asymmetric? Are the values mostly clustered about the mean, or are there many values in the "tails" of the distribution? (Skewness, kurtosis) • In SPSS, the Descriptives procedure computes a select set of basic descriptive statistics for one or more continuous numeric variables. In all, the statistics it can produce are: • N valid responses, Mean, Sum, Standard deviation, Variance, Minimum, Maximum, Range, Standard error of the mean (or S.E. mean), Skewness, Kurtosis 31
  • 32. Amity Institute of Psychology and Allied Sciences Explore • The Explore procedure produces detailed univariate statistics and graphs for numeric scale variables for an entire sample, or for subsets of a sample. It can also be used to assess the normality of a numeric scale variable with special inferential statistics and detailed diagnostic plots. • To run the Explore procedure, click Analyze > Descriptive Statistics > Explore. 32
  • 33. Amity Institute of Psychology and Allied Sciences Crosstabs To describe a single categorical variable, we use frequency tables. To describe the relationship between two categorical variables, we use a special type of table called a cross-tabulation (or "crosstab" for short). In a cross-tabulation, the categories of one variable determine the rows of the table, and the categories of the other variable determine the columns. The cells of the table contain the number of times that a particular combination of categories occurred. The "edges" (or "margins") of the table typically contain the total number of observations for that category. This type of table is also known as a: • Crosstab. • Two-way table. • Contingency table. 33
  • 34. Amity Institute of Psychology and Allied Sciences Charts SPSS to create bar graphs, histograms, line graphs, and scatterplots. Editing the graphs, and printing selected parts of the output. 34
  • 35. Amity Institute of Psychology and Allied Sciences Module IV - Statistical tests 35
  • 36. Amity Institute of Psychology and Allied Sciences Module IV - Statistical tests Means: the numerical average of a set of scores, computed as the sum of all scores divided by the number of scores. T-test: A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features. One-way ANOVA: The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups. Non parametric tests: The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups. 36
  • 37. Amity Institute of Psychology and Allied Sciences Module IV - Statistical tests Normality tests: a theoretical distribution in which values pile up in the center at the mean and fall off into tails at either end. When plotted, it gives the familiar bell-shaped curve expected when variation about the mean value is random. The normal distribution has several primary characteristics: It is symmetrical, it has both upper and lower asymptotes, and its mean, median, and mode are the same value. 37
  • 38. Amity Institute of Psychology and Allied Sciences Correlation and regression Correlation: n. the degree of a relationship (usually linear) between two variables, which may be quantified as a correlation coefficient. Regression analysis any of several statistical techniques that are used to describe, explain, or predict (or all three) the variance of an outcome or dependent variable using scores on one or more predictor or independent variables. For example, a regression analysis could show the extent to which 1st-year grades in college (outcome) are predicted by such factors as standardized test scores, courses taken in high school, letters of recommendation, and particular extracurricular activities. 38
  • 39. Amity Institute of Psychology and Allied Sciences Module V - Multivariate analysis 39
  • 40. Amity Institute of Psychology and Allied Sciences Factor Analysis factor analysis (FA) a broad family of mathematical procedures for reducing a set of interrelations among manifest variables to a smaller set of unobserved latent variables or factors. For example, a number of tests of mechanical ability might be intercorrelated to enable factor analysis to reduce them to a few factors, such as fine motor coordination, speed, and attention. 40
  • 41. Amity Institute of Psychology and Allied Sciences Factor Analysis manifest variable a variable whose values can be directly observed or measured, as opposed to one whose values must be inferred. In structural equation modeling and factor analysis, manifest variables are used to study latent variables. Also called indicator variable. 41
  • 42. Amity Institute of Psychology and Allied Sciences Factor Analysis latent variable a theoretical entity or construct that is used to explain one or more manifest variables. Latent variables cannot be directly observed or measured but rather are approximated through various measures presumed to assess part of the given construct. 42
  • 43. Amity Institute of Psychology and Allied Sciences Factor Analysis 43
  • 44. Amity Institute of Psychology and Allied Sciences Factor Analysis SPSS Anxiety Questionnaire 1. Statistics makes me cry 2. My friends will think I’m stupid for not being able to cope with SPSS 3. Standard deviations excite me 4. I dream that Pearson is attacking me with correlation coefficients 5. I don’t understand statistics 6. I have little experience of computers 7. All computers hate me 8. I have never been good at mathematics 44
  • 45. Amity Institute of Psychology and Allied Sciences Factor Analysis 45
  • 46. Amity Institute of Psychology and Allied Sciences Factor Analysis 46
  • 47. Amity Institute of Psychology and Allied Sciences Factor Analysis 47
  • 48. Amity Institute of Psychology and Allied Sciences Factor Analysis 48
  • 49. Amity Institute of Psychology and Allied Sciences Factor Analysis 49
  • 50. Amity Institute of Psychology and Allied Sciences Factor Analysis 50
  • 51. Amity Institute of Psychology and Allied Sciences Factor Analysis 51
  • 52. Amity Institute of Psychology and Allied Sciences Factor Analysis 52
  • 53. Amity Institute of Psychology and Allied Sciences Factor Analysis 53
  • 54. Amity Institute of Psychology and Allied Sciences Factor Analysis 54
  • 55. Amity Institute of Psychology and Allied Sciences Cluster Analysis a method of multivariate data analysis in which individuals or units are placed into distinct subgroups based on their strong similarity with regard to specific attributes. For example, one might use cluster analysis to form groups of individual children on the basis of their levels of anxiety, aggression, delinquency, and cognitive difficulties so as to identify useful typologies that could increase understanding of co-occurring mental disorders and lead to more appropriate treatments for specific individuals. There are several different forms of cluster analysis—including hierarchical clustering and latent class analysis—and each is appropriate for use with different types of data. Results of a cluster analysis often are presented in a dendrogram. 55
  • 56. Amity Institute of Psychology and Allied Sciences Cluster Analysis Dendrogram n. a type of treelike diagram used in hierarchical clustering. It lists all of the participants at one end and then directs branches out from those participants who are similar and connects them with a node that represents a cluster. A dendrogram could be used, for example, to cluster individuals into various categories of HIV risk, depending on their number of sexual partners, their frequency of unprotected sex, and the perceived risk of their partners. Individuals who had few sexual partners with little or no unprotected sex and who perceived little or no partner risk of HIV infection would be branched into a cluster that could be labeled low risk, whereas individuals with high values on these three variables would branch into a high-risk cluster, with other individuals presumably clustering into a medium-risk group. 56
  • 57. Amity Institute of Psychology and Allied Sciences Cluster Analysis 57
  • 58. Amity Institute of Psychology and Allied Sciences Cluster Analysis 58
  • 59. Amity Institute of Psychology and Allied Sciences Cluster Analysis 59
  • 60. Amity Institute of Psychology and Allied Sciences Cluster Analysis 60
  • 61. Amity Institute of Psychology and Allied Sciences Cluster Analysis 61
  • 62. Amity Institute of Psychology and Allied Sciences Cluster Analysis 62
  • 63. Amity Institute of Psychology and Allied Sciences Cluster Analysis 63
  • 64. Amity Institute of Psychology and Allied Sciences Cluster Analysis 64
  • 65. Amity Institute of Psychology and Allied Sciences Videos https://forms.gle/vNpsRtjPH3RoE7MKA https://docs.google.com/forms/d/e/1FAIpQLSfXnVuKSLI47pOeEvIhX_6 YX1M8Fa_cv0Mnt5A7jXHVoOhPfA/viewform?usp=sf_link https://study.com/academy/lesson/what-is-a-t-test-procedure- interpretation-examples.html https://study.com/academy/lesson/cluster-analysis-market- segmentation-definition-examples.html https://www.youtube.com/watch?v=Se28XHI2_xE

Editor's Notes

  1. 1
  2. 2