n
2
3
n
99% 2.58
95% 1.96
90% 1.645
n
.
n
4
n
1 645
11 6 1 645
4 1
225
11 6 0 45 1115 12 05
.
. .
.
. . ( . , . )
5
6
(approx) normal.
n
n
n
7
.
15
270
2.977
n
8
9
p
p of S and
pp
.p
637
1000
637
10
p p p
p
p
2
p p
n
1
p p
p p
n
1
p
p p
n
1
607 667
11
n
12
p
p p
p p
np
3 3
1
015 163
p
p p
n
p p
n
1 1
049 129
4 - PRELIMINARY DATA SCREENING
4.1 Introduction: Problems in Real Data
Real datasets often contain errors, inconsistencies in responses or measurements, outliers, and missing values. Researchers should conduct thorough preliminary data screening to identify and remedy potential problems with their data prior to running the data analyses that are of primary interest. Analyses based on a dataset that contains errors, or data that seriously violate assumptions that are required for the analysis, can yield misleading results.
Some of the potential problems with data are as follows: errors in data coding and data entry, inconsistent responses, missing values, extreme outliers, nonnormal distribution shapes, within-group sample sizes that are too small for the intended analysis, and nonlinear relations between quantitative variables. Problems with data should be identified and remedied (as adequately as possible) prior to analysis. A research report should include a summary of problems detected in the data and any remedies that were employed (such as deletion of outliers or data transformations) to address these problems.
4.2 Quality Control During Data Collection
There are many different possible methods of data collection. A psychologist may collect data on personality or attitudes by asking participants to answer questions on a questionnaire. A medical researcher may use a computer-controlled blood pressure monitor to assess systolic blood pressure (SBP) or other physiological responses. A researcher may record observations of animal behavior. Physical measurements (such as height or weight) may be taken. Most methods of data collection are susceptible to recording errors or artifacts, and researchers need to know what kinds of errors are likely to occur.
For example, researchers who use self-report data to do research on personality or attitudes need to be aware of common problems with this type of data. Participants may distort their answers because of social desirability bias; they may misunderstand questions; they may not remember the events that they are asked to report about; they may deliberately try to “fake good” or “fake bad”; they may even make random responses without reading the questions. A participant may accidentally skip a question on a survey and, subsequently, use the wrong lines on the answer sheet to enter each response; for example, the response to Question 4 may be filled in as Item 3 on the answer sheet, the response to Question 5 may be filled in as Item 4, and so forth. In addition, research assistants have been known to f ...
Pg. 05Question FiveAssignment #Deadline Day 22.docxmattjtoni51554
Pg. 05
Question Five
Assignment #
Deadline: Day 22/10/2017 @ 23:59
[Total Mark for this Assignment is 25]
System Analysis and Design
IT 243
College of Computing and Informatics
Question One
5 Marks
Learning Outcome(s):
Understand the need of Feasibility analysis in project approval and its types
What is feasibility analysis? List and briefly discuss three kinds of feasibility analysis.
Question Two
5 Marks
Learning Outcome(s):
Understand the various cost incurred in project development
How can you classify costs? Describe each cost classification and provide a typical example of each category.Question Three
5 Marks
Learning Outcome(s):
System Development Life Cycle methodologies (Waterfall & Prototyping)
There a several development methodologies for the System Development Life Cycle (SDLC). Among these are the Waterfall and System Prototyping models. Compare the two methodologies in details in terms of the following criteria.
Criteria
Waterfall
System Prototyping
Description
Requirements Clarity
System complexity
Project Time schedule
Question Four
5 Marks
Learning Outcome(s):
Understand JAD Session and its procedure
What is JAD session? Describe the five major steps in conducting JAD sessions.
Question Five
5 Marks
Learning Outcome(s):
Ability to distinguish between functional and non functional requirements
State what is meant by the functional and non-functional requirements. What are the primary types of nonfunctional requirements? Give two examples of each. What role do nonfunctional requirements play in the project overall?
# Marks
4 - PRELIMINARY DATA SCREENING
4.1 Introduction: Problems in Real Data
Real datasets often contain errors, inconsistencies in responses or measurements, outliers, and missing values. Researchers should conduct thorough preliminary data screening to identify and remedy potential problems with their data prior to running the data analyses that are of primary interest. Analyses based on a dataset that contains errors, or data that seriously violate assumptions that are required for the analysis, can yield misleading results.
Some of the potential problems with data are as follows: errors in data coding and data entry, inconsistent responses, missing values, extreme outliers, nonnormal distribution shapes, within-group sample sizes that are too small for the intended analysis, and nonlinear relations between quantitative variables. Problems with data should be identified and remedied (as adequately as possible) prior to analysis. A research report should include a summary of problems detected in the data and any remedies that were employed (such as deletion of outliers or data transformations) to address these problems.
4.2 Quality Control During Data Collection
There are many different possible methods of data collection. A psychologist may collect data on personality or attitudes by asking participants to answer questions on a questionnaire..
PUH 6301, Public Health Research 1 Course Learning OuTatianaMajor22
PUH 6301, Public Health Research 1
Course Learning Outcomes for Unit VI
Upon completion of this unit, students should be able to:
4. Evaluate strategies for data analysis to determine the best statistical tests needed for research
methods.
4.1 Determine the four levels of measurement as valid research statistical techniques in the public
health research process.
4.2 Explain why proper data and statistical analysis is important.
4.3 Describe the basic types of statistic tests.
Course/Unit
Learning Outcomes
Learning Activity
4.1
Unit Lesson
Chapter 28
Chapter 29
Chapter 30
Chapter 31
Chapter 33
Blog: “Descriptive vs. Inferential Statistics: What’s the Difference?
Unit VI Essay
4.2
Unit Lesson
Chapter 28
Unit VI Essay
4.3
Unit Lesson
Chapter 29
Unit VI Essay
Required Unit Resources
Chapter 28: Data Management
Chapter 29: Descriptive Statistics
Chapter 30: Comparative Statistics
Chapter 31: Regression Analysis
Chapter 33: Additional Analysis Tools
In order to access the following resource, click the link below:
The website below provides a good summary of how the public health researcher can use descriptive and
inferential statistics methods to conduct public health research.
Market Research Guy. (2011, December 1). Descriptive vs. inferential statistics: What’s the difference? [Blog
post]. http://www.mymarketresearchmethods.com/descriptive-inferential-statistics-difference/
UNIT VI STUDY GUIDE
Data Analysis Plan
http://www.mymarketresearchmethods.com/descriptive-inferential-statistics-difference/
PUH 6301, Public Health Research 2
UNIT x STUDY GUIDE
Title
Unit Lesson
Introduction
This unit covers the statistical procedures used to analyze the data collected from research tools. During this
stage of research, you may begin to draw conclusions and be able to answer the research question(s) and
sub-question(s) you developed in Unit I. Use statistics in this stage of research to manipulate the data and
make it understandable for others to read. Shi (2008) encourages researchers to know and understand basic
statistics and statistical procedures. The data analysis phase of research is important because it makes sense
of the data that can be used for future research studies (Jacobsen, 2021).
Data Management
Data management is the entire process of keeping a record of all the results of clinical assessments
conducted during a research study (Jacobsen, 2021). Record keeping includes listing details on potential
articles, pulling information from patient charts, tracking responses from surveys, or recording assessment
results from cohorts or studies. It is vital that those responsible for collecting and keeping data maintain
confidentiality and the integrity of data sets from all outside sources. Once researchers enter the data into the
spreadsheet or database, the data should be recoded and double-checked prior to beginning statistical
ana ...
Biostatistics involves using statistical methods and techniques to analyze medical data. This includes collecting accurate data from patients regarding clinical characteristics, outcomes, and more. The data must then be prepared, coded, and cleaned before being analyzed. Statistical Package for Social Sciences (SPSS) is a commonly used software that allows importing data files and performing both descriptive and inferential statistical analyses. Descriptive analyses may include graphs, frequency tables, measures of central tendency, and variability to summarize categorical and continuous variables both individually and in relation to each other.
This document provides information on using SPSS for educational research. It discusses descriptive statistics, common statistical issues in research, procedures for creating a SPSS data file and conducting descriptive analyses. It also explains how to perform t-tests, analysis of variance (ANOVA), frequencies analysis and other statistical tests in SPSS. The document is intended as a guide for researchers on applying various statistical analyses in SPSS.
Spss Homework Psyc 421
SPSS Assignment Part 1 Instructions
Describing a Normative Sample
When it comes to the use of psychological tests, one approach that both researchers and clinicians take in trying to understand participants’ performance is a norm-referenced approach. With a norm-referenced test, the test is given to a large, representative group of participants known as the “normative sample” (a.k.a. “norm group”). Then, the scores of all subsequent test-takers are compared to the scores of the norm group. In order for the norm group to be a valid comparison group, it has to be representative of the population who will be taking the test.
So how do we know if the normative sample is representative? When summarizing the psychometric properties of a test, test developers and publishers usually describe the norm group with their demographic variables. Demographic variables are characteristics of the participants like: gender, age, ethnicity, relationship status, socioeconomic status, religious affiliation etc. A description of the normative sample allows examinersto decide if the test of interest can be used with their intended examinees. For example, if the normative sample were 95% male, then you likely could not logically compare their scores to females test takers! That is why readers need to know what the normative sample looks like.
The purpose of the current assignment is for you to provide a verbal (and graphical)description of a fictional normative sample of research participants.
In the Assignment Instructions folder, there is an SPSS data file that will be the basis for your analyses. The data provided are fictional and were created solely for the purposes of our SPSS assignments. This data file includes: 1) demographic information for a normative sample of 428 participants, and 2) participants’ scores on a test called the Center for Epidemiologic Studies Depression Scale (CES-D scale).
The CES-D Scale is utilized to measure symptoms of depression. It is a self-assessment that is completed by the individual. The CES-D contains 20items rated on a 4-point scale (0 = Rarely or None of the Time to 3
The document contains an outline of the table of contents for a textbook on general statistics. It covers topics such as preliminary concepts, data collection and presentation, measures of central tendency, measures of dispersion and skewness, and permutations and combinations. Sample chapters discuss introduction to statistics, variables and data, methods of presenting data through tables, graphs and diagrams, computing the mean, median and mode, and other statistical measures.
This document provides an overview of introducing SPSS and quantifying data for analysis. It discusses the different types of data in SPSS including nominal, ordinal, interval/ratio scales. It covers entering data from questionnaires or other sources into SPSS and constructing a codebook. The document then explains how to conduct basic analyses in SPSS including frequency counts, measures of central tendency and dispersion, charts, contingency tables, and chi-square tests. It emphasizes correctly preparing and working with data in SPSS before conducting analyses.
This document provides guidance on analyzing data using SPSS. It covers topics such as different data types, structuring data for analysis in SPSS, descriptive statistics, graphs, inferential statistics, and specific tests like t-tests, ANOVA, and correlation. The document is intended as a practical guide for researchers who need to analyze their data using SPSS. It defines key terms and provides examples to illustrate different statistical concepts and analysis procedures.
Pg. 05Question FiveAssignment #Deadline Day 22.docxmattjtoni51554
Pg. 05
Question Five
Assignment #
Deadline: Day 22/10/2017 @ 23:59
[Total Mark for this Assignment is 25]
System Analysis and Design
IT 243
College of Computing and Informatics
Question One
5 Marks
Learning Outcome(s):
Understand the need of Feasibility analysis in project approval and its types
What is feasibility analysis? List and briefly discuss three kinds of feasibility analysis.
Question Two
5 Marks
Learning Outcome(s):
Understand the various cost incurred in project development
How can you classify costs? Describe each cost classification and provide a typical example of each category.Question Three
5 Marks
Learning Outcome(s):
System Development Life Cycle methodologies (Waterfall & Prototyping)
There a several development methodologies for the System Development Life Cycle (SDLC). Among these are the Waterfall and System Prototyping models. Compare the two methodologies in details in terms of the following criteria.
Criteria
Waterfall
System Prototyping
Description
Requirements Clarity
System complexity
Project Time schedule
Question Four
5 Marks
Learning Outcome(s):
Understand JAD Session and its procedure
What is JAD session? Describe the five major steps in conducting JAD sessions.
Question Five
5 Marks
Learning Outcome(s):
Ability to distinguish between functional and non functional requirements
State what is meant by the functional and non-functional requirements. What are the primary types of nonfunctional requirements? Give two examples of each. What role do nonfunctional requirements play in the project overall?
# Marks
4 - PRELIMINARY DATA SCREENING
4.1 Introduction: Problems in Real Data
Real datasets often contain errors, inconsistencies in responses or measurements, outliers, and missing values. Researchers should conduct thorough preliminary data screening to identify and remedy potential problems with their data prior to running the data analyses that are of primary interest. Analyses based on a dataset that contains errors, or data that seriously violate assumptions that are required for the analysis, can yield misleading results.
Some of the potential problems with data are as follows: errors in data coding and data entry, inconsistent responses, missing values, extreme outliers, nonnormal distribution shapes, within-group sample sizes that are too small for the intended analysis, and nonlinear relations between quantitative variables. Problems with data should be identified and remedied (as adequately as possible) prior to analysis. A research report should include a summary of problems detected in the data and any remedies that were employed (such as deletion of outliers or data transformations) to address these problems.
4.2 Quality Control During Data Collection
There are many different possible methods of data collection. A psychologist may collect data on personality or attitudes by asking participants to answer questions on a questionnaire..
PUH 6301, Public Health Research 1 Course Learning OuTatianaMajor22
PUH 6301, Public Health Research 1
Course Learning Outcomes for Unit VI
Upon completion of this unit, students should be able to:
4. Evaluate strategies for data analysis to determine the best statistical tests needed for research
methods.
4.1 Determine the four levels of measurement as valid research statistical techniques in the public
health research process.
4.2 Explain why proper data and statistical analysis is important.
4.3 Describe the basic types of statistic tests.
Course/Unit
Learning Outcomes
Learning Activity
4.1
Unit Lesson
Chapter 28
Chapter 29
Chapter 30
Chapter 31
Chapter 33
Blog: “Descriptive vs. Inferential Statistics: What’s the Difference?
Unit VI Essay
4.2
Unit Lesson
Chapter 28
Unit VI Essay
4.3
Unit Lesson
Chapter 29
Unit VI Essay
Required Unit Resources
Chapter 28: Data Management
Chapter 29: Descriptive Statistics
Chapter 30: Comparative Statistics
Chapter 31: Regression Analysis
Chapter 33: Additional Analysis Tools
In order to access the following resource, click the link below:
The website below provides a good summary of how the public health researcher can use descriptive and
inferential statistics methods to conduct public health research.
Market Research Guy. (2011, December 1). Descriptive vs. inferential statistics: What’s the difference? [Blog
post]. http://www.mymarketresearchmethods.com/descriptive-inferential-statistics-difference/
UNIT VI STUDY GUIDE
Data Analysis Plan
http://www.mymarketresearchmethods.com/descriptive-inferential-statistics-difference/
PUH 6301, Public Health Research 2
UNIT x STUDY GUIDE
Title
Unit Lesson
Introduction
This unit covers the statistical procedures used to analyze the data collected from research tools. During this
stage of research, you may begin to draw conclusions and be able to answer the research question(s) and
sub-question(s) you developed in Unit I. Use statistics in this stage of research to manipulate the data and
make it understandable for others to read. Shi (2008) encourages researchers to know and understand basic
statistics and statistical procedures. The data analysis phase of research is important because it makes sense
of the data that can be used for future research studies (Jacobsen, 2021).
Data Management
Data management is the entire process of keeping a record of all the results of clinical assessments
conducted during a research study (Jacobsen, 2021). Record keeping includes listing details on potential
articles, pulling information from patient charts, tracking responses from surveys, or recording assessment
results from cohorts or studies. It is vital that those responsible for collecting and keeping data maintain
confidentiality and the integrity of data sets from all outside sources. Once researchers enter the data into the
spreadsheet or database, the data should be recoded and double-checked prior to beginning statistical
ana ...
Biostatistics involves using statistical methods and techniques to analyze medical data. This includes collecting accurate data from patients regarding clinical characteristics, outcomes, and more. The data must then be prepared, coded, and cleaned before being analyzed. Statistical Package for Social Sciences (SPSS) is a commonly used software that allows importing data files and performing both descriptive and inferential statistical analyses. Descriptive analyses may include graphs, frequency tables, measures of central tendency, and variability to summarize categorical and continuous variables both individually and in relation to each other.
This document provides information on using SPSS for educational research. It discusses descriptive statistics, common statistical issues in research, procedures for creating a SPSS data file and conducting descriptive analyses. It also explains how to perform t-tests, analysis of variance (ANOVA), frequencies analysis and other statistical tests in SPSS. The document is intended as a guide for researchers on applying various statistical analyses in SPSS.
Spss Homework Psyc 421
SPSS Assignment Part 1 Instructions
Describing a Normative Sample
When it comes to the use of psychological tests, one approach that both researchers and clinicians take in trying to understand participants’ performance is a norm-referenced approach. With a norm-referenced test, the test is given to a large, representative group of participants known as the “normative sample” (a.k.a. “norm group”). Then, the scores of all subsequent test-takers are compared to the scores of the norm group. In order for the norm group to be a valid comparison group, it has to be representative of the population who will be taking the test.
So how do we know if the normative sample is representative? When summarizing the psychometric properties of a test, test developers and publishers usually describe the norm group with their demographic variables. Demographic variables are characteristics of the participants like: gender, age, ethnicity, relationship status, socioeconomic status, religious affiliation etc. A description of the normative sample allows examinersto decide if the test of interest can be used with their intended examinees. For example, if the normative sample were 95% male, then you likely could not logically compare their scores to females test takers! That is why readers need to know what the normative sample looks like.
The purpose of the current assignment is for you to provide a verbal (and graphical)description of a fictional normative sample of research participants.
In the Assignment Instructions folder, there is an SPSS data file that will be the basis for your analyses. The data provided are fictional and were created solely for the purposes of our SPSS assignments. This data file includes: 1) demographic information for a normative sample of 428 participants, and 2) participants’ scores on a test called the Center for Epidemiologic Studies Depression Scale (CES-D scale).
The CES-D Scale is utilized to measure symptoms of depression. It is a self-assessment that is completed by the individual. The CES-D contains 20items rated on a 4-point scale (0 = Rarely or None of the Time to 3
The document contains an outline of the table of contents for a textbook on general statistics. It covers topics such as preliminary concepts, data collection and presentation, measures of central tendency, measures of dispersion and skewness, and permutations and combinations. Sample chapters discuss introduction to statistics, variables and data, methods of presenting data through tables, graphs and diagrams, computing the mean, median and mode, and other statistical measures.
This document provides an overview of introducing SPSS and quantifying data for analysis. It discusses the different types of data in SPSS including nominal, ordinal, interval/ratio scales. It covers entering data from questionnaires or other sources into SPSS and constructing a codebook. The document then explains how to conduct basic analyses in SPSS including frequency counts, measures of central tendency and dispersion, charts, contingency tables, and chi-square tests. It emphasizes correctly preparing and working with data in SPSS before conducting analyses.
This document provides guidance on analyzing data using SPSS. It covers topics such as different data types, structuring data for analysis in SPSS, descriptive statistics, graphs, inferential statistics, and specific tests like t-tests, ANOVA, and correlation. The document is intended as a practical guide for researchers who need to analyze their data using SPSS. It defines key terms and provides examples to illustrate different statistical concepts and analysis procedures.
This document provides guidance on analyzing data using SPSS. It covers topics such as different data types, structuring data for analysis in SPSS, descriptive statistics, graphs, inferential statistics, and specific tests like t-tests, ANOVA, and correlation. The document is intended as a practical guide for researchers who need to analyze their data using SPSS. It defines key terms and provides examples to illustrate different statistical concepts and analysis procedures.
This document provides guidance on analysing data using SPSS. It discusses key considerations for determining the appropriate analysis method, including the type of data (nominal, ordinal, interval, ratio), whether the data is paired, whether it is parametric, and what is being examined (differences, correlations, etc.). It covers descriptive statistics, inferential statistics, and specific tests like t-tests, ANOVA, correlation, and chi-square. Examples are provided to illustrate different analysis techniques for various research study designs.
This document provides an overview of different types of data analysis including univariate analysis, bivariate analysis, and multivariate analysis. It also discusses different types of data structures such as cross-sectional data, time series data, and panel data.
The key points are:
1) Univariate analysis looks at one variable only to describe patterns in the data. Bivariate analysis looks at the relationship between two variables, while multivariate analysis examines three or more variables.
2) Cross-sectional data collects information from different subjects at the same point in time. Time series data observes the same variable over time. Panel data tracks the same subjects over multiple time periods.
3) Different analysis techniques can be used depending on the
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
Need a nonplagiarised paper and a form completed by 10/06/015 before 7:00pm. I have attached the documents along the rubics that must be followed.
Coyne and Messina Articles, Part 2 Statistical Assessment
Details:
1) Write a paper of 1,000-1,250 words regarding the statistical significance of outcomes as presented in Messina's, et al. article "The Relationship between Patient Satisfaction and Inpatient Admissions Across Teaching and Nonteaching Hospitals."
2) Assess the appropriateness of the statistics used by referring to the chart presented in the Module 4 lecture and the resource "Statistical Assessment."
3) Discuss the value of statistical significance vs. pragmatic usefulness.
4) Prepare this assignment according to the APA guidelines found in the APA Style Guide located in the Student Success Center. An abstract is not required.
5) This assignment uses a grading rubric. Instructors will be using the rubric to grade the assignment; therefore, students should review the rubric prior to beginning the assignment to become familiar with the assignment criteria and expectations for successful completion of the assignment.
Statistics: What you Need to Know
Introduction
Often, when people begin a statistics course, they worry about doing advanced mathematics or their math phobias kick in. Understanding that statistics as addressed in this course is not a math course at all is important. The only math you will do is addition, subtraction, multiplication, and division. In these days of computer capability, you generally don't even have to do that much, since Excel is set up to do basic statistics for you. The key elements for the student in this course is to understand the various types of statistics, what their requirements are, what they do, and how you can use and interpret the results. Referring back to the basic components of a valid research study, which statistic a researcher uses depends on several things:
·
The research question itself
·
The sample size
·
The type of data you have collected
·
The type of statistic called for by the design
All quantitative studies require a data set. Qualitative studies may use a data set or may use observations with no numerical data at all. For the purposes of the next modules, our focus will be on quantitative studies.
Types of Statistics
There are several types of statistics available to the researcher. Descriptive statistics provide a basic description of the data set. This includes the measures of central tendency: means, medians, and modes, and the measures of dispersion, including variances and standard deviations. Descriptive statistics also include the sample size, or "N", and the frequency with which each data point occurs in the data set.
Inferential statistics allow the researcher to make predictions, estimations, and generalizations about the data set, the sample, and the population from which the sample was drawn. They allow you to draw inferences, generaliza.
Print CopyExport Output InstructionsSPSS output can be selectiv.docxstilliegeorgiana
Print Copy/Export Output Instructions
SPSS output can be selectively copied and pasted into Word by using the Copy command:
1. Click on the SPSS output in the Viewer window.
2. Right-click for options.
3. Click the Copy command.
4. Paste the output into a Microsoft Word document.
The Copy command will preserve the formatting of the SPSS tables and charts when pasting into Microsoft Word.
An alternative method is to use the Export command:
1. Click on the SPSS output in the Viewer window.
2. Right-click for options.
3. Click the Export command.
4. Save the file as Word/RTF (.doc) to your computer.
5. Open the .doc file.
IBM SPSS Step-by-Step Guide: Histograms and Descriptive Statistics
Note: This guide is an example of creating histograms and descriptive statistics in SPSS with the grades.sav file. The variables shown in this guide do not correspond with the actual variables assigned in Assessment 1. Carefully follow the Assessment 1 instructions for a list of assigned variables. Screen shots were created with SPSS 21.0.
Section 1: Histograms and Visual Interpretation
Refer to your Assessment 1 instructions for a list of assigned Section 1 variables. The example variable final is shown below.
Step 1. Open grades.sav in SPSS.
Step 2. On the Graphs menu, point to Legacy Dialogs and click Histogram…
Step 3. In the Histogram dialog box:
First, select your assigned variable for Assessment 1. Move it to the Variable box. The exampleof final appears below.
Second, select Display normal curve.
Third, move gender to the Rows box.
Fourth, click OK to generate the histogram output.
Step 4. The histogram output appears in SPSS. Copy your SPSS output. To do this, right-click on the image and then click Copy.
Step 5. Open a new Word document and right-click for the paste options. Paste the histogram output into the Word document. Below the histograms, write up your visual interpretations as described in your Assessment 1.
Section 2: Calculate and Interpret Measures of Central Tendency and Dispersion
Refer to the Assessment 1 instructions for a list of assigned Section 2 variables. The examplevariable final is shown below.
Step 1. On the Analyze menu, point to Descriptive Statistics and click Descriptives…
Step 2. In the Descriptives dialog box:
First, select your Assessment 1 variables. Move them to the Variable(s) box. The examplevariable final is shown below.
Second, click the Options button.
Third, select Mean, Std. deviation, Kurtosis, and Skewness. For Display Order, select Variable list.
Fourth, click Continue and then click OK. The descriptives output is generated in SPSS for your Assessment 1 variables.
Step 3. Copy the descriptive statistics output and paste it into your Word document. Below the output, write up your analyses as described in the Assessment 1 instructions.
1
Remove or Replace: Header Is Not Doc Title
Assessment 1 ContextTransitioning from Descriptive Statistics to Inferential Statistics
In this assessment ...
Advice On Statistical Analysis For Circulation ResearchNancy Ideker
This document provides an overview and review of statistical methods for analyzing cardiovascular research data. It discusses common statistical errors in previous decades, such as low statistical power and inadequate analysis of repeated measures studies. It introduces several statistical methods that are useful but not always familiar to cardiologists, including power analysis, methods for analyzing repeated measures, analysis of covariance, multivariate analysis of variance, nonparametric tests, and more. The goal is to help researchers choose the appropriate statistical tests and properly interpret the results.
This document provides an overview of qualitative data analysis. It discusses that qualitative data analysis involves coding, categorizing, comparing and interpreting collected data to find meanings and implications. The researcher's perspective influences the analysis. It also describes techniques for qualitative data analysis like becoming familiar with the data, providing in-depth descriptions, and categorizing data into themes. Ensuring credibility involves considering factors like the researcher's observations and biases. The document also contrasts qualitative data analysis with quantitative analysis.
This document provides an overview of key concepts from Chapter 1 of the textbook "Elementary Statistics". It defines important statistical terms like population, sample, parameter, and statistic. It also distinguishes between different types of data and levels of measurement. Additionally, it discusses the importance of collecting sample data through appropriate random sampling methods. Critical thinking in statistics is emphasized, highlighting factors like the context, source, and sampling method of data when evaluating statistical claims. Different ways of collecting data through studies and experiments are also introduced.
The document summarizes key concepts from Chapter 1 of the textbook "Elementary Statistics" including:
- The difference between a population and a sample, and how statistics uses samples to make inferences about populations.
- The different types of data: quantitative, categorical, discrete vs. continuous data.
- The different levels of measurement for data: nominal, ordinal, interval, and ratio.
- The importance of critical thinking when analyzing data and statistics, including considering context, sources, sampling methods, and avoiding misleading graphs, samples, conclusions, or survey questions.
The document discusses various techniques for quantitative data analysis, including descriptive analysis, exploratory analysis, and statistical analysis. Descriptive analysis involves frequency tables, charts, and summary statistics to describe individual and groups of variables. Exploratory analysis examines relationships between two or more variables using cross-tabulations and correlations. Statistical analysis tests for significant relationships using techniques like chi-squared tests, t-tests, and regression analysis. The remainder of the document provides examples and explanations of these analytical methods.
This document provides an overview of methods for data analysis. It discusses data, descriptive statistics such as measures of central tendency and dispersion, inferential statistics including hypothesis testing and probability, and statistical software packages with a focus on SPSS. SPSS allows users to easily input, manage, and analyze data to obtain summary statistics and perform inferential analyses like t-tests, ANOVA, and regression. Outputs can be copied into reports.
Week 9
Data Analysis
Resources
Readings
Yegidis, B. L., Weinbach, R. W., & Myers, L. L. (2018).
Research methods for social workers
(8th ed.). New York, NY: Pearson.
o Chapter 13, “Analyzing Data” (pp. 295–297, “The Data in Perspective”)
Bauer, S., Lambert, M. J., & Nielsen, S. L. (2004). Clinical significance methods: A comparison of statistical techniques.
Journal of Personality Assessment, 82
, 60–70.
Gibson, F. H. (2003).
Indigent client perceptions of barriers to marriage and family therapy
(Dissertation, University of Louisiana at Monroe).
Plummer, S.-B., Makris, S., & Brocksen S. M. (Eds.). (2014).
Social work case studies: Foundation year.
Baltimore, MD: Laureate International Universities Publishing. [Vital Source e- reader].
o Social Work Research: Program Evaluation
Data Analysis Techniques
In order to make decisions about the value of any research study for practice, it is important to understand the general processes involved in analyzing research data. By now, you have examined enough research studies to be aware that there are some common ways that data are reported and summarized in research studies. For example, the sample is often described by numbers of participants and by certain characteristics of those participants that help us determine how representative the sample is of a population. The information about the sample is commonly reported in tables and graphs, making use of frequency distributions, measures of central tendency, and dispersion. Information about the variables (or concepts) of interest when quantified are also reported in similar manner.
Although the actual data analysis takes place after data have been collected, from the initial planning of a research study, the researcher needs to have an awareness of the types of questions that can be answered by particular data analysis techniques.
For this Discussion, review the case study entitled "Social Work Research: Measuring Group Success." Consider the data analysis described in that case. Recall the information presented in the earlier chapters of your text about formulating research questions to inform a hypotheses or open-ended exploration of an issue.
Discussion 1
Relationship Between Purpose of Study and Data Analysis Techniques 1 page paper
Post
an explanation of the types of descriptive and/or inferential statistics you might use to analyze the data gathered in the case study. Also explain how the statistics you identify can guide you in evaluating the applicability of the study’s findings for your own practice as a social worker. Please use the Resources to support your answer.
Week 9
Data Analysis
Discussion 2
Research studies often compare variables, conditions, times, and/or groups of participants to evaluate relationships between variables or differences between groups or times. For example, if researchers are interested in knowing whether an intervention produces change in the ...
Statistics is the study of collecting, organizing, summarizing, and interpreting data. Medical statistics applies statistical methods to medical data and research. Biostatistics specifically applies statistical methods to biological data. Statistics is essential for medical research, updating medical knowledge, data management, describing research findings, and evaluating health programs. It allows comparison of populations, risks, treatments, and more.
This document provides an introduction and definitions related to key concepts in statistics. It discusses what statistics is as the science of collecting, organizing, and interpreting data. It defines important statistical terms like data, variables, statistics, and parameters. It also outlines the two main branches of statistics as descriptive statistics, which focuses on summarizing and presenting data, and inferential statistics, which analyzes samples to make inferences about populations. Finally, it discusses common sources of data like published sources, experiments, and surveys.
Statistics What you Need to KnowIntroductionOften, when peop.docxdessiechisomjj4
Statistics: What you Need to Know
Introduction
Often, when people begin a statistics course, they worry about doing advanced mathematics or their math phobias kick in. Understanding that statistics as addressed in this course is not a math course at all is important. The only math you will do is addition, subtraction, multiplication, and division. In these days of computer capability, you generally don't even have to do that much, since Excel is set up to do basic statistics for you. The key elements for the student in this course is to understand the various types of statistics, what their requirements are, what they do, and how you can use and interpret the results. Referring back to the basic components of a valid research study, which statistic a researcher uses depends on several things:
The research question itself
The sample size
The type of data you have collected
The type of statistic called for by the design
All quantitative studies require a data set. Qualitative studies may use a data set or may use observations with no numerical data at all. For the purposes of the next modules, our focus will be on quantitative studies.
Types of Statistics
There are several types of statistics available to the researcher. Descriptive statistics provide a basic description of the data set. This includes the measures of central tendency: means, medians, and modes, and the measures of dispersion, including variances and standard deviations. Descriptive statistics also include the sample size, or "N", and the frequency with which each data point occurs in the data set.
Inferential statistics allow the researcher to make predictions, estimations, and generalizations about the data set, the sample, and the population from which the sample was drawn. They allow you to draw inferences, generalizations, and possibilities regarding the relationship between the independent variable and the dependent variable to indicate how those inferences answer the research question. Researchers can make predictions and estimations about how the results will fit the overall population. Statistics can also be described in terms of the types of data they can analyze. Non-parametric statistics can be used with nominal or ordinal data, while parametric statistics can be used with interval and ratio data types.
Types of Data
There are four types of data that a researcher may collect.
Nominal Data Sets
The Nominal data set includes simple classifications of data into categories which are all of equal weight and value. Examples of categories that are equal to each other include gender (male, female), state of birth (Arizona, Wyoming, etc.), membership in a group (yes, no). Each of these categories is equivalent to the other, without value judgments.
Ordinal Data Sets
Ordinal data sets also have data classified into categories, but these categories have some form or order or ranking attached, often of some sort of value / val.
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES
This document discusses the basics of descriptive data analysis, including defining different types of variables, coding principles, and univariate data analysis. It describes continuous variables as always numeric and categorical variables as information sorted into categories, with ordinal variables having intrinsic order, nominal variables lacking order, and dichotomous variables having only two levels. The document outlines steps for coding variables and cleaning data, as well as techniques for univariate analysis of continuous and categorical variables to check data quality.
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES
This document discusses the basics of descriptive data analysis, including defining different types of variables, coding principles, and univariate data analysis. It describes continuous and categorical variables, including ordinal, nominal, and dichotomous variables. The document outlines steps for coding variables and cleaning data, as well as techniques for univariate analysis of continuous and categorical variables, including examining distributions, frequencies, and comparing observed vs expected values. The goal is to familiarize readers with fundamental concepts for organizing, coding, and performing initial exploratory analysis of data.
This document provides guidelines for exploring data and assumption testing in applied statistics. It discusses descriptive statistics, normal distribution tests, and assigning practice exercises to student groups. Specifically, it explains how to generate descriptive statistics and histograms in SPSS, introduces the Kolmogorov-Smirnov normality test, and provides examples analyzing normality for intrinsic motivation scores and exam scores from different universities. Students are then assigned to groups and asked questions related to outliers, measures of central tendency, variables, distribution characteristics, within-subjects designs, statistical errors, effect sizes, standard error, p-values, z-scores, and degrees of freedom.
Introduction of statistics and probabilityBencentapleras
This document discusses key concepts in statistics including collecting, organizing, and analyzing quantitative and qualitative data. It defines common statistical terminology like nominal, ordinal, interval, and ratio scales of measurement. Descriptive and inferential statistics are compared, where descriptive statistics summarize data and inferential statistics are used to make generalizations from a sample to a population. Common descriptive measures like mean, median, and mode are also defined.
Name 1. The table shows the number of days per week, x, that 100.docxgilpinleeanna
Name
1. The table shows the number of days per week, x, that 100 students use the gym at a local high school.
x
frequency
Relative
frequency
Cumulative
frequency
0
3
1
12
2
33
3
28
4
11
5
9
6
4
1. The table shows the number of days per week, x, that 100 students use the gym at a local high school.
a. Complete the table
b. Display the information as either a pie chart, a horizontal bar chart, or a vertical bar chart.
c. Determine the mean, median, minimum frequency, maximum frequency, range, Q1, Q3 and the standard deviation, Sx
d. Based on the information and chart, what can you say about the distribution.a. Complete the table
b. Display the information as either a pie chart, a horizontal bar chart, or a vertical bar chart.
c. Determine the mean, median, minimum frequency, maximum frequency, range, Q1, Q3 and the standard deviation, Sx
d. Based on the information and chart, what can you say about the distribution.
Theme one is to identify the types of cultures or models of cultures and how they work or fit within an organization
Learning Activity #1
Using your reading material create a chart that describes the type, characteristics of the culture, associated values that would be important to keep the culture alive, and kinds of organizations structures that work best for culture. Compare and contrast them in your explanation of the chart. For instance what culture might work for Joe at the new sawmill and then which one might work at Purvis' shoe company.
Theme two: How to Create, Change, and Align Culture to the Structure and Vision.
Organizational Structure
Preface:
A leader’s job is to create the direction for the company to move forward. The leader does this in steps. Here are the steps of the process:
First, the leader designs the vision and mission for the company and second, the leader must establish an organizational structure which promotes the vision, mission and empowers the employees to keep the forward movement in the organization.
In creating the structure various factors must be considered.
· First and foremost is the purpose of the company or organization. What type of structure will best accomplish that goal? Certainly a company like UPS needs a somewhat rigid structure that is set up to focus on procedure and time sensitivity. Since UPS has as its goal to get the correct parcels to the right customers in the fastest way possible, variance in procedures or ways of accomplishing the tasks would never work. A tight delineated structure is imperative.
· Along with the purpose the leader must look at the vision of the organization. Where does the leader want the organization to go? How best can the structure provide for the future? Will the vision call for expansion into other countries or simply call for product development changes? Do you plan a struct ...
Name _____________________Date ________________________ESL.docxgilpinleeanna
Name _____________________ Date ________________________
ESL 408 Remembered Event Worksheet
1) What is the most memorable, significant event in your life?
2) What important lesson(s) or applications are there from this event?
3) Complete the chart below. Add at least 5 details to each part of the storyline.
Story Element
Details
Exposition
Rising Action
Climax
Falling Action
Resloution
...
This document provides guidance on analyzing data using SPSS. It covers topics such as different data types, structuring data for analysis in SPSS, descriptive statistics, graphs, inferential statistics, and specific tests like t-tests, ANOVA, and correlation. The document is intended as a practical guide for researchers who need to analyze their data using SPSS. It defines key terms and provides examples to illustrate different statistical concepts and analysis procedures.
This document provides guidance on analysing data using SPSS. It discusses key considerations for determining the appropriate analysis method, including the type of data (nominal, ordinal, interval, ratio), whether the data is paired, whether it is parametric, and what is being examined (differences, correlations, etc.). It covers descriptive statistics, inferential statistics, and specific tests like t-tests, ANOVA, correlation, and chi-square. Examples are provided to illustrate different analysis techniques for various research study designs.
This document provides an overview of different types of data analysis including univariate analysis, bivariate analysis, and multivariate analysis. It also discusses different types of data structures such as cross-sectional data, time series data, and panel data.
The key points are:
1) Univariate analysis looks at one variable only to describe patterns in the data. Bivariate analysis looks at the relationship between two variables, while multivariate analysis examines three or more variables.
2) Cross-sectional data collects information from different subjects at the same point in time. Time series data observes the same variable over time. Panel data tracks the same subjects over multiple time periods.
3) Different analysis techniques can be used depending on the
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
Need a nonplagiarised paper and a form completed by 10/06/015 before 7:00pm. I have attached the documents along the rubics that must be followed.
Coyne and Messina Articles, Part 2 Statistical Assessment
Details:
1) Write a paper of 1,000-1,250 words regarding the statistical significance of outcomes as presented in Messina's, et al. article "The Relationship between Patient Satisfaction and Inpatient Admissions Across Teaching and Nonteaching Hospitals."
2) Assess the appropriateness of the statistics used by referring to the chart presented in the Module 4 lecture and the resource "Statistical Assessment."
3) Discuss the value of statistical significance vs. pragmatic usefulness.
4) Prepare this assignment according to the APA guidelines found in the APA Style Guide located in the Student Success Center. An abstract is not required.
5) This assignment uses a grading rubric. Instructors will be using the rubric to grade the assignment; therefore, students should review the rubric prior to beginning the assignment to become familiar with the assignment criteria and expectations for successful completion of the assignment.
Statistics: What you Need to Know
Introduction
Often, when people begin a statistics course, they worry about doing advanced mathematics or their math phobias kick in. Understanding that statistics as addressed in this course is not a math course at all is important. The only math you will do is addition, subtraction, multiplication, and division. In these days of computer capability, you generally don't even have to do that much, since Excel is set up to do basic statistics for you. The key elements for the student in this course is to understand the various types of statistics, what their requirements are, what they do, and how you can use and interpret the results. Referring back to the basic components of a valid research study, which statistic a researcher uses depends on several things:
·
The research question itself
·
The sample size
·
The type of data you have collected
·
The type of statistic called for by the design
All quantitative studies require a data set. Qualitative studies may use a data set or may use observations with no numerical data at all. For the purposes of the next modules, our focus will be on quantitative studies.
Types of Statistics
There are several types of statistics available to the researcher. Descriptive statistics provide a basic description of the data set. This includes the measures of central tendency: means, medians, and modes, and the measures of dispersion, including variances and standard deviations. Descriptive statistics also include the sample size, or "N", and the frequency with which each data point occurs in the data set.
Inferential statistics allow the researcher to make predictions, estimations, and generalizations about the data set, the sample, and the population from which the sample was drawn. They allow you to draw inferences, generaliza.
Print CopyExport Output InstructionsSPSS output can be selectiv.docxstilliegeorgiana
Print Copy/Export Output Instructions
SPSS output can be selectively copied and pasted into Word by using the Copy command:
1. Click on the SPSS output in the Viewer window.
2. Right-click for options.
3. Click the Copy command.
4. Paste the output into a Microsoft Word document.
The Copy command will preserve the formatting of the SPSS tables and charts when pasting into Microsoft Word.
An alternative method is to use the Export command:
1. Click on the SPSS output in the Viewer window.
2. Right-click for options.
3. Click the Export command.
4. Save the file as Word/RTF (.doc) to your computer.
5. Open the .doc file.
IBM SPSS Step-by-Step Guide: Histograms and Descriptive Statistics
Note: This guide is an example of creating histograms and descriptive statistics in SPSS with the grades.sav file. The variables shown in this guide do not correspond with the actual variables assigned in Assessment 1. Carefully follow the Assessment 1 instructions for a list of assigned variables. Screen shots were created with SPSS 21.0.
Section 1: Histograms and Visual Interpretation
Refer to your Assessment 1 instructions for a list of assigned Section 1 variables. The example variable final is shown below.
Step 1. Open grades.sav in SPSS.
Step 2. On the Graphs menu, point to Legacy Dialogs and click Histogram…
Step 3. In the Histogram dialog box:
First, select your assigned variable for Assessment 1. Move it to the Variable box. The exampleof final appears below.
Second, select Display normal curve.
Third, move gender to the Rows box.
Fourth, click OK to generate the histogram output.
Step 4. The histogram output appears in SPSS. Copy your SPSS output. To do this, right-click on the image and then click Copy.
Step 5. Open a new Word document and right-click for the paste options. Paste the histogram output into the Word document. Below the histograms, write up your visual interpretations as described in your Assessment 1.
Section 2: Calculate and Interpret Measures of Central Tendency and Dispersion
Refer to the Assessment 1 instructions for a list of assigned Section 2 variables. The examplevariable final is shown below.
Step 1. On the Analyze menu, point to Descriptive Statistics and click Descriptives…
Step 2. In the Descriptives dialog box:
First, select your Assessment 1 variables. Move them to the Variable(s) box. The examplevariable final is shown below.
Second, click the Options button.
Third, select Mean, Std. deviation, Kurtosis, and Skewness. For Display Order, select Variable list.
Fourth, click Continue and then click OK. The descriptives output is generated in SPSS for your Assessment 1 variables.
Step 3. Copy the descriptive statistics output and paste it into your Word document. Below the output, write up your analyses as described in the Assessment 1 instructions.
1
Remove or Replace: Header Is Not Doc Title
Assessment 1 ContextTransitioning from Descriptive Statistics to Inferential Statistics
In this assessment ...
Advice On Statistical Analysis For Circulation ResearchNancy Ideker
This document provides an overview and review of statistical methods for analyzing cardiovascular research data. It discusses common statistical errors in previous decades, such as low statistical power and inadequate analysis of repeated measures studies. It introduces several statistical methods that are useful but not always familiar to cardiologists, including power analysis, methods for analyzing repeated measures, analysis of covariance, multivariate analysis of variance, nonparametric tests, and more. The goal is to help researchers choose the appropriate statistical tests and properly interpret the results.
This document provides an overview of qualitative data analysis. It discusses that qualitative data analysis involves coding, categorizing, comparing and interpreting collected data to find meanings and implications. The researcher's perspective influences the analysis. It also describes techniques for qualitative data analysis like becoming familiar with the data, providing in-depth descriptions, and categorizing data into themes. Ensuring credibility involves considering factors like the researcher's observations and biases. The document also contrasts qualitative data analysis with quantitative analysis.
This document provides an overview of key concepts from Chapter 1 of the textbook "Elementary Statistics". It defines important statistical terms like population, sample, parameter, and statistic. It also distinguishes between different types of data and levels of measurement. Additionally, it discusses the importance of collecting sample data through appropriate random sampling methods. Critical thinking in statistics is emphasized, highlighting factors like the context, source, and sampling method of data when evaluating statistical claims. Different ways of collecting data through studies and experiments are also introduced.
The document summarizes key concepts from Chapter 1 of the textbook "Elementary Statistics" including:
- The difference between a population and a sample, and how statistics uses samples to make inferences about populations.
- The different types of data: quantitative, categorical, discrete vs. continuous data.
- The different levels of measurement for data: nominal, ordinal, interval, and ratio.
- The importance of critical thinking when analyzing data and statistics, including considering context, sources, sampling methods, and avoiding misleading graphs, samples, conclusions, or survey questions.
The document discusses various techniques for quantitative data analysis, including descriptive analysis, exploratory analysis, and statistical analysis. Descriptive analysis involves frequency tables, charts, and summary statistics to describe individual and groups of variables. Exploratory analysis examines relationships between two or more variables using cross-tabulations and correlations. Statistical analysis tests for significant relationships using techniques like chi-squared tests, t-tests, and regression analysis. The remainder of the document provides examples and explanations of these analytical methods.
This document provides an overview of methods for data analysis. It discusses data, descriptive statistics such as measures of central tendency and dispersion, inferential statistics including hypothesis testing and probability, and statistical software packages with a focus on SPSS. SPSS allows users to easily input, manage, and analyze data to obtain summary statistics and perform inferential analyses like t-tests, ANOVA, and regression. Outputs can be copied into reports.
Week 9
Data Analysis
Resources
Readings
Yegidis, B. L., Weinbach, R. W., & Myers, L. L. (2018).
Research methods for social workers
(8th ed.). New York, NY: Pearson.
o Chapter 13, “Analyzing Data” (pp. 295–297, “The Data in Perspective”)
Bauer, S., Lambert, M. J., & Nielsen, S. L. (2004). Clinical significance methods: A comparison of statistical techniques.
Journal of Personality Assessment, 82
, 60–70.
Gibson, F. H. (2003).
Indigent client perceptions of barriers to marriage and family therapy
(Dissertation, University of Louisiana at Monroe).
Plummer, S.-B., Makris, S., & Brocksen S. M. (Eds.). (2014).
Social work case studies: Foundation year.
Baltimore, MD: Laureate International Universities Publishing. [Vital Source e- reader].
o Social Work Research: Program Evaluation
Data Analysis Techniques
In order to make decisions about the value of any research study for practice, it is important to understand the general processes involved in analyzing research data. By now, you have examined enough research studies to be aware that there are some common ways that data are reported and summarized in research studies. For example, the sample is often described by numbers of participants and by certain characteristics of those participants that help us determine how representative the sample is of a population. The information about the sample is commonly reported in tables and graphs, making use of frequency distributions, measures of central tendency, and dispersion. Information about the variables (or concepts) of interest when quantified are also reported in similar manner.
Although the actual data analysis takes place after data have been collected, from the initial planning of a research study, the researcher needs to have an awareness of the types of questions that can be answered by particular data analysis techniques.
For this Discussion, review the case study entitled "Social Work Research: Measuring Group Success." Consider the data analysis described in that case. Recall the information presented in the earlier chapters of your text about formulating research questions to inform a hypotheses or open-ended exploration of an issue.
Discussion 1
Relationship Between Purpose of Study and Data Analysis Techniques 1 page paper
Post
an explanation of the types of descriptive and/or inferential statistics you might use to analyze the data gathered in the case study. Also explain how the statistics you identify can guide you in evaluating the applicability of the study’s findings for your own practice as a social worker. Please use the Resources to support your answer.
Week 9
Data Analysis
Discussion 2
Research studies often compare variables, conditions, times, and/or groups of participants to evaluate relationships between variables or differences between groups or times. For example, if researchers are interested in knowing whether an intervention produces change in the ...
Statistics is the study of collecting, organizing, summarizing, and interpreting data. Medical statistics applies statistical methods to medical data and research. Biostatistics specifically applies statistical methods to biological data. Statistics is essential for medical research, updating medical knowledge, data management, describing research findings, and evaluating health programs. It allows comparison of populations, risks, treatments, and more.
This document provides an introduction and definitions related to key concepts in statistics. It discusses what statistics is as the science of collecting, organizing, and interpreting data. It defines important statistical terms like data, variables, statistics, and parameters. It also outlines the two main branches of statistics as descriptive statistics, which focuses on summarizing and presenting data, and inferential statistics, which analyzes samples to make inferences about populations. Finally, it discusses common sources of data like published sources, experiments, and surveys.
Statistics What you Need to KnowIntroductionOften, when peop.docxdessiechisomjj4
Statistics: What you Need to Know
Introduction
Often, when people begin a statistics course, they worry about doing advanced mathematics or their math phobias kick in. Understanding that statistics as addressed in this course is not a math course at all is important. The only math you will do is addition, subtraction, multiplication, and division. In these days of computer capability, you generally don't even have to do that much, since Excel is set up to do basic statistics for you. The key elements for the student in this course is to understand the various types of statistics, what their requirements are, what they do, and how you can use and interpret the results. Referring back to the basic components of a valid research study, which statistic a researcher uses depends on several things:
The research question itself
The sample size
The type of data you have collected
The type of statistic called for by the design
All quantitative studies require a data set. Qualitative studies may use a data set or may use observations with no numerical data at all. For the purposes of the next modules, our focus will be on quantitative studies.
Types of Statistics
There are several types of statistics available to the researcher. Descriptive statistics provide a basic description of the data set. This includes the measures of central tendency: means, medians, and modes, and the measures of dispersion, including variances and standard deviations. Descriptive statistics also include the sample size, or "N", and the frequency with which each data point occurs in the data set.
Inferential statistics allow the researcher to make predictions, estimations, and generalizations about the data set, the sample, and the population from which the sample was drawn. They allow you to draw inferences, generalizations, and possibilities regarding the relationship between the independent variable and the dependent variable to indicate how those inferences answer the research question. Researchers can make predictions and estimations about how the results will fit the overall population. Statistics can also be described in terms of the types of data they can analyze. Non-parametric statistics can be used with nominal or ordinal data, while parametric statistics can be used with interval and ratio data types.
Types of Data
There are four types of data that a researcher may collect.
Nominal Data Sets
The Nominal data set includes simple classifications of data into categories which are all of equal weight and value. Examples of categories that are equal to each other include gender (male, female), state of birth (Arizona, Wyoming, etc.), membership in a group (yes, no). Each of these categories is equivalent to the other, without value judgments.
Ordinal Data Sets
Ordinal data sets also have data classified into categories, but these categories have some form or order or ranking attached, often of some sort of value / val.
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES
This document discusses the basics of descriptive data analysis, including defining different types of variables, coding principles, and univariate data analysis. It describes continuous variables as always numeric and categorical variables as information sorted into categories, with ordinal variables having intrinsic order, nominal variables lacking order, and dichotomous variables having only two levels. The document outlines steps for coding variables and cleaning data, as well as techniques for univariate analysis of continuous and categorical variables to check data quality.
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES
This document discusses the basics of descriptive data analysis, including defining different types of variables, coding principles, and univariate data analysis. It describes continuous and categorical variables, including ordinal, nominal, and dichotomous variables. The document outlines steps for coding variables and cleaning data, as well as techniques for univariate analysis of continuous and categorical variables, including examining distributions, frequencies, and comparing observed vs expected values. The goal is to familiarize readers with fundamental concepts for organizing, coding, and performing initial exploratory analysis of data.
This document provides guidelines for exploring data and assumption testing in applied statistics. It discusses descriptive statistics, normal distribution tests, and assigning practice exercises to student groups. Specifically, it explains how to generate descriptive statistics and histograms in SPSS, introduces the Kolmogorov-Smirnov normality test, and provides examples analyzing normality for intrinsic motivation scores and exam scores from different universities. Students are then assigned to groups and asked questions related to outliers, measures of central tendency, variables, distribution characteristics, within-subjects designs, statistical errors, effect sizes, standard error, p-values, z-scores, and degrees of freedom.
Introduction of statistics and probabilityBencentapleras
This document discusses key concepts in statistics including collecting, organizing, and analyzing quantitative and qualitative data. It defines common statistical terminology like nominal, ordinal, interval, and ratio scales of measurement. Descriptive and inferential statistics are compared, where descriptive statistics summarize data and inferential statistics are used to make generalizations from a sample to a population. Common descriptive measures like mean, median, and mode are also defined.
Similar to n 2 3 n99 2.58 95 1.96 90 1.645.docx (20)
Name 1. The table shows the number of days per week, x, that 100.docxgilpinleeanna
Name
1. The table shows the number of days per week, x, that 100 students use the gym at a local high school.
x
frequency
Relative
frequency
Cumulative
frequency
0
3
1
12
2
33
3
28
4
11
5
9
6
4
1. The table shows the number of days per week, x, that 100 students use the gym at a local high school.
a. Complete the table
b. Display the information as either a pie chart, a horizontal bar chart, or a vertical bar chart.
c. Determine the mean, median, minimum frequency, maximum frequency, range, Q1, Q3 and the standard deviation, Sx
d. Based on the information and chart, what can you say about the distribution.a. Complete the table
b. Display the information as either a pie chart, a horizontal bar chart, or a vertical bar chart.
c. Determine the mean, median, minimum frequency, maximum frequency, range, Q1, Q3 and the standard deviation, Sx
d. Based on the information and chart, what can you say about the distribution.
Theme one is to identify the types of cultures or models of cultures and how they work or fit within an organization
Learning Activity #1
Using your reading material create a chart that describes the type, characteristics of the culture, associated values that would be important to keep the culture alive, and kinds of organizations structures that work best for culture. Compare and contrast them in your explanation of the chart. For instance what culture might work for Joe at the new sawmill and then which one might work at Purvis' shoe company.
Theme two: How to Create, Change, and Align Culture to the Structure and Vision.
Organizational Structure
Preface:
A leader’s job is to create the direction for the company to move forward. The leader does this in steps. Here are the steps of the process:
First, the leader designs the vision and mission for the company and second, the leader must establish an organizational structure which promotes the vision, mission and empowers the employees to keep the forward movement in the organization.
In creating the structure various factors must be considered.
· First and foremost is the purpose of the company or organization. What type of structure will best accomplish that goal? Certainly a company like UPS needs a somewhat rigid structure that is set up to focus on procedure and time sensitivity. Since UPS has as its goal to get the correct parcels to the right customers in the fastest way possible, variance in procedures or ways of accomplishing the tasks would never work. A tight delineated structure is imperative.
· Along with the purpose the leader must look at the vision of the organization. Where does the leader want the organization to go? How best can the structure provide for the future? Will the vision call for expansion into other countries or simply call for product development changes? Do you plan a struct ...
Name _____________________Date ________________________ESL.docxgilpinleeanna
Name _____________________ Date ________________________
ESL 408 Remembered Event Worksheet
1) What is the most memorable, significant event in your life?
2) What important lesson(s) or applications are there from this event?
3) Complete the chart below. Add at least 5 details to each part of the storyline.
Story Element
Details
Exposition
Rising Action
Climax
Falling Action
Resloution
...
Name Bijapur Fort Year 1599 Location Bijapur city.docxgilpinleeanna
Name: Bijapur Fort
Year: 1599
Location: Bijapur city in Bijapur District of the Indian state of Karnataka
The fort precinct is studded with the historical fort, palaces, mosques, tombs and
gardens.
Built by Yusuf Adil Shah, during the rule of Adil Shahidynasty.
https://en.wikipedia.org/wiki/Bijapur,_Karnataka
https://en.wikipedia.org/wiki/Bijapur_district,_Karnataka
https://en.wikipedia.org/wiki/States_and_territories_of_India
https://en.wikipedia.org/wiki/Karnataka
https://en.wikipedia.org/wiki/Adil_Shahi
Name: Adham Khan's Tomb
Year: 1561
Location : Qutub Minar, Mehrauli, Delhi,
Built for 16th-century tomb of Adham Khan, a general of the Mughal Emperor Akbar.
It consists of a domed octagonal chamber in the Lodhi Dynasty style and Sayyid
dynasty early in the 14th century.
https://en.wikipedia.org/wiki/Qutub_Minar
https://en.wikipedia.org/wiki/Mehrauli
https://en.wikipedia.org/wiki/Delhi
https://en.wikipedia.org/wiki/Adham_Khan
https://en.wikipedia.org/wiki/Mughal_Emperor
https://en.wikipedia.org/wiki/Akbar
https://en.wikipedia.org/wiki/Lodhi_Dynasty
https://en.wikipedia.org/wiki/Sayyid_dynasty
https://en.wikipedia.org/wiki/Sayyid_dynasty
These two objects are both tomb and have it’s own style form certain dynasty.
I chose these two objects is because they are both architecture and I can talk more about
how different dynasty influences the design of the architecture.s
Week 10 Assignments – XBRL
DUE DATE: Sunday midnight of Week 6, submitted in a MS Word (or Excel if
computations required) document with filename format:
Last First_Week X hwk.doc or .xls Make sure your name appears on each page of the
homework using the header function.
Homework questions:
1. Why do you think it took from 1999, when the XBRL concept was invented, until 2009
for the SEC require that public filers adopt?
2. From the PWC Webcast on XBRL, what are the differences between the “bolt-on” and
“embedded” approach to XBRL?
3. If you worked in the Finance and Accounting department of a company, how could you
use XBRL tags to help in your job? Could XBRL tagging help other functions in a
company do their jobs?
4. US public filers are required to begin tagging and reporting financial data using XBRL
beginning in 2009. From earlier in this course, they also have many major projects that
are required now or in the coming years (IFRS, Fair Value, etc.). Aside from the obvious
benefit of job creation for CPA’s and the companies which provide these
services/software ☺, what impact do you think these requirements are going to have on
companies? Will this divert attention and resources from their core business or will this
be like all other changes they go through (e.g. SOX), an intense implementation then
business as usual?
...
Name _______________________________ (Ex2 rework) CHM 33.docxgilpinleeanna
Name: _______________________________ (Ex2 rework)
CHM 3372, Winter 2016
Exam #2 Re-work
Due Wed, 3/2/16
1. Make the ketone below from 13C-labeled formaldehyde and propane. Make certain to keep
track of your labels throughout your synthesis. (27 points)
O
Name: _______________________________ (Ex2 rework)
2. (a) The reaction below can form two possible diastereomeric products. Draw the structures of
both products, and the mechanism of the formation of either one. (4 points)
O
1. LiAlH4
2. NH4Cl, H2O
(b) What characterizes a thermodynamic product of a reaction (any reaction)? What
characterizes a kinetic product of reaction? (2 points)
(c) Which product from part (a) would you expect to be the thermodynamic product? Why? (2
points)
(d) Which product would you expect to be the kinetic product? Why? (Note that this is not
necessarily the "non-thermodynamic" product.) (2 points)
(e) When this reaction is performed, regardless of what the temperature is, only one of the two
possible products is ever formed. Which one? (1 points)
(f) Why is the other diastereomer never formed? What must occur in order for it to be formed,
which will never occur with this particular reagent? Why? (3 points)
(g) Although the other diastereomer is never formed directly in this reaction, gentle heating with
aqueous acid will isomerize the initial product into the other diastereomer. Draw the mechanism
of the isomerization, and comment on why this isomerization occurs -- why one diastereomer
will react completely to form the other. (5 points)
Name: _______________________________ (Ex2 rework)
3. This page seems like it was tough on Q#3. Let’s see if you do better the second time around.
From the three alcohols shown, provide syntheses for the molecules below. For any SN2 or E2
reactions, use only non-halogen leaving groups – use a different leaving group which was
covered in Ch. 11. (12 points)
From: Make:
OH
OH
CH3 OH
O
O
CH3
O
O
O
Name: _______________________________ (Ex2 rework)
4. (a) Once again, write the oxidation state of the metal (each complex is neutral, Nickel is
Group 10; OTf is triflate, CF3SO3-), number of d electrons, and total valence electrons for the
metal in each complex, and indicate what type of reaction is occurring. (8 points)
H Ni
OTf
PPh3
Ni
OTf
PPh3H
Ni
OTf
PPh3
Ni
OTf
PPh3
Ni
OTf
PPh3
H
(b) What are the reactant(s) and product(s) of the reaction? (This time, they are not drawn for
you.) (2 points)
(c) If the ethylene molecule were deuterated completely (CD2=CD2), where would the deuterium
atoms end up in the product? Draw the structure, showing the position(s) of the deuterium
atoms. Assume the catalytic cycle has run several times already. (2 points)
Name: _______________________________ (Ex2 rework)
5. (a) I defined a conjugated system gener ...
Name 1 Should Transportation Security Officers Be A.docxgilpinleeanna
Name:
1
Should Transportation Security Officers Be Armed?
It is the opinion of this writer that Transportation Security Officers (TSOs) should not be
armed. It is my intent to illustrate that point in this paper. During my research I will weigh the
advantages and disadvantages of arming TSOs, examining each side of the argument. I will also
offer a potential solution that while costly will still prove to be less costly than arming TSOs.
What has led to this discussion? For a majority of our society it takes years and certain
events to take place in our lives for change to occur. Those events include graduating High
School/College, getting married, or having children. In a matter of only five short minutes on
the morning of November 1st, 2013, some individual’s lives changed forever. On that morning
Paul Anthony Ciancia, age 23, opened fire in Terminal 3 of the Los Angeles International
Airport (LAX). His senseless acts killed a TSO, while injuring six other individuals. The
shooting has been debated over and over again on whether it is a terrorist act or not. The
activities before, during, and after the shooting will show the acts were certainly a terrorist
attack. But more importantly could any deaths or injuries have been avoided if the TSOs were
armed? These is the question that will continue to be debated and one that will be addressed in
this paper.
Synopsis of the event that led up to this argument:
Shortly after being dropped off at the airport by his roommate, Paul Ciancia pulled out a
rifle and began opening fire. He was carrying luggage that was filled with a semiautomatic .223
caliber Smith & Wesson M&P-15 rifle, five 30-round magazines, and hundreds of additional
rounds of ammunition ("Lax shooting suspect," 2013). Walking up to the TSA checkpoint,
Ciancia pulled out a rifle and opened fire hitting TSO Gerardo Hernandez in the chest. Ciancia
Name:
2
then apparently moved into the screening area where he continued to fire striking two other
TSOs and a male citizen. According to eye witnesses, Ciancia continually asked civilians if they
were TSA officers, when they said “no” he moved on without shooting them ("Lax shooting:
Latest," 2013). Ciancia made it as far as the food court some five minutes after the first shots
were fired. He was then surrounded by LAX police officers who engaged him in a gunfight.
Shortly after the gunfight ended Ciancia was taken into custody where he had to be transported
to a nearby trauma hospital for gunshot wounds (Abdollah, 2013).
In total eight individuals had to be treated at the scene. Four victims were treated for
gunshot wounds, while the others were treated for other injuries ("6 hospitalized after," 2013).
The sole suspect Paul Ciancia was carrying a note on him that stated he “wanted to kill TSA”
and describe them as “pigs”, the note also mentioned “fiat currency” and “NWO” ("Lax shooting
...
Name Don’t ForgetDate UNIT 3 TEST(The direct.docxgilpinleeanna
Name: Don’t Forget
Date:
UNIT 3 TEST
(The directions and procedures for this test are the same as for the previous Unit test.)
Save this test on your computer, and complete the questions by marking correct answers with the “text color” function in WORD ( ) located on the “home” toolbar.Please attach your completed test to the assignment submission page.
Section I
Please identify problems of vagueness, overgenerality and ambiguity (double meaning) in the following passages. Then explain briefly how/why the passage exemplifies that problem. (Some examples may contain more than one problem.)
1. Who was Hitler? He was an Austrian.
__vague
__overgeneral
__ambiguous
Explanation:
2. The judge sanctioned the firm's criminal conduct.
__vague
__overgeneral
__ambiguous
Explanation:
3. "Turn right here!"
__vague
__overgeneral
__ambiguous
Explanation:
4. (From a Student Code of Conduct- Sexual impropriety in the dorms after 6:00 pm is forbidden.
__vague
__overgeneral
__ambiguous
Explanation:
5. Did Donald win the election? Well, he did get quite a few votes!
__vague
__overgeneral
__ambiguous
Explanation:
6. How are Henry’s finances? Oh, he’s really quite well off!
__vague
__overgeneral
__ambiguous
7. Bertha Belch, as missionary from Africa, will be speaking tonight at the Calvary Chapel. Come and hear Bertha Belch all the way from Africa.
__vague
__overgeneral
__ambiguous
Explanation:
8. Lower Slobovia can’t be a very well-run country. I mean, it’s not particularly democratic!
[Careful: Think about the various aspects of these claims before answering.]
__vague
__overgeneral
__ambiguous
Section II. Definitions
Please indicate whether the following are stipulative, persuasive, lexical or precising definitions.
9. Postmodern means a chaotic and confusing mishmash of images and references that leaves readers and viewers longing for the days of a good, well-told story.
__ stipulative
__ persuasive
__ lexical
__ précising
10. A triangle is a plane figure enclosed by 3 straight lines.
__ stipulative
__ persuasive
__ lexical
__ precising
11. An arid region, for purposes of this study, is any region that receives an average of less than 15 inches of rain per year
__ stipulative
__ persuasive
__ lexical
__ precising
14. A Blanker is someone who sends holiday cards without signatures or personalized messages
__ stipulative
__ persuasive
__ lexical
__ precising
15. Tragedy, in literary terms, means a serious drama that usually ends in disaster nd that focuses on a single character who experiences unexpected reversals in fat, often falling from a position of authority and power because of an unrecognized flaw or misguided action
__ stipulative
__ persuasive
__ lexical
__ précising
Section III. Strategies for Defining
Please indicate whether the following lexical definitions are ostensive definitions, enumerative definitions, definitions by s ...
Name Add name hereConcept Matching From Disease to Treatmen.docxgilpinleeanna
Name: Add name here
Concept Matching: From Disease to Treatment
Using your textbooks, complete the empty squares on the table below to match specific diseases with their pathology, pathophysiology and pharmacological treatment. Be sure to use appropriate medical terminology when adding information. You should review two different sources at a minimum to develop your brief synopses.
Example of completed row:
Disease
Body system
Signs/Symptoms
Pathophysiology
Treatment(s) (Pharm & Other)
Acne vulgaris
Integumentary system
Non-inflammatory comedones or inflammatory papules, pustules or modules. Symptoms can include pain, erythema and tenderness
Release of inflammatory mediators into the skin, with follicle hyperkeratinization, Propionibacterium acne colonization, and excess production of sebum
Depending on severity, topical mediations include benzyol peroxide or retinoid drugs. Hormonal drugs (such as oral contraceptives), and in some cases antibiotics may be used for severe inflammatory acne. Nonpharmacological treatments include dermabrasion or phototherapy
Disease
Body System
Signs/Symptoms
Pathophysiology
Treatment(s)
Atopic Dermatitis
Multiple Sclerosis
Squamous cell carcinoma
Osteoporosis
Osteosarcoma
Rheumatoid arthritis
Epilepsy
Psoriasis
Alzheimer’s Disease
...
Name Abdulla AlsuwaidiITA 160Uncle VanyaMan has been en.docxgilpinleeanna
Name Abdulla Alsuwaidi
I
TA 160
"Uncle Vanya"
“Man has been endowed with reason,
with the power to create, so that he can add to what he's been given.
But up to now, he hasn't been a creator, only a destroyer.
Forests keep disappearing, rivers dry up,
wild life's become extinct, the climate's ruined,
and the land grows poorer and uglier”
The play “Uncle Vanya” written by Anton Chekhov is a pearl of the classics of Russian literature. Anton Chekhov left a great legacy in a form of his plays and short stories for the classics of world literature. Without a shadow of doubt, this masterpiece, written by one of the most prominent the Russian playwrights of his time, should be read with further analysis and discussion. “Uncle Vanya” is a realist play and Chekhov tried to make its scenes as true-to-life as possible. Chekhov spent one year writing “Uncle Vanya” and introduced a number of changes between the years 1896 – 1897. The final version of his play is famous worldwide. The plot of the play narrates a heartbreaking story of how the main hero, Ivan Petrovich Voynitsky or Uncle Vanya that was a rather calm and quiet man undergoes a moral “rebirth” developing a spirit of a rebellion. Uncle Vanya, the main hero of the play, can be characterized as a bitter aging man who spent his life in toil working for his brother-in-law. Chekhov depicted the character of uncle Vanya as a misanthrope who recognized the miserable nature of other characters.
Moreover, Chekhov’s play also involves a number of other important issues that are experienced by the play’s characters. These issues include the feeling of pointless life lacking meaning, missed opportunities, and the most touching feeling of blind admiration. It should be admitted that Chekhov used to create hidden meaning in his plays to make the readers think critically not only of his work but of their lives either. Therefore, in the play, Chekhov made every character individualistic. For instance, the central character in the play, Uncle Vanya, cares about patrimony and the Serebryakov’s family’s property. Throughout the play, uncle Vanya finds himself dismissed and rejected without the right for an opinion. Chekhov also pointed out the suffering of other characters who struggle to change their lives for better. The play consists of a number of personal dramas that are interconnected.
It can be stated that Chekhov included a number of opposite lines in his play such as the choice between obedience or riot, feeling of admiration and disrespect. The following lines from the play demonstrate the feeling of disappointment and understanding the pointlessness of a situation: “”I’m mad — but people who conceal their utter lack of talent, their dullness, their complete heartlessness under the guise of the professor, the purveyor of learned magic — they aren’t mad” (Uncle Vanya). Uncle Vanya is concerned about the wasted years and the thought of how his life could look like in case he used the opportun ...
Name Add name hereHIM 2214 Module 6 Medical Record Abstractin.docxgilpinleeanna
Name: Add name here
HIM 2214 Module 6: Medical Record Abstracting
Instructions: In this medical record abstracting assignment you will first need to download and the records (history & physical, surgery consultation, operative report, pathology report and discharge summary) for a patient with digestive system problems. (Recommend reading them in the order listed).
Save your answers to the following related questions in this document and submit them for this module's assignment.
1. Define the terms diverticulosis and diverticulitis.
2. What is the pathophysiology of diverticulitis?
3. What is a hiatal hernia?
4. Describe some of the signs or symptoms a person with a hiatal hernia might have.
5. What is a pulmonary embolus?
6. What was the etiology (cause) of the pulmonary embolus for this patient?
7. What is gastritis?
8. Which problem is likely a contributor to the patient’s Type II diabetes mellitus?
9. What was the purpose of the barium enema?
10. What does the abbreviation HEENT stand for?
11. What is thrombophlebitis?
12. What is a surgical resection?
13. Define anastomosis.
14. What is ferrous gluconate and what is it used to treat?
15. What condition is the drug Darvocet used to treat?
16. What are electrolytes?
17. What is exogenous obesity?
18. Where is the femoral pulse found/taken?
19. Where is the popliteal pulse found/taken?
20. What is hepatosplenomegaly?
21. Which condition(s) is/are the drug Humulin used to treat?
22. What is an adenocarcinoma?
23. Which condition(s) is/are the drug Lanoxin used to treat?
24. What is the purpose of ordering the blood test PTT?
25. What is a colon stricture?
26. What is/are the etiologies associated with colorectal cancer?
27. What is the medical term for gallstones?
28. Which condition(s) is the drug Zantac used to treat?
29. What does the pathology report indicate about the spread of the carcinoma in this patient?
30. What is the etiology of Type II diabetes mellitus?
· Academic arguments are designed to get someone to agree with the author, who may use pathos (emotion), logos (logic and facts) and ethos (authority and expertise) to persuade.
Academic arguments are not about ranting, screaming or otherwise increasing conflict, but in fact are the opposite: They attempt to help the other person understand what the author believes to be right (opinion) based on the evidence presented (authority, logic, facts).
For your topic for your final paper, what kinds of arguments can you develop for your claim (thesis, main idea)?
Health Record Face Sheet
Record Number:
005
Age:
67
Gender:
Male
Length of Stay:
3 days
Service:
Inpatient Hospital Admission
Disposition:
Home
Discharge Summary
Patient is a 67-year-old male. He saw the doctor recently with abdominal pain and constipation. A barium enema showed diverticulosis and perhaps a stricture near the sigmoid and rectal junction. He was scoped by the doctor, who saw a stricture at that point and sa ...
Name Sophocles, AntigoneMain Characters Antigone, Cre.docxgilpinleeanna
Name:
Sophocles, Antigone
Main Characters: Antigone, Creon (the King), Ismene (Antigone’s sister), the Chorus, the Guard, Haimon (Creon’s and Euridike’s son), Euridike (Creon’s wife/Haimon’s mother), Teiresias (the prophet), the messenger.
1. Aristotle writes that the tragic hero suffers from a harmartia or error. Who is the tragic hero of the play? Why do you think so?
2. Who is in the right? Antigone? Creon? Both? Neither? Why?
3. What makes this play tragic?
4. What is the role of the chorus in this production? How do they fit into the play?
5. What do you think about the way the production differentiates between divine law and human law? Which characters do you think are more closely linked to what (kind of) law?
6. Why is this art? What is the relationship between Antigone and a painting or a statue, such that we can call them both art?
...
N4455 Nursing Leadership and ManagementWeek 3 Assignment 1.docxgilpinleeanna
N4455 Nursing Leadership and Management
Week 3 Assignment 1: Financial Management Case Study v2.2
Name:
Date:
Overview: Financial Management Case Study
One of the important duties of a nurse leader is to manage personnel and personnel budgets. In this assignment, you will assume the role of a nurse manager. You will use given data to make important decisions regarding budgets and staffing.
Some nurse managers have computer spreadsheets or software applications to help them make decisions regarding budgets and staffing. You will only need simple mathematical operations* to perform the needed calculations in this assignment because the scenario has been simplified. Furthermore, some data have been provided for you that a nurse leader might need to gather or compute in a real setting. Still, you will get a glimpse of the complexity of responsibilities nurse leaders shoulder regarding financial management.
· To calculate the percent of the whole a given number represents, follow these steps:
Change the percentage to a decimal number by moving the decimal twice to the left (or dividing by 100).
Multiply the new decimal number by the whole.
Example: What is 30% of 70?
30%= .30; (.30) × 70 = 21
· To find out what percentage a number represents in relation to the whole, follow these steps:
Divide the number by the whole (usually the small number by the large number).
Change the decimal answer to percent by moving the decimal twice to the right (or multiplying by 100).
Example: What percent of 45 is 10?
10 ÷ 45 = .222; so, 10 is 22% of 45.
* You will only need addition, subtraction, multiplication, and division.
Case Study
You are the manager for 3 West, a medical/surgical unit. You have been given the following data to assist you in preparing your budget for the upcoming fiscal year.
Patient Data
ADC: 54
Budget based on 5.4 Avg. HPPD
(5.4 HPPD excludes head nurse and unit secretaries)
Staff Data
Total FTEs
37.0 Variable FTEs
1.0 Nurse Manager
2.2 Unit Secretaries
40.2 Total FTEs
Staffing Mix
RN
65%
LVN
20%
NA
15%
Average Salary Scale per Employee
(Fringe benefits are 35% of salaries)
Nurse Manager
$77,999.00 per year
Registered Nurses (RN)
$36.00 per hour
Licensed Vocational Nurses (LVN)
$24.00 per hour
Nurse Aides (NA)
$13.50 per hour
Unit Secretary (US)
$11.25 per hourRubric
Use this rubric to guide your work on this assignment.
Criteria
Target
Acceptable
Unacceptable
Question 1
Both % and FTEs column totals within ± 2 of correct answers
(13-16 Points)
Either % or FTEs column totals within ± 2 of correct answers
(5-12 points)
Neither % nor FTEs column totals within ± 2 of correct answers
(0-4 points)
Question 2
All column (except Hours and Salary) totals within ± 2 of correct answers
(17-20 Points)
At least 4 column totals within ± 2 of correct answers
(5-16 points)
Less than 4 column totals within ± 2 of correct answers
(0-4 points)
Question 3
A. Table
All ...
Name Habitable Zones – Student GuideExercisesPlease r.docxgilpinleeanna
Name:
Habitable Zones – Student Guide
Exercises
Please read through the background pages entitled Life, Circumstellar Habitable Zones, and The Galactic Habitable Zone before working on the exercises using simulations below.
Circumstellar Zones
Open the Circumstellar Zone Simulator. There are four main panels:
· The top panel simulation displays a visualization of a star and its planets looking down onto the plane of the solar system. The habitable zone is displayed for the particular star being simulated. One can click and drag either toward the star or away from it to change the scale being displayed.
· The General Settings panel provides two options for creating standards of reference in the top panel.
· The Star and Planets Setting and Properties panel allows one to display our own star system, several known star systems, or create your own star-planet combinations in the none-selected mode.
· The Timeline and Simulation Controls allows one to demonstrate the time evolution of the star system being displayed.
The simulation begins with our Sun being displayed as it was when it formed and a terrestrial planet at the position of Earth. One can change the planet’s distance from the Sun either by dragging it or using the planet distance slider.
Note that the appearance of the planet changes depending upon its location. It appears quite earth-like when inside the circumstellar habitable zone (hereafter CHZ). However, when it is dragged inside of the CHZ it becomes “desert-like” while outside it appears “frozen”.
Question 1: Drag the planet to the inner boundary of the CHZ and note this distance from the Sun. Then drag it to the outer boundary and note this value. Lastly, take the difference of these two figures to calculate the “width” of the sun’s primordial CHZ.
CHZ Inner Boundary
CHZ Outer Boundary
Width of CHZ
NAAP – Habitable Zones 1/7
Question 2: Let’s explore the width of the CHZ for other stars. Complete the table below for stars with a variety of masses.
Star Mass (M )
Star Luminosity (L )
CHZ Inner Boundary (AU)
CHZ Outer Boundary (AU)
Width of CHZ (AU)
0.3
0.7
1.0
2.0
4.0
8.0
15.0
Question 3: Using the table above, what general conclusion can be made regarding the location of the CHZ for different types of stars?
Question 4: Using the table above, what general conclusion can be made regarding the width of the CHZ for different types of stars?
Exploring Other Systems
Begin by selecting the system 51 Pegasi. This was the first planet discovered around a star using the radial velocity technique. This technique detects systematic shifts in the wavelengths of absorption lines in the star’s spectra over time due to the motion of the star around the star-planet center of mass. The planet orbiting 51 Pegasi has a mass of at least half Jupiter’s mass.
Question 5: Zoom out so that you can compare this planet to those in our solar system (you can click-hold-drag to change t ...
Name Class Date SKILL ACTIVITY Giving an Eff.docxgilpinleeanna
Name Class Date
SKILL ACTIVITY
Giving an Effective Presentation
Directions: Read the information about oral presentations. Then
complete an outline for your own presentation.
One kind of oral presentation is a speech in which you explain
a position, or opinion, about an issue. After your speech, the
audience asks questions and you answer them. Preparing is the
first step. Use the following list as a guide to prepare.
• Decide what opinion you will take—for or against—and why.
• Write a short opening statement that gives your opinion.
• Gather facts and examples that support your opinion.
• Write a short conclusion that restates your opinion.
• Brainstorm a list of questions that your audience might ask.
Write down answers to the questions.
• Practice your presentation. Keep track of how long your
speech takes.
When you make the presentation, follow these steps:
• Begin with your opening statement.
• Give facts and examples that support your opinion.
• Conclude by stating your opinion again in different words.
• Answer questions from the audience. Listen carefully to make
sure you understand each question.
• While you are speaking, remember to look at your audience.
• Speak loudly and clearly so they can hear you.
Directions: Prepare and give a presentation on the following
topic: Is the increase in temporary employment a good thing for
American workers? Copy the following outline onto your own
paper to begin organizing your ideas.
I. Your opening statement:
II. Facts and examples that support your opinion:
1–5.
III. Your conclusion:
IV. Questions the audience may ask:
1–5.
V. Answers to these questions:
1–5.
BODY%RITUAL%AMONG%THE%NACIREMA%%
Horace%Miner%
%
From%Horace%Miner,%"Body%Ritual%among%the%Nacirema."%Reproduced%by%permission%of%the%
American%Anthropological%Association%from%The%American%Anthropologist,%vol.%58%(1956),%pp.%
503S507.%
%
Most%cultures%exhibit%a%particular%configuration%or%style.%A%single%value%or%pattern%of%perceiving%
the%world%often%leaves%its%stamp%on%several%institutions%in%the%society.%Examples%are%"machismo"%
in%Spanish>influenced%cultures,%"face"%in%Japanese%culture,%and%"pollution%by%females"%in%some%
highland%New%Guinea%cultures.%Here%Horace%Miner%demonstrates%that%"attitudes%about%the%
body"%have%a%pervasive%influence%on%many%institutions%in%Nacireman%society.%
The%anthropologist%has%become%so%familiar%with%the%diversity%of%ways%in%which%different%peoples%
behave%in%similar%situations%that%he%is%not%apt%to%be%surprised%by%even%the%most%exotic%customs.%
In%fact,%if%all%of%the%logically%possible%combinations%of%behavior%have%not%been%found%somewhere%
in%the%world,%he%is%apt%to%suspect%that%they%must%be%present%in%some%yet%undescribed%tribe.%%This%
point%has,%in%fact,%been%expressed%with%respect%to%clan%organization%by%Murdock.%In%this%light,%
the%magical%beliefs%and%practices%of%the%Nacirema%present%such%unusual%aspect ...
Name Speech Title I. Intro A) Atten.docxgilpinleeanna
Name:
Speech Title
I. Intro:
A) Attention getter --
B) Purpose Statement --
C) Thesis --
II. BODY
A) Main Point Number 1:
a)
b)
c)
transition --
B) Main Point Number 2:
a)
b)
c)
transition --
C) Main Point Number 3:
a)
b)
c)
transition –
III. CONCLUSION:
A) Summary statement --
B) Memorable conclusion --
References
List all references on a separate page with the word “References” centered at the top.
Name: Suepin Nguyen
Hygiene Saves Lives
I. Intro: To give an informational speech about Ignaz Philipp Semmelweis
A) Attention getter – On each square centimeter of your skin, there are about 1,500
bacteria. That’s a lot of germs. According to a study conducted by Michigan State
University researchers, 95% of people do not properly wash their hands long enough to
kill the infection causing germs and bacteria (Jaslow, “95 Percent of People Wash Their
Hands Improperly: Are You One of Them?”).
B) Purpose Statement - That’s gross. While I can’t force you to wash your hands, perhaps
today I can help you realize just how much history and evidence is behind this crucial
bathroom ritual.
C) Thesis – Today, I will inform you all about Ignaz Philipp Semmelweis by discussing first
about his practice and studies, second about his scientific methods that saved a lot of
lives, and third about the germ theory we all take for granted.
II. BODY:
A) Main Point Number 1: To begin, I want to introduce Ignaz Philipp Semmelweis.
a) Ignaz Semmelweis became a physician and earned his doctorate degree in medicine
in 1844. This time period was known as the start of the golden age of the physician
scientist” (NPR.org). This means that doctors were expected to have scientific
training. Doctors were more interested in numbers and collecting data (Justin Lessler,
an assistant professor at Johns Hopkins School of Public Health).
b) In 1846, Dr. Semmelweis showed up for his new job in the maternity clinic at the
General Hospital in Vienna. Due to the time period, Dr. Semmelweis thought like a
physician scientist and wanted to figure out why so many women in maternity wards
were dying from childbed fever (Davis, “The Doctor Who Championed
Hand-Washing and Briefly Saved Lives”).
c) So what did he do? He collected data of his own. He studied two maternity wards in
the hospital. One was staffed by all male doctors and medical students, and the other
by female midwives. He tallied up the number of deaths in each ward and found that
women in the clinic staffed by doctors and medical students died at a rate 5 times ...
n engl j med 352;16www.nejm.org april 21, .docxgilpinleeanna
The document discusses the case of Terri Schiavo, who was in a persistent vegetative state for 15 years. It summarizes the key facts of her medical condition and diagnosis, the disagreement between her husband and parents about continuing life-sustaining treatment, and the multiple legal appeals involved in the case. It concludes that while both sides wanted what was right for Terri, the central issue is determining what the patient herself would have wanted, which the courts found clear evidence for in Terri's case based on prior statements to her husband.
Name:
Class:
Date:
HUMR 211 Spring 2018 - Midterm
Copyright Cengage Learning. Powered by Cognero. Page 1
Indicate the answer choice that best completes the statement or answers the question.
1. Each of the following is considered the business of social welfare except:
a. telling people how to live their lives.
b. ending all types of discrimination and oppression.
c. providing child-care services for parents who work outside the home.
d. rehabilitating people who are addicted to alcohol or drugs.
2. Which of the following statements is consistent with the residual view of social welfare?
a. Recipients are viewed as being entitled to social services and financial help.
b. Social services and financial help should be provided to an individual on a short-term basis, primarily during
emergencies.
c. It is associated with the belief that an individual’s difficulties are due to causes largely beyond his or her
control.
d. There is no stigma attached to receiving funds or services. In this view, when difficulties arise, causes are
sought in the society, and efforts are focused on improving the social institutions within which the individual
functions.
3. Which of the following is consistent with an institutional view of social welfare?
a. Social services and financial aid should be provided only when other measures or efforts have been exhausted.
b. Causes for client’s difficulties are sought in the society.
c. Clients are to blame for their predicaments because of personal inadequacies.
d. Recipients are required to perform certain low-grade work assignments to receive financial aid.
4. The Elizabethan Poor Law of 1601 established three categories of relief recipients:
a. the insane, the poor, and the disabled.
b. the insane, dependent children, and the poor.
c. the able-bodied poor, the impotent poor, and dependent children.
d. the disabled, wives of prisoners, and the poor.
5. Before 1930 social services and financial assistance for people in need were provided primarily by _____.
a. churches and voluntary organizations
b. federal and state institutions
c. richer European countries
d. the military
6. President Clinton and the Republican-controlled Congress abolished Aid to Families with Dependent Children (AFDC)
in 1996 and replaced it with:
a. Welfare Services for Single Mothers.
b. Temporary Assistance to Needy Families.
c. Conditional Aid to Single Parents.
d. Assistance for Poor Families.
Indicate whether the statement is true or false.
Name:
Class:
Date:
HUMR 211 Spring 2018 - Midterm
Copyright Cengage Learning. Powered by Cognero. Page 2
7. One of the businesses of social welfare is to provide adequate housing for the homeless.
a. True
b. False
8. In the past, social welfare has been more of a pure sci ...
NAME ----------------------------------- CLASS -------------- .docxgilpinleeanna
The document discusses foreign companies establishing manufacturing operations in the United States. It notes that while some US jobs have moved overseas, many foreign companies are also creating new jobs in the US for reasons like proximity to consumers, business incentives from local communities, and generally better business conditions. The article provides examples of Mexican, Japanese, and European companies that have expanded manufacturing in the US and employed thousands of American workers.
Name Understanding by Design (UbD) TemplateStage 1—Desir.docxgilpinleeanna
This document summarizes a proposed change by a Little League commission to eliminate scoring in games. The commission believes this will reduce stress in children, but the summary argues that:
1) There is no evidence scoring causes stress, and children face stress from many sources unrelated to baseball.
2) Removing scoring upends decades of tradition and takes away important lessons about effort and reward for children.
3) Parents will likely oppose the change as it diminishes their experiences supporting and bonding with their children over the game.
Name MUS108 Music Cultures of the World .docxgilpinleeanna
Name MUS108 Music Cultures of the World Points /40
Winter 2018 Exam 2
(Take Home, open notes – NOT open book)
Matching – (1 point each, 8 points total)
Match each term with one of the following cultures by writing the corresponding letter in the blank space:
A. India
B. Bali
C. Ireland
1. _______sitar
2._______kilitan telu
3._______kecak
4._______gamelan
5._______Sean-nós
6._______beleganjur
7._______alap
8._______céilí
9. Describe Irish music. Please include information from each of the 3 different “eras” discussed in the book. (4 points)
10. Describe a raga in detail, with much attention paid to form, instruments, and development/barhat. (4 points)
11. What effect did the potato famine have on the culture and music of Ireland? (6 points)
12. What is ombak? Please explain it in detail, including how it is achieved. (4 points)
13. What is the difference between ceili and session? (2 points)
5. Listening Exercise – 12 points ( 4 points each) Sound Files are on Moodle!!!
Listen to the sound clips. See if you can guess what culture/tradition they come from. You may even be able to guess the type/form of music. Please write down your thought process. What are the clues? Why might it be from one particular culture? Listen to instruments, form, texture. The right answer is not the goal. What I need to see is your reasoning. You could get full credit even if you guess the wrong culture, provided your reasoning is sound. Complete sentences are not needed; lists are fine.
Clip 1.
Clip 2.
Clip 3.
...
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
2. 11 6 0 45 1115 12 05
.
. .
.
. . ( . , . )
5
6
(approx) normal.
n
n
n
7
.
15
270
2.977
n
3. 8
9
p
p of S and
pp
.p
637
1000
637
10
p p p
p
p
2
p p
n
1
4. p p
p p
n
1
p
p p
n
1
607 667
11
n
12
p
p p
p p
np
3 3
1
015 163
5. p
p p
n
p p
n
1 1
049 129
4 - PRELIMINARY DATA SCREENING
4.1 Introduction: Problems in Real Data
Real datasets often contain errors, inconsistencies in responses
or measurements, outliers, and missing values. Researchers
should conduct thorough preliminary data screening to identify
and remedy potential problems with their data prior to running
the data analyses that are of primary interest. Analyses based on
a dataset that contains errors, or data that seriously violate
assumptions that are required for the analysis, can yield
misleading results.
Some of the potential problems with data are as follows: errors
in data coding and data entry, inconsistent responses, missing
values, extreme outliers, nonnormal distribution shapes, within-
group sample sizes that are too small for the intended analysis,
and nonlinear relations between quantitative variables.
Problems with data should be identified and remedied (as
adequately as possible) prior to analysis. A research report
should include a summary of problems detected in the data and
any remedies that were employed (such as deletion of outliers
or data transformations) to address these problems.
4.2 Quality Control During Data Collection
There are many different possible methods of data collection. A
psychologist may collect data on personality or attitudes by
asking participants to answer questions on a questionnaire. A
6. medical researcher may use a computer-controlled blood
pressure monitor to assess systolic blood pressure (SBP) or
other physiological responses. A researcher may record
observations of animal behavior. Physical measurements (such
as height or weight) may be taken. Most methods of data
collection are susceptible to recording errors or artifacts, and
researchers need to know what kinds of errors are likely to
occur.
For example, researchers who use self-report data to do research
on personality or attitudes need to be aware of common
problems with this type of data. Participants may distort their
answers because of social desirability bias; they may
misunderstand questions; they may not remember the events that
they are asked to report about; they may deliberately try to
“fake good” or “fake bad”; they may even make random
responses without reading the questions. A participant may
accidentally skip a question on a survey and, subsequently, use
the wrong lines on the answer sheet to enter each response; for
example, the response to Question 4 may be filled in as Item 3
on the answer sheet, the response to Question 5 may be filled in
as Item 4, and so forth. In addition, research assistants have
been known to fill in answer sheets themselves instead of
having the participants complete them. Good quality control in
the collection of self-report data requires careful consideration
of question wording and response format and close supervision
of the administration of surveys. Converse and Presser (1999);
Robinson, Shaver, and Wrightsman (1991); and Stone, Turkkan,
Kurtzman, Bachrach, and Jobe (1999) provide more detailed
discussion of methodological issues in the collection of self-
report data.
For observer ratings, it is important to consider issues of
reactivity (i.e., the presence of an observer may actually change
the behavior that is being observed). It is important to establish
good interobserver reliability through training of observers and
empirical assessment of interobserver agreement. See Chapter
21 in this book for discussion of reliability, as well as Aspland
7. and Gardner (2003), Bakeman (2000), Gottman and Notarius
(2002), and Reis and Gable (2000) for further discussion of
methodological issues in the collection of observational data.
For physiological measures, it is necessary to screen for
artifacts (e.g., when electroencephalogram electrodes are
attached near the forehead, they may detect eye blinks as well
as brain activity; these eye blink artifacts must be removed from
the electroencephalogram signal prior to other processing). See
Cacioppo, Tassinary, and Berntson (2000) for methodological
issues in the collection of physiological data.
This discussion does not cover all possible types of
measurement problems, of course; it only mentions a few of the
many possible problems that may arise in data collection.
Researchers need to be aware of potential problems or sources
of artifact associated with any data collection method that they
use, whether they use data from archival sources, experiments
with animals, mass media, social statistics, or other methods not
mentioned here.
4.3 Example of an SPSS Data Worksheet
The dataset used to illustrate data-screening procedures in this
chapter is named bpstudy .sav. The scores appear in Table 4.1,
and an image of the corresponding SPSS worksheet appears in
Figure 4.1.
This file contains selected data from a dissertation that assessed
the effects of social stress on blood pressure (Mooney, 1990).
The most important features in Figure 4.1 are as follows. Each
row in the Data View worksheet corresponds to the data for 1
case or 1 participant. In this example, there are a total of N = 65
participants; therefore, the dataset has 65 rows. Each column in
the Data View worksheet corresponds to a variable; the SPSS
variable names appear along the top of the data worksheet. In
Figure 4.1, scores are given for the following SPSS variables:
idnum, GENDER, SMOKE (smoking status), AGE, SYS1
(systolic blood pressure or SBP at Time 1), DIA1 (diastolic
blood pressure, DBP, at Time 1), HR1 (heart rate at Time 1),
and WEIGHT. The numerical values contained in this data file
8. were typed into the SPSS Data View worksheet by hand.
Table 4.1 Data for the Blood Pressure/Social Stress Study
SOURCE: Mooney (1990).
NOTES: 1. idnum = arbitrary, unique identification number for
each participant. 2. GENDER was coded 1 = male, 2 = female.
3. SMOKE was coded 1 = nonsmoker, 2 = light smoker, 3 =
moderate smoker, 4 = heavy or regular smoker. 4. AGE = age in
years. 5. SYS1 = systolic blood pressure at Time 1/baseline. 6.
DIA1 = diastolic blood pressure at Time 1/baseline. 7. HR1 =
heart rate at Time 1/baseline. 8. WEIGHT = body weight in
pounds.
The menu bar across the top of the SPSS Data View worksheet
in Figure 4.1 can be used to select menus for different types of
procedures. The pull-down menu for <File> includes options
such as opening and saving data files. The pull-down menus for
<Analyze> and <Graphs> provide access to SPSS procedures for
data analysis and graphics, respectively.
The two tabs near the lower left-hand corner of the Data View
of the SPSS worksheet can be used to toggle back and forth
between the Data View (shown in Figure 4.1) and the Variable
View (shown in Figure 4.2) versions of the SPSS data file.
The Variable View of an SPSS worksheet, shown in Figure 4.2,
provides a place to document and describe the characteristics of
each variable, to supply labels for variables and score values,
and to identify missing values.
For example, examine the row of the Variable View worksheet
that corresponds to the variable named GENDER. The scores on
this variable were numerical; that is, the scores are in the form
of numbers (rather than alphabetic characters). Other possible
variable types include dates or string variables that consist of
alphabetic characters instead of numbers. If the researcher
needs to identify a variable as string or date type, he or she
clicks on the cell for Variable Type and selects the appropriate
variable type from the pull-down menu list. In the datasets used
9. as examples in this textbook, almost all the variables are
numerical.
Figure 4.1 SPSS Worksheet for the Blood Pressure/Social
Stress Study (Data View) in bpstudy.sav
SOURCE: Mooney (1990).
The Width column indicates how many significant digits the
scores on each variable can have. For this example, the
variables GENDER and SMOKE were each allowed a one-digit
code, the variable AGE was allowed a two-digit code, and the
remaining variables (heart rate, blood pressure, and body
weight) were each allowed three digits. The Decimals column
indicates how many digits are displayed after the decimal point.
All the variables in this dataset (such as age in years and body
weight in pounds) are given to the nearest integer value, and so
all these variables are displayed with 0 digits to the right of the
decimal place. If a researcher has a variable, such as grade point
average (GPA), that is usually reported to two decimal places
(as in GPA = 2.67), then he or she would select 2 as the number
of digits to display after the decimal point.
Figure 4.2 SPSS Worksheet for the Blood Pressure/Social
Stress Study (Variable View)
The next column, Label, provides a place where each variable
name can be associated with a longer descriptive label. This is
particularly helpful when brief SPSS variable names are not
completely self-explanatory. For example, “body weight in
pounds” appears as a label for the variable WEIGHT. The
Values column provides a place where labels can be associated
with the individual score values of each variable; this is
primarily used with nominal or categorical variables. Figure 4.3
shows the dialog window that opens up when the user clicks on
the cell for Values for the variable GENDER. To associate each
score with a verbal label, the user types in the score (such as 1)
and the corresponding verbal label (such as male) and then
10. clicks the Add button to add this label to the list of value labels.
When all the labels have been specified, clicking on OK returns
to the main Variable View worksheet. In this example, a score
of 1 on GENDER corresponds to male and a score of 2 on
GENDER corresponds to female.
The column headed Missing provides a place to identify scores
as codes for missing values. Consider the following example to
illustrate the problem that arises in data analysis when there are
missing values. Suppose that a participant did not answer the
question about body weight. If the data analyst enters a value of
0 for the body weight of this person who did not provide
information about body weight and does not identify 0 as a code
for a missing value, this value of 0 would be included when
SPSS sums the scores on body weight to compute a mean for
weight. The sample mean is not robust to outliers; that is, a
sample mean for body weight will be substantially lower when a
value of 0 is included for a participant than it would be if that
value of 0 was excluded from the computation of the sample
mean. What should the researcher do to make sure that missing
values are not included in the computation of sample statistics?
SPSS provides two different ways to handle missing score
values. The first option is to leave the cell in the SPSS Data
View worksheet that corresponds to the missing score blank. In
Figure 4.1, participant number 12 did not answer the question
about smoking status; therefore, the cell that corresponds to the
response to the variable SMOKE for Participant 12 was left
blank. By default, SPSS treats empty cells as “system missing”
values. If a table of frequencies is set up for scores on SMOKE,
the response for Participant 12 is labeled as a missing value. If
a mean is calculated for scores on smoking, the score for
Participant 12 is not included in the computation as a value 0;
instead, it is omitted from the computation of the sample mean.
Figure 4.3 Value Labels for Gender
A second method is available to handle missing values; it is
possible to use different code numbers to represent different
11. types of missing data. For example, a survey question that is a
follow-up about the amount and frequency of smoking might be
coded 9 if it was not applicable to an individual (because that
individual never smoked), 99 if the question was not asked
because the interviewer ran out of time, and 88 if the
respondent refused to answer the question. For the variable
WEIGHT, body weight in pounds, a score of 999 was identified
as a missing value by clicking on the cell for Missing and then
typing a score of 999 into one of the windows for missing
values; see the Missing Values dialog window in Figure 4.4. A
score of 999 is defined as a missing value code, and therefore,
these scores are not included when statistics are calculated. It is
important to avoid using codes for missing values that
correspond to possible valid responses. Consider the question,
How many children are there in a household? It would not make
sense to use a score of 0 or a score of 9 as a code for missing
values, because either of these could correspond to the number
of children in some households. It would be acceptable to use a
code of 99 to represent a missing value for this variable because
no single-family household could have such a large number of
children.
The next few columns in the SPSS Variable View worksheet
provide control over the way the values are displayed in the
Data View worksheet. The column headed Columns indicates
the display width of each column in the SPSS Data View
worksheet, in number of characters. This was set at eight
characters wide for most variables. The column headed Align
indicates whether scores will be shown left justified, centered,
or (as in this example) right justified in each column of the
worksheet.
Finally, the column in the Variable View worksheet that is
headed Measure indicates the level of measurement for each
variable. SPSS designates each numerical variable as nominal,
ordinal, or scale (scale is equivalent to interval/ratio level of
measurement, as described in Chapter 1 of this textbook). In
this sample dataset, idnum (an arbitrary and unique
12. identification number for each participant) and GENDER were
identified as categorical or nominal variables. Smoking status
(SPSS variable name SMOKE) was coded on an ordinal scale
from 1 to 4, with 1 = nonsmoker, 2 = light smoker, 3 = moderate
smoker, and 4 = heavy smoker. The other variables (heart rate,
blood pressure, body weight, and age) are quantitative and
interval/ratio, so they were designated as “scale” in level of
measurement.
Figure 4.4 Missing Values for Weight
4.4 Identification of Errors and Inconsistencies
The SPSS data file should be proofread and compared with
original data sources (if these are accessible) to correct errors in
data coding or data entry. For example, if self-report data are
obtained using computer-scorable answer sheets, the
correspondence between scores on these answer sheets and
scores in the SPSS data file should be verified. This may
require proofreading data line by line and comparing the scores
with the data on the original answer sheets. It is helpful to have
a unique code number associated with each case so that each
line in the data file can be matched with the corresponding
original data sheet.
Even when line-by-line proofreading has been done, it is useful
to run simple exploratory analyses as an additional form of data
screening. Rosenthal (cited in D. B. Wright, 2003) called this
process of exploration “making friends with your data.”
Subsequent sections of this chapter show how examination of
the frequency distribution tables and graphs provides an
overview of the characteristics of people in this sample—for
example, how many males and females were included in the
study, how many nonsmokers versus heavy smokers were
included, and the range of scores on physiological responses
such as heart rate.
Examining response consistency across questions or
measurements is also useful. If a person chooses the response “I
have never smoked” to one question and then reports smoking
13. 10 cigarettes on an average day in another question, these
responses are inconsistent. If a participant’s responses include
numerous inconsistencies, the researcher may want to consider
removing that participant’s data from the data file.
On the basis of knowledge of the variables and the range of
possible response alternatives, a researcher can identify some
responses as “impossible” or “unlikely.” For example, if
participants are provided a choice of the following responses to
a question about smoking status: 1 = nonsmoker, 2 = light
smoker, 3 = moderate smoker, and 4 = heavy smoker, and a
participant marks response number “6” for this question, the
value of 6 does not correspond to any of the response
alternatives provided for the question. When impossible,
unlikely, or inconsistent responses are detected, there are
several possible remedies. First, it may be possible to go back
to original data sheets or experiment logbooks to locate the
correct information and use it to replace an incorrect score
value. If that is not possible, the invalid score value can be
deleted and replaced with a blank cell entry or a numerical code
that represents a missing value. It is also possible to select out
(i.e., temporarily or permanently remove) cases that have
impossible or unlikely scores.
4.5 Missing Values
Journal editors and funding agencies now expect more
systematic evaluation of missing values than was customary in
the past. SPSS has a Missing Values add-on procedure to assess
the amount and pattern of missing data and replace missing
scores with imputed values. Research proposals should include
a plan for identification and handling of missing data; research
reports should document the amount and pattern of missing data
and imputation procedures for replacement. Within the SPSS
program, an empty or blank cell in the data worksheet is
interpreted as a System Missing value. Alternatively, as
described earlier, the Missing Value column in the Variable
View worksheet in SPSS can be used to identify some specific
numerical codes as missing values and to use different
14. numerical codes to correspond to different types of missing
data. For example, for a variable such as verbal Scholastic
Aptitude Test (SAT) score, codes such as 888 = student did not
take the SAT and 999 = participant refused to answer could be
used to indicate different reasons for the absence of a valid
score.
Ideally, a dataset should have few missing values. A systematic
pattern of missing observations suggests possible bias in
nonresponse. For example, males might be less willing than
females to answer questions about negative emotions such as
depression; students with very low SAT scores may refuse to
provide information about SAT performance more often than
students with high SAT scores. To assess whether missing
responses on depression are more common among some groups
of respondents or are associated with scores on some other
variable, the researcher can set up a variable that is coded 1
(respondent answered a question about depression) versus 0
(respondent did not answer the question about depression).
Analyses can then be performed to see whether this variable,
which represents missing versus nonmissing data on one
variable, is associated with scores on any other variable. If the
researcher finds, for example, that a higher proportion of men
than women refused to answer a question about depression, it
signals possible problems with generalizability of results; for
example, conclusions about depression in men can be
generalized only to the kinds of men who are willing to answer
such questions.
It is useful to assess whether specific individual participants
have large numbers of missing scores; if so, data for these
participants could simply be deleted. Similarly, it may be useful
to see whether certain variables have very high nonresponse
rates; it may be necessary to drop these variables from further
analysis.
When analyses involving several variables (such as
computations of all possible correlations among a set of
variables) are performed in SPSS, it is possible to request either
15. listwise or pairwise deletion. For example, suppose that the
researcher wants to use the bivariate correlation procedure in
SPSS to run all possible correlations among variables named
V1, V2, V3, and V4. If listwise deletion is chosen, the data for
a participant are completely ignored when all these correlations
are calculated if the participant has a missing score on any one
of the variables included in the list. In pairwise deletion, each
correlation is computed using data from all the participants who
had nonmissing values on that particular pair of variables. For
example, suppose that there is one missing score on the variable
V1. If listwise deletion is chosen, then the data for the
participant who had a missing score on V1 are not used to
compute any of the correlations (between V1 and V2, V2 and
V3, V2 and V4, etc.). On the other hand, if pairwise deletion is
chosen, the data for the participant who is missing a score on
V1 cannot be used to calculate any of the correlations that
involve V1 (e.g., V1 with V2, V1 with V3, V1 with V4), but the
data from this participant will be used when correlations that
don’t require information about V1 are calculated (correlations
between V2 and V3, V2 and V4, and V3 and V4). When using
listwise deletion, the same number of cases and subset of
participants are used to calculate all the correlations for all
pairs of variables. When using pairwise deletion, depending on
the pattern of missing values, each correlation may be based on
a different N and a different subset of participants than those
used for other correlations.
The default for handling missing data in most SPSS procedures
is listwise deletion. The disadvantage of listwise deletion is that
it can result in a rather small N of participants, and the
advantage is that all correlations are calculated using the same
set of participants. Pairwise deletion can be selected by the
user, and it preserves the maximum possible N for the
computation of each correlation; however, both the number of
participants and the composition of the sample may vary across
correlations, and this can introduce inconsistencies in the values
of the correlations (as described in more detail by Tabachnick &
16. Fidell, 2007).
When a research report includes a series of analyses and each
analysis includes a different set of variables, the N of scores
that are included may vary across analyses (because different
people have missing scores on each variable). This can raise a
question in readers’ minds: Why do the Ns change across pages
of the research report? When there are large numbers of missing
scores, quite different subsets of data may be used in each
analysis, and this may make the results not comparable across
analyses. To avoid these potential problems, it may be
preferable to select out all the cases that have missing values on
all the variables that will be used ahead of time, so that the
same subset of participants (and the same N of scores) are used
in all the analyses in a paper.
The default in SPSS is that cases with system missing values or
scores that are specifically identified as missing values are
excluded from computations, but this can result in a substantial
reduction in the sample size for some analyses. Another way to
deal with missing data is by substitution of a reasonable
estimated score value to replace each missing response. Missing
value replacement can be done in many different ways; for
example, the mean score on a variable can be substituted for all
missing values on that variable, or estimated values can be
calculated separately for each individual participant using
regression methods to predict that person’s missing score from
her or his scores on other, related variables. This is often called
imputation of missing data. Procedures for missing value
replacement can be rather complex (Schafer, 1997, 1999;
Schafer & Olsen, 1998).
Tabachnick and Fidell (2007) summarized their discussion of
missing value replacement by saying that the seriousness of the
problem of missing values depends on “the pattern of missing
data, how much is missing, and why it is missing” (p. 62). They
also noted that the decision about how to handle missing data
(e.g., deletion of cases or variables, or estimation of scores to
replace missing values) is “a choice among several bad
17. alternatives” (p. 63). If some method of imputation or
estimation is employed to replace missing values, it is desirable
to repeat the analysis with the missing values omitted. Results
are more believable, of course, if they are essentially the same
with and without the replacement scores.
4.6 Empirical Example of Data Screening for Individual
Variables
In this textbook, variables are treated as either categorical or
quantitative (see Chapter 1 for a review of this distinction).
Different types of graphs and descriptive statistics are
appropriate for use with categorical versus quantitative
variables, and for that reason, data screening is discussed
separately for categorical and quantitative variables.
4.6.1 Frequency Distribution Tables
For both categorical (nominal) and quantitative (scale)
variables, a table of frequencies can be obtained to assess the
number of persons or cases who had each different score value.
These frequencies can be converted to proportions or
percentages. Examination of a frequency table quickly provides
answers to the following questions about each categorical
variable: How many groups does this variable represent? What
is the number of persons in each group? Are there any groups
with ns that are too small for the group to be used in analyses
that compare groups (e.g., analysis of variance, ANOVA)?
If a group with a very small number (e.g., 10 or fewer) cases is
detected, the researcher needs to decide what to do with the
cases in that group. The group could be dropped from all
analyses, or if it makes sense to do so, the small n group could
be combined with one or more of the other groups (by recoding
the scores that represent group membership on the categorical
variable).
For both categorical and quantitative variables, a frequency
distribution also makes it possible to see if there are any
“impossible” score values. For instance, if the categorical
variable GENDER on a survey has just two response options, 1
= male and 2 = female, then scores of “3” and higher are not
18. valid or interpretable responses. Impossible score values should
be detected during proofreading, but examination of frequency
tables provides another opportunity to see if there are any
impossible score values on categorical variables.
Figure 4.5 SPSS Menu Selections: <Analyze> → <Descriptive
Statistics> → <Frequencies>
Figure 4.6 SPSS Dialog Window for the Frequencies
Procedure
SPSS was used to obtain a frequency distribution table for the
variables GENDER and AGE. Starting from the data worksheet
view (as shown in Figure 4.1), the following menu selections
(as shown in Figure 4.5) were made: <Analyze> → <Descriptive
Statistics> → <Frequencies>.
The SPSS dialog window for the Frequencies procedure appears
in Figure 4.6. To specify which variables are included in the
request for frequency tables, the user points to the names of the
two variables (GENDER and AGE) and clicks the right-pointing
arrow to move these variable names into the right-hand window.
Output from this procedure appears in Figure 4.7.
In the first frequency table in Figure 4.7, there is one
“impossible” response for GENDER. The response alternatives
provided for the question about gender were 1 = male and 2 =
female, but a response of 3 appears in the summary table; this
does not correspond to a valid response option. In the second
frequency table in Figure 4.7, there is an extreme score (88
years) for AGE. This is a possible, but unusual, age for a
college student. As will be discussed later in this chapter,
scores that are extreme or unusual are often identified as
outliers and sometimes removed from the data prior to doing
other analyses.
4.6.2 Removal of Impossible or Extreme Scores
In SPSS, the Select Cases command can be used to remove
cases from a data file prior to other analyses. To select out the
participant with a score of “3” for GENDER and also the
19. participant with an age of 88, the following SPSS menu
selections (see Figure 4.8) would be used: <Data> → <Select
Cases>.
The initial SPSS dialog window for Select Cases appears in
Figure 4.9.
A logical “If” conditional statement can be used to exclude
specific cases. For example, to exclude the data for the person
who reported a value of “3” for GENDER, click the radio button
for “If condition is satisfied” in the first Select Cases dialog
window. Then, in the “Select Cases: If” window, type in the
logical condition “GENDER ~= 3.” The symbol “~=” represents
the logical comparison “not equal to”; thus, this logical “If”
statement tells SPSS to include the data for all participants
whose scores for GENDER are not equal to “3.” The entire line
of data for the person who reported “3” as a response to
GENDER is (temporarily) filtered out or set aside as a result of
this logical condition.
It is possible to specify more than one logical condition. For
example, to select cases that have valid scores on GENDER and
that do not have extremely high scores on AGE, we could set up
the logical condition “GENDER ~= 3 and AGE < 70,” as shown
in Figures 4.10 and 4.11. SPSS evaluates this logical statement
for each participant. Any participant with a score of “3” on
GENDER and any participant with a score greater than or equal
to 70 on AGE is excluded or selected out by this Select If
statement.
When a case has been selected out using the Select Cases
command, a crosshatch mark appears over the case number for
that case (on the far left-hand side of the SPSS data worksheet).
Cases that are selected out can be temporarily filtered or
permanently deleted. In Figure 4.12, the SPSS data worksheet is
shown as it appears after the execution of the Data Select If
commands just described. Case number 11 (a person who had a
score of 3 on GENDER) and case number 15 (a person who had
a score of 88 on AGE) are now shown with a crosshatch mark
through the case number in the left-hand column of the data
20. worksheet. This crosshatch indicates that unless the Select If
condition is explicitly removed, the data for these 2 participants
will be excluded from all future analyses. Note that the original
N of 65 cases has been reduced to an N of 63 by this Data
Select If statement. If the researcher wants to restore
temporarily filtered cases to the sample, it can be done by
selecting the radio button for All Cases in the Select Cases
dialog window.
Figure 4.7 Output From the SPSS Frequencies Procedure
(Prior to Removal of “Impossible” Score Values)
4.6.3 Bar Chart for a Categorical Variable
For categorical or nominal variables, a bar chart can be used to
represent the distribution of scores graphically. A bar chart for
GENDER was created by making the following SPSS menu
selections (see Figure 4.13): <Graphs> → <Legacy Dialogs> →
<Bar [Chart]>.
Figure 4.8 SPSS Menu Selections for <Data> → <Select
Cases> Procedure
The first dialog window for the bar chart procedure appears in
Figure 4.14. In this example, the upper left box was clicked in
the Figure 4.14 dialog window to select the “Simple” type of
bar chart; the radio button was selected for “Summaries for
groups of cases”; then, the Define button was clicked. This
opened the second SPSS dialog window, which appears in
Figure 4.15.
Figure 4.9 SPSS Dialog Windows for the Select Cases
Command
Figure 4.10 Logical Criteria for Select Cases
NOTE: Include only persons who have a score for GENDER
that is not equal to 3 and who have a score for AGE that is less
than 70.
21. Figure 4.11 Appearance of the Select Cases Dialog Window
After Specification of the Logical “If” Selection Rule
To specify the form of the bar chart, use the cursor to highlight
the name of the variable that you want to graph, and click on
the arrow that points to the right to move this variable name
into the window under Category Axis. Leave the radio button
selection as the default choice, “Bars represent N of cases.”
This set of menu selections will yield a bar graph with one bar
for each group; for GENDER, this is a bar graph with one bar
for males and one for females. The height of each bar represents
the number of cases in each group. The output from this
procedure appears in Figure 4.16. Note that because the invalid
score of 3 has been selected out by the prior Select If statement,
this score value of 3 is not included in the bar graph in Figure
4.16. A visual examination of a set of bar graphs, one for each
categorical variable, is a useful way to detect impossible values.
The frequency table and bar graphs also provide a quick
indication of group size; in this dataset, there are N = 28 males
(score of 1 on GENDER) and N = 36 females (score of 2 on
GENDER). The bar chart in Figure 4.16, like the frequency
table in Figure 4.7, indicates that the male group had fewer
participants than the female group.
Figure 4.12 Appearance of SPSS Data Worksheet After the
Select Cases Procedure in Figure 4.11
4.6.4 Histogram for a Quantitative Variable
For a quantitative variable, a histogram is a useful way to assess
the shape of the distribution of scores. As described in Chapter
3, many analyses assume that scores on quantitative variables
are at least approximately normally distributed. Visual
examination of the histogram is a way to evaluate whether the
distribution shape is reasonably close to normal or to identify
the shape of a distribution if it is quite different from normal.
In addition, summary statistics can be obtained to provide
information about central tendency and dispersion of scores.
22. The mean (M), median, or mode can be used to describe central
tendency; the range, standard deviation (s or SD), or variance
(s2) can be used to describe variability or dispersion of scores.
A comparison of means, variances, and other descriptive
statistics provides the information that a researcher needs to
characterize his or her sample and to judge whether the sample
is similar enough to some broader population of interest so that
results might possibly be generalizable to that broader
population (through the principle of “proximal similarity,”
discussed in Chapter 1). If a researcher conducts a political poll
and finds that the range of ages of persons in the sample is from
age 18 to 22, for instance, it would not be reasonable to
generalize any findings from that sample to populations of
persons older than age 22.
Figure 4.13 SPSS Menu Selections for the <Graphs> →
<Legacy Dialogs> → <Bar [Chart]> Procedure
Figure 4.14 SPSS Bar Charts Dialog Window
Figure 4.15 SPSS Define Simple Bar Chart Dialog Window
Figure 4.16 Bar Chart: Frequencies for Each Gender Category
When the distribution shape of a quantitative variable is
nonnormal, it is preferable to assess central tendency and
dispersion of scores using graphic methods that are based on
percentiles (such as a boxplot, also called a box and whiskers
plot). Issues that can be assessed by looking at frequency tables,
histograms, or box and whiskers plots for quantitative scores
include the following:
1. Are there impossible or extreme scores?
2. Is the distribution shape normal or nonnormal?
3. Are there ceiling or floor effects? Consider a set of test
scores. If a test is too easy and most students obtain scores of
90% and higher, the distribution of scores shows a “ceiling
23. effect”; if the test is much too difficult, most students will
obtain scores of 10% and below, and this would be called a
“floor effect.” Either of these would indicate a problem with the
measurement, in particular, a lack of sensitivity to individual
differences at the upper end of the distribution (when there is a
ceiling effect) or the lower end of the distribution (when there
is a floor effect).
4. Is there a restricted range of scores? For many measures,
researchers know a priori what the minimum and maximum
possible scores are, or they have a rough idea of the range of
scores. For example, suppose that Verbal SAT scores can range
from 250 to 800. If the sample includes scores that range from
550 to 580, the range of Verbal SAT scores in the sample is
extremely restricted compared with the range of possible scores.
Generally, researchers want a fairly wide range of scores on
variables that they want to correlate with other variables. If a
researcher wants to “hold a variable constant”—for example, to
limit the impact of age on the results of a study by including
only persons between 18 and 21 years of age—then a restricted
range would actually be preferred.
The procedures for obtaining a frequency table for a
quantitative variable are the same as those discussed in the
previous section on data screening for categorical variables.
Distribution shape for a quantitative variable can be assessed by
examining a histogram obtained by making these SPSS menu
selections (see Figure 4.17): <Graphs> → <Legacy Dialogs> →
<Histogram>.
These menu selections open the Histogram dialog window
displayed in Figure 4.18. In this example, the variable selected
for the histogram was HR1 (baseline or Time 1 heart rate).
Placing a checkmark in the box next to Display Normal Curve
requests a superimposed smooth normal distribution function on
the histogram plot. To obtain the histogram, after making these
selections, click the OK button. The histogram output appears in
Figure 4.19. The mean, standard deviation, and N for HR1
appear in the legend below the graph.
24. Figure 4.17 SPSS Menu Selections: <Graphs> → <Legacy
Dialogs> → <Histogram>
Figure 4.18 SPSS Dialog Window: Histogram Procedure
Figure 4.19 Histogram of Heart Rates With Superimposed
Normal Curve
An assumption common to all the parametric analyses covered
in this book is that scores on quantitative variables should be
(at least approximately) normally distributed. In practice, the
normality of distribution shape is usually assessed visually; a
histogram of scores is examined to see whether it is
approximately “bell shaped” and symmetric. Visual examination
of the histogram in Figure 4.19 suggests that the distribution
shape is not exactly normal; it is slightly asymmetrical.
However, this distribution of sample scores is similar enough to
a normal distribution shape to allow the use of parametric
statistics such as means and correlations. This distribution
shows a reasonably wide range of heart rates, no evidence of
ceiling or floor effects, and no extreme outliers.
There are many ways in which the shape of a distribution can
differ from an ideal normal distribution shape. For example, a
distribution is described as skewed if it is asymmetric, with a
longer tail on one side (see Figure 4.20 for an example of a
distribution with a longer tail on the right). Positively skewed
distributions similar to the one that appears in Figure 4.20 are
quite common; many variables, such as reaction time, have a
minimum possible value of 0 (which means that the lower tail of
the distribution ends at 0) but do not have a fixed limit at the
upper end of the distribution (and therefore the upper tail can be
quite long). (Distributions with many zeros pose special
problems; refer back to comments on Figure 1.4. Also see
discussions by Atkins & Gallop [2007] and G. King & Zeng
[2001]; options include Poisson regression [Cohen, Cohen,
West, & Aiken, 2003, chap. 13], and negative binomial
25. regression [Hilbe, 2011].)
Figure 4.20 Histogram of Positively Skewed Distribution
NOTE: Skewness index for this variable is +2.00.
A numerical index of skewness for a sample set of X scores
denoted by (X1, X2, …, XN) can be calculated using the
following formula:
where Mx is the sample mean of the X scores, s is the sample
standard deviation of the X scores, and N is the number of
scores in the sample.
For a perfectly normal and symmetrical distribution, skewness
has a value of 0. If the skewness statistic is positive, it indicates
that there is a longer tail on the right-hand/upper end of the
distribution (as in Figure 4.20); if the skewness statistic is
negative, it indicates that there is a longer tail on the lower end
of the distribution (as in Figure 4.21).
Figure 4.21 Histogram of a Negatively Skewed Distribution
NOTE: Skewness index for this variable is −2.00.
A distribution is described as platykurtic if it is flatter than an
ideal normal distribution and leptokurtic if it has a
sharper/steeper peak in the center than an ideal normal
distribution (see Figure 4.22). A numerical index of kurtosis can
be calculated using the following formula:
where Mx is the sample mean of the X scores, s is the sample
standard deviation of the X scores, and N is the number of
scores in the sample.
Using Equation 4.2, the kurtosis for a normal distribution
corresponds to a value of 3; most computer programs actually
report “excess kurtosis”—that is, the degree to which the
kurtosis of the scores in a sample differs from the kurtosis
expected in a normal distribution. This excess kurtosis is given
26. by the following formula:
Figure 4.22 Leptokurtic and Platykurtic Distributions
SOURCE: Adapted from
http://www.murraystate.edu/polcrjlst/p660kurtosis.htm
A positive score for excess kurtosis indicates that the
distribution of scores in the sample is more sharply peaked than
in a normal distribution (this is shown as leptokurtic in Figure
4.22). A negative score for kurtosis indicates that the
distribution of scores in a sample is flatter than in a normal
distribution (this corresponds to a platykurtic distribution shape
in Figure 4.22). The value that SPSS reports as kurtosis
corresponds to excess kurtosis (as in Equation 4.3).
A normal distribution is defined as having skewness and
(excess) kurtosis of 0. A numerical index of skewness and
kurtosis can be obtained for a sample of data to assess the
degree of departure from a normal distribution shape.
Additional summary statistics for a quantitative variable such as
HR1 can be obtained from the SPSS Descriptives procedure by
making the following menu selections: <Analyze> →
<Descriptive Statistics> → <Descriptives>.
The menu selections shown in Figure 4.23 open the Descriptive
Statistics dialog box shown in Figure 4.24. The Options button
opens up a dialog box that has a menu with check boxes that
offer a selection of descriptive statistics, as shown in Figure
4.25. In addition to the default selections, the boxes for
skewness and kurtosis were also checked. The output from this
procedure appears in Figure 4.26. The upper panel in Figure
4.26 shows the descriptive statistics for scores on HR1 that
appeared in Figure 4.19; skewness and kurtosis for the sample
of scores on the variable HR1 were both fairly close to 0. The
lower panel shows the descriptive statistics for the artificially
generated data that appeared in Figures 4.20 (a set of positively
skewed scores) and 4.21 (a set of negatively skewed scores).
27. Figure 4.23 SPSS Menu Selections: <Analyze> →
<Descriptive Statistics> → <Descriptives>
Figure 4.24 Dialog Window for SPSS Descriptive Statistics
Procedure
Figure 4.25 Options for the Descriptive Statistics Procedure
Figure 4.26 Output From the SPSS Descriptive Statistics
Procedure for Three Types of Distribution Shape
NOTE: Scores for HR (from Figure 4.19) are not skewed; scores
for posskew (from Figure 4.20) are positively skewed; and
scores for negskew (from Figure 4.21) are negatively skewed.
It is possible to set up a statistical significance test (in the form
of a z ratio) for skewness because SPSS also reports the
standard error (SE) for this statistic:
When the N of cases is reasonably large, the resulting z ratio
can be evaluated using the standard normal distribution; that is,
skewness is statistically significant at the α = .05 level (two-
tailed) if the z ratio given in Equation 4.4 is greater than 1.96 in
absolute value.
A z test can also be set up to test the significance of (excess)
kurtosis:
The tests in Equations 4.4 and 4.5 provide a way to evaluate
whether an empirical frequency distribution differs significantly
from a normal distribution in skewness or kurtosis. There are
formal mathematical tests to evaluate the degree to which an
empirical distribution differs from some ideal or theoretical
distribution shape (such as the normal curve). If a researcher
needs to test whether the overall shape of an empirical
frequency distribution differs significantly from normal, it can
be done by using the Kolmogorov-Smirnov or Shapiro-Wilk test
28. (both are available in SPSS). In most situations, visual
examination of distribution shape is deemed sufficient.
In general, empirical distribution shapes are considered
problematic only when they differ dramatically from normal.
Some earlier examples of drastically nonnormal distribution
shapes appeared in Figures 1.2 (a roughly uniform distribution)
and 1.3 (an approximately exponential or J-shaped distribution).
Multimodal distributions or very seriously skewed distributions
(as in Figure 4.20) may also be judged problematic. A
distribution that resembles the one in Figure 4.19 is often
judged close enough to normal shape.
4.7 Identification and Handling of Outliers
An outlier is an extreme score on either the low or the high end
of a frequency distribution of a quantitative variable. Many
different decision rules can be used to decide whether a
particular score is extreme enough to be considered an outlier.
When scores are approximately normally distributed, about 99%
of the scores should fall within +3 and −3 standard deviations
of the sample mean. Thus, for normally distributed scores, z
scores can be used to decide which scores to treat as outliers.
For example, a researcher might decide to treat scores that
correspond to values of z that are less than −3.30 or greater than
+3.30 as outliers.
Another method for the detection of outliers uses a graph called
a boxplot (or a box and whiskers plot). This is a nonparametric
exploratory procedure that uses medians and quartiles as
information about central tendency and dispersion of scores.
The following example uses a boxplot of scores on WEIGHT
separately for each gender group, as a means of identifying
potential outliers on WEIGHT. To set up this boxplot for the
distribution of weight within each gender group, the following
SPSS menu selections were made: <Graphs> → <Legacy
Dialogs> → <Box[plot]>.
This opens up the first SPSS boxplot dialog box, shown in
Figure 4.27. For this example, the box marked Simple was
clicked, and the radio button for “Summaries for groups of
29. cases” was selected to obtain a boxplot for just one variable
(WEIGHT) separately for each of two groups (male and female).
Clicking on the Define button opened up the second boxplot
dialog window, as shown in Figure 4.28. The name of the
quantitative dependent variable, WEIGHT, was placed in the top
window (as the name of the variable); the categorical or
“grouping” variable (GENDER) was placed in the window for
the Category Axis. Clicking the OK button generated the
boxplot shown in Figure 4.29, with values of WEIGHT shown
on the Y axis and the categories male and female shown on the
X axis.
Rosenthal and Rosnow (1991) noted that there are numerous
variations of the boxplot; the description here is specific to the
boxplots generated by SPSS and may not correspond exactly to
descriptions of boxplots given elsewhere. For each group, a
shaded box corresponds to the middle 50% of the distribution of
scores in that group. The line that bisects this box horizontally
(not necessarily exactly in the middle) represents the 50th
percentile (the median). The lower and upper edges of this
shaded box correspond to the 25th and 75th percentiles of the
weight distribution for the corresponding group (labeled on the
X axis). The 25th and 75th percentiles of each distribution of
scores, which correspond to the bottom and top edges of the
shaded box, respectively, are called the hinges. The distance
between the hinges (i.e., the difference between scores at the
75th and 25th percentiles) is called the H-spread. The vertical
lines that extend above and below the 75th and 25th percentiles
are called “whiskers,” and the horizontal lines at the ends of the
whiskers mark the “adjacent values.” The adjacent values are
the most extreme scores in the sample that lie between the hinge
and the inner fence (not shown on the graph; the inner fence is
usually a distance from the median that is 1.5 times the H-
spread). Generally, any data points that lie beyond these
adjacent values are considered outliers. In the boxplot, outliers
that lie outside the adjacent values are graphed using small
circles. Observations that are extreme outliers are shown as
30. asterisks (*).
Figure 4.27 SPSS Dialog Window for Boxplot Procedure
Figure 4.28 Define Simple Boxplot: Distribution of Weight
Separately by Gender
Figure 4.29 Boxplot of WEIGHT for Each Gender Group
Figure 4.29 indicates that the middle 50% of the distribution of
body weights for males was between about 160 and 180 lb, and
there was one outlier on WEIGHT (Participant 31 with a weight
of 230 lb) in the male group. For females, the middle 50% of
the distribution of weights was between about 115 and 135 lb,
and there were two outliers on WEIGHT in the female group;
Participant 50 was an outlier (with weight = 170), and
Participant 49 was an extreme outlier (with weight = 190). The
data record numbers that label the outliers in Figure 4.29 can be
used to look up the exact score values for the outliers in the
entire listing of data in the SPSS data worksheet or in Table 4.1.
In this dataset, the value of idnum (a variable that provides a
unique case number for each participant) was the same as the
SPSS line number or record number for all 65 cases. If the
researcher wants to exclude the 3 participants who were
identified as outliers in the boxplot of weight scores for the two
gender groups, it could be done by using the following Select If
statement: idnum ~= 31 and idnum ~= 49 and idnum ~=50.
Parametric statistics (such as the mean, variance, and Pearson
correlation) are not particularly robust to outliers; that is, the
value of M for a batch of sample data can be quite different
when it is calculated with an outlier included than when an
outlier is excluded. This raises a problem: Is it preferable to
include outliers (recognizing that a single extreme score may
have a disproportionate impact on the outcome of the analysis)
or to omit outliers (understanding that the removal of scores
may change the outcome of the analysis)? It is not possible to
state a simple rule that can be uniformly applied to all research
31. situations. Researchers have to make reasonable judgment calls
about how to handle extreme scores or outliers. Researchers
need to rely on both common sense and honesty in making these
judgments.
When the total N of participants in the dataset is relatively
small, and when there are one or more extreme outliers, the
outcomes for statistical analyses that examine the relation
between a pair of variables can be quite different when outliers
are included versus excluded from an analysis. The best way to
find out whether the inclusion of an outlier would make a
difference in the outcome of a statistical analysis is to run the
analysis both including and excluding the outlier score(s).
However, making decisions about how to handle outliers post
hoc (after running the analyses of interest) gives rise to a
temptation: Researchers may wish to make decisions about
outliers based on the way the outliers influence the outcome of
statistical analyses. For example, a researcher might find a
significant positive correlation between variables X and Y when
outliers are included, but the correlation may become
nonsignificant when outliers are removed from the dataset. It
would be dishonest to report a significant correlation without
also explaining that the correlation becomes nonsignificant
when outliers are removed from the data. Conversely, a
researcher might also encounter a situation where there is no
significant correlation between scores on the X and Y variables
when outliers are included, but the correlation between X and Y
becomes significant when the data are reanalyzed with outliers
removed. An honest report of the analysis should explain that
outlier scores were detected and removed as part of the data
analysis process, and there should be a good rationale for
removal of these outliers. The fact that dropping outliers yields
the kind of correlation results that the researcher hopes for is
not, by itself, a satisfactory justification for dropping outliers.
It should be apparent that if researchers arbitrarily drop enough
cases from their samples, they can prune their data to fit just
about any desired outcome. (Recall the myth of King
32. Procrustes, who cut off the limbs of his guests so that they
would fit his bed; we must beware of doing the same thing to
our data.)
A less problematic way to handle outliers is to state a priori that
the study will be limited to a specific population—that is, to
specific ranges of scores on some of the variables. If the
population of interest in the blood pressure study is healthy
young adults whose blood pressure is within the normal range,
this a priori specification of the population of interest would
provide a justification for the decision to exclude data for
participants with age older than 30 years and SBP above 140.
Another reasonable approach is to use a standard rule for
exclusion of extreme scores (e.g., a researcher might decide at
an early stage in data screening to drop all values that
correspond to z scores in excess of 3.3 in absolute value; this
value of 3.3 is an arbitrary standard).
Another method of handling extreme scores (trimming) involves
dropping the top and bottom scores (or some percentage of
scores, such as the top and bottom 1% of scores) from each
group. Winsorizing is yet another method of reducing the
impact of outliers: The most extreme score at each end of a
distribution is recoded to have the same value as the next
highest score.
Another way to reduce the impact of outliers is to apply a
nonlinear transformation (such as taking the base 10 logarithm
[log] of the original X scores). This type of data transformation
can bring outlier values at the high end of a distribution closer
to the mean.
Whatever the researcher decides to do with extreme scores
(throw them out, Winsorize them, or modify the entire
distribution by taking the log of scores), it is a good idea to
conduct analyses with the outlier included and with the outlier
excluded to see what effect (if any) the decision about outliers
has on the outcome of the analysis. If the results are essentially
identical no matter what is done to outliers, then either
approach could be reported. If the results are substantially
33. different when different things are done with outliers, the
researcher needs to make a thoughtful decision about which
version of the analysis provides a more accurate and honest
description of the situation. In some situations, it may make
sense to report both versions of the analysis (with outliers
included and excluded) so that it is clear to the reader how the
extreme individual score values influenced the results. None of
these choices are ideal solutions; any of these procedures may
be questioned by reviewers or editors.
It is preferable to decide on simple exclusion rules for outliers
before data are collected and to remove outliers during the
preliminary screening stages rather than at later stages in the
analysis. It may be preferable to have a consistent rule for
exclusion (e.g., excluding all scores that show up as extreme
outliers in boxplots) rather than to tell a different story to
explain why each individual outlier received the specific
treatment that it did. The final research report should explain
what methods were used to detect outliers, identify the scores
that were identified as outliers, and make it clear how the
outliers were handled (whether extreme scores were removed or
modified).
4.8 Screening Data for Bivariate Analyses
There are three possible combinations of types of variables in
bivariate analysis. Both variables may be categorical, both may
be quantitative, or one may be categorical and the other
quantitative. Separate bivariate data-screening methods are
outlined for each of these situations.
4.8.1 Bivariate Data Screening for Two Categorical Variables
When both variables are categorical, it does not make sense to
compute means (the numbers serve only as labels for group
memberships); instead, it makes sense to look at the numbers of
cases within each group. When two categorical variables are
considered jointly, a cross-tabulation or contingency table
summarizes the number of participants in the groups for all
possible combinations of scores. For example, consider
GENDER (coded 1 = male and 2 = female) and smoking status
34. (coded 1 = nonsmoker, 2 = occasional smoker, 3 = frequent
smoker, and 4 = heavy smoker). A table of cell frequencies for
these two categorical variables can be obtained using the SPSS
Crosstabs procedure by making the following menu selections:
<Analyze> → <Descriptives> → <Crosstabs>.
These menu selections open up the Crosstabs dialog window,
which appears in Figure 4.30. The names of the row variable (in
this example, GENDER) and the column variable (in this
example, SMOKE) are entered into the appropriate boxes.
Clicking on the button labeled Cells opens up an additional
dialog window, shown in Figure 4.31, where the user specifies
the information to be presented in each cell of the contingency
table. In this example, both observed (O) and expected (E)
frequency counts are shown in each cell (see Chapter 8 in this
textbook to see how expected cell frequencies are computed
from the total number in each row and column of a contingency
table). Row percentages were also requested.
The observed cell frequencies in Figure 4.32 show that most of
the males and the females were nonsmokers (SMOKE = 1). In
fact, there were very few light smokers and heavy smokers (and
no moderate smokers). As a data-screening result, this has two
implications: If we wanted to do an analysis (such as a chi-
square test of association, as described in Chapter 8) to assess
how gender is related to smoking status, the data do not satisfy
an assumption about the minimum expected cell frequencies
required for the chi-square test of association. (For a 2 × 2
table, none of the expected cell frequencies should be less than
5; for larger tables, various sources recommend different
standards for minimum expected cell frequencies, but a
minimum expected frequency of 5 is recommended here.) In
addition, if we wanted to see how gender and smoking status
together predict some third variable, such as heart rate, the
numbers of participants in most of the groups (such as heavy
smoker/females with only N = 2 cases) are simply too small.
What would we hope to see in preliminary screening for
categorical variables? The marginal frequencies (e.g., number
35. of males, number of females; number of nonsmokers, light
smokers, and heavy smokers) should all be reasonably large.
That is clearly not the case in this example: There were so few
heavy smokers that we cannot judge whether heavy smoking is
associated with gender.
Figure 4.30 SPSS Crosstabs Dialog Window
Figure 4.31 SPSS Crosstabs: Information to Display in Cells
Figure 4.32 Cross-Tabulation of Gender by Smoking Status
NOTE: Expected cell frequencies less than 10 in three cells.
The 2 × 3 contingency table in Figure 4.32 has four cells with
expected cell frequencies less than 5. There are two ways to
remedy this problem. One possible solution is to remove groups
that have small marginal total Ns. For example, only 3 people
reported that they were “heavy smokers.” If this group of 3
people were excluded from the analysis, the two cells with the
lowest expected cell frequencies would be eliminated from the
table. Another possible remedy is to combine groups (but only
if this makes sense). In this example, the SPSS recode command
can be used to recode scores on the variable SMOKE so that
there are just two values: 1 = nonsmokers and 2 = light or heavy
smokers. The SPSS menu selections <Compute> → <Recode>
→ <Into Different Variable> appear in Figure 4.33; these menu
selections open up the Recode into Different Variables dialog
box, as shown in Figure 4.34.
In the Recode into Different Variables dialog window, the
existing variable SMOKE is identified as the numeric variable
by moving its name into the window headed Numeric Variable
→ Output Variable. The name for the new variable (in this
example, SMOKE2) is typed into the right-hand window under
the heading Output Variable, and if the button marked Change
is clicked, SPSS identifies SMOKE2 as the (new) variable that
will contain the recoded values that are based on scores for the
36. existing variable SMOKE. Clicking on the button marked Old
and New Values opens up the next SPSS dialog window, which
appears in Figure 4.35.
The Old and New Values dialog window that appears in Figure
4.35 can be used to enter a series of pairs of scores that show
how old scores (on the existing variable SMOKE) are used to
create new recoded scores (on the output variable SMOKE2).
For example, under Old Value, the value 1 is entered; under
New Value, the value 1 is entered; then, we click the Add
button to add this to the list of recode commands. People who
have a score of 1 on SMOKE (i.e., they reported themselves as
nonsmokers) will also have a score of 1 on SMOKE2 (this will
also be interpreted as “nonsmokers”). For the old values 2, 3,
and 4 on the existing variable SMOKE, each of these variables
is associated with a score of 2 on the new variable SMOKE2. In
other words, people who chose responses 2, 3, or 4 on the
variable SMOKE (light, moderate, or heavy smokers) will be
coded 2 (smokers) on the new variable SMOKE2. Click
Continue and then OK to make the recode commands take
effect. After the recode command has been executed, a new
variable called SMOKE2 will appear in the far right-hand
column of the SPSS Data View worksheet; this variable will
have scores of 1 (nonsmoker) and 2 (smoker).
Figure 4.33 SPSS Menu Selection for the Recode Command
Figure 4.34 SPSS Recode Into Different Variables Dialog
Window
Figure 4.35 Old and New Values for the Recode Command
Figure 4.36 Crosstabs Using the Recoded Smoking Variable
(SMOKE2)
While it is possible to replace the scores on the existing
variable SMOKE with recoded values, it is often preferable to
put recoded scores into a new output variable. It is easy to lose
37. track of recodes as you continue to work with a data file. It is
helpful to retain the variable in its original form so that
information remains available.
After the recode command has been used to create a new
variable (SMOKE2), with codes for light, moderate, and heavy
smoking combined into a single code for smoking, the Crosstabs
procedure can be run using this new version of the smoking
variable. The contingency table for GENDER by SMOKE2
appears in Figure 4.36. Note that this new table has no cells
with minimum expected cell frequencies less than 5. Sometimes
this type of recoding results in reasonably large marginal
frequencies for all groups. In this example, however, the total
number of smokers in this sample is still small.
4.8.2 Bivariate Data Screening for One Categorical and One
Quantitative Variable
Data analysis methods that compare means of quantitative
variables across groups (such as ANOVA) have all the
assumptions that are required for univariate parametric
statistics:
1. Scores on quantitative variables should be normally
distributed.
2. Observations should be independent.
When means on quantitative variables are compared across
groups, there is one additional assumption: The variances of the
populations (from which the samples are drawn) should be
equal. This can be stated as a formal null hypothesis:
Assessment of possible violations of Assumptions 1 and 2 were
described in earlier sections of this chapter. Graphic methods,
such as boxplots (as described in an earlier section of this
chapter), provide a way to see whether groups have similar
ranges or variances of scores.
The SPSS t test and ANOVA procedures provide a significance
test for the null assumption that the population variances are
38. equal (the Levene test). Usually, researchers hope that this
assumption is not violated, and thus, they usually hope that the
F ratio for the Levene test will be nonsignificant. However,
when the Ns in the groups are equal and reasonably large
(approximately N > 30 per group), ANOVA is fairly robust to
violations of the equal variance assumption (Myers & Well,
1991, 1995).
Small sample sizes create a paradox with respect to the
assessment of violations of many assumptions. When N is small,
significance tests for possible violations of assumptions have
low statistical power, and violations of assumptions are more
problematic for the analysis. For example, consider a one-way
ANOVA with only 5 participants per group. With such a small
N, the test for heterogeneity of variance may be significant only
when the differences among sample variances are extremely
large; however, with such a small N, small differences among
sample variances might be enough to create problems in the
analysis. Conversely, in a one-way ANOVA with 50 participants
per group, quite small differences in variance across groups
could be judged statistically significant, but with such a large
N, only fairly large differences in group variances would be a
problem. Doing the preliminary test for heterogeneity of
variance when Ns are very large is something like sending out a
rowboat to see if the water is safe for the Queen Mary.
Therefore, it may be reasonable to use very small α levels, such
as α = .001, for significance tests of violations of assumptions
in studies with large sample sizes. On the other hand,
researchers may want to set α values that are large (e.g., α of
.20 or larger) for preliminary tests of assumptions when Ns are
small.
Tabachnick and Fidell (2007) provide extensive examples of
preliminary data screening for comparison of groups. These
generally involve repeating the univariate data-screening
procedures described earlier (to assess normality of distribution
shape and identify outliers) separately for each group and, in
addition, assessing whether the homogeneity of variance
39. assumption is violated.
It is useful to assess the distribution of quantitative scores
within each group and to look for extreme outliers within each
group. Refer back to Figure 4.29 to see an example of a boxplot
that identified outliers on WEIGHT within the gender groups. It
might be desirable to remove these outliers or, at least, to
consider how strongly they influence the outcome of a t test to
compare male and female mean weights. The presence of these
outlier scores on WEIGHT raises the mean weight for each
group; the presence of these outliers also increases the within-
group variance for WEIGHT in both groups.
4.8.3 Bivariate Data Screening for Two Quantitative Variables
Statistics that are part of the general linear model (GLM), such
as the Pearson correlation, require several assumptions. Suppose
we want to use Pearson’s r to assess the strength of the
relationship between two quantitative variables, X (diastolic
blood pressure [DBP]) and Y (systolic blood pressure [SBP]).
For this analysis, the data should satisfy the following
assumptions:
1. Scores on X and Y should each have a univariate normal
distribution shape.
2. The joint distribution of scores on X and Y should have a
bivariate normal shape (and there should not be any extreme
bivariate outliers).
3. X and Y should be linearly related.
4. The variance of Y scores should be the same at each level of
X (the homogeneity or homoscedasticity of variance
assumption).
The first assumption (univariate normality of X and Y) can be
evaluated by setting up a histogram for scores on X and Y and
by looking at values of skewness as described in Section 4.6.4.
The other two assumptions (a bivariate normal distribution
shape and a linear relation) can be assessed by examining an X,
Yscatter plot.
40. To obtain an X, Y scatter plot, the following menu selections
are used: <Graph> → <Scatter>.
From the initial Scatter/Dot dialog box (see Figure 4.37), the
Simple Scatter type of scatter plot was selected by clicking on
the icon in the upper left part of the Scatter/Dot dialog window.
The Define button was used to move on to the next dialog
window. In the next dialog window (shown in Figure 4.38), the
name of the predictor variable (DBP at Time 1) was placed in
the window marked X Axis, and the name of the outcome
variable (SBP at Time 1) was placed in the window marked Y
Axis. (Generally, if there is a reason to distinguish between the
two variables, the predictor or “causal” variable is placed on the
X axis in the scatter plot. In this example, either variable could
have been designated as the predictor.) The output for the
scatter plot showing the relation between scores on DBP and
SBP in Figure 4.39 shows a strong positive association between
DBP and SBP. The relation appears to be fairly linear, and there
are no bivariate outliers.
The assumption of bivariate normal distribution is more
difficult to evaluate than the assumption of univariate
normality, particularly in relatively small samples. Figure 4.40
represents an ideal theoretical bivariate normal distribution.
Figure 4.41 is a bar chart that shows the frequencies of scores
with specific pairs of X, Y values; it corresponds approximately
to an empirical bivariate normal distribution (note that these
figures were not generated using SPSS). X and Y have a
bivariate normal distribution if Y scores are normally
distributed for each value of X (and vice versa). In either graph,
if you take any specific value of X and look at that cross section
of the distribution, the univariate distribution of Y should be
normal. In practice, even relatively large datasets (N > 200)
often do not have enough data points to evaluate whether the
scores for each pair of variables have a bivariate normal
distribution.
Several problems may be detectable in a bivariate scatter plot.
A bivariate outlier (see Figure 4.42) is a score that falls outside
41. the region in the X, Y scatter plot where most X, Y values are
located. In Figure 4.42, one individual has a body weight of
about 230 lb and SBP of about 110; this combination of score
values is “unusual” (in general, persons with higher body
weight tended to have higher blood pressure). To be judged a
bivariate outlier, a score does not have to be a univariate outlier
on either X or Y (although it may be). A bivariate outlier can
have a disproportionate impact on the value of Pearson’s r
compared with other scores, depending on its location in the
scatter plot. Like univariate outliers, bivariate outliers should
be identified and examined carefully. It may make sense in
some cases to remove bivariate outliers, but it is preferable to
do this early in the data analysis process, with a well-thought-
out justification, rather than late in the data analysis process,
because the data point does not conform to the preferred linear
model.
Figure 4.37 SPSS Dialog Window for the Scatter Plot
Procedure
Figure 4.38 Scatter Plot: Identification of Variables on X and
Y Axes
Heteroscedasticity or heterogeneity of variance refers to a
situation where the variance in Y scores is greater for some
values of X than for others. In Figure 4.43, the variance of Y
scores is much higher for X scores near 50 than for X values
less than 30.
This unequal variance in Y across levels of X violates the
assumption of homoscedasticity of variance; it also indicates
that prediction errors for high values of X will be systematically
larger than prediction errors for low values of X. Sometimes a
log transformation on a Y variable that shows heteroscedasticity
across levels of X can reduce the problem of unequal variance
to some degree. However, if this problem cannot be corrected,
then the graph that shows the unequal variances should be part
of the story that is reported, so that readers understand: It is not
42. just that Y tends to increase as X increases, as in Figure 4.43;
the variance of Y also tends to increase as X increases. Ideally,
researchers hope to see reasonably uniform variance in Y scores
across levels of X. In practice, the number of scores at each
level of X is often too small to evaluate the shape and variance
of Y values separately for each level of X.
Figure 4.39 Bivariate Scatter Plot for Diastolic Blood
Pressure (DIA1) and Systolic Blood Pressure (SYS1)
(Moderately Strong, Positive, Linear Relationship)
Figure 4.40 Three-Dimensional Representation of an Ideal
Bivariate Normal Distribution
SOURCE: Reprinted with permission from Hartlaub, B., Jones,
B. D., & Karian, Z. A., downloaded from
www2.kenyon.edu/People/hartlaub/MellonProject/images/bivari
ate17.gif, supported by the Andrew W. Mellon Foundation.
Figure 4.41 Three-Dimensional Histogram of an Empirical
Bivariate Distribution (Approximately Bivariate Normal)
SOURCE: Reprinted with permission from Dr. P. D. M.
MacDonald.
NOTE: Z1 and Z2 represent scores on the two variables, while
the vertical heights of the bars along the “frequency” axis
represent the number of cases that have each combination of
scores on Z1 and Z2. A clear bivariate normal distribution is
likely to appear only for datasets with large numbers of
observations; this example only approximates bivariate normal.
4.9 Nonlinear Relations
Students should be careful to distinguish between these two
situations: no relationship between X and Y versus a nonlinear
relationship between X and Y (i.e., a relationship between X
and Y that is not linear). An example of a scatter plot that
shows no relationship of any kind (either linear or curvilinear)
43. between X and Y appears in Figure 4.44. Note that as the value
of X increases, the value of Y does not either increase or
decrease.
In contrast, an example of a curvilinear relationship between X
and Y is shown in Figure 4.45. This shows a strong relationship
between X and Y, but it is not linear; as scores on X increase
from 0 to 30, scores on Y tend to increase, but as scores on X
increase between 30 and 50, scores on Y tend to decrease. An
example of a real-world research situation that yields results
similar to those shown in Figure 4.45 is a study that examines
arousal or level of stimulation (on the X axis) as a predictor of
task performance (on the Y axis). For example, suppose that the
score on the X axis is a measure of anxiety and the score on the
Y axis is a score on an examination. At low levels of anxiety,
exam performance is not very good: Students may be sleepy or
not motivated enough to study. At moderate levels of anxiety,
exam performance is very good: Students are alert and
motivated. At the highest levels of anxiety, exam performance
is not good: Students may be distracted, upset, and unable to
focus on the task. Thus, there is an optimum (moderate) level of
anxiety; students perform best at moderate levels of anxiety.
Figure 4.42 Bivariate Scatter Plot for Weight and Systolic
Blood Pressure (SYS1)
NOTE: Bivariate outlier can be seen in the lower right corner of
the graph.
Figure 4.43 Illustration of Heteroscedasticity of Variance
NOTE: Variance in Y is larger for values of X near 50 than for
values of X near 0.
Figure 4.44 No Relationship Between X and Y
Figure 4.45 Bivariate Scatter Plot: Inverse U-Shaped
Curvilinear Relation Between X and Y
44. If a scatter plot reveals this kind of curvilinear relation between
X and Y, Pearson’s r (or other analyses that assume a linear
relationship) will not do a good job of describing the strength of
the relationship and will not reveal the true nature of the
relationship. Other analyses may do a better job in this
situation. For example, Y can be predicted from both X and X2
(a function that includes an X2 term as a curve rather than a
straight line). (For details, see Aiken & West, 1991, chap. 5.)
Alternatively, students can be separated into high-, medium-,
and low-anxiety groups based on their scores on X, and a one-
way ANOVA can be performed to assess how mean Y test
scores differ across these three groups. However, recoding
scores on a quantitative variable into categories can result in
substantial loss of information, as pointed out by Fitzsimons
(2008).
Another possible type of curvilinear function appears in Figure
4.46. This describes a situation where responses on Y reach an
asymptote as X increases. After a certain point, further
increases in X scores begin to result in diminishing returns on
Y. For example, some studies of social support suggest that
most of the improvements in physical health outcomes occur
between no social support and low social support and that there
is little additional improvement in physical health outcomes
between low social support and higher levels of social support.
Here also, Pearson’s r or other statistic that assumes a linear
relation between X and Y may understate the strength and fail
to reveal the true nature of the association between X and Y.
Figure 4.46 Bivariate Scatter Plot: Curvilinear Relation
Between X1 and Y
If a bivariate scatter plot of scores on two quantitative variables
reveals a nonlinear or curvilinear relationship, this nonlinearity
must be taken into account in the data analysis. Some nonlinear
relations can be turned into linear relations by applying
appropriate data transformations; for example, in
45. psychophysical studies, the log of the physical intensity of a
stimulus may be linearly related to the log of the perceived
magnitude of the stimulus.
4.10 Data Transformations
A linear transformation is one that changes the original X score
by applying only simple arithmetic operations (addition,
subtraction, multiplication, or division) using constants. If we
let b and c represent any two values that are constants within a
study, then the arithmetic function (X – b)/c is an example of a
linear transformation. The linear transformation that is most
often used in statistics is the one that involves the use of M as
the constant b and the sample standard deviation s as the
constant c: z = (X – M)/s. This transformation changes the mean
of the scores to 0 and the standard deviation of the scores to 1,
but it leaves the shape of the distribution of X scores
unchanged.
Sometimes, we want a data transformation that will change the
shape of a distribution of scores (or alter the nature of the
relationship between a pair of quantitative variables in a scatter
plot). Some data transformations for a set of raw X scores (such
as the log of X and the log of Y) tend to reduce positive
skewness and also to bring extreme outliers at the high end of
the distribution closer to the body of the distribution (see
Tabachnick & Fidell, 2007, chap. 4, for further discussion).
Thus, if a distribution is skewed, taking the log or square root
of scores sometimes makes the shape of the distribution more
nearly normal. For some variables (such as reaction time), it is
conventional to do this; log of reaction time is very commonly
reported (because reaction times tend to be positively skewed).
However, note that changing the scale of a variable (from heart
rate to log of heart rate) changes the meaning of the variable
and can make interpretation and presentation of results
somewhat difficult.
Figure 4.47 Illustration of the Effect of the Base 10 Log
Transformation
46. NOTE: In Figure 4.47, raw scores for body weight are plotted
on the X axis; raw scores for metabolic rate are plotted on the Y
axis. In Figure 4.48, both variables have been transformed using
base 10 log. Note that the log plot has more equal spaces among
cases (there is more information about the differences among
low-body-weight animals, and the outliers have been moved
closer to the rest of the scores). Also, when logs are taken for
both variables, the relation between them becomes linear. Log
transformations do not always create linear relations, of course,
but there are some situations where they do.
Sometimes a nonlinear transformation of scores on X and Y can
change a nonlinear relation between X and Y to a linear
relation. This is extremely useful, because the analyses included
in the family of methods called general linear models usually
require linear relations between variables.
A common and useful nonlinear transformation of X is the base
10 log of X, denoted by log10(X). When we find the base 10 log
of X, we find a number p such that 10p = X. For example, the
base 10 log of 1,000 is 3, because 103 = 1,000. The p exponent
indicates order of magnitude.
Consider the graph shown in Figure 4.47. This is a graph of
body weight (in kilograms) on the X axis with mean metabolic
rate on the Y axis; each data point represents a mean body
weight and a mean metabolic rate for one species. There are
some ways in which this graph is difficult to read; for example,
all the data points for physically smaller animals are crowded
together in the lower left-hand corner of the scatter plot. In
addition, if you wanted to fit a function to these points, you
would need to fit a curve (rather than a straight line).
Figure 4.48 Graph Illustrating That the Relation Between
Base 10 Log of Body Weight and Base 10 Log of Metabolic
Rate Across Species Is Almost Perfectly Linear
SOURCE: Reprinted with permission from Dr. Tatsuo
47. Motokawa.
Figure 4.48 shows the base 10 log of body weight and the base
10 log of metabolic rate for the same set of species as in Figure
4.47. Note that now, it is easy to see the differences among
species at the lower end of the body-size scale, and the relation
between the logs of these two variables is almost perfectly
linear.
In Figure 4.47, the tick marks on the X axis represented equal
differences in terms of kilograms. In Figure 4.48, the equally
spaced points on the X axis now correspond to equal spacing
between orders of magnitude (e.g., 101, 102, 103, …); a one-
tick-mark change on the X axis in Figure 4.48 represents a
change from 10 to 100 kg or 100 to 1,000 kg or 1,000 to 10,000
kg. A cat weighs something like 10 times as much as a dove, a
human being weighs something like 10 times as much as a cat, a
horse about 10 times as much as a human, and an elephant about
10 times as much as a horse. If we take the log of body weight,
these log values (p = 1, 2, 3, etc.) represent these orders of
magnitude, 10p (101 for a dove, 102 for a cat, 103 for a human,
and so on). If we graphed weights in kilograms using raw
scores, we would find a much larger difference between
elephants and humans than between humans and cats. The
log10(X) transformation yields a new way of scaling weight in
terms of p, the relative orders of magnitude.
When the raw X scores have a range that spans several orders of
magnitude (as in the sizes of animals, which vary from < 1 g up
to 10,000 kg), applying a log transformation reduces the
distance between scores on the high end of the distribution
much more than it reduces distances between scores on the low
end of the distribution. Depending on the original distribution
of X, outliers at the high end of the distribution of X are
brought “closer” by the log(X) transformation. Sometimes when
raw X scores have a distribution that is skewed to the right,
log(X) is nearly normal.
Some relations between variables (such as the physical
magnitude of a stimulus, e.g., a weight or a light source) and
48. subjective judgments (of heaviness or brightness) become linear
when log or power transformations are applied to the scores on
both variables.
Note that when a log transformation is applied to a set of scores
with a limited range of possible values (e.g., Likert ratings of 1,
2, 3, 4, 5), this transformation has little effect on the shape of
the distribution. However, when a log transformation is applied
to scores that vary across orders of magnitude (e.g., the highest
score is 10,000 times as large as the lowest score), the log
transformation may change the distribution shape substantially.
Log transformations tend to be much more useful for variables
where the highest score is orders of magnitude larger than the
smallest score; for example, maximum X is 100 or 1,000 or
10,000 times minimum X.
Other transformations that are commonly used involve power
functions—that is, replacing X with X2, Xc (where c is some
power of X, not necessarily an integer value), or . For specific
types of data (such as scores that represent proportions,
percentages, or correlations), other types of nonlinear
transformations are needed.
Usually, the goals that a researcher hopes to achieve through
data transformations include one or more of the following: to
make a nonnormal distribution shape more nearly normal, to
minimize the impact of outliers by bringing those values closer
to other values in the distribution, or to make a nonlinear
relationship between variables linear.
One argument against the use of nonlinear transformations has
to do with interpretability of the transformed scores. If we take
the square root of “number of times a person cries per week,”
how do we talk about the transformed variable? For some
variables, certain transformations are so common that they are
expected (e.g., psychophysical data are usually modeled using
power functions; measurements of reaction time usually have a
log transformation applied to them).
4.11 Verifying That Remedies Had the Desired Effects
Researchers should not assume that the remedies they use to try
49. to correct problems with their data (such as removal of outliers,
or log transformations) are successful in achieving the desired
results. For example, after one really extreme outlier is
removed, when the frequency distribution is graphed again,
other scores may still appear to be relatively extreme outliers.
After the scores on an X variable are transformed by taking the
natural log of X, the distribution of the natural log of X may
still be nonnormal. It is important to repeat data screening using
the transformed scores to make certain that the data
transformation had the desired effect. Ideally, the transformed
scores will have a nearly normal distribution without extreme
outliers, and relations between pairs of transformed variables
will be approximately linear.
4.12 Multivariate Data Screening
Data screening for multivariate analyses (such as multiple
regression and multivariate analysis of variance) begins with
screening for each individual variable and bivariate data
screening for all possible pairs of variables as described in
earlier sections of this chapter. When multiple predictor or
multiple outcome variables are included in an analysis,
correlations among these variables are reported as part of
preliminary screening. More complex assumptions about data
structure will be reviewed as they arise in later chapters.
Complete data screening in multivariate studies requires careful
examination not just of the distributions of scores for each
individual variable but also of the relationships between pairs
of variables and among subsets of variables. It is possible to
obtain numerical indexes (such as Mahalanobis d) that provide
information about the degree to which individual scores are
multivariate outliers. Excellent examples of multivariate data
screening are presented in Tabachnick and Fidell (2007).
4.13 Reporting Preliminary Data Screening
Many journals in psychology and related fields use the style
guidelines published by the American Psychological
Association (APA, 2009). This section covers some of the basic
guidelines. All APA-style research reports should be double-
50. spaced and single-sided with at least 1-in. margins on each
page.
A Results section should report data screening and the data
analyses that were performed (including results that run counter
to predictions). Interpretations and discussion of implications of
the results are generally placed in the Discussion section of the
paper (except in very brief papers with combined
Results/Discussion sections).
Although null hypothesis significance tests are generally
reported, the updated fifth and sixth editions of the APA (2001,
2009) Publication Manual also call for the inclusion of effect-
size information and confidence intervals (CIs), wherever
possible, for all major outcomes. Include the basic descriptive
statistics that are needed to understand the nature of the results;
for example, a report of a one-way ANOVA should include
group means and standard deviations as well as F values,
degrees of freedom, effect-size information, and CIs.
Standard abbreviations are used for most statistics—for
example, M for mean and SD for standard deviation (APA,
2001, pp. 140–144). These should be in italic font (APA, 2001,
p. 101). Parentheses are often used when these are reported in
the context of a sentence, as in, “The average verbal SAT for
the sample was 551 (SD = 135).”
The sample size (N) or the degrees of freedom (df) should
always be included when reporting statistics. Often the df
values appear in parentheses immediately following the
statistic, as in this example: “There was a significant gender
difference in mean score on the Anger In scale, t(61) = 2.438, p
= .018, two-tailed, with women scoring higher on average than
men.” Generally, results are rounded to two decimal places,
except that p values are sometimes given to three decimal
places. It is more informative to report exact p values than to
make directional statements such as p < .05. If the printout
shows a p of .000, it is preferable to report p < .001 (the risk of
Type I error indicated by p is not really zero). When it is
possible for p values to be either one-tailed or two-tailed (for
51. the independent samples t test, for example), this should be
stated explicitly.
Tables and figures are often useful ways of summarizing a large
amount of information—for example, a list of t tests with
several dependent variables, a table of correlations among
several variables, or the results from multivariate analyses such
as multiple regression. See APA (2001, pp. 147–201) for
detailed instructions about the preparation of tables and figures.
(Tufte, 1983, presents wonderful examples of excellence and
awfulness in graphic representations of data.) All tables and
figures should be discussed in the text; however, the text should
not repeat all the information in a table; it should point out only
the highlights. Table and figure headings should be informative
enough to be understood on their own. It is common to denote
statistical significance using asterisks (e.g., * for p < .05, ** for
p < .01, and *** for p < .001), but these should be described by
footnotes to the table. Each column and row of the table should
have a clear heading. When there is not sufficient space to type
out the entire names for variables within the table, numbers or
abbreviations may be used in place of variable names, and this
should also be explained fully in footnotes to the table.
Horizontal rules or lines should be used sparingly within tables
(i.e., not between each row but only in the headings and at the
bottom). Vertical lines are not used in tables. Spacing should be
sufficient so that the table is readable.
In general, Results sections should include the following
information. For specific analyses, additional information may
be useful or necessary.
1. The opening sentence of each Results section should state
what analysis was done, with what variables, and to answer
what question. This sounds obvious, but sometimes this
information is difficult to find in published articles. An example
of this type of opening sentence is, “In order to assess whether
there was a significant difference between the mean Anger In
52. scores of men and women, an independent samples t test was
performed using the Anger In score as the dependent variable.”
2. Next, describe the data screening that was done to decide
whether assumptions were violated, and report any steps that
were taken to correct the problems that were detected. For
example, this would include examination of distribution shapes
using graphs such as histograms, detection of outliers using
boxplots, and tests for violations of homogeneity of variance.
Remedies might include deletion of outliers, data
transformations such as the log, or choice of a statistical test
that is more robust to violations of the assumption.
3. The next sentence should report the test statistic and the
associated exact p value; also, a statement whether or not it
achieved statistical significance, according to the predetermined
alpha level, should be included: “There was a significant gender
difference in mean score on the Anger In scale, t(61) = 2.438, p
= .018, two-tailed, with women scoring higher on average than
men.” The significance level can be given as a range (p < .05)
or as a specific obtained value (p = .018). For nonsignificant
results, any of the following methods of reporting may be used:
p > .05 (i.e., a statement that the p value on the printout was
larger than a preselected α level of .05), p = .38 (i.e., an exact
obtained p value), or just ns (an abbreviation for
nonsignificant). Recall that the p value is an estimate of the risk
of Type I error; in theory, this risk is never zero, although it
may be very small. Therefore, when the printout reports a
significance or p value of .000, it is more accurate to report it
as “p < .001” than as “p = .000.”
4. Information about the strength of the relationship should be
reported. Most statistics have an accompanying effect-size
measure. For example, for the independent samples t test,
Cohen’s d and η2 are common effect-size indexes. For this
example, η2 = t2/(t2 + df) = (2.438)2/((2.438)2 + 61) = .09.
Verbal labels may be used to characterize an effect-size
estimate as small, medium, or large. Reference books such as
Cohen’s (1988) Statistical Power Analysis for the Behavioral
53. Sciences suggest guidelines for the description of effect size.
5. Where possible, CIs should be reported for estimates. In this
example, the 95% CI for the difference between the sample
means was from .048 to .484.
6. It is important to make a clear statement about the nature of
relationships (e.g., the direction of the difference between group
means or the sign of a correlation). In this example, the mean
Anger In score for females (M = 2.36, SD = .484) was higher
than the mean Anger In score for males (M = 2.10, SD = .353).
Descriptive statistics should be included to provide the reader
with the most important information. Also, note whether the
outcome was consistent with or contrary to predictions; detailed
interpretation/discussion should be provided in the Discussion
section.
Many published studies report multiple analyses. In these
situations, it is important to think about the sequence.
Sometimes, basic demographic information is reported in the
section about participants in the Methods/Participants section of
the paper. However, it is also common for the first table in the
Results section to provide means and standard deviations for all
the quantitative variables and group sizes for all the categorical
variables. Preliminary analyses that examine the reliabilities of
variables are reported prior to analyses that use those variables.
It is helpful to organize the results so that analyses that examine
closely related questions are grouped together. It is also helpful
to maintain a parallel structure throughout the research paper.
That is, questions are outlined in the Introduction, the Methods
section describes the variables that are manipulated and/or
measured to answer those questions, the Results section reports
the statistical analyses that were employed to try to answer each
question, and the Discussion section interprets and evaluates the
findings relevant to each question. It is helpful to keep the
questions in the same order in each section of the paper.
Sometimes, a study has both confirmatory and exploratory
components (as discussed in Chapter 1). For example, a study
might include an experiment that tests the hypotheses derived
54. from earlier research (confirmatory), but it might also examine
the relations among variables to look for patterns that were not
predicted (exploratory). It is helpful to make a clear distinction
between these two types of results. The confirmatory Results
section usually includes a limited number of analyses that
directly address questions that were stated in the Introduction;
when a limited number of significance tests are presented, there
should not be a problem with inflated risk of Type I error. On
the other hand, it may also be useful to present the results of
other exploratory analyses; however, when many significance
tests are performed and no a priori predictions were made, the
results should be labeled as exploratory, and the author should
state clearly that any p values that are reported in this context
are likely to underestimate the true risk of Type I error.
Usually, data screening that leads to a reduced sample size and
assessment of measurement reliability are reported in the
Methods section prior to the Results section. In some cases, it
may be possible to make a general statement such as, “All
variables were normally distributed, with no extreme outliers”
or “Group variances were not significantly heterogeneous,” as a
way of indicating that assumptions for the analysis are
reasonably well satisfied.
An example of a Results section that illustrates some of these
points follows. The SPSS printout that yielded these numerical
results is not included here; it is provided in the Instructor
Supplement materials for this textbook.
Results
An independent samples t test was performed to assess whether
there was a gender difference in mean Anger In scores.
Histograms and boxplots indicated that scores on the dependent
variable were approximately normally distributed within each
group with only one outlier in each group. Because these
outliers were not extreme, these scores were retained in the
analysis. The Levene test showed a nonsignificant difference