Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# PART 1: THEORY

1,747

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
1,747
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
21
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Exercise 1 Launch SPSS and open the data file Employees.sav. Q1. Calculate frequency tables for the variables Employment Category, Minority Classification and Months since hire. Write down the answers to the following questions: How many managers are there in the sample? ______ What percentage of staff are female? ______ What percentage of staff were hired 72 months ago or less? (Hint: think carefully about which column of the table to use) ______ Q2. Obtain descriptive statistics for variables Current salary and Previous experience (months). In addition to the statistics supplied by default, request the Skewness statistic (this is a measure of asymmetry of the distribution curve) Hint: use the Options... button. Write down the answers to the following questions: What mean value did you obtain for Previous experience? ______ Which of the two variables has the larger value for the skewness statistic? ______ ______ Q3. Perform a crosstabulation of Gender against Minority, requesting that total percentages are included in the cells. Write down the answers to the following questions: How many people are female and belong to a minority? ______ What percentage of people are male and belong to a minority? ______ Q4. Compute a new variable for the Current salary minus Beginning salary (call the new variable saldiff). Compare mean values of saldiff for each Gender. Which Gender has the highest mean saldiff? ______ What is the mean value of saldiff for that group? ______ Q5. Graph Previous experience (months) as a histogram. Before you click on OK, (1) select Display normal curve. and (2) use the Titles button and enter your Name as the title of the graph (it is not sufficient to add these using Microsoft Word later). Paste the figure below and write at least a three-sentence interpretation of the graph.
• 2. Q1. Recode Educational Level (years) (educ) into a new variable educ18 distinguishing ’18 years or more’ (Range: 18 through Highest) from the rest. When recoding, any missing data should be recoded as system-missing Provide suitable variable and value labels. Now select (filter) cases to include only employees with a Current Salary (salary) of more than \$39999. Find out and write down the answers to the following questions: What percentage of these better paid workers are not "well educated"? ______% What is the maximum Beginning Salary (salbegin) in each of the two educational groups defined by educ18 for these better paid workers? (Hints: Use Option in Compare Means>Means 18+ years \$_____ If necessary edit the pivot table to widen the columns) others \$_____ Q2. IMPORTANT: Remember to select all cases before continuing, via Data=> Select Cases. Use Crosstabs to obtain counts (only) for educ18 by gender by minority. Pivot the table so that you see a two-dimensional table showing only the male employees, with educ18 defining the columns and minority the rows. (Hint: if you don't obtain a pivoting tray at first, use the Pivot menu to obtain one. Paste the resulting table here: Q3. Obtain an interactive scatterplot of Current Salary (salary) on the y-axis, by Previous Experience (prevexp) on the x-axis, and distinguishing different values for minority using the Style box. Paste the figure below and write at least a three- sentence interpretation of the graph.
• 3. Exercise 2 PART 1: THEORY Q1. Using examples from your own research or field of study, identify a situation in which you might use each of the following tests: a. Independent one-way ANOVA: b. Repeated measures one-way ANOVA: Q2. Mauchly’s test assesses whether: a. Data are normally-distributed b. The variances in different groups are equal c. The assumption of sphericity has been met d. Group means differ Q3. Why is it necessary to follow up significant F-tests with planned or post hoc comparisons? What is the advantage of using these specially designed tests instead of several normal t-tests?
• 4. PART 2: PRACTICE Launch SPSS and open the data file Telemarketing.sav Assume that in an attempt to maximize profits, a telemarketing company is conducting an experiment to determine which of four scripted sales pitches generates the best revenue. 1500 different telemarketing calls are randomly assigned to one of the four scripts, and the resulting revenue for each call is recorded. Q4. Run the appropriate F-test for this research design, with sales_pitch and revenue as your variables of interest. In the Options menu, request Descriptive statistics and a Homogeneity of variance test. In the Posthoc menu, select Bonferroni. a. Which pitch generated the greatest revenue on average? Paste the Descriptive statistics table from your SPSS output here or attach as a separate sheet. b. Is the homogeneity of variance assumption satisfied for this test? c. Use appropriate academic style to report the findings of the main ANOVA analysis (see PowerPoint slides to refresh your memory). Include the F-test statistic, appropriate df values, and significance level. d. Examine the results of your post hoc analysis. Of all the paired comparisons shown, which was associated with the smallest difference in revenue? Was this comparison significant?
• 5. PART 1: THEORY Q1. In two sentences, state the purpose of a chi-square test. Q2. List three assumptions that should be met in order to use Pearson’s χ2. 1. 2. 3. Q3. Using an example from your own research or field of study, identify two categorical variables that you suspect might be associated and provide a null and alternative (non-directional) hypothesis. Assume both variables meet the requirements identified in Question 2. Variable 1= Variable 2= H0= H1= Q4. Why do we calculate effect size measures in addition to χ2 values? Name two effect size measures that can be used to supplement chi-square analysis and indicate the circumstances under which you would use each one.
• 6. Exercise 3 PART 2: PRACTICE Launch SPSS and open the data file 1991 U.S. General Social Survey.sav Assume that you are a psychologist with a research interest in the factors associated with self-reported happiness amongst Americans. Q5. Run a chi-square analysis with region (row variable) and happy_level (column variable) as your variables of interest (Analyze => Descriptives => Crosstabs). In the Cells menu, ask for Row, Column, and Total percentages. In the Statistics menu request Chi-Square, Phi/V, and Risk. Q5a. Paste the resulting contingency table from your SPSS output here or attach as a separate sheet. According to this data, which region of the U.S. had the highest proportion of respondents who were ‘very happy’ in 1991? Q5b. Was the overall association between region and happiness significant? Report the values of χ2, df, the significance level of p, and the correct effect size measure here: χ2= df= p< effect size= Q6. Use the recode function (Transform => Recode => Into Different Variable) to change the happy_level variable into a dichotomy with two categories. Name the new variable happy and recode ‘very happy’ and ‘pretty happy’ to a value of 1, and ‘not very happy’ to a value of 2. Gives these values labels in ‘Data View’. Run a frequency analysis for your new variable (Analyze => Descriptives => Frequencies). Paste a frequency table here or attach as a separate sheet. Q7. Run a chi-square analysis with usintl (row variable) and your new variable, happy (column variable) (Analyze => Descriptives => Crosstabs). In the Cells menu, ask for Row, Column, and Total percentages. In the Statistics menu request Chi-Square, Phi/V, and Risk. Q7a. Use appropriate academic style to report your findings below (see PowerPoint slides to refresh your memory). Write in full sentences, and
• 7. include the χ2 test statistic, df, significance level. Interpret your findings using the best measure of effect size for this analysis. Also use row percentages for interpreting your results.
• 8. Exercise 4 PART 1: THEORY Q1. In two sentences, state the purpose of a correlation analysis. Q2. List three assumptions that should be met in order to use Pearson’s r. 1. 2. 3. Q3. Using an example from your own research or field of study, identify two variables that you suspect might be correlated and provide a null and alternative (directional) hypothesis. Assume both variables meet the requirements identified in Question 2. Variable 1: Variable 2: Null Hypothesis: Alternative Hypothesis: Q4. If a correlation analysis yields non-significant results (i.e., p>.05), can you conclude that there is no relationship between the variables under examination? Provide 2 reasons to support your answer.
• 9. PART 2: PRACTICE Launch SPSS and open the data file Band.sav Assume that you are a record executive, and that you are interested in determining whether the number of flyers and free web downloads you offer to promote your bands are related to number of CD sales. Q5a. Produce two separate scatterplot graphs (Graph => Scatterplot => Simple Scatter) to illustrate the relationship between flyers (X-axis) and sales (Y-axis) and the relationship between web (X-axis) and sales (Y-axis). Give each graph a title. Paste both graphs here or attach as a separate sheet. Q5b. In one or two sentences, describe the pattern of points shown in each graph. Indicate whether you think the relationship represented appears to be positive, negative, or null. Relationship between flyers and sales: Relationship between web and sales: Q6a. Run a bivariate correlation analysis (Analyze => Correlate => Bivariate). Include the variables flyers, web, and sales in the correlation matrix. Paste the matrix here or attach as a separate sheet. Q6b. Which variable is most strongly correlated with CD sales? According to Cohen’s (1988) rule of thumb, what is the magnitude of this correlation (Hint: small, moderate, or large)? Q7. Report results for both of the correlations you plotted in Question 5a. Use appropriate academic style (see PowerPoint slides to refresh your memory). Remember to identify both variables and to provide the r-value and significance level associated with each test.
• 10. Exercise 5 PART 1: THEORY Q1. In two or three sentences describe the difference between systematic and unsystematic variation. Q2. Using examples from your own research or field of study, identify a situation in which you might use each of the following tests: a. One sample t-test: b. Paired t-test: c. Independent t-test: Q4. What do we mean when conclude that a t-test is ‘significant’?
• 11. PART 2: PRACTICE Launch SPSS and open the data file Pollution.sav Assume that you are an environmental lobbyist interested in assessing whether the introduction of a new waste removal system at a large chemical factory has had an impact on groundwater contamination levels at 12 test sites in the area. You compare contamination levels from the same 12 locations based on samples collected one month before, and one month after the system was introduced. Q5a. Run the appropriate t-test for this research design, with pollute_before and pollute_after as your variables of interest (Analyze => Compare Means). Report the means and standard deviations for each variable. Pollute_before Mean= SD= Pollute_after Mean= SD= Q5b. Was there a significant difference between the average groundwater contamination levels measured before and after the waste removal system was installed? Use appropriate academic style to report your findings below (see PowerPoint slides to refresh your memory). Indicate what type of test you used, and report the t-test statistic, df, and significance level. Interpret your findings using the means and standard deviations reported above. (1 bonus point if you provide a measure of effect size). Launch SPSS and open the data file Rubbish.sav Assume that you are a social anthropologist who is interested in examining the gender division of household tasks. You ask 60 different participants (30 married women and 30 married men), to record the number of times they take out the rubbish in a week. Q6. Generate a bar graph to illustrate the mean number of times that husbands and wives take out the rubbish (Graph => Bar => Simple => Define). Use spouse as the Category axis and rubbish to represent the bars (Hint: instead of using N to represent the bars, select ‘other statistic’ to represent the mean). Paste the resulting bar graph here or attach as a separate sheet. Q6b. Run the appropriate t-test for this research design, with spouse and rubbish as your variables of interest (Analyze => Compare Means). Use appropriate academic style to report your findings below (see PowerPoint slides to refresh your memory). Indicate the type of test you used, the t-test statistic, df, and significance level. Interpret your findings using means and standard deviations.
• 12. Exercise 6 Question 1: What is the main purpose of regression? What types of dependent and independent variables are required for regression? What is the difference between bivariate and multiple regression? Question 2: The equation used for bivariate regression is y = a + bx + e. Explain each component of this equation (what it is and what it means). Question 3: We want to determine whether fear of crime impacts the level of protection behaviour. Using the Fear of Crime dataset (Fear-of-Crime.sav), carry out three separate bivariate regressions. Use protection behaviour (protect) as your DV for all three regression tests. Your three IVs are fear at night (fearnigr), fear during the day (feardayr), and total fear of crime (feartotr, which is the sum of all 24 items, day and night). Be careful to choose the right variables, they have been recoded so that a high level of fear = a high value. You would then have the following regression equations: protect = a + b (fearnigr) protect = a + b (feardayr) protect = a + b (feartotr) Before you proceed with the regression, analyse the univariate distributions and compute a Correlation Matrix between all variables. Also, make some scatterplots to explore the data. For each of the three regression tests, interpret and explain the results obtained for the significance tests, b, and R squared. Fill in the blanks for the above three equations. Remember that a is the value of y when x = 0 (it is the value of the constant in the regression table results produced by SPSS). Q 4. Open the Fear-of-Crime.sav data set and carry out the following analyses. You want to explain the total amount of protection behaviour. Your dependent variable is PROTECT. Assume that you have the following predictor variables: 1. Total extent of problems in neighbourhood: PROBLEM; sum of all V03xx. 2. Fear of Crime at Night: FEARNIGR: 0 (lowest) thru 36 (highest). 3. Political Orientation: V35POLIT, 1 = left.
• 13. 4. Sex: V29SEX, 0= male, 1 = female (recode first) 5. Age: V30Age. Question 1: Specify a hypothesis for each predictor variable and write a short explanation. Question 2: Carry out a multivariate regression analysis. Interpret and explain the results obtained for the significance tests, b, and R squared. State which variable(s) are most influential in the prediction of the variation of the dependent variable. Dataset : employees2.sav Data for a discrimination case against women and minorities in a US Midwestern bank Exercise You want to analyze the impact of various factors on employees beginning salary. Use Gender, Minority Classification, Educational Level, Employment Category, and Previous Experience as a predictor of Beginning Salary. (provide some tables and write 3-5 sentences for a, b, c, d points) a) Examine the assumptions on IV and DV b) Establish which variable(s) have the most impact on Beginning Salary c) Develop a regression model for predicting Beginning Salaries d) Write a short report on your model