Employee data

2,033 views

Published on

Analysis of Employee data of SPSS demo file.

Published in: Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,033
On SlideShare
0
From Embeds
0
Number of Embeds
74
Actions
Shares
0
Downloads
43
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Employee data

  1. 1. Vivek Kumar Enrollment no – 09BS0002756 Lakshami Through Sarasawati Summary: This study is an interpretation of employee data. This study reveals those education level and job categories are gender biased. It is found that among all the employees, females are less educated. On the other hand, it reveals that more education level is needed for better jobs. People do job to satisfytheir different needs by earning the money (Lakshami) .As per the research, more educated (Sarasawati) people are getting better job. Hence we can conclude it is not Lakshami Vs Sarasawati, Instead it is “Lakshami Thorough Sarasawati”. ________________________________________________________________________________________ 1. Introduction of Data Employee data has been interpreted and result has been explained in this report. This data is having nine attributes those attributes are gender, birth day, education level (Year), Job category, current Salary, Beginning salary, Month since hire, Previous experience and Minority classification. Some new attribute is derived from above nine attribute like male (binary value for gender that may be 0 or 1) and age (derived from date of birth). 2. Analysis 2.1. It’s a need to test weather female are less educated then male i.e. is education level gender biased? (Refer: Appendix I) a. Hypothesis: H0:μMale = μfemale (Null Hypothesis is that mean of education level is same for male and female) Ha:μMale ≠ μfemale (Alternate hypothesis is mean ofeducation level is not same for male and female) Significance Level α 0.05 (i.e. Rejection Region - Rejectthe null hypothesis ifp-value ≤ 0.05) b. Nature of Data and appropriate statistical tool: In this case, attribute “education level” and “gender” need to be interpreted from the employee data. Here “education level” is a continuous variables and “gender” is a categorical variable. By Q-Q plot (Figure 1 , Appendix I), it is found that the continuous variable “educational level” is normal is nature since observed values in Q-Q plot is approximately on expected values. Also we
  2. 2. Vivek Kumar Enrollment no – 09BS0002756 found that skewness and Kurtosis of “education level” is -0.114 and -0.265 which is acceptable region to say data is approximately normal to proceed for independent sample t-test. Also we need to identify the outlier for education level. A box plot is drawn to remove the outlier but we did not identified any outlier for education level (number of year of education) c. Independent sample t-test: Discussion of Result Now to proceed with independent sample t-test (Appendix I), it is mandatory to check the variance of “education level” for male and female. By “Levene’s Test of equality of variance” (Table 2, Appendix I), we can see significance level is less than 0.05 i.e. it can be interpreted that variance for educational level for both the category is not significantly equals. Since variance is not equal for male and female, we need to see significance level for t-test under “Equal variance not assumed”. Significance level for t-test under “Equal variance not assumed” is .000 (less than .05) and hence null hypothesis is rejected. Hence we can conclude that education level for male and female is not equal. Now the mean of education level for male and female are 14.43 and 12.37 (Table 1: Appendix 1) respectively. Since mean of education level for female is lower than the same of male hence we can say female are less educated than male. 2.2. It’s a need to test weather more education gives better Job (Refer Appendix II) a. Hypothesis: H0:μClerical = μCustodian= μManger (Null Hypothesis that mean of all category are equal) Ha: Not all the Mean are equal (Alternative hypothesis) Significance Level α = 0.05 (i.e. Rejection Region - Reject the null hypothesis if p-value ≤ 0.05) b. Nature of Data and appropriate statistical tool: In this case attribute “education level” is a continuous variable and job category is a categorical variable which has more than two categories (i.e. Custodial, Clerical and Manager). For the normality check of variable “education level” is explained in previous section of this report and it is found that education level is
  3. 3. Vivek Kumar Enrollment no – 09BS0002756 approximately normal. Since here more than two groups for variable “job category” is available we need to apply ANOVA instead of independent sample t-test. c. ANOVA : Discussion of Result Null hypothesis will be rejected since by ANOVA test we found that F=68.49 and p=.000 (which is less than .05). Rejections of Null hypothesis conclude that education level for all category of job is not equal. Now we have calculated the mean of education level for all three categories. We found mean for manager, clerical and custodian is 17.25, 12.87 and 10.87 respectively. It shows maximum educational level is required for manager and least is required for custodian. 2.3. It’s a need to test weather job category is gender biased (Refer: Appendix III) a. Hypothesis: H0: Job category is independent of gender Ha: Job category is NOT independent of gender Significance Level α 0.05 (i.e. RejectionRegion- Reject the null hypothesis if p-value ≤ 0.05) b. Nature of Data and appropriate statistical tool: Here we need to test relationship between two categorical variables; those are Job category and gender. To make the relationship between two categorical variables we should go for a chi-square test. In SPSS, Chi-square test can be done through Cross tab. Also we need to test one more requirement to proceed for chi-square test, that is in contingency table expected frequency should not be less than five. c. Chi-Square : Discussion of Result - The result indicated that there is no statistical significant relationship between the type of job and gender with significance level of 0.05 (chi-square with two degree of freedom = 79.277, p=0.000) 3. Conclusion: This study is an interpretation of employee data. This study reveals those education level and job categories are gender biased. It is found that among all the employees, females are less educated. On the other hand, it reveals that more education level is needed for better job.
  4. 4. Vivek Kumar Enrollment no – 09BS0002756 Appendix I : Independent Sample t-test Group Statistics Gender N Mean Std. Deviation Std. Error Mean Educational Level (years) Male 258 14.43 2.979 .185 Female 216 12.37 2.319 .158 Table 1 : Group Statistics, From Independent sample t-test Table 2 : Independent Sample t-test Figure 1 : Q-Q plot for "Education Level" to check the normality
  5. 5. Vivek Kumar Enrollment no – 09BS0002756 Appendix II : ANOVA for education level and Job Category Table 3 : ANOVA for Education level and Job Category Educational Level (years) Employment Category Mean N Std. Deviation Clerical 12.87 363 2.333 Custodial 10.19 27 2.219 Manager 17.25 84 1.612 Total 13.49 474 2.885 Table 4: Mean for Job category, from ANOVA Appendix III : Cross Tab Gender * Employment Category Crosstabulation Employment Category Clerical Custodial Manager Total Gender Female Count 206 0 10 216 Expected Count 165.4 12.3 38.3 216.0 % within Gender 95.4% .0% 4.6% 100.0% Male Count 157 27 74 258 Expected Count 197.6 14.7 45.7 258.0 % within Gender 60.9% 10.5% 28.7% 100.0% Total Count 363 27 84 474 Expected Count 363.0 27.0 84.0 474.0 % within Gender 76.6% 5.7% 17.7% 100.0% Table 5 : Contingency table ,From Cross Tab ANOVA Educational Level (years) 498.852 1 498.852 68.495 .000 3437.615 472 7.283 3936.466 473 Betw een Groups Within Groups Total Sum of Squares df Mean Square F Sig.
  6. 6. Vivek Kumar Enrollment no – 09BS0002756 Table 6 : Chi-square Test, from Cross tab Chi-Square Tests 79.277a 2 .000 95.463 2 .000 474 Pearson Chi-Square Likelihood Ratio N of Valid Cases Value df Asymp. Sig. (2-sided) 0 cells (.0%) have expected count less than 5. The minimum expected count is 12.30. a.

×