Upcoming SlideShare
×

# Data analysis

461 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
461
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
8
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Data analysis

1. 1. DATA ANALYSIS Dr.Ajay Pandit September 14, 2010
2. 2. Data Analysis <ul><li>Data Analysis involves three stages: </li></ul><ul><li>Testing association between variables </li></ul><ul><li>Determining the degree of association between the variables </li></ul><ul><li>Estimating the values of the variables </li></ul>
3. 3. Identifying the technique <ul><li>Technique shall largely depend upon the scales of measurement of variables i.e. nominal, ordinal, interval or ratio. </li></ul>
4. 4. Bi-variate Analysis <ul><li>Independent </li></ul><ul><li>Nom Int/Ratio </li></ul><ul><li>Nom Chi-Sq. χ 2 Discriminant </li></ul><ul><li>Dependant </li></ul><ul><li>Int/Ratio ANOVA Reg/Co-Rel </li></ul>
5. 5. CHI-SQUARE ( χ 2 ) <ul><li>The technique uses data arranged in a contingency table to determine whether two classifications of a population of nominal data are statistically independent . </li></ul><ul><li>This test can also be interpreted as a comparison of two or more populations. </li></ul>
6. 6. Example <ul><li>The demand for an MBA program’s optional courses and majors is quite variable year over year. </li></ul><ul><li>The research hypothesis is that the academic background of the students (i.e. their undergrad degrees) affects their choice of major. </li></ul><ul><li>A random sample of data on last year’s MBA students was collected and summarized in a contingency table … </li></ul>
7. 7. Example The Data MBA Major UG Degree Acntng Finance Mktg Total BA 31 13 16 60 BEng 8 16 7 31 BBA 12 10 17 39 Other 10 5 7 22 Total 61 44 47 152
8. 8. Example <ul><li>We are interested in determining whether or not the academic background of the students affects their choice of MBA major . Thus our research hypothesis is: </li></ul><ul><li>H 1 : The two variables are dependent </li></ul><ul><li>Our null hypothesis then, is: </li></ul><ul><li>H 0 : The two variables are independent. </li></ul>
9. 9. Example <ul><li>In this case, our test statistic is: </li></ul><ul><li>(where k is the number of cells in the contingency table, i.e. rows x columns) </li></ul><ul><li>Our rejection region is: </li></ul><ul><li>where the number of degrees of freedom is (r–1)(c–1) </li></ul>
10. 10. Example <ul><li>In order to calculate our χ 2 test statistic, we need to calculate the expected frequencies for each cell… </li></ul><ul><li>The expected frequency of the cell in row i and column j is: </li></ul>COMPUTE Row i total x Column j total e ij = Sample size
11. 11. Contingency Table Set-up…
12. 12. Example COMPUTE e 23 = (31)(47)/152 = 9.59 — compare this to f 23 = 7 Compute expected frequencies… Row i total x Column j total e ij = Sample size MBA Major Undergrad Degree Accounting Finance Marketing Total BA 31 13 16 60 BEng 8 16 31 x 47 152 31 BBA 12 10 17 39 Other 10 5 7 22 Total 61 44 47 152
13. 13. Example <ul><li>We can now compare observed with expected frequencies… </li></ul><ul><li>and calculate our test statistic: </li></ul>MBA Major Undergrad Degree Accounting Finance Marketing BA 31 24.08 13 17.37 16 18.55 BEng 8 12.44 16 8.97 7 9.59 BBA 12 15.65 10 11.29 17 12.06 Other 10 8.83 5 6.37 7 6.80
14. 14. Example <ul><li>We compare χ 2 = 14.70 with: </li></ul><ul><li>Since our test statistic falls into the rejection region, we reject </li></ul><ul><li>H 0 : The two variables are independent. </li></ul><ul><li>in favor of </li></ul><ul><li>H 1 : The two variables are dependent. </li></ul><ul><li>That is, there is evidence of a relationship between undergrad degree and MBA major. </li></ul>INTERPRET χ 2 = χ 2 = χ 2 = 12.5916 α , ν .05, (4-1)(3-1) .05,6
15. 15. Required Condition – Rule of Five… <ul><li>In a contingency table where one or more cells have expected values of less than 5 , we need to combine rows or columns to satisfy the rule of five. </li></ul><ul><li>Note: by doing this, the degrees of freedom must be changed as well. </li></ul>
16. 16. Type of Measurement Differences between three or more independent groups Interval or ratio One-way ANOVA ANOVA
17. 17. SAMPLE RESULTS OF PACKAGE SALES
18. 18. ONE WAY ANOVA       1) - k(n df SSW 1 - k df SSB 1 - nk df SST 2 1 1 1 2 2 1 1                    k i n j i ij k i GM i k i n j GM ij X X X X n X X
19. 19.                  57 40 - 97 SSB - SST SSW 40 ) 4 3 . 3 4 3 4 6 4 6 . 3 7 SSB 97 4 2 ..... 4 5 4 3 SST 2 2 2 2 2 2 2                   
20. 20. PACKAGE SALES DESCRIPTIVE STATISTICS
21. 21. PACKAGE SALES ANOVA SUMMARY TABLE
22. 22. <ul><li>THANKS </li></ul>