1. 1. Reliability Testing for Item Analysis Dr. Debdulal Dutta Roy  , Ph.D.       Psychology Research Unit         Indian Statistical Institute        203, B.T. Road       Kolkata - 700108        E-mail:    ddroy@isical.ac.in        Fax : 91 - 33 - 25776680 Tel (o) : 91 - 33 - 2575 3454 Presentation at the Department of Clinical Psychology, Ram Chandra University, Chennai 21.2.09
2. 2. Reliability Analysis <ul><li>Reliability refers to the consistency of scores obtained by the same persons when reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions (Anastasi, 1990). It indicates the extent to which individual differences in test scores are attributable to “true” differences in the characteristics under consideration and the extent to which they are attributable to chance errors. Reliability of a test is given by the proportion of true variance resulting from the presence of specific situation under consideration and error variance resulting from the presence of some factors irrelevant to the present situation. </li></ul>
3. 3. Reliability Analysis
4. 4. Test-Retest Reliability <ul><li>Johnny, Johnny,      Yes, Papa, Eating sugar?      No, Papa Telling lies?      No, Papa Open your mouth      O Ha! Ha! Ha! </li></ul>
5. 5. Test retest reliability (8 months interval) Test retest reliability of Reading Motivation Questionnaire (n=72) using both t-test and correlation coefficients 0.93** -0.27 1.01 4.5 1.05 4.46 Total     -1.41 0.38 0.83 0.44 0.74 19A     2.43 0.49 0.63 0.41 0.79 16B     0.53 0.35 0.86 0.32 0.89 14B     -0.83 0.32 0.89 0.36 0.85 4A     0 0.23 0.94 0.23 0.94 2A     -1.41 0.44 0.35 0.44 0.25 1B Knowledge 0.706 -0.23 1.01 4.86 1.14 4.82 Total     -2.99 0.26 0.93 0.42 0.78 21A     0.19 0.46 0.69 0.46 0.71 17A     -1.84 0.23 0.94 0.36 0.85 9B     1 0.28 0.92 0.2 0.96 7A     0.81 0.45 0.72 0.42 0.78 6B     1.41 0.48 0.65 0.44 0.75 1A Application Correlation t-ratio(df=71) SD Mean SD Mean         Second session First session
6. 6. Test-Retest Multi-item response <ul><li>All items do not behave in same fashion always. </li></ul><ul><li>Identify inconsistent items in the set across periods.. </li></ul>Last supper: Leonardo Da Vinci
7. 7. Test-Retest Multi-item response Consistency (8 months interval)
8. 8. Test-Retest Multi-Trait Consistency (8 months interval) After 8 moths Tool: Reading motivation questionnaire (Dutta Roy, 2002); N=72 students of same school
9. 9. Alternate form <ul><li>Correlating scores of two parallel forms of a single test </li></ul><ul><ul><li>Number of items in both forms should be same. </li></ul></ul><ul><ul><li>Both have uniform content, range of agreement and disagreement </li></ul></ul><ul><ul><li>Means and standard deviations of both forms should be equal </li></ul></ul><ul><ul><li>Mode of administration and scoring of both should be uniform </li></ul></ul>
10. 10. Split-half <ul><li>Upper and lower part of the questionnaire sometimes differ in item content. </li></ul><ul><li>All items do not reflect same content always. </li></ul>
11. 11. Split-half reliability <ul><li>Correlating equal halves of test </li></ul><ul><li>Correlating odd and even no. of items </li></ul><ul><li>2 X reliability of half test </li></ul><ul><li>rtt= ------------------------------------------- </li></ul><ul><li>1 + reliability of half test </li></ul><ul><li>Reliability of half test = Product moment corr of two halves </li></ul><ul><li>Advantages : On-the-spot relibility </li></ul><ul><li>Disadvantages : Failure to assess temporal stability </li></ul>
12. 12. Split-half Canonical correlation <ul><li>Split-half Canonical correlation provides knowledge about the percent of variance in the one set explained by the other set of variables along a given dimension . </li></ul>
13. 13. Study: Split-half Canonical correlation <ul><li>12-item Likert type 5 point scale assessing attitude towards workers education was administered to 1600 rural workers of WB. </li></ul><ul><li>Split-half rtt=0.85; Cronbach’s alpha = 0.87 </li></ul><ul><li>Canonical correlation coefficient between the sets (first 6 and last 6 items) = 0.78, Chisq(36)=1558.3, p<0.0000. </li></ul>
14. 14. Internal Consistency <ul><li>It measures whether several items that propose to measure the same general construct produce similar scores. For example, if a respondent expressed agreement with the statements &quot;I like to ride bicycles&quot; and &quot;I've enjoyed riding bicycles in the past&quot;, and disagreement with the statement &quot;I hate bicycles&quot;, this would be indicative of good internal consistency of the test. </li></ul>
15. 15. Coefficient alpha <ul><li>Cronbach’s coefficient alpha is an useful index to assess internal consistency of the scale. It is equivalent of Hoyt’s ANOVA procedure. The formula is: </li></ul><ul><li>N Sum of item variance </li></ul><ul><li>Alpha = -------------- X 1 - -------------------------------------------- </li></ul><ul><li>N -1 Variance of total composite </li></ul><ul><li>EXAMPLE </li></ul>
16. 16. Item-Item correspondence: Internal consistency among 42 items of Reading Motivation questionnaire Correspondence Map shows cluster of intrinsic reading motivation items and extrinsic motivation items are scattered widely.
17. 17. Correspondence map of traits
18. 18. Rational Equivalence <ul><li>The formula is given below: </li></ul><ul><li>rtt = (n/(n-1)) X ((  2t-  pq) /  2t) </li></ul><ul><li>in which, </li></ul><ul><li>rtt= reliability coefficient of the whole test </li></ul><ul><li>n= number of items </li></ul><ul><li> t= the SD of the total scores </li></ul><ul><li>p= proportion of the group giving ‘yes’ responses </li></ul><ul><li>q= (1-p)= the proportion of the group giving ‘no’ responses </li></ul>Reliability Coefficients of Attitude towards School Infrastructure Questionnaire (N=175) Ref: Dutta Roy,D. (2008). Attitude towards school infrastructure in rural areas. Unpublished project report submitted to Indian Statistical Institute, P45. 0.5 10 Willingness to Participate 0.63 5 Equal Opportunity 0.68 7 Easiness 0.5 5 Reliability 0.5 12 Exploring 0.58 12 Adequacy 0.42 5 Comfort 0.68 7 Safety 0.58 5 Cleanliness Kuder Richardson’s Reliability coefficients No. of Items Attitudes
19. 19. Factors influencing test scores <ul><li>Extrinsic factors </li></ul><ul><ul><li>Group variability </li></ul></ul><ul><ul><li>Guessing </li></ul></ul><ul><ul><li>Environmental conditions </li></ul></ul><ul><li>Intrinsic factors </li></ul><ul><ul><li>Length of test </li></ul></ul><ul><ul><li>Homogeneity of items </li></ul></ul><ul><ul><li>Discrimination value </li></ul></ul><ul><ul><li>Scorer reliability </li></ul></ul>
20. 20. Maximize your efficiency <ul><li>Groups should be heterogeneous </li></ul><ul><li>Items should be homogeneous </li></ul><ul><li>Scale should be preferably longer one </li></ul><ul><li>Items should be discriminatory one </li></ul>