Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Validity and reliability of questionnaires

127,348 views

Published on

A presentation on validity and reliability assessment of questionnaire in research. Also includes types of validity and reliability and steps in achieving the same.

Published in: Healthcare

Validity and reliability of questionnaires

  1. 1. VALIDITY AND RELIABILITY OF QUESTIONNAIRES Dr. R. VENKITACHALAM
  2. 2. CONTENTS  Introduction  Steps in questionnaire designing  Validity  Concept of validity  Types of validity  Steps in questionnaire validation  Reliability  Types and measurement of reliability  Conclusion  References
  3. 3. INTRODUCTION  Questionnaire: Important method of data collection used extensively  Advantages of questionnaire  Less expensive  Offers greater anonymity  Disadvantages  Application is limited  Response rate is low  Opportunities to clarify issues is lacking
  4. 4.  Ideal requisites of a questionnaire:  Should be clear and easy to understand  Layout is easy to read and pleasant to eye  Sequence of questions easy to follow  Should be developed in an interactive style  Sensitive questions must be worded exactly  NOTE: The terminologies research instrument, measuring instrument,scale and test in various parts of the seminar represent questionnaire in this context . . . And item represents each question in a questionnaire
  5. 5. Steps in questionnaire designing
  6. 6. Validity
  7. 7. The concept of validity  Validity is the ability of an instrument to measure what it is intended to measure.  Degree to which the researcher has measured what he has set out to measure (Smith, 1991)  Are we measuring what we think we are measuring? (Kerlinger, 1973)  Extent to which an empirical measure adequately reflects the real meaning of the concept under consideration (Babbie, 1989)
  8. 8. Why validity ?  Validity is done mainly to answer the following questions:  Is the research investigation providing answers to the research questions for which it was undertaken?  If so, is it providing these answers using appropriate methods and procedures?
  9. 9. Questions to ponder Investigator Readers of report Experts in the field Logic Statistical tests
  10. 10. Logical thinking  Justification of each question in relation to objective of study  Easy if questions relate to tangible matters  Difficult in situations where we are measuring attitude, effectiveness of a program, satisfaction etc  Everybody’s logic doesn’t match . . No statistical backing
  11. 11. Statistical procedures  By calculating coefficient of correlations between questions and outcome variables
  12. 12. Types of validity Validity Content validity Face validity Criterion related Concurrent Predictive Construct validity
  13. 13. CONTENT VALIDITY  Uses logical reasoning and hence easy to apply  Extent to which a measuring instrument covers a representative sample of the domain of the aspects measured  Whether items and questions cover the full range of the issues or problem being measured
  14. 14. FACE VALIDITY  The extent to which a measuring instrument appears valid on its surface  Each question or item on the research instrument must have a logical link with the objective
  15. 15. Face validity is not content validity. Why?  Face validity  Simply addresses whether a measuring instrument looks valid  Not a validity in technical sense because it does not refer to what is actually being measured rather what it appears to measure  It has more to do with rapport and public relations than with actual validity
  16. 16. Other aspects of content validity  Coverage of issue should be balanced  Each aspect should have similar and adequate representation in questions
  17. 17. Problems associated with content validity  Based on subjective logic; no definitive conclusion can be drawn or consensus reached  Extent to which questions reflect the objectives of the study may differ. If wordings changed or question substituted, magnitude of link changes
  18. 18. CRITERION VALIDITY  The extent to which a measuring instrument accurately predicts behaviour or ability in a given area.  The measuring instrument is called ‘criteria’  It is of two types:  Predictive validity  Concurrent validity
  19. 19. Predictive validity  If the test is used to predict future performance  Eg: Entrance exam . . . . Performance of these tests correlates with later performance in professional college  Eg: Written driving test  Eg: measurement of sugar exposure for caries development
  20. 20. Concurrent validity  If the test is used to estimate present performance or person’s ability at the present time not attempting to predict future outcomes  Professional college exam  Eg: driving test, pilot test  Eg: measurement of DMFT for caries experience
  21. 21. Problems in criterion validity  Cannot be used in all circumstances  Esp in social sciences where some conditions do not have a relevant criteria  Eg: for measuring self-esteem, no criteria can be applied
  22. 22. CONSTRUCT VALIDITY  Most important type of validity  Assesses the extent to which a measuring instrument accurately measures a theoretical construct it is designed to measure  Measured by correlating performance on the test with performance on a test for which construct validity has been determined  Eg: a new index for measuring caries can be validated by comparing its values with a standard index (like DMFT)
  23. 23.  Another method is to show that scores of the new test differs across people with different levels of outcomes being measured  Eg: Establishing the validity of a new caries index by applying it to different stages of dental caries and calculating its accuracy
  24. 24. Summary of Validity CONTENT CRITERION CONSTRUCT CONCURRENT PREDICTIVE What it measures Whether the test covers a representative sample of the domains to be measured The ability of the test to estimate present performance The ability of the test to predict future performance The extent to which the instrument measures a theoretical construct How it is accomplished Ask experts to assess the test to establish that the items are representative of the outcome Correlate performance on the test with a concurrent behaviour Correlate performance on the test with a behaviour in future Correlate performance on the instrument with a performance on an established instrument
  25. 25. Steps in questionnaire validation
  26. 26. FACE VALIDITY  Evaluate in terms of: Readability Layout and style Clarity of wording Feasibility
  27. 27. CONTENT VALIDITY Two phases Researcher: Conceptualization and domain analysis Experts: Enhancement of content of questionnaire (Seven or more experts) Specify the full domain of content that is relevant to the issue Sample specific areas form this domain Put items/questions in a form that is testable
  28. 28. How do experts evaluate validity  Method 1: Average Congruency Percentage (ACP) [Popham, 1978]  Experts compute the percentage of questions deemed to be relevant for them  Take the average of all experts  If the value is > 90 . . . Valid  Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)  Then ACP = 90%
  29. 29.  Method 2: Content validity index [Martuza 1977]  Content validity Index for individual items (I-CVI)  Content Validity Index for the scale (S-CVI)
  30. 30. I-CVI  Panel of content experts asked to review the relevance of each question on a 4-point Likert scale (minimum 3 maximum 10 experts)  1= not relevant  2= somewhat relevant  3= relevant  4= very relevant  Then for each question, number of experts giving 3 or 4 score is counted (3,4 – relevant; 1,2 – nonrelevant)  Proportion is calculated  Eg: If 4/5 experts give score 3 or 4: I-CVI = 0.80
  31. 31. Critics of I-CVI  Collapses experts multipoint assessment into two categories (relevant and non-relevant)  Does not give inference on comprehensiveness of whole questionnaire  Problem of chance agreement. To overcome that, Lynn proposed  Five or fewer experts: all must agree (I-CVI = 1.0)  Six or more: (I-CVI should not be less than 0.78)
  32. 32. S-CVI  The proportion of items on an instrument that achieved a rating of 3 or 4 by all the content experts  Two approaches:  S-CVI/UA – Universal agreement  S-CVI/Ave - Average
  33. 33.  Which would be an effective measure here ??  S-CVI/UA or S-CVI/Ave
  34. 34.  Which to follow?  Report both the values I-CVI and S-CVI rather than using CVI as an acronym  Report the range of I-CVI values  The best method is S-CVI/UA for stringent validity, but will be difficult to use if multiple experts are validating. . In such situations S-CVI/Ave is used
  35. 35. CONSTRUCT VALIDITY  Method: Factor analysis  To examine empirically the interrelationship among items and to identify clusters of items that share sufficient variation to justify their existence as a factor or construct to be measured by the instrument  Various items are gathered into common factors  Common factors are synthesized into fewer factors and then relation between each item and factor is measured  Unrelated items are eliminated
  36. 36. Reliability
  37. 37. RELIABILITY  Definition: It is the ability of an instrument to create reproducible results  Each time it is used, similar scores should be obtained  A questionnaire is said to be reliable if we get same/similar answers repeatedly  Though it cannot be calculated exactly, it can be measured by estimating correlation coefficients
  38. 38. Reliability measured in aspects of: • Done to ensure that same results are obtained when used consecutively for two or more times • Test-retest method is used STABILITY • To ensure all subparts of a instrument measure the same characteristic (Homogeneity) • Split-half method INTERNAL CONSISTENCY • Used when two observers study a single phenomenon simulataneously • Inter-rater reliability EQUIVALENCE
  39. 39. Test-Retest reliability (for stability)  Test administered twice to the same participant at different times  Used for things that are stable over time  Easy and straight-forward approach  Useful for questionnaires, checklist, rating scales etc  Disadvantages  Practice effect (mainly for tests)  Too short intervals in between (effect of memory)  Some traits may change with time
  40. 40. Statistical calculation  Administration of instrument to a sample on two different occasions  Scores compared and calculated by using correlation coefficient formula (pearson)
  41. 41. Correlation coefficient  Measures the degree of relationship between two sets of scores  Can range from -1 to +1  0 indicates absence of any relationships Correlation coefficient Strength of relationship +/- 0.7 to 1.0 Strong +/- 0.3 to 0.69 Moderate +/- 0.0 to 0.29 None to weak
  42. 42. Split halves reliability (homogenity)  Split the contents of the questionnaire into two equivalent halves; either odd/even number or first/second half  Correlate scores of one half with scores of the other  Formula: r = Σ (x-x’)(y-y’) √ Σ(x-x’)2 (y-y’)2  But this r is only for the half, so to check reliability of entire test, use the formula
  43. 43.  R’ = 2r/1+r  (r = coefficient of split half, R’ = coefficient of entire test)  Cronbach’s alpha:  Another method of calculation using the formula: R = k/k-1 (1-Σσ1 2/σy2) k = total number of items in list σ1 = variance of individual items σy2 = variance of total test scores
  44. 44. Inter-rater reliability (Equivalence)  Used when a single event is measured simultaneously and independently by two or more trained observers R = Number of agreements Number of agreements + Number of disagreements
  45. 45. Summary of Reliability TEST RETEST SPLIT HALF INTERRATER What it measures Stability over time Equivalency of items Agreement between raters How it is accomplished Administer the same test to the same people at two different times Correlate performance for a group of people on two equivalent halves of same test Have multiple researchers measure same instrument and determine percentage of agreement between them
  46. 46. Conclusion  Validated questionnaire  It is one which has undergone a validation procedure to show that it accurately measures what it aims to, regardless of who responds, when they respond, and to whom they respond or when self-administered and whose reliability has also been examined thereby:  Reducing bias and ambiguities  Better quality of data and credible information
  47. 47. In a nutshell . . . . A questionnaire can be reliable but invalid . . . But a valid questionnaire is always reliable . . .
  48. 48. Acknowledgements  Dr. Joe Joseph  Dr. Chandrashekar
  49. 49. References  Linda Del Greco, Walop W, Richard H McCarthy. Questionnaire development: 2. Validity and Reliability. CMAJ. 1987;136:699–700.  Sushil S, Verma N. Questionnaire validation made easy. Eur J Sci Res. 2010;46(2):172–8.  Polit DF, Cheryl Tatano Beck. The Content Validity Index: Are You Sure You Know What’s Being Reported? Critique and Recommendations. Res Nurs Health. 2006;29:489–97.  Reliability and Validity Module 6. Cengage Learning; 2010.
  50. 50.  Rama B Radhakrishna. Tips for Developing and Testing Questionnaires/Instruments. J Ext. 2007;35(1):710–4.  06Article04.pdf [Internet]. [cited 2015 Apr 7]. Available from: http://www.uk.sagepub.com/salkind2study/articles/06Article04.pdf  pta_6871_6791004_64131.pdf [Internet]. [cited 2015 Apr 7]. Available from: http://cfd.ntunhs.edu.tw/ezfiles/6/1006/attach/33/pta_6871_6791004_6 4131.pdf  Questionnaire designing and validation [Internet]. [cited 2015 Apr 7]. Available from: http://www.jpma.org.pk/full_article_text.php?article_id=3414
  51. 51.  Suresh K Sharma. Nursing Research and Statistics. 1st ed. New Delhi: Elsevier Saunders;  Edward G, Richard Zeller. Reliability and Validity Assessment. New Delhi: SAGE publication; 1979.  Ranjit Kumar. Research Methodology - A step by step guide for beginners. 3rd ed. New Delhi: SAGE publication; 2012.  Articles from Dr. Joe

×