TESTING Validity: Internal Validity of Test Items and Item Analysis

949 views

Published on

Testing Validity work

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
949
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

TESTING Validity: Internal Validity of Test Items and Item Analysis

  1. 1. Page 1 Course VALIDITY & ASSESSMENT Learner/Practitioner Assessment Project Purpose of the Assessment: The purpose of the test was to assess knowledge of nurses completing an in-service training about “Patient Safety”. Persons being assessed: Learners who took the “Patient Safety” test are nursing staff on the 8th Floor Vanderbilt University Medical Center with varying degrees of experience in and outside of Vanderbilt and total years of nursing experience. The learners were attendees of the inservice and the inservice would count towards their 4 hours annual required inservice time. This works as a motivator to get nurses to attend inservices. Framework – content: The content for the inservice was derived from current findings published by the Institute for Healthcare Improvement Safety Initiative called Transforming Care at the Bedside (TCAB) (Viney et al. 2006). The concepts in the inservice were presented to staff to help explain key quality and safety concepts about inpatient acute hospital falls, hospital medication errors, adverse events, and nosocomial pressure ulcers. One arm of recommendations stemming from TCAB is that nurses and teams benefit from current knowledge and awareness about evidenced based research regarding patient safety and hospital quality improvement. Framework – measurement and outcome level: The assessment for this inservice was criterion-referenced framework. The level of learning outcome being assessed is 3A, Learning: Declarative Knowledge measured by posttest (Moore et al. 2009). The passing score for this test was 70%. If learners did not achieve a score of 70% or greater, they did not receive a full hour of inservice time. Out of 37 taking the assessment, 33 scored above this 70% mark. Data Collection tool:12 item True or False questions online web-based posttest. The link was emailed to each attendee the Friday following the 4 separate nightshift and dayshift inservice events. The test was not proctored, there was no discussion of using other resources and attendees were told that it would be based on the power point lecture. They were told that 70% would be passing. Person(s) completing the data collection tool: Participants in the inservice complete the test. Frequency of data collection and the sample: The test was assigned once after the inservices and taken online within two weeks of inservice for full inservice time. It is a one time, no remediation test. 100% of inservice attendees took test. Descriptive Results from the data set:
  2. 2. Page 2 One leaner did not answer one item. 2 people are missing from some of this data. One learner did not answer every question and another learner was not a nurse but an ancillary staff member. Their data was removed from reliability testing and item analysis. This first bar chart describes all test takers, their percent of items correct, the mean of 91%, and standard deviation of 16.5. TABLE 1. Percent Correct Number of Learners Percent Correct
  3. 3. Page 3 TABLE 2. All Learners Percent Correct Frequency Percent Valid Percent Cumulative Percent 33.33 1 2.7 2.7 2.7 41.67 1 2.7 2.7 5.4 58.33 1 2.7 2.7 8.1 66.67 1 2.7 2.7 10.8 75.00 2 5.4 5.4 16.2 83.33 1 2.7 2.7 18.9 91.67 7 18.9 18.9 37.8 100.00 23 62.2 62.2 100.0 Total 100.0 100.0 Percent Correct Valid 37 TABLE 3. Reliability Statistics Cronbach's Alpha Cronbach's N of Items Alpha Based on Standardized Items .855 .866 11 In Table 3 The number of items for which we could perform reliability testing is 11. One item is not included in the reliability measure because not all learners answered the question.
  4. 4. Page 4 TABLE 4. Mean Std. Deviation N .7429 .44344 35 device_pu_scored Device related pressure ulcers may be unpreventable when a patient is compromised .8857 nutritionally, has poor perfusion and must have device secured in place for life support. .32280 35 toiletting_scored Per Vanderbilt policy, if you assist a patient to the .9429 toilet, you must stay with them. .23550 35 reimbursed_scored As of 2012, hospitals are reimbursed related to their patient safety scores. .9143 .28403 35 rrt_scored Rapid Response Systems were designed to prevent failure to rescue. Calling Rapid .9714 Response for first recognition of trigger is the reliable way to ensure Rapid Response Systems remain reliable. .16903 35 gait_belts_scored Gait belts are used to prevent falls.
  5. 5. Page 5 reliability_scored Hospital reliability and nursing communication related to patient safety must include checklists, .9714 standardized communication formats and information system checks. .16903 35 transfusion_scored Transfusion errors begin .9714 at the point of collecting the specimen. .16903 35 ebp_fall_scored Some hospitals are using hip protectors and helmets .9714 on patients who are known for falling. .16903 35 stop_pu_scored Pressure ulcers are prevented by appropriate surface selection, regular repositioning and .8571 turning, optimizing temperature control, and preventing moisture/providing moisture barrier products. .35504 35 fall_liability_scored Patients who fall who have stated a high desire for independence, who .8857 have stated they do not have to use the call bell, can not hold us liable if they fall and are hurt. .32280 35
  6. 6. Page 6 zero_scored Falls are preventable and achieving zero falls has .8286 been attained in other hospitals. .38239 35 In Table 4 the item statistics are presented. The mean percent of learners getting the item correct for each item is in the column labeled mean. 2 people are missing from this data. One learner did not answer every question and another learner was not a nurse but an ancillary staff member. Their data was removed from reliability testing and item analysis. The first item “Gait belts are used to prevent falls” is a false statement. I suspect that it may have been a tricky question. A true statement would be “Gait belts are used to prevent injury during falls.” I think the reason people got it wrong is that it is just a little bit tricky. TABLE 5. Item-Total Statistics Scale Mean if Scale Item Deleted Variance if Item Deleted Corrected Item-Total Correlation Squared Multiple Correlation Cronbach's Alpha if Item Deleted gait_belts_scored Gait belts are used to prevent 9.2000 falls. 2.929 .690 . .834 device_pu_scored Device related pressure ulcers may be unpreventable when a patient is compromised 9.0571 nutritionally, has poor perfusion and must have device secured in place for life support. 3.291 .664 . .833 toiletting_scored Per Vanderbilt policy, if you assist a patient to the 9.0000 toilet, you must stay with them. 3.471 .737 . .832
  7. 7. Page 7 reimbursed_scored As of 2012, hospitals are reimbursed related to their patient safety scores. 9.0286 3.499 .558 . .842 rrt_scored Rapid Response Systems were designed to prevent failure to rescue. Calling Rapid Response for first 8.9714 recognition of trigger is the reliable way to ensure Rapid Response Systems remain reliable. 3.793 .533 . .848 reliability_scored Hospital reliability and nursing communication related to patient safety must include checklists, 8.9714 standardized communication formats and information system checks. 3.793 .533 . .848 transfusion_scored Transfusion errors begin 8.9714 at the point of collecting the specimen. 3.852 .441 . .852 ebp_fall_scored Some hospitals are using hip protectors and helmets on patients who are known for falling. 3.970 .260 . .859 8.9714
  8. 8. Page 8 stop_pu_scored Pressure ulcers are prevented by appropriate surface selection, regular repositioning and 9.0857 turning, optimizing temperature control, and preventing moisture/providing moisture barrier products. 3.081 .775 . .822 fall_liability_scored Patients who fall who have stated a high desire for independence, who have stated they do 9.0571 not have to use the call bell, can not hold us liable if they fall and are hurt. 3.114 .838 . .817 zero_scored Falls are preventable and achieving zero falls has 9.1143 been attained in other hospitals. 3.692 .228 . .876 Table 5 describes item statistics. Each Cronbach’s Alpha is very good and is calculated to predict internal consistency. This can serve as an index of consistency and an approximation to test-retest reliability. Measurement Characteristics: Reliability We are able to come up with measures for internal consistency such as calculating the test item intercorrelations and reject or accept the questions with the highest or lowest reliability coefficient. We were able to accept all items and the last item makes no difference. My index of consistency used was the Cronbach’s Alpha. It was 0.86 for 11 test items. This is a very good level of internal consistency. The Standard deviation for this test is 16.54. The
  9. 9. Page 9 mean score is 91.2. This means the average test score of all test participants was 91.2%. The standard error of measurement (SEM) is an estimate of error to use in interpreting an individual’s test score. A test score is an estimate of a person’s “true” test performance. Using a reliability coefficient and the test’s standard deviation, we can calculate this value: SEM =sd 1 – r) The Standard Error of Measurement = 6.40. The SEM of the test scores of the test participants was 6.40. With 99% confidence the mean true test score lies between 74.69 and 100. (16.51) With 95% confidence the mean true test score lies between 78.66 and 100. (12.54) Validity The validity of this assessment is that this assessment was a measure of how much was understood about concepts and ideas presented in a staff inservice about safety and quality. Nurses who do not have a general understanding of key ideas about safety quality may have less motivation implementing new processes and strategies to improve quality and safety. Decisions: Those that score 70% in this assessment will be given a full inservice hour towards their total 4 hours required by the department. If they score less than 70% they only receive a half hour. This assessment would be formative in that it would give feedback to learners about where they have weakness or where they could do further study. The content validity was assured because each question on the test was exactly quoted from the inservice and from the power point slides shown at the inservice. The content was related to the learning objectives given at the beginning of the class. Construct validity about the content of the inservice is related to the importance of understanding key points about patient safety and quality in the hospital setting. These ideas are also key points reflected in Joint Commissions National Patient Safety Goals. Vanderbilt University Medical Center also has 5 Pillar Goals for 2012 that relate to patient safety and quality including preventing falls and pressure ulcers. The questions came directly from the lecture. And the content of the assessment is the content from the inservice materials. When taking Kane’s “argument-based approach to validity”, and using “Criterion 1: Clarity of the Argument” the inservice lecture and test is based directly on the newest evidenced based points that comprise a better understanding of content of the Transforming Care At the Bedside initiative and the National Patient Safety Goals set by The Joint Commission. These points of evidence lay the foundation for understanding patient safety and quality improvement initiatives that are occurring in American hospitals. The inservice was conducted as a way to spread the latest evidenced based information and increase the nurse’s base knowledge.According to Criterion 2: Coherence of the Argument, by assuring transmitted evidenced based information that is relevant to a nurse’s work, the test is a way to measure the transmission of the information.According to Criterion 3: Plausibility of the assumptions, it is very
  10. 10. Page 10 plausible that the test is valid because the test questions are exactly quoted from the lecture and power point slides when test questions are true. When the test question is false the statement is changed in a simple way to make it false.Other sources of “error,”and other sources of unwanted variance that might undermine the measurement characteristics of this assessment are various things. I’ve listed nine examples of possible sources of error. 1. The test taker not being present during the inservice would undermine the results of the test. 2. The questions must be phrased in a clear non-confusing way. 3. There could be and was an attendee who was not a nurse but an ancillary staff member that wanted to attend and take the test. I did not include them in the reliability and item analysis calculations. 4. There are other factors such as learning or reading disabilities that any of the participants may possess that may interfere with test taking ability. 5. There could have been a distractor that caused the test participant to accidentally mark an answer they did not intend. 6. The test was given through Redcap, scoring was precise and was completed using SPSS. 7. Some of the nurses may have already known the information and to the degree that the inservice was unnecessary. 8. While this patient safety inservice is not given to improve patient safety directly, it is given to improve the nurse’s motivation and involvement in unit and patient safety awareness. 9. It is possible that those who scored poorly had already met their inservice time requirement and did not take the test seriously. There are numerous other possible sources of error (Kane 1992). Improvement Plan 1. The first way to ensure that knowledge is being gained is to use this test as the pre-test. I could assign this test before giving the class to assess the baseline knowledge. 2. One aspect that could be improved upon is content validity. I could approach this by having a few nurse colleagues assess the test for content as well as question writing (Miller & Linn 2000). 3. Next I could repeat this test and measure test item intercorrelations. I could re-conduct this inservice on another floor with a separate and new cohort and see if my data is different and in what way.
  11. 11. Page 11 References 1. Viney MM, Batcheller JM, Houston S, Belcik KB. Transforming Care at the Bedside: Designing New Care Systems in an Age of Complexity. Journal of Nursing Care Quality April. 2006;21(2):143–150. 2. Rutherford P, Moen R, Taylor J. TCAB: The “How” and the “What.” AJN, American Journal of Nursing. 2009;109:5–17. 3. Kane MT. An argument-based approach to validity.Psychological Bulletin. 1992;112(3):527–535. 4. Moore Jr DE. Achieving desired results and improved outcomes: integrating planning and assessment throughout learning activities.J CONTIN EDUC HEALTH PROF. 2009;29(1):1. 5. Miller DM, & Linn RL.Validation of performance-based assessments.APPL PSYCH MEAS. 2000;24(4):367.

×