SlideShare a Scribd company logo
1 of 34
Developing A Measure Of Cognitive
Ability And Personality
By:
Paula Brown
Alicia Glatt
Submitted to Dr. Ashita Goswami, Ph.D.
Psy 790 - Psychometrics
SALEM STATE UNIVERSITY
9 May 2016
2
Executive Summary
This report documents the preliminary development of the PAL* Cognitive and Spirituality
Scales. The scales are composed of three dichotomous cognitive subscales measuring
quantitative reasoning, verbal reasoning, and psychology knowledge, as well as a non-
dichotomous personality subscale measuring an individual’s level of spirituality.
The spirituality scale showed very high levels of internal consistency. The overall cognitive
composite, and verbal and quantitative reasoning subscales also showed acceptable measures of
internal consistency. The acceptable internal consistency measure in verbal reasoning,
quantitative, overall cognitive composite and spirituality indicate the scales were a good measure
of their respective constructs. The internal consistency for the psychology subscale fell outside
the acceptable range and is therefore not a good measure of psychology knowledge.
The spirituality scale and all facets of cognitive ability were further evaluated with item analysis.
Items which fell outside of the acceptable difficulty range and which showed poor discrimination
among high and low performers were removed; however, after doing so, the estimated internal
consistency improved only slightly for the quantitative subscale and combined cognitive test.
There were no items from the verbal subscale that qualified for removal. Due to a large number
of test questions being removed, the internal consistency of the psychology subscale decreased in
value, remaining far outside the acceptable range. Finally, various items were evaluated with
distractor analysis, and recommendations for improvement to individual questions are provided.
In conclusion, the PAL Cognitive and Spirituality Scales show promise as psychometric tools for
measuring cognitive reasoning, psychology knowledge and personality. Further analysis should
be conducted to ensure the validity of both the cognitive and spirituality scales. Implementation
of the recommendations made in the distractor analysis section should improve internal
consistency.
Introduction
As a field of study, psychometrics attempts to conceptualize human behavior and measure the
differences between individuals in terms aptitudes, personality, values, skills, intelligence and
attitudes. In addition to the development and refinement of theoretical approaches to
measurement, a major component of psychometrics is the construction of instruments and
procedures for measuring such constructs. This project provided an opportunity to apply the
theoretical components of classical test theory in the development of a series of scales measuring
cognitive ability, as well as a separate measure of personality. Through this exercise we gained
practical experience in the construction, evaluation and interpretative methods for psychological
tests.
*PAL isan acronym of the authors’names, PaulaandAlicia.
3
Construct Development
The PAL cognitive scale included a 20-item measure of quantitative reasoning, a 20-item
measure of verbal reasoning, and a 24-item measure of psychology knowledge. In terms of both
format and difficulty, questions were modeled after those used by the Educational Testing
Service for assessing readiness for graduate education (i.e., the GRE general test and psychology
subject test).
Quantitative reasoning, verbal reasoning and psychology knowledge are all multifaceted
constructs; therefore, in order to ensure content validity, survey questions were designed to cover
a large spectrum of facets. Math questions included algebra, arithmetic, data interpretation and
geometry. Verbal questions included analogies, antonyms, reading comprehension and sentence
completion. Likewise, questions in the psychology knowledge section were representative of a
dozen different subject areas that included neuroscience, clinical/abnormal psychology, and
measurement and methodology. In addition to representing a wide variety of constructs,
questions were designed to span a wide range of abilities, with easier questions presented earlier
and increasing in difficulty over the course of the survey.
Unrelated to cognitive reasoning, the spirituality scale is a 20-item personality scale developed to
assess an individual’s personal spirituality. It was developed using a 5-point Likert scale for
participants to respond to statements such as, “Although lacking in material possessions, it is
possible to feel fulfilled,” and “I believe my life has a purpose.” The scale was developed in
response to the growing trend of spirituality in the workplace. Spirituality in the
workplace/spiritual leadership are growing trends reflected in the steady increase in the number
of books, publications and conferences on this topic over the past 20 years (Imel 1998). Experts
have pointed to a number of factors behind this trend, including the rise in corporate layoffs and
downsizing, the decline of traditional support networks, efforts of individuals to find personal
fulfillment on the job, the need to reconcile personal values with those of the corporation, and the
rise of innovative organizational trends, such as learning organizations and the quality movement
and corporate desires to help workers achieve more balanced lives (Fry 2016, Laabs 1995; Leigh
1997; McLaughlin 1998). Anticipating that many survey participants may not be currently
employed, researchers focused on spirituality as a general construct as opposed to spirituality in
the workplace.
4
Test Development
Two researchers were involved in the creation of the survey, which consisted of a total of 100
items. A measure of self-reported performance (Appendix C) had 7 questions; the spirituality
scale (Appendix D) was composed of 20 questions; the psychology subscale (Appendix E) had
24 questions; the quantitative reasoning subscale (Appendix F) had 20 questions; and the verbal
subscale (Appendix G) had 20 questions. The remainder of the questions solicited demographic
information. Cognitive items had between 4 and 5 multiple choice options (i.e., one correct
answer and 3 or 4 distractors, respectively). Spirituality items were structured on a 5-point
Likert Scale with one point being “strongly disagree” and 5 points being “strongly agree” with
the respective statement. Since there were 20 items on the survey, the total possible points was
100. Total points were then divided by the number of items answered to obtain an average score
between 1 and 5.
Procedure and Sample
Procedure
Researchers solicited participation from among the researchers’ personal social networks of
contacts. Survey questions were entered into Survey Monkey, and the corresponding electronic
link to the survey was posted to the researchers’ Facebook pages. In addition, electronic links
were directly emailed to friends/acquaintances/coworkers who the researchers felt would be
amenable to participating. Participants were permitted to use calculators for the quantitative
section, as suggesting otherwise would be unenforceable. Likewise, no time limits were placed
on participants as enforcement was impossible to achieve.
Sample
Among the 45 respondents, ages ranged from 23 to 70 (SD = 15) and the average age was 49.
Sixty-seven percent were female, 96% were white, 42% had a Bachelor’s degree, 40% had a
Master’s degree, and 13% had a PhD or equivalent. The respondents reported grade point
averages (GPAs) ranging from 2.8 to 4.0, with a mean of 3.62. Sixteen percent of respondents
were in graduate school; the remainder were not students.
5
Descriptive Statistics Of CognitiveTests
Table 1 provides descriptive statistics for the three subscales of the PAL cognitive test (verbal,
quantitative and psychology), as well as a combined total of all three subscales (combined
cognitive). Scoring of the cognitive portions of the test consisted of adding a point for each
correct answer. Therefore, the range of scores for each subscale was 0 - 20 for verbal and
quantitative reasoning, 0-24 for psychology knowledge, and 0 - 64 for the combined cognitive
score. Of the three cognitive subscales, quantitative reasoning had the lowest average mean
(mean = 10.38, SD = 4.27) while verbal reasoning had the highest average mean (mean = 13.56,
SD = 3.63).
Crohnbach’s alpha is a statistic that measures the internal consistency of a scale (i.e., how
reliable the test is) and values greater than 0.70 are considered acceptable. Notably, the
psychology subscale did not meet the predetermined threshold; therefore it was not considered
reliable. The combined cognitive score, verbal reasoning, and quantitative reasoning scales all
had very high values of internal consistency.
Table1- DescriptiveStatisticsForCognitiveSubscalesAndCombinedCognitiveScore
Number
of Items
Minimum
Score
Maximum
Score
Mean Standard
Deviation
Internal
Consistency
(α)
Verbal Reasoning 20 4 20 13.56 3.63 0.75
Psychology Knowledge 24 5 21 12.67 3.63 0.64
Quantitative Reasoning 20 2 18 10.38 4.27 0.81
Combined Cognitive 64 15 53 36.60 8.01 0.81
N=45. The combinedcognitive subtotal was computedby addingthe verbal,quantitative, andpsychology subscores. Acceptable scores of
internal consistencyare bolded.
Table 2 shows the descriptive statistics and internal consistency (α or alpha) for psychology
knowledge, quantitative reasoning, verbal reasoning and combined cognitive with the
respondents separated by gender. With the exception of psychology knowledge amongst
females, all scales had strong scores of internal consistency. Cohen’s d is a statistic that
compares the difference in the standardized mean between the two groups. A positive “d” in
Table 2 represents a stronger female performance, while a negative “d” represents a stronger
male performance. There is a statistically small effect of gender on the psychology scale, and the
moderate effect of gender on the quantitative (males scored higher) and verbal (females scored
higher) reasoning subscales have the effect of canceling each other out so that there was virtually
no gender effect on the combined cognitive scale.
6
Table2- DescriptiveStatisticsForCognitiveSubscalesAndCombinedCognitiveByGender
Female
(N = 30)
Male
(N = 14)
Mean Standard
Deviation
Min. Max. α Mean Standard
Deviation
Min. Max. α Cohen’s
d
Psychology
Knowledge
12.87 3.33 5.00 21.00 0.59 12.27 4.27 5.00 20 0.75 0.16
Quantitative
Reasoning
9.600 4.05 3.00 18.00 0.79 11.93 4.40 2.00 18.00 0.83 -0.55
Verbal
Reasoning
14.17 3.52 6.00 20.00 0.74 12.33 3.64 4.00 17.00 0.75 0.51
Combined
Cognitive
36.63 6.82 21.00 49.00 0.73 36.53 10.27 15.00 53.00 0.89 0.01
N=45. The combinedcognitive subtotal was computedby addingthe verbal, quantitative, andpsychology subscores. Verbal andquantitative
subscales had20 items each; psychology knowledge had24 items. The combinedcognitive had75items. Acceptable scores of internal
consistency arebolded.
Descriptive Statistics For Spirituality, Quantitative, Verbal And Psychology
Subscales And Combined Cognitive By Ethnicity
Due to the homogeneity of respondents, the statistical software was unable to provide descriptive
statistics by race. Out of 45 test takers, only one was Asian, one was Hispanic, and the
remainder (96%) were white.
Table 3 displays descriptive analysis organized by academic degree. The psychology scale has
inconsistent reliability, with a score of only 0.29 for PhD-level respondents but an acceptable
score (α = 0.71) at the Masters-level. In addition, the verbal reasoning subscale at the Master’s
level was not acceptable. All other alpha scores are at or well above the 0.70 threshold of
reliability.
Regarding the cognitive scales, one might expect respondents with more education to perform
better on the survey, but that occurred only within the quantitative subscale. Respondents with
Bachelor’s degrees scored lowest (mean = 10.26, SD = 3.77), followed by those with Master’s
degrees (mean = 10.5; SD = 4.66), and those with PhD’s scored highest (mean = 11.33; SD =
5.47). In regards to the other cognitive subscales (verbal and psychology) and the combined
cognitive test, PhD-level respondents actually had the lowest scores of the three groups. It is
important to note that the sample size of PhD respondents was markedly lower than the other
groups and that different trends might emerge with a larger sample.
7
Table3- DescriptiveStatisticsForQuantitative,Verbal AndPsychologySubscalesAnd
CombinedCognitiveByAcademicDegree
Education
Level
Facet Minimum Maximum Mean Standard
Deviation
Alpha
Bachelors
(N = 19)
Verbal
Reasoning
4.00 20.00 13.42 3.76 0.77
Quantitative
Reasoning
2.00 18.00 10.26 3.77 0.74
Psychology
Knowledge
5.00 21.00 12.53 3.72 0.68
Combined
Cognitive
15.00 46.00 36.21 7.61 0.79
Masters
(N = 18)
Verbal
Reasoning
8.00 18.00 14.00 3.12 0.67
Quantitative
Reasoning
3.00 18.00 10.50 4.66 0.85
Psychology
Knowledge
5.00 20.00 13.06 4.04 0.71
Combined
Cognitive
23.00 53.00 37.56 7.46 0.78
PhD
(N = 6)
Verbal
Reasoning
6.00 18.00 12.50 5.09 0.87
Quantitative
Reasoning
5.00 18.00 11.33 5.47 0.90
Psychology
Knowledge
10.00 15.00 12.17 2.48 0.29
Combined
Cognitive
21.00 49.00 36.00 12.59 0.93
Total N = 45. The boldedinternalconsistencyindicates an acceptable reliabilitymeasure> 0.70. Verbal andquantitative reasoningsubscales had
20 items each; psychology knowledge had24 items. The combinedcognitive, whichis calculatedby addingtogether scores of thethree
subscales, had75 items.
Correlativeand Validity AnalysisOf
CognitiveScales
A test is said to have “construct validity” if it accurately measures a theoretical, non-observable
construct or trait -- in our case an individual’s aptitude in regards to the field of psychology, as
well as verbal and quantitative reasoning. One method of establishing a test’s construct validity
is called convergent/divergent validation. A test has convergent validity if it has a high
correlation with another measure of a similar construct or a construct you would expect to
mirror. In our case, we compared the subscales and the combined cognitive test with self-
reported GPAs. By contrast, a test’s divergent validity is demonstrated through a low correlation
with a test that we would expect to measure a different construct. In our case, we compared the
8
subscales and the combined cognitive test with individuals’ scores on the spirituality
scale. Evidence of high divergent validity is established by a low correlation coefficient.
Table 4 shows the correlation coefficients of each of the subscales, the combined cognitive test,
GPAs and scores on the spirituality scale. To establish convergent validity, we are looking for a
high correlation between the cognitive scales and GPA. Unfortunately, the data do not support
this. The correlation between verbal reasoning and GPA (r = -0.01) is almost non-
existent. Looking at cognitive scores, GPA correlates most highly with the psychology subscale
(r = 0.12), but the effect size is still considered to be small. As far at the test’s divergent validity,
we are looking for a low correlation between the cognitive scales and the spirituality
scale. While we do observe a very low correlation between spirituality and the combined
cognitive score (r = 0.04), the correlations between each of the subscales and the spirituality
scale are higher than their correlations with GPA.
Table4- CorrelationsBetweenTheSpiritualityScale,TheTotal CognitiveTest,All Three
SubscalesAndCollegeG.P.A.
1 2 3 4 5
1 -
Spirituality
---
2 - Verbal
Reasoning
0.15 ---
3 -
Quantitative
Reasoning
-0.26 0.28 ---
4 -
Psychology
Knowledge
0.24 0.28 0.12 ---
5 -
Combined
Cognitive
0.04 .729 .711 .643 ---
6 - GPA 0.16 -0.01 -0.07 0.12 0.02
N=45. Boldedcorrelations are significant at the 0.001level (2-tailed).
Table 5 examines the correlations between the cognitive subscales and the transformed scores for
each subscale. Because the scores have undergone a linear transformation, each of the three
subscales correlates perfectly with the subscale’s Z-score, IQ score and T-score. This occurs
because the values did not change in relation to each other, i.e., the transformation was
standardized.
9
Table5- CorrelationsBetweenVerbal,MathandPsychologySubscalesAndThe
TransformedT-scores,Z-scoresAndIQScores
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 - Verbal
Reasoning
---
2 - Verbal
ReasoningZ-
score
1.00 ---
3 - Verbal
ReasoningIQ
Score
1.00 1.00 ---
4 - Verbal
ReasoningT-
Score
1.00 1.00 1.00 ---
5 - Quantitative
Reasoning
0.28 0.28 0.28 0.28 ---
6 - Quantitative
ReasoningZ-
score
0.28 0.28 0.28 0.28 1.00 ---
7 - Quantitative
ReasoningIQ
Score
0.28 0.28 0.28 0.28 1.00 1.00 ---
8 - Quantitative
ReasoningT-
score
0.28 0.28 0.28 0.28 1.00 1.00 1.00 ---
9 - Psychology
Knowledge
0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 ---
10 - Psychology
Knowledge Z-
score
0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 ---
11 - Psychology
Knowledge IQ
Score
0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 1.00 ---
12 - Psychology
Knowledge T-
score
0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 1.00 1.00 ---
13 – Combined
Cognitive
0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 ---
14 – Combined
Cognitive Z-
score
0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 ---
15 – Combined
Cognitive IQ
Score
0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 1.00
16 – Combined
Cognitive T-
score
0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 1.00
N = 45. Boldedvalues are significant at the 0.01level.
10
Item AnalysisOf CognitiveTest And Its
Facets
Despite the cognitive subscales’ promising scores of internal consistency, it nevertheless is
important to perform an item analysis to ensure the quality of each item. Two statistics that help
us discern this are p-values and CITC. P-values range from 0 - 1 and represent the difficulty of a
test item: scores < .30 are considered too hard (i.e., less than 30% of respondents answered
correctly) and scores > .80 are considered too easy (i.e., less than 20% answered incorrectly).
CITC stands for “corrected item total correlation” and is a measure of how well an item
discriminates between respondents who are knowledgeable in the content area and those who are
not. Scores can range from -1 to 1, and test items with a CITC < .20 are considered poor
differentiators and should be considered for removal.
Table6- DifficultyandDiscriminationofQuantitativeReasoning,Verbal ReasoningAnd
PsychologyKnowledge
Math Verbal Psychology
Item
Number
Difficulty
(p)
CITC Difficulty
(p)
CITC Difficulty
(p)
CITC
1 0.53 0.27 0.49 -0.01 0.89 0.17
2 0.82 0.38 0.47 0.47 0.38 0.35
3 0.42 0.39 0.91 0.25 0.27 0.05
4 0.29 0.37 0.44 0.29 0.20 0.04
5 0.71 0.52 0.96 0.27 0.38 0.08
6 0.40 0.27 0.78 0.53 0.71 -0.01
7 0.44 0.36 0.67 0.35 0.87 0.11
8 0.38 0.37 0.91 0.31 0.42 0.18
9 0.24 0.14 0.76 -0.17 0.22 0.22
10 0.60 -0.04 0.69 0.31 0.73 0.26
11 0.69 0.36 0.78 0.44 0.69 0.38
12 0.38 0.42 0.87 0.51 0.56 0.20
13 0.60 0.31 0.73 0.16 0.62 0.09
14 0.07 -0.21 0.73 0.27 0.87 0.05
15 0.38 0.17 0.36 0.10 0.64 0.10
16 0.84 0.20 0.51 0.30 0.64 -0.04
11
17 0.69 0.38 0.31 0.03 0.53 0.38
18 0.87 0.04 0.78 0.23 0.42 0.09
19 0.44 0.44 0.71 0.32 0.29 0.14
20 0.58 0.33 0.71 0.36 0.60 0.23
N = 45. Difficulty (p) refers tothe percentage of correct responses. Items that are too easy(p > .80)or toohard(p < .30) arebolded. CITC
refers to CorrectedItemTotal Correlations, anditems that have a CITC < .20 are problematicand are bolded. Blue-shadedtest items indicate
that both statistics are outside the recommendedrange.
Each item from the cognitive scales is represented in Table 6, along with the corresponding p-
values and CITC scores. Math question 14 holds the highest difficulty (p = .07), and verbal
question 5 was the least difficult (p = .96), and both were amongst those bolded for being outside
the threshold values. Math Item 14 also scored -.21 in correct item total correlation, meaning
the participants who performed poorly on the entire test had a higher rate of success on this
question. Conversely, Item 2 on the verbal subscale has favorable values: with a CITC of .47,
we are assured it a good discriminator, and a p-value of .47 tells us that nearly as many
individuals in our sample answered the item correctly as those who answered incorrectly.
The bolded values in Table 6 represent those that failed to meet one of the thresholds described
above. Blue-shaded test items indicate that both statistics are outside the recommended range,
and those questions will undergo further scrutiny through a “distractor analysis” and possibly be
removed from the survey. The verbal reasoning section of the test did not have any flagged
questions, while the psychology subscale had the greatest number of flagged questions (6).
Quality test items will correlate with their own composite scores to a greater degree than they do
with the scores of another subscale. Failure to meet this qualification is an indication that the
question may be a better predictor of a construct other than the one it is meant to measure.
Tables 7a, b, and c represent correlations between each test question and the subscale composite
scores.
Table7a- CorrelationsBetweenPsychologyItemsAnd CompositeScores
Psychology Item
Number
Psychology
(CITCs)
Verbal Quantitative
1 0.17 .17 -.07
2 0.35 .24 .17
3 0.05 .14 -.27
4 0.04 -.09 -.19
5 0.08 -.02 -.16
6 -0.01 -.09 .02
7 0.11 .19 -.14
8 0.18 .11 .06
12
9 0.22 .11 -.16
10 0.26 .26 .03
11 0.38 .31 .12
12 0.20 .00 .25
13 0.09 -.02 .04
14 0.05 -.05 .13
15 0.10 .15 -.02
16 -0.04 -.17 .15
17 0.38 .29 .34
18 0.09 -.02 .17
19 0.14 .11 .04
20 0.23 .24 .11
N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that
correlatemore highlywithsubscales otherthanthe psychology subscale.
Table 7a indicates psychology items 3, 6, 7, 12, 14, 15, 16, 18, and 20 are problematic because
they correlate with the quantitative or verbal composite at a higher rate than to its own
composite. All other items correlate most strongly with the psychology composite.
Table 7b indicates verbal items 1, 5, 8, 9, 15, and 18 are problematic because they correlate with
the quantitative or psychology composite at a higher rate than to its own composite. All other
items correlate most strongly with the verbal composite.
Table7b- CorrelationsBetween VerbalReasoning ItemsAndCompositeScores
Verbal Item Number Verbal Reasoning
(CITCs)
Psychology Quantitative
1 -0.01 -0.35 0.28
2 0.47 0.28 0.29
3 0.25 0.19 0.04
4 0.29 0.04 0.03
5 0.27 0.30 0.13
6 0.53 0.30 0.32
7 0.35 0.32 0.07
8 0.31 0.32 -0.03
9 -0.17 -0.12 -0.24
10 0.31 0.13 0.10
11 0.44 0.16 0.34
12 0.51 0.19 0.38
13 0.16 0.01 0.15
13
14 0.27 0.15 0.00
15 0.10 0.27 -0.23
16 0.30 0.09 0.23
17 0.03 0.00 -0.07
18 0.23 0.02 0.26
19 0.32 0.12 0.17
20 0.36 0.20 0.17
N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that
correlatemore highlywithsubscales otherthanthe verbal subscale.
Table 7c indicates quantitative items 9, 10, and 14 are problematic because they correlate with
the verbal or psychology composite at a higher rate than to its own composite. All other items
correlate most strongly with the quantitative composite.
14
Table7c- CorrelationsBetweenQuantitative ReasoningItemsAndCompositeScores
Quantitative Item
Number
Quantitative
Reasoning (CITCs)
Verbal Psychology
1 0.27 0.17 -0.06
2 0.38 0.28 0.07
3 0.39 0.17 -0.01
4 0.37 0.26 0.11
5 0.52 0.37 0.25
6 0.27 0.14 -0.03
7 0.36 0.19 0.12
8 0.37 0.05 0.05
9 0.14 0.19 -0.08
10 -0.04 0.01 0.08
11 0.36 0.05 0.08
12 0.42 0.25 0.11
13 0.31 -0.01 0.06
14 -0.21 -0.29 0.17
15 0.17 -0.15 0.11
16 0.20 0.15 0.03
17 0.38 0.18 -0.01
18 0.04 0.01 -0.20
19 0.44 0.26 0.18
20 0.33 0.16 0.02
N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that
correlatemore highlywithsubscales otherthanthe verbal subscale.
Summary ofProblematic Cognitive Test Items
A summary of the problematic items, including why the item was considered for removal or
alteration is presented in Table 8. Test items that fell out of both the acceptable difficulty range
(.30 < p < .80) and discrimination range (CITC < .20) were automatically deleted. Falling into
this category were items 1, 3, 4, 7, 14, and 17 from the psychology subscale and items 9, 10, 14
and 18 from the math subscale. None of the items in the verbal subscale failed both criteria.
Items that met only one of those criteria and/or displayed a higher correlation with a subscale
other than its own were also considered for distractor analysis, the next step in evaluating item
integrity. Falling into this category were items 2, 15, and 16 from the math subscale, items 6, 15,
and 16 from the psychology subscale, and items 3, 5, and 8 from the verbal subscale.
15
Table8– SummaryTableofProblematicCognitiveTestItems
Test Item p-value CITC Reasons For Concern
Was Item
Removed
From Survey?
Psychology 1 .89 .17 p, CITC Yes
Psychology 3 .27 .05 p, high correlation with verbal Yes
Psychology 4 .20 .04 p and CITC Yes
Psychology 5 .38 .08 CITC No
Psychology 6 .71 -.01 CITC, high correlation with math No
Psychology 7 .87 .11 p, CITC, high correlation with verbal Yes
Psychology 8 .42 .18 CITC No
Psychology 9 .22 .22 p No
Psychology 12 .56 .20 High correlation with math No
Psychology 13 .62 .09 CITC No
Psychology 14 .87 .05 p, CITC, high correlation with math Yes
Psychology 15 .64 .10 CITC, high correlation with verbal No
Psychology 16 .64 -.04 CITC, high correlation with math No
Psychology 18 .42 .09 CITC, high correlation with math No
Psychology 19 .29 .14 p, CITC Yes
Psychology 20 .60 .23 High correlation with verbal No
Math 2 .82 .38 p No
Math 4 .29 .37 p No
Math 9 .24 .14 p, CITC, high correlation with verbal Yes
Math 10 .60 -.04 CITC, high correlation with psychology No
Math 14 .07 -.21 p, CITC, high correlation with
psychology
Yes
Math 15 .38 .17 CITC No
Math 16 .84 .20 p No
Math 18 .87 .04 p, CITC Yes
Verbal 1 .49 -.01 CITC, high correlation with math No
Verbal 3 .91 .25 p No
Verbal 5 .96 .27 p, high correlation with psychology No
Verbal 8 .91 .32 p, high correlation with psychology No
Verbal 9 .76 -.17 CITC, high correlation with psychology No
Verbal 12 .87 .51 p No
Verbal 13 .73 .16 CITC No
Verbal 15 .36 .10 CITC, high correlation with psychology No
Verbal 17 .3.781 .03 CITC No
Verbal 18 .23 High correlation with math No
The psychologysubscale is composedof 24items; verbal andquantitative subscales each have 20items.
16
Distractor AnalysisOf CognitiveScale
As required by the assignment, distractor analyses were conducted on three problematic items
from each subscale to determine if the item should be removed or altered. In a distractor analysis,
the average composite score of individuals who select a particular response for an item is
calculated to determine two properties of a good item:
1. An equal distribution of participants selecting incorrect responses (distractors).
2. The mean score of the overall subscale should be higher for the correct response,
indicating that the question correctly discriminates between high and low performers
on that particular subscale.
Psychology Knowledge Item Analysis
Psychology knowledge items 1, 3 and 4 were elected for analysis and removal.
Psychology Item 1
The distractor analysis of psychology Item 1 is represented in Table 9a. Table 8 indicates the
item had a low difficulty level with a p value of .89 and also a low CITC score of .17. The low
CITC score represents that those with a strong performance on the psychology knowledge test
are not statistically more successful at this item.
Table9a- PsychologyScoreOfRespondentsSelectingEachOptionForItem 1
Response
Choice
Psychology
Reasoning
Score
N %
A 5 1 2.22%
B 9.5 2 4.44%
C 11 1 2.22%
D 9 1 2.22%
E 13.15 40 88.89%
N=45. Correct response is E.
Table 9a indicates a very high percentage of participants selected the correct answer E (88.89%).
The combination of information in Table 8 and Table 9a display a valid reason to eliminate and
replace Item 1. The replacement should be moderately easy due to its placement in the test but
more difficult than the original question.
17
Psychology Item 3
The distractor analysis of Item 3 is represented in Table 9b. Table 8 indicates the item had a high
difficulty with a p value of .27 and also had a low CITC score of .05. The low CITC score
indicates that participants who performed well on the psychology knowledge test did not
statistically perform well on this item.
Table9b- PsychologyScoreOfRespondentsSelectingEachOptionForItem 3
Response
Choice
Psychology
Reasoning
Score
N %
A 9.88 8 17.78%
B 115.42 12 26.67%
C 10.86 7 15.56%
D 11.88 8 17.78%
E 13.50 10 22.22%
N=45. Correct response is B.
As Table 9b demonstrates, the highest percentage of participants selected the correct answer B
(26.67%), which is ideal. Furthermore, the distribution of incorrect answers amongst the
distractors is relatively evenly disbursed, which is also ideal. But because the CITC score is so
low, it warrants removal and replacement with an item that is better at discriminating between
high and low performers.
Psychology Item 4
The distractor analysis of Item 4 is represented in Table 9c. The above Table 8 indicates the item
had a high difficulty with a p value of .20 and also had a negative CITC score of .04. The low
CITC score indicates that participants who performed well on the psychology knowledge test did
not statistically perform well on this item.
Table9c- PsychologyScoreOfRespondentsSelectingEachOptionForItem 4
N=45 Correct response is D.
Response
Choice
Psychology
Reasoning
Score
N %
A 11 7 15.16%
B 9.75 8 17.78%
C 11.88 8 17.78%
D 16.56 9 20%
E 13.15 13 28.89%
18
Table 9c displays a lower percentage of participants selecting the correct answer D (20%) and a
large percentage of participants selecting answer E (28.89%). The highest percentage should be
the correct answer and the responses should have a more even distribution.
The combination of information in Table 9c and Table 8 display a valid reason to eliminate and
replace Item 4. The replacement should be of moderate difficulty due to its placement in the test.
Verbal Reasoning Item Analysis
Verbal reasoning items 5, 8 and 9 were elected for analysis.
Verbal Item 5
The distractor analysis of Item 5 is represented in Table 9d. Table 8 indicates the item had a low
difficulty with a p value of .96 and also had an acceptable CITC score of .27. The acceptable
CITC score indicates that those who did well on this item also did well on the rest of the verbal
reasoning section.
Table9d– Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 5
Response
Choice
Verbal
Reasoning
Score
N %
A 10.5 2 4.44%
C 13.70 43 95.56%
N=45. Correct response is C.
Table 9d displays a very high percentage of participants selecting the correct answer of C
(95.56%). No one answered B, D or E and only two participants answered both A. The
responses should have a much better distribution.
The combination of information in Table 10d and Table 9 provide a valid reason to revise Item 5.
The question is very easy; therefore, the revision should include more appealing distractors. Item
5 has been rewritten:
5. Love and beauty are best described as __________ concepts of the mind.
A) Physical
B) Concrete
C) Psychology
D) Factual
19
Verbal Item 8
The distractor analysis of Item 8 is represented in Table 9e. Table 8 indicates the item had a low
difficulty with a p value of .91 and an acceptable CITC score of .32. The acceptable CITC score
indicates that those who did well on this item also did well on the rest of the verbal reasoning
section.
Table9e- Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 8
Response
Choice
Verbal
Reasoning
Score
N %
A 9.33 3 6.67%
C 14 41 91.11%
D 8 1 22.22%
N=45. Correct response is C.
Table 9e displays a very high percentage of participants selecting the correct answer of C
(91.11%). None of the participants answered B or E and only one participant answered D. The
responses should have a much better distribution.
The combination of information in Table 9e and Table 8 display a valid reason to revise Item 8.
Item 8 below has been rewritten:
8. It is our view that as a result of intercession from foreign stakeholders our company’s financial
difficulties have lamentably been ___________.
A) Curtailed
B) Ameliorated
C) Exacerbated
D) Mitigated
Verbal Item 9
The distractor analysis of Item 9 is represented in Table 9f. Table 8 indicates the item had an
acceptable p value of .76 but a negative CITC score of -.17. The negative CITC score represents
that those with a strong performance on the verbal reasoning test are not statistically more
successful at this item.
20
Table9f- Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 9
Response
Choice
Verbal
Reasoning
Score
N %
A 14 4 8.89%
B 14.67 3 6.67%
C 11.75 4 8.89%
D 13.62 34 75.56%
N=45. Correct response is D.
Table 9f displays a high percentage of participants selecting the correct answer D (75.56%). The
highest percentage should be the correct answer but the responses should have a much better
distribution. The mean of the correct response (13.63) is lower than response A (14) and B
(14.67). This indicates the question does not discriminate between high and low performers on
the verbal reasoning subscale which is why the CITC score is low (-.17). The item may be too
easy and high performers are over thinking the answer.
The combination of information in Table 9f and Table 8 display a valid reason to rewrite Item 9.
Item 9 has been rewritten below to make the other options more appealing.
9. Finish the analogy:
Grain : Rice ::
A. Cube : Ice
B. Rain : Precipitation
C. Cow : Milk
D. Flake : Snow
Quantitative Reasoning Item Analysis
Quantitative reasoning items 9, 14 and 18 were elected for analysis and removal.
Quantitative Item 9
The distractor analysis of Item 9 is represented in Table 9g. Table 8 indicates the item had high
difficulty with a p value of .24 and a CITC score of .14. The CITC score indicates that those with
a strong performance on the quantitative test are not statistically more successful at this item.
21
Table9g- Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 9
Response
Choice
Quantitative
Score
N %
A 7.75 4 9.52%
B 7.86 7 16.67%
C 12.09 11 26.19%
D 10.77 13 30.95%
E 12.29 7 16.67%
N=42. Correct response is C.
As Table 9g displays, a higher percentage of participants selected the answer D (30.95%) rather
than the correct answer C (26.19%). The correct answer should have the highest percent of
respondents. The mean of the correct response (12.09) is slightly lower than that of response E
(12.29). This indicates the question does not discriminate between high and low performers on
the quantitative reasoning subscale, which is why the CITC score is low (.14). The item may be
too easy and high performers are over thinking the answer.
The combination of information in Table 8 and Table 9g display a valid reason to eliminate and
replace Item 9.
Quantitative Item 14
The distractor analysis of Item 14 is represented in Table 9h. Table 8 indicates the item had a
high difficulty with a p value of .07 and also had a negative CITC score of -.21. The negative
CITC score represents that those with a strong performance on the quantitative test are not
statistically more successful at this item.
Table9h- Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 14
Response
Choice
Quantitative
Score
N %
A 3 1 2.39%
B 2 1 2.39%
C 7.75 4 9.52%
D 7 3 7.14%
E 11.85 33 78.57%
N=42. Correct response is D
Table 9h displays a low percentage of participants selecting the correct answer D (7.14%) and a
very large percentage of participants selecting answer E (78.57%). The highest percentage
should be the correct answer. The distribution is also not equal across the incorrect choices. The
mean of the correct answer (7) should be higher than the other responses. The lower mean
22
indicates the question does not discriminate between high and low performers on the quantitative
subscale.
The combination of information in Table 8 and Table 9h display a valid reason to eliminate or
replace Item 14. The replacement question should be of moderate difficulty because of the
placement of the item.
Quantitative Item 18
The distractor analysis of Item 18 is represented in Table 9i. Table 8 indicates the item had a
low difficulty with a p value of .87 and also had a low CITC score of .04. The CITC score
indicates that those with a strong performance on the quantitative test are not statistically more
successful at this item.
Table9i - Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 18
Response
Choice
Quantitative
Score
N %
A 10.85 39 86.67%
B 3 1 2.22%
C 12 2 4.44%
D 5.5 2 4.44%
E 6 1 2.22%
N=45 responseis A.
Table 9i displays a high percentage of participants selecting the correct answer A (86.67%). The
highest percentage should be the correct answer but the distribution should be more evenly
distributed across the incorrect choices. The mean of the correct answer (10.85) should be
higher than the mean of answer C (12). The lower mean indicates the question does not
discriminate between high and low performers on the quantitative subscale.
Revised CognitiveScale
As a result of our item and distractor analyses, the following items were eliminated: Psychology
items 1, 3, 4, 7, 14 and 19 and Quantitative items 9, 14, 18. No verbal reasoning items were
removed. The combined score is a combination of all three subscales so the number of items
decreased by 9. By removing the problematic items, the scale’s reliability potentially will be
altered; therefore, another analysis is required. The results shown in Table 10 indicate that the
removal of the problematic items did not result in changes in reliability for the psychology scale.
Quantitative reasoning and the combined cognitive have improved slightly higher than their
already acceptable level, meaning the internal consistency has improved these two measures.
23
Table10- DescriptiveStatisticsForCognitiveScaleCorrectedTotals
Number of
Items
Minimum
Score
Maximum
Score
Mean Standard
Deviation
Internal
Consistency
(α)
Verbal
Reasoning
Old 20 4 20 13.56 3.63 .75
New 20 4 20 13.56 3.63 .75
Quantitative
Reasoning
Old 20 2 18 10.38 4.27 .81
New 17 1 17 9.20 4.15 .82
Psychology
Knowledge
Old 24 5 21 12.67 3.63 .64
New 18 3 16 9.29 2.94 .54
Combined
Cognitive
Old 64 15 53 36.60 8.01 .81
New 55 10 47 32.04 7.70 .82
N = 45. Newacceptable internal consistency (> 0.70) arebolded. Newitems indicate the problem items havebeen removed. Psychologyitems 1,
3, 4, 7, 14, 19 andquantitative items 9, 14,18.The newcombinedcognitive subtotal was computedby addingthe correctedverbal, psychology
andmath subscales andcorrectingfor guessing.
Descriptive Statistics For Spirituality
Scale
Scoring of the spirituality portion of the test consisted of adding 1 point for each statement to
which individuals responded with “strongly disagree”, 2 points for “slightly disagree”, 3 points
for “neither agree nor disagree”, 4 points for “slightly agree”, and 5 points for “strongly agree.”
The total was then divided by the number of questions on the scale to obtain an individual
average. Therefore an individual’s score for the spirituality scale could range from 1 - 5, with
higher scores indicating higher degrees of spirituality.
Table 11 provides descriptive statistics for the overall spirituality scale, as well as divided by
gender. The scores obtained by our sample ranged from 1.95 to 4.9, with a mean score of 3.60
and a standard deviation of .57. Women reported higher levels of spirituality (Mean = 3.79; SD
= .49) than men (Mean = 3.23; SD .55). Cohen’s d is a statistic that compares the difference in
the standardized mean between two groups. A positive “d” in Table11 represents a stronger
female performance, while a negative “d” represents a stronger male performance. There is a
statistically large effect of gender on the spirituality scale.
24
Table11- DescriptiveStatisticsForSpiritualityScaleByGender
Overall
(N = 45)
Female
(N = 30)
Male
(N = 15)
Mean 3.60 3.79 3.23
Standard Deviation .57 .49 .55
Minimum Score 1.95 2.85 1.95
Maximum Score 4.90 4.90 3.95
Internal Consistency .90 .86 .91
Cohen’s d 1.08 (large effect)
Total N = 45. The boldedinternal consistency indicates an acceptable reliability measure > .70.The spirituality scale was composedof 20 items.
Crohnbach’s alpha is a statistic that measures the internal consistency of a scale (i.e., how
reliable the test is) and values greater than .70 are considered to be acceptable. At .90 overall,
the spirituality scale has a remarkable score of internal consistency. Even though the reliability
of the spirituality scale for men (α = .91) is higher than for women (α = .86), the female score is
still well above the .70 threshold of acceptability.
Table 12 displays descriptive analysis organized by academic degree. The spirituality scale has
impressive scores of internal consistency at all academic levels. There were no discernable
trends related to scores on the spirituality scale and academic degree of the respondents.
Table12- DescriptiveStatisticsForSpiritualityScaleByAcademicDegree
Education Level Minimum
Score
Maximum
Score
Mean Standard
Deviation
Alpha
(α)
Bachelors (N = 19) 1.95 4.90 3.63 0.71 .93
Masters (N = 18) 2.63 4.25 3.50 0.42 .80
PhD or equivalent (N = 6) 3.20 4.35 3.66 0.52 .88
Total N = 45. The boldedinternalconsistencyindicates an acceptable reliabilitymeasure> .70. Spiritualityscale had20 items.
Descriptive Statistics For Spirituality Scale By Ethnicity
Due to the homogeneity of respondents the statistical software was unable to provide descriptive
statistics by race. Out of 45 test takers, only one was Asian, one was Hispanic, and the
remainder (96%) were white.
Validity And CorrelationAnalysis For Spirituality Scale
The spirituality score was correlated with the combined cognitive test and all three of its
subscales and is represented in Table 13. There is a negative correlation between spirituality and
quantitative reasoning (r = -0.26), which represents a small to medium effect. Conversely, there
25
is a positive correlation between spirituality and the remainder of the scales. There is a small
correlation between spirituality and verbal reasoning (r = 0.15), a small to medium correlation
between spirituality and psychology knowledge (r = 0.24), and a very small correlation between
spirituality and the combined cognitive scale (r = 0.04).
Table13- CorrelationsBetweenTheSpiritualityScale,TheTotal CognitiveTestAndAll
ThreeSubscales
Verbal Reasoning Quantitative
Reasoning
Psychology
Knowledge
Combined
Cognitive
Spirituality 0.15 -0.26 0.24 0.04
Nature Of Effect
Size
Small Small/Medium Small/Medium VerySmall
N = 45. None ofthe correlations was statistically significant.
Overall Analysis Of The Spirituality Scale
To test whether the spirituality scale was truly measuring spirituality, reliability analysis was
conducted on the spirituality scale. Again the acceptable value of internal consistency is above
.70 which indicates that the scale is measuring what it intends to measure. Reliability analysis of
the spirituality scale resulted in internal consistency of α=.90. This indicates that the spirituality
scale measured what it intended to measure. In other words, it is a good measure of the
spirituality construct. Even though it had a significant internal consistency further analyses were
conducted at the item level of the spirituality scale as specified by the class assignent.
Item Analysis Of The Spirituality Scale
Table 14 shows that all of the items in the spirituality scale had acceptable CITC scores, meaning
the respondents who scored high on an item were likely to score highly on the spirituality survey
as a whole. Item 1 (CITC = .22) and Item 4 (CITC = .21) had the lowest CITC scores. No items
fell below the CITC threshold of .20, so no items had respondents who scored highly on the item
score but low on the spirituality subscale.
26
Table14- DescriptiveStatisticsAndDiscriminationIndex(CITC)ForTheSpiritualityScale
Item Number Mean Score Standard Deviation CITC
1 4.26 0.54 0.22
2 4.36 0.53 0.36
3 4.17 0.73 0.47
4 3.88 0.77 0.21
5 4.31 0.78 0.57
6 3.29 0.86 0.65
7 3.74 0.83 0.54
8 4.12 1.09 0.61
9 3.33 1.30 0.79
10 3.14 1.05 0.37
11 3.81 1.07 0.68
12 3.21 1.09 0.84
13 4.02 0.78 0.32
14 3.62 1.04 0.61
15 4.07 0.84 0.60
16 3.88 0.71 0.34
17 2.31 1.16 0.35
18 2.45 1.27 0.71
19 3.60 1.13 0.23
20 2.60 1.29 0.76
N=45. BoldedCITCs indicate items with the lowest (but still acceptable) CITC scores.
To satisfy the requirements of the assignment, further analysis was conducted to determine the
relationship between the individual items on the spirituality scale and the other facets of the PAL
survey. A good item correlates highly with its own facet more than any other facet. If an item
correlates more highly with another facet, then it is a poor item. Table 15 presents the results of
this analysis. Only Items 1 and 4 are correlating with facets other than its own.
27
Table15- CorrelationsBetweenSpiritualityAndCognitiveScores
Item Number Spirituality CITC Verbal Quantitative Psychology
1 0.22 0.34 0.13 0.05
2 0.36 -0.05 -0.16 -0.13
3 0.47 0.00 -0.09 0.03
4 0.21 0.28 0.04 0.26
5 0.57 0.13 -0.05 0.04
6 0.65 -0.04 -0.11 0.27
7 0.54 0.22 -0.11 0.15
8 0.61 0.09 -0.05 0.17
9 0.79 -0.04 -0.34 0.08
10 0.37 -0.12 -0.17 0.14
11 0.68 0.18 -0.32 0.18
12 0.84 0.29 -0.14 0.20
13 0.32 0.03 -0.25 0.11
14 0.61 0.11 -0.20 0.18
15 0.60 0.23 -0.19 0.04
16 0.34 0.02 -0.27 0.16
17 0.35 -0.09 -0.15 0.22
18 0.71 0.80 -0.20 0.16
19 0.23 0.02 -0.09 0.03
20 0.76 0.26 -0.20 0.29
N=45. CITC refers to CorrectedItem Total Correlations. Boldedcorrelations represent the highest correlations forthe itemamongthe subscales.
Distractor Analysis OfSpirituality Scale Items
The spirituality scale is different from the other subscales in that the answers are not right or
wrong. Items 1 and 4 have acceptable CITC scores but they are in the lower range of acceptable
scores, indicating that those who scored highly on the items scored lower on the rest of the
spirituality scale. Although these items are within the acceptable range, they are the weakest of
the 20 questions. Tables 16a and 16b display distractor analysis for the two items. None of the
questions in the spirituality scale will be removed.
Table16a- SpiritualityScoreOfRespondentsSelectingEachOptionForItem1
Response Choice Spirituality Score N %
Slightly disagree (3) 3.78 2 4.44%
Slightly agree (4) 3.50 29 64.44%
Agree (5) 3.80 14 31.11%
N=45. Thespiritualityscale included20 questions.
28
Table16b- SpiritualityScoreOfRespondentsSelectingEachOptionForItem4
Response Choice Spirituality Score N %
Disagree (2) 3.21 4 8.89%
Slightly Disagree (3) 3.31 4 8.89%
Slightly Agree (4) 3.64 31 68.89%
Agree (5) 3.87 6 13.33%
N=45. Thespiritualityscale included20 questions.
Biases Of Spirituality Scale Items
Additional investigation into individual responses was taken by looking for response biases of
participants completing the spirituality personality measure. A response bias can occur on an
attitude measure when a person systematically chooses responses based on something besides
their true stance on a question. The tables below show three separate participants flagged for
three types of response biases. These respondents were reviewed and ultimately removed.
Central Tendency
The type of error where a respondent tends to choose the answers closest to the middle is central
tendency. In these cases respondents are reluctant to choose strongly agree or strongly disagree
which are the extreme positive and extreme negative choices. Case 45 is an example of central
tendency in the spirituality respondents. Table 17a displays the respondents answers clustered in
the middle of the rating continuum. All twenty answers fell within the three central responses,
and a majority of 10 responses being ‘neither’. The participant may not have taken the time to
read the entire question properly; therefore, this case will be deleted.
Table17a- SpiritualityResponseOfRespondent45:Central TendencyError
Frequency Percent
Disagree 3 15%
Neither Agree Nor Disagree 10 50%
Agree 7 35%
29
Severity Bias
By definition, severity bias is an error that occurs as the result of a rater’s tendency to be overly
critical; however, that definition falsely suggests that there is a preferred way to answer the
questions on the spirituality scale. In this study, we arbitrarily assigned severity bias to the cases
in which most of an individual’s responses were at the negative end of the Likert scale (i.e,
disagreeing to some degree with almost all of the items on the survey.) Case 9 displayed in
Table 17b has a severity bias frequency distribution. Case 9 will be removed.
Table17b- SpiritualityResponseOfRespondent9:SeverityBiasError
Frequency Percent
Strongly Disagree 8 40%
Disagree 8 40%
Neither Agree Nor Disagree 1 5%
Agree 3 15%
Leniency Error
By definition, leniency error occurs as the result of a rater’s tendency to be too forgiving and
insufficiently critical; however, that definition falsely suggests that there is a preferred way to
answer the questions on the spirituality scale. In this study, we arbitrarily assigned leniency
error to the cases in which most of an individual’s responses were at the positive end of the
Likert scale (i.e, agreeing to some degree with almost all of the items on the survey.) Case 14
has a strong leniency error seen in Table 17c. The respondent had a vast majority of the
responses under the same reply; therefore, the case will be removed.
Table17c- SpiritualityResponseOfRespondent14:LeniencyError
Frequency Percent
Agree 2 10%
Strongly Agree 18 90%
30
Revised Spirituality Scale
As a result of our item and distractor analyses, the following respondent cases were eliminated:
9, 14 and 45. All questions met the requirements to be included in the final results, no questions
were eliminated. The statistical results of eliminating the three cases are delineated in Table 18.
The new estimate of internal consistency decreased from 0.90 to 0.85; however, the corrected
spirituality scale remains a highly reliable instrument.
Table18- DescriptiveStatisticsFor SpiritualityScaleCorrectedTotals
Original Corrected
Minimum Score 1.95 2.63
Maximum Score 4.90 4.55
Mean 3.60 3.62
Standard Deviation 0.57 0.49
Internal Consistency (α) 0.90 0.85
Original N = 45, Corrected N = 42. Newacceptable internal consistencies (> 0.70) are bolded. “Corrected” indicates the problematic cases (9, 14
and45) have been removed.
31
Correctionfor Guessing On Cognitive Test
In order to obtain a better representation of a participant's true score a correction for guessing
was performed on the combined cognitive scores. The spirituality measure was corrected by
removing respondents with severe response bias. Table 18 shows that after correcting for
guessing, the combined cognitive score correlation with GPA decreased. The spirituality
correlation with GPA also decreased from .16 to .11. The correction for guessing made a
difference in the criterion validity for the spirituality scale.
Variance Explained
Table 19 indicates that the corrected combined cognitive test accounted for none of the variance
explained by GPA. Likewise, the uncorrected combined cognitive test accounted for none of the
variance explained by GPA. Additionally, the spirituality scale accounted for 2.46% of the
variance explained by GPA and the corrected spirituality scale accounted for 1.10% of the
variance explained by GPA. The nature of the effect size in spirituality is considered to be small.
The amount of variance explained in GPA is not affected by the correction for guessing in the
cognitive test.
Table19- VarianceinGPAexplainedeffectsizeand GPAcorrelationofcorrecteditems.
Original
Combined
Cognitive
Partially-
Corrected
Combined
Cognitive
Corrected
Combined
Cognitive
Original
Spirituality
Corrected
Spirituality
GPA
Correlation(r)
0.02 .00 .00 .16 .11
GPAExplained
Variance
(r2
* 100)
0.00% 0.00% 0.00% 2.46% 1.10%
Nature of
EffectSize
No Effect No Effect No Effect Small Effect Small Effect
Proportion ofthe variance explained: .01=small effect, .09=medium effect,.25=large effect
Correctionfor Attenuation
The relationship between two variables can be weakened by measurement error. The correlation
between two items is more accurately reflected if a correction for attenuation has been
performed. The correlations of the three subscales, spirituality and combined cognitive score
with the self-reported measure are reported in Table 20. The correction improved the correlation
coefficient of all measures but quantitative which had almost no original correlation. The
psychology subscale had the largest increase in effect size from .29 to .39.
32
Table20- CorrectedForAttenuationWith SelfPerformanceReportsAndTestPerformance
Verbal
Reasoning
Quantitative
Reasoning
Psychology
Knowledge
Combined
Cognitive
Spirituality
Attenuation .21 .00 .39 .21 .43
Self-
Performance
.18 .00 .29 .19 .40
Reliability .75 .82 .54 .82 .85
The transformed scores still correlate completely with the untransformed scores. The other
corrected correlations only vary slightly from the uncorrected scores. Because the scores have
undergone a linear transformation, each of the three subscales correlates perfectly with the
subscale’s Z-score, IQ score and T-score. This occurs because the values did not change in
relation to each other, i.e., the transformation was standardized.
Table21- CorrelationBetween CorrectedRawAndStandardizedScoresOfTheCombined
CognitiveTestAndIts Subscales
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1. Corrected
Combined
Cognitive
---
2. Combined
Cognitive Z-
Score
1.00 ---
3. Combined
Cognitive T-
Score
1.0 1.00 ---
4. Combined
Cognitive IQ
1.00 1.00 1.00 ---
5. Psychology .64 .64 .64 .64 ---
6. Psychology Z-
Score
.64 .64 .64 .64 1.00 ---
7. Psychology T-
Score
.64 .64 .64 .64 1.00 1.00 ---
8. Psychology IQ .64 .64 .64 .64 1.00 1.00 1.00 ---
9. Quantitative .76 .76 .76 .76 .23 .23 .23 .23 ---
10. Quantitative
Z-Score
.76 .76 .76 .76 .23 .23 .23 .23 1.00 ---
11. Quantitative
T-Score
.76 .76 .76 .76 .23 .23 .23 .23 1.00 1.00 ---
12. Quantitative
IQ
.76 .76 .76 .76 .23 .23 .23 .23 1.00 1.00 1.00 ---
13. Verbal .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 ---
14. Verbal Z-
Score
.73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 ---
15. Verbal T-
Score
.73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 1.00 ---
16. Verbal IQ .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 1.00 1.00 ---
N = 45. Boldedcorrelations are significant at the 0.01level (2-tailed).
33
Overall ConclusionsAnd
Recommendations
The PAL Cognitive and Spirituality Scales show promise as psychometric tools for measuring
cognitive reasoning, psychology knowledge and personality. Overall the cognitive scale was
found to have high internal consistency, as did the verbal and quantitative subscales. The
psychology subscale was found not to have good internal consistency, however, and the
mandatory removal of items not meeting reliability standards further decreased its internal
consistency, an unfortunate consequence when a scale has only a small number of items. The
spirituality subscale showed extremely high levels of internal consistency and all items met the
required thresholds of reliability and were therefore retained.
This report has presented a number of recommendations to further improve the scales’ validity
and reliability. Implementation of the recommendations made in the distractor analysis section
should improve internal consistency. For example the development of several new questions for
the psychology subscale is recommended. Finally, further analysis should be conducted to
ensure the validity of both the cognitive and spirituality scales.
References
Imel, Susan (1998). “Spirituality in the Workplace.” Trends and Issues Alert, ERIC
Clearinghouse on Adult, Career and Vocational Information
Fry, Louis (2016). “Toward a Theory of Spiritual Leadership.” The Leadership Quarterly, no.
14, 693-727
Laabs, Jennifer J. (1995). "Balancing Spirituality and Work." Personnel Journal 74, no. 9: 60-
76.
Leigh, Pamela. (1997). "The New Spirit at Work." Training and Development 5, no. 3:
26-33.
McLaughin, Abraham. (1998). "Seeking Spirituality...At Work." Christian Science Monitor
34
Appendices
Appendix A –SurveyInstructions
Appendix B– BackgroundInformation
Appendix C–Self-ReportedPerformance
Appendix D– SpiritualityScale
Appendix E– PsychologySubscale
Appendix F –Quantitative ReasoningSubscale
Appendix G– Verbal ReasoningSubscale

More Related Content

What's hot

4. principles of psychological tests S.Lakshmanan Psychologist
4. principles of psychological tests  S.Lakshmanan Psychologist4. principles of psychological tests  S.Lakshmanan Psychologist
4. principles of psychological tests S.Lakshmanan PsychologistLAKSHMANAN S
 
Psychology test and its types
Psychology test and its typesPsychology test and its types
Psychology test and its typeshira baloch
 
Psychological testing, meaning, advantages and limitations
Psychological testing, meaning, advantages and limitationsPsychological testing, meaning, advantages and limitations
Psychological testing, meaning, advantages and limitationsUsman Public School System
 
Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Moin Sarker
 
Rationale of psychological testing
Rationale of psychological testingRationale of psychological testing
Rationale of psychological testingDr. Piyush Trivedi
 
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University Dubai
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University DubaiUGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University Dubai
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University DubaiAditi Nath
 
Curriculum Vitae Emily Clark
Curriculum Vitae Emily ClarkCurriculum Vitae Emily Clark
Curriculum Vitae Emily ClarkEmily Clark
 
Introduction principles of psychological measurement
Introduction principles of psychological measurementIntroduction principles of psychological measurement
Introduction principles of psychological measurementPauline Veneracion
 
Assessment & Diagnosis
Assessment & DiagnosisAssessment & Diagnosis
Assessment & DiagnosisBryn Robinson
 
Hypothesis and its types
Hypothesis and its typesHypothesis and its types
Hypothesis and its typesTooba46
 
kgavura 1 scientific method
kgavura 1 scientific methodkgavura 1 scientific method
kgavura 1 scientific methodKathleen Gavura
 
Assessment Competency J Clin 2004(1)
Assessment Competency J Clin 2004(1)Assessment Competency J Clin 2004(1)
Assessment Competency J Clin 2004(1)guest3e52276
 
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophrenia
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophreniaArtigo esquizofrenia a meta analysis of cognitive remediation in schizophrenia
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophreniaJeane Araujo
 
Types of Psychological Tests updated by S.Lakshmanan, Psychologist
Types of Psychological Tests updated by S.Lakshmanan, PsychologistTypes of Psychological Tests updated by S.Lakshmanan, Psychologist
Types of Psychological Tests updated by S.Lakshmanan, PsychologistLAKSHMANAN S
 
Psychological test and role of nurse
Psychological test and role of nursePsychological test and role of nurse
Psychological test and role of nurseChinna Chadayan
 

What's hot (19)

Process theory
Process theoryProcess theory
Process theory
 
4. principles of psychological tests S.Lakshmanan Psychologist
4. principles of psychological tests  S.Lakshmanan Psychologist4. principles of psychological tests  S.Lakshmanan Psychologist
4. principles of psychological tests S.Lakshmanan Psychologist
 
Psychological Assessment
Psychological AssessmentPsychological Assessment
Psychological Assessment
 
Psychological test
Psychological testPsychological test
Psychological test
 
Psychology test and its types
Psychology test and its typesPsychology test and its types
Psychology test and its types
 
Psychological testing, meaning, advantages and limitations
Psychological testing, meaning, advantages and limitationsPsychological testing, meaning, advantages and limitations
Psychological testing, meaning, advantages and limitations
 
Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)
 
Rationale of psychological testing
Rationale of psychological testingRationale of psychological testing
Rationale of psychological testing
 
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University Dubai
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University DubaiUGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University Dubai
UGRSC Research Paper - Is Luck a Moody Mistress - Middlesex University Dubai
 
Curriculum Vitae Emily Clark
Curriculum Vitae Emily ClarkCurriculum Vitae Emily Clark
Curriculum Vitae Emily Clark
 
Introduction principles of psychological measurement
Introduction principles of psychological measurementIntroduction principles of psychological measurement
Introduction principles of psychological measurement
 
Assessment & Diagnosis
Assessment & DiagnosisAssessment & Diagnosis
Assessment & Diagnosis
 
Hypothesis and its types
Hypothesis and its typesHypothesis and its types
Hypothesis and its types
 
kgavura 1 scientific method
kgavura 1 scientific methodkgavura 1 scientific method
kgavura 1 scientific method
 
Assessment Competency J Clin 2004(1)
Assessment Competency J Clin 2004(1)Assessment Competency J Clin 2004(1)
Assessment Competency J Clin 2004(1)
 
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophrenia
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophreniaArtigo esquizofrenia a meta analysis of cognitive remediation in schizophrenia
Artigo esquizofrenia a meta analysis of cognitive remediation in schizophrenia
 
Types of Psychological Tests updated by S.Lakshmanan, Psychologist
Types of Psychological Tests updated by S.Lakshmanan, PsychologistTypes of Psychological Tests updated by S.Lakshmanan, Psychologist
Types of Psychological Tests updated by S.Lakshmanan, Psychologist
 
Psychological test and role of nurse
Psychological test and role of nursePsychological test and role of nurse
Psychological test and role of nurse
 
Psychological Assessment
Psychological AssessmentPsychological Assessment
Psychological Assessment
 

Viewers also liked

Viewers also liked (12)

Investigación - acción
Investigación - acciónInvestigación - acción
Investigación - acción
 
ACC General Brochure
ACC General BrochureACC General Brochure
ACC General Brochure
 
Ernesto Martínez investigación-acción_02062016
Ernesto Martínez investigación-acción_02062016Ernesto Martínez investigación-acción_02062016
Ernesto Martínez investigación-acción_02062016
 
Debbie Silvey's CV 2016
Debbie Silvey's CV 2016Debbie Silvey's CV 2016
Debbie Silvey's CV 2016
 
Investigación acción
Investigación acciónInvestigación acción
Investigación acción
 
Aprendizaje autónomo
Aprendizaje autónomoAprendizaje autónomo
Aprendizaje autónomo
 
Alvin%20Suresh%20SUBRAMANIAM
Alvin%20Suresh%20SUBRAMANIAMAlvin%20Suresh%20SUBRAMANIAM
Alvin%20Suresh%20SUBRAMANIAM
 
Clasicismo y concreto armado gropius y la bauhaus
Clasicismo y concreto armado gropius y la bauhausClasicismo y concreto armado gropius y la bauhaus
Clasicismo y concreto armado gropius y la bauhaus
 
David Farkas_Dissertation_Final
David Farkas_Dissertation_FinalDavid Farkas_Dissertation_Final
David Farkas_Dissertation_Final
 
Top 5 Things Safety Planners and Highway Safety Offices Need to Know About Se...
Top 5 Things Safety Planners and Highway Safety Offices Need to Know About Se...Top 5 Things Safety Planners and Highway Safety Offices Need to Know About Se...
Top 5 Things Safety Planners and Highway Safety Offices Need to Know About Se...
 
Prezintatsia angl (1)
Prezintatsia angl (1)Prezintatsia angl (1)
Prezintatsia angl (1)
 
Seminar
SeminarSeminar
Seminar
 

Similar to Brown Glatt Final Project

Use the Capella library to locate two psychology research articles.docx
Use the Capella library to locate two psychology research articles.docxUse the Capella library to locate two psychology research articles.docx
Use the Capella library to locate two psychology research articles.docxdickonsondorris
 
A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...Dana Boo
 
Psy assesment 5.docx
Psy assesment 5.docxPsy assesment 5.docx
Psy assesment 5.docxbkbk37
 
Running head EMOTIONAL INTELLEGENCE 1 Re.docx
Running head EMOTIONAL INTELLEGENCE  1 Re.docxRunning head EMOTIONAL INTELLEGENCE  1 Re.docx
Running head EMOTIONAL INTELLEGENCE 1 Re.docxsusanschei
 
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyJFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyScott Whitmer
 
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyJFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyScott Whitmer
 
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxSocial Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxsamuel699872
 
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxSocial Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxrosemariebrayshaw
 
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docxdrennanmicah
 
The Relationship between Psychological Health, Self-Confidence and Locus of C...
The Relationship between Psychological Health, Self-Confidence and Locus of C...The Relationship between Psychological Health, Self-Confidence and Locus of C...
The Relationship between Psychological Health, Self-Confidence and Locus of C...IJERA Editor
 
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docx
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docxAnnotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docx
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docxrossskuddershamus
 
International Journal of Education (IJE)
International Journal of Education (IJE)International Journal of Education (IJE)
International Journal of Education (IJE)ijejournal
 
Development of a sociopathy scale (psychometrics paper)
Development of a sociopathy scale (psychometrics paper)Development of a sociopathy scale (psychometrics paper)
Development of a sociopathy scale (psychometrics paper)sedunham
 
Basic Question Evaluation
Basic Question EvaluationBasic Question Evaluation
Basic Question EvaluationJessica Speer
 
Research design and methods, Dr. William Kritsonis
Research design and methods, Dr. William KritsonisResearch design and methods, Dr. William Kritsonis
Research design and methods, Dr. William KritsonisWilliam Kritsonis
 
Research Design and Methodology, Dr. W.A. Kritsonis
Research Design and Methodology, Dr. W.A. KritsonisResearch Design and Methodology, Dr. W.A. Kritsonis
Research Design and Methodology, Dr. W.A. Kritsonisguestcc1ebaf
 
Being mindful predicts experiencing less emotional problems in school staff: ...
Being mindful predicts experiencing less emotional problems in school staff: ...Being mindful predicts experiencing less emotional problems in school staff: ...
Being mindful predicts experiencing less emotional problems in school staff: ...Manja Veldin
 
PSY 550 Annotated Bibliography Assignment Overview A.docx
 PSY 550 Annotated Bibliography Assignment Overview A.docx PSY 550 Annotated Bibliography Assignment Overview A.docx
PSY 550 Annotated Bibliography Assignment Overview A.docxaryan532920
 

Similar to Brown Glatt Final Project (20)

Use the Capella library to locate two psychology research articles.docx
Use the Capella library to locate two psychology research articles.docxUse the Capella library to locate two psychology research articles.docx
Use the Capella library to locate two psychology research articles.docx
 
A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...
 
Psy assesment 5.docx
Psy assesment 5.docxPsy assesment 5.docx
Psy assesment 5.docx
 
Running head EMOTIONAL INTELLEGENCE 1 Re.docx
Running head EMOTIONAL INTELLEGENCE  1 Re.docxRunning head EMOTIONAL INTELLEGENCE  1 Re.docx
Running head EMOTIONAL INTELLEGENCE 1 Re.docx
 
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyJFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
 
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetencyJFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
JFVA_Vol15No2_WhitmerGrimley_PsychometricEvaluationCompetency
 
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxSocial Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
 
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docxSocial Psychology Research Project Grading Rubric W18CATEGORY5.docx
Social Psychology Research Project Grading Rubric W18CATEGORY5.docx
 
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx
1RUNNING HEAD METHODS AND RESULTS1RUNNING HEAD METHODS.docx
 
The Relationship between Psychological Health, Self-Confidence and Locus of C...
The Relationship between Psychological Health, Self-Confidence and Locus of C...The Relationship between Psychological Health, Self-Confidence and Locus of C...
The Relationship between Psychological Health, Self-Confidence and Locus of C...
 
Psychological tests; Introduction and Classifications
Psychological tests; Introduction and ClassificationsPsychological tests; Introduction and Classifications
Psychological tests; Introduction and Classifications
 
RobinsonCV2019
RobinsonCV2019RobinsonCV2019
RobinsonCV2019
 
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docx
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docxAnnotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docx
Annotated BibliographyLeierer, S. J., Blackwell, T. L., Strohmer.docx
 
International Journal of Education (IJE)
International Journal of Education (IJE)International Journal of Education (IJE)
International Journal of Education (IJE)
 
Development of a sociopathy scale (psychometrics paper)
Development of a sociopathy scale (psychometrics paper)Development of a sociopathy scale (psychometrics paper)
Development of a sociopathy scale (psychometrics paper)
 
Basic Question Evaluation
Basic Question EvaluationBasic Question Evaluation
Basic Question Evaluation
 
Research design and methods, Dr. William Kritsonis
Research design and methods, Dr. William KritsonisResearch design and methods, Dr. William Kritsonis
Research design and methods, Dr. William Kritsonis
 
Research Design and Methodology, Dr. W.A. Kritsonis
Research Design and Methodology, Dr. W.A. KritsonisResearch Design and Methodology, Dr. W.A. Kritsonis
Research Design and Methodology, Dr. W.A. Kritsonis
 
Being mindful predicts experiencing less emotional problems in school staff: ...
Being mindful predicts experiencing less emotional problems in school staff: ...Being mindful predicts experiencing less emotional problems in school staff: ...
Being mindful predicts experiencing less emotional problems in school staff: ...
 
PSY 550 Annotated Bibliography Assignment Overview A.docx
 PSY 550 Annotated Bibliography Assignment Overview A.docx PSY 550 Annotated Bibliography Assignment Overview A.docx
PSY 550 Annotated Bibliography Assignment Overview A.docx
 

Brown Glatt Final Project

  • 1. Developing A Measure Of Cognitive Ability And Personality By: Paula Brown Alicia Glatt Submitted to Dr. Ashita Goswami, Ph.D. Psy 790 - Psychometrics SALEM STATE UNIVERSITY 9 May 2016
  • 2. 2 Executive Summary This report documents the preliminary development of the PAL* Cognitive and Spirituality Scales. The scales are composed of three dichotomous cognitive subscales measuring quantitative reasoning, verbal reasoning, and psychology knowledge, as well as a non- dichotomous personality subscale measuring an individual’s level of spirituality. The spirituality scale showed very high levels of internal consistency. The overall cognitive composite, and verbal and quantitative reasoning subscales also showed acceptable measures of internal consistency. The acceptable internal consistency measure in verbal reasoning, quantitative, overall cognitive composite and spirituality indicate the scales were a good measure of their respective constructs. The internal consistency for the psychology subscale fell outside the acceptable range and is therefore not a good measure of psychology knowledge. The spirituality scale and all facets of cognitive ability were further evaluated with item analysis. Items which fell outside of the acceptable difficulty range and which showed poor discrimination among high and low performers were removed; however, after doing so, the estimated internal consistency improved only slightly for the quantitative subscale and combined cognitive test. There were no items from the verbal subscale that qualified for removal. Due to a large number of test questions being removed, the internal consistency of the psychology subscale decreased in value, remaining far outside the acceptable range. Finally, various items were evaluated with distractor analysis, and recommendations for improvement to individual questions are provided. In conclusion, the PAL Cognitive and Spirituality Scales show promise as psychometric tools for measuring cognitive reasoning, psychology knowledge and personality. Further analysis should be conducted to ensure the validity of both the cognitive and spirituality scales. Implementation of the recommendations made in the distractor analysis section should improve internal consistency. Introduction As a field of study, psychometrics attempts to conceptualize human behavior and measure the differences between individuals in terms aptitudes, personality, values, skills, intelligence and attitudes. In addition to the development and refinement of theoretical approaches to measurement, a major component of psychometrics is the construction of instruments and procedures for measuring such constructs. This project provided an opportunity to apply the theoretical components of classical test theory in the development of a series of scales measuring cognitive ability, as well as a separate measure of personality. Through this exercise we gained practical experience in the construction, evaluation and interpretative methods for psychological tests. *PAL isan acronym of the authors’names, PaulaandAlicia.
  • 3. 3 Construct Development The PAL cognitive scale included a 20-item measure of quantitative reasoning, a 20-item measure of verbal reasoning, and a 24-item measure of psychology knowledge. In terms of both format and difficulty, questions were modeled after those used by the Educational Testing Service for assessing readiness for graduate education (i.e., the GRE general test and psychology subject test). Quantitative reasoning, verbal reasoning and psychology knowledge are all multifaceted constructs; therefore, in order to ensure content validity, survey questions were designed to cover a large spectrum of facets. Math questions included algebra, arithmetic, data interpretation and geometry. Verbal questions included analogies, antonyms, reading comprehension and sentence completion. Likewise, questions in the psychology knowledge section were representative of a dozen different subject areas that included neuroscience, clinical/abnormal psychology, and measurement and methodology. In addition to representing a wide variety of constructs, questions were designed to span a wide range of abilities, with easier questions presented earlier and increasing in difficulty over the course of the survey. Unrelated to cognitive reasoning, the spirituality scale is a 20-item personality scale developed to assess an individual’s personal spirituality. It was developed using a 5-point Likert scale for participants to respond to statements such as, “Although lacking in material possessions, it is possible to feel fulfilled,” and “I believe my life has a purpose.” The scale was developed in response to the growing trend of spirituality in the workplace. Spirituality in the workplace/spiritual leadership are growing trends reflected in the steady increase in the number of books, publications and conferences on this topic over the past 20 years (Imel 1998). Experts have pointed to a number of factors behind this trend, including the rise in corporate layoffs and downsizing, the decline of traditional support networks, efforts of individuals to find personal fulfillment on the job, the need to reconcile personal values with those of the corporation, and the rise of innovative organizational trends, such as learning organizations and the quality movement and corporate desires to help workers achieve more balanced lives (Fry 2016, Laabs 1995; Leigh 1997; McLaughlin 1998). Anticipating that many survey participants may not be currently employed, researchers focused on spirituality as a general construct as opposed to spirituality in the workplace.
  • 4. 4 Test Development Two researchers were involved in the creation of the survey, which consisted of a total of 100 items. A measure of self-reported performance (Appendix C) had 7 questions; the spirituality scale (Appendix D) was composed of 20 questions; the psychology subscale (Appendix E) had 24 questions; the quantitative reasoning subscale (Appendix F) had 20 questions; and the verbal subscale (Appendix G) had 20 questions. The remainder of the questions solicited demographic information. Cognitive items had between 4 and 5 multiple choice options (i.e., one correct answer and 3 or 4 distractors, respectively). Spirituality items were structured on a 5-point Likert Scale with one point being “strongly disagree” and 5 points being “strongly agree” with the respective statement. Since there were 20 items on the survey, the total possible points was 100. Total points were then divided by the number of items answered to obtain an average score between 1 and 5. Procedure and Sample Procedure Researchers solicited participation from among the researchers’ personal social networks of contacts. Survey questions were entered into Survey Monkey, and the corresponding electronic link to the survey was posted to the researchers’ Facebook pages. In addition, electronic links were directly emailed to friends/acquaintances/coworkers who the researchers felt would be amenable to participating. Participants were permitted to use calculators for the quantitative section, as suggesting otherwise would be unenforceable. Likewise, no time limits were placed on participants as enforcement was impossible to achieve. Sample Among the 45 respondents, ages ranged from 23 to 70 (SD = 15) and the average age was 49. Sixty-seven percent were female, 96% were white, 42% had a Bachelor’s degree, 40% had a Master’s degree, and 13% had a PhD or equivalent. The respondents reported grade point averages (GPAs) ranging from 2.8 to 4.0, with a mean of 3.62. Sixteen percent of respondents were in graduate school; the remainder were not students.
  • 5. 5 Descriptive Statistics Of CognitiveTests Table 1 provides descriptive statistics for the three subscales of the PAL cognitive test (verbal, quantitative and psychology), as well as a combined total of all three subscales (combined cognitive). Scoring of the cognitive portions of the test consisted of adding a point for each correct answer. Therefore, the range of scores for each subscale was 0 - 20 for verbal and quantitative reasoning, 0-24 for psychology knowledge, and 0 - 64 for the combined cognitive score. Of the three cognitive subscales, quantitative reasoning had the lowest average mean (mean = 10.38, SD = 4.27) while verbal reasoning had the highest average mean (mean = 13.56, SD = 3.63). Crohnbach’s alpha is a statistic that measures the internal consistency of a scale (i.e., how reliable the test is) and values greater than 0.70 are considered acceptable. Notably, the psychology subscale did not meet the predetermined threshold; therefore it was not considered reliable. The combined cognitive score, verbal reasoning, and quantitative reasoning scales all had very high values of internal consistency. Table1- DescriptiveStatisticsForCognitiveSubscalesAndCombinedCognitiveScore Number of Items Minimum Score Maximum Score Mean Standard Deviation Internal Consistency (α) Verbal Reasoning 20 4 20 13.56 3.63 0.75 Psychology Knowledge 24 5 21 12.67 3.63 0.64 Quantitative Reasoning 20 2 18 10.38 4.27 0.81 Combined Cognitive 64 15 53 36.60 8.01 0.81 N=45. The combinedcognitive subtotal was computedby addingthe verbal,quantitative, andpsychology subscores. Acceptable scores of internal consistencyare bolded. Table 2 shows the descriptive statistics and internal consistency (α or alpha) for psychology knowledge, quantitative reasoning, verbal reasoning and combined cognitive with the respondents separated by gender. With the exception of psychology knowledge amongst females, all scales had strong scores of internal consistency. Cohen’s d is a statistic that compares the difference in the standardized mean between the two groups. A positive “d” in Table 2 represents a stronger female performance, while a negative “d” represents a stronger male performance. There is a statistically small effect of gender on the psychology scale, and the moderate effect of gender on the quantitative (males scored higher) and verbal (females scored higher) reasoning subscales have the effect of canceling each other out so that there was virtually no gender effect on the combined cognitive scale.
  • 6. 6 Table2- DescriptiveStatisticsForCognitiveSubscalesAndCombinedCognitiveByGender Female (N = 30) Male (N = 14) Mean Standard Deviation Min. Max. α Mean Standard Deviation Min. Max. α Cohen’s d Psychology Knowledge 12.87 3.33 5.00 21.00 0.59 12.27 4.27 5.00 20 0.75 0.16 Quantitative Reasoning 9.600 4.05 3.00 18.00 0.79 11.93 4.40 2.00 18.00 0.83 -0.55 Verbal Reasoning 14.17 3.52 6.00 20.00 0.74 12.33 3.64 4.00 17.00 0.75 0.51 Combined Cognitive 36.63 6.82 21.00 49.00 0.73 36.53 10.27 15.00 53.00 0.89 0.01 N=45. The combinedcognitive subtotal was computedby addingthe verbal, quantitative, andpsychology subscores. Verbal andquantitative subscales had20 items each; psychology knowledge had24 items. The combinedcognitive had75items. Acceptable scores of internal consistency arebolded. Descriptive Statistics For Spirituality, Quantitative, Verbal And Psychology Subscales And Combined Cognitive By Ethnicity Due to the homogeneity of respondents, the statistical software was unable to provide descriptive statistics by race. Out of 45 test takers, only one was Asian, one was Hispanic, and the remainder (96%) were white. Table 3 displays descriptive analysis organized by academic degree. The psychology scale has inconsistent reliability, with a score of only 0.29 for PhD-level respondents but an acceptable score (α = 0.71) at the Masters-level. In addition, the verbal reasoning subscale at the Master’s level was not acceptable. All other alpha scores are at or well above the 0.70 threshold of reliability. Regarding the cognitive scales, one might expect respondents with more education to perform better on the survey, but that occurred only within the quantitative subscale. Respondents with Bachelor’s degrees scored lowest (mean = 10.26, SD = 3.77), followed by those with Master’s degrees (mean = 10.5; SD = 4.66), and those with PhD’s scored highest (mean = 11.33; SD = 5.47). In regards to the other cognitive subscales (verbal and psychology) and the combined cognitive test, PhD-level respondents actually had the lowest scores of the three groups. It is important to note that the sample size of PhD respondents was markedly lower than the other groups and that different trends might emerge with a larger sample.
  • 7. 7 Table3- DescriptiveStatisticsForQuantitative,Verbal AndPsychologySubscalesAnd CombinedCognitiveByAcademicDegree Education Level Facet Minimum Maximum Mean Standard Deviation Alpha Bachelors (N = 19) Verbal Reasoning 4.00 20.00 13.42 3.76 0.77 Quantitative Reasoning 2.00 18.00 10.26 3.77 0.74 Psychology Knowledge 5.00 21.00 12.53 3.72 0.68 Combined Cognitive 15.00 46.00 36.21 7.61 0.79 Masters (N = 18) Verbal Reasoning 8.00 18.00 14.00 3.12 0.67 Quantitative Reasoning 3.00 18.00 10.50 4.66 0.85 Psychology Knowledge 5.00 20.00 13.06 4.04 0.71 Combined Cognitive 23.00 53.00 37.56 7.46 0.78 PhD (N = 6) Verbal Reasoning 6.00 18.00 12.50 5.09 0.87 Quantitative Reasoning 5.00 18.00 11.33 5.47 0.90 Psychology Knowledge 10.00 15.00 12.17 2.48 0.29 Combined Cognitive 21.00 49.00 36.00 12.59 0.93 Total N = 45. The boldedinternalconsistencyindicates an acceptable reliabilitymeasure> 0.70. Verbal andquantitative reasoningsubscales had 20 items each; psychology knowledge had24 items. The combinedcognitive, whichis calculatedby addingtogether scores of thethree subscales, had75 items. Correlativeand Validity AnalysisOf CognitiveScales A test is said to have “construct validity” if it accurately measures a theoretical, non-observable construct or trait -- in our case an individual’s aptitude in regards to the field of psychology, as well as verbal and quantitative reasoning. One method of establishing a test’s construct validity is called convergent/divergent validation. A test has convergent validity if it has a high correlation with another measure of a similar construct or a construct you would expect to mirror. In our case, we compared the subscales and the combined cognitive test with self- reported GPAs. By contrast, a test’s divergent validity is demonstrated through a low correlation with a test that we would expect to measure a different construct. In our case, we compared the
  • 8. 8 subscales and the combined cognitive test with individuals’ scores on the spirituality scale. Evidence of high divergent validity is established by a low correlation coefficient. Table 4 shows the correlation coefficients of each of the subscales, the combined cognitive test, GPAs and scores on the spirituality scale. To establish convergent validity, we are looking for a high correlation between the cognitive scales and GPA. Unfortunately, the data do not support this. The correlation between verbal reasoning and GPA (r = -0.01) is almost non- existent. Looking at cognitive scores, GPA correlates most highly with the psychology subscale (r = 0.12), but the effect size is still considered to be small. As far at the test’s divergent validity, we are looking for a low correlation between the cognitive scales and the spirituality scale. While we do observe a very low correlation between spirituality and the combined cognitive score (r = 0.04), the correlations between each of the subscales and the spirituality scale are higher than their correlations with GPA. Table4- CorrelationsBetweenTheSpiritualityScale,TheTotal CognitiveTest,All Three SubscalesAndCollegeG.P.A. 1 2 3 4 5 1 - Spirituality --- 2 - Verbal Reasoning 0.15 --- 3 - Quantitative Reasoning -0.26 0.28 --- 4 - Psychology Knowledge 0.24 0.28 0.12 --- 5 - Combined Cognitive 0.04 .729 .711 .643 --- 6 - GPA 0.16 -0.01 -0.07 0.12 0.02 N=45. Boldedcorrelations are significant at the 0.001level (2-tailed). Table 5 examines the correlations between the cognitive subscales and the transformed scores for each subscale. Because the scores have undergone a linear transformation, each of the three subscales correlates perfectly with the subscale’s Z-score, IQ score and T-score. This occurs because the values did not change in relation to each other, i.e., the transformation was standardized.
  • 9. 9 Table5- CorrelationsBetweenVerbal,MathandPsychologySubscalesAndThe TransformedT-scores,Z-scoresAndIQScores 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 - Verbal Reasoning --- 2 - Verbal ReasoningZ- score 1.00 --- 3 - Verbal ReasoningIQ Score 1.00 1.00 --- 4 - Verbal ReasoningT- Score 1.00 1.00 1.00 --- 5 - Quantitative Reasoning 0.28 0.28 0.28 0.28 --- 6 - Quantitative ReasoningZ- score 0.28 0.28 0.28 0.28 1.00 --- 7 - Quantitative ReasoningIQ Score 0.28 0.28 0.28 0.28 1.00 1.00 --- 8 - Quantitative ReasoningT- score 0.28 0.28 0.28 0.28 1.00 1.00 1.00 --- 9 - Psychology Knowledge 0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 --- 10 - Psychology Knowledge Z- score 0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 --- 11 - Psychology Knowledge IQ Score 0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 1.00 --- 12 - Psychology Knowledge T- score 0.28 0.28 0.28 0.28 0.12 0.12 0.12 0.12 1.00 1.00 1.00 --- 13 – Combined Cognitive 0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 --- 14 – Combined Cognitive Z- score 0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 --- 15 – Combined Cognitive IQ Score 0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 1.00 16 – Combined Cognitive T- score 0.73 0.73 0.73 0.73 .71 .71 .71 .71 .64 .64 .64 .64 1.00 1.00 N = 45. Boldedvalues are significant at the 0.01level.
  • 10. 10 Item AnalysisOf CognitiveTest And Its Facets Despite the cognitive subscales’ promising scores of internal consistency, it nevertheless is important to perform an item analysis to ensure the quality of each item. Two statistics that help us discern this are p-values and CITC. P-values range from 0 - 1 and represent the difficulty of a test item: scores < .30 are considered too hard (i.e., less than 30% of respondents answered correctly) and scores > .80 are considered too easy (i.e., less than 20% answered incorrectly). CITC stands for “corrected item total correlation” and is a measure of how well an item discriminates between respondents who are knowledgeable in the content area and those who are not. Scores can range from -1 to 1, and test items with a CITC < .20 are considered poor differentiators and should be considered for removal. Table6- DifficultyandDiscriminationofQuantitativeReasoning,Verbal ReasoningAnd PsychologyKnowledge Math Verbal Psychology Item Number Difficulty (p) CITC Difficulty (p) CITC Difficulty (p) CITC 1 0.53 0.27 0.49 -0.01 0.89 0.17 2 0.82 0.38 0.47 0.47 0.38 0.35 3 0.42 0.39 0.91 0.25 0.27 0.05 4 0.29 0.37 0.44 0.29 0.20 0.04 5 0.71 0.52 0.96 0.27 0.38 0.08 6 0.40 0.27 0.78 0.53 0.71 -0.01 7 0.44 0.36 0.67 0.35 0.87 0.11 8 0.38 0.37 0.91 0.31 0.42 0.18 9 0.24 0.14 0.76 -0.17 0.22 0.22 10 0.60 -0.04 0.69 0.31 0.73 0.26 11 0.69 0.36 0.78 0.44 0.69 0.38 12 0.38 0.42 0.87 0.51 0.56 0.20 13 0.60 0.31 0.73 0.16 0.62 0.09 14 0.07 -0.21 0.73 0.27 0.87 0.05 15 0.38 0.17 0.36 0.10 0.64 0.10 16 0.84 0.20 0.51 0.30 0.64 -0.04
  • 11. 11 17 0.69 0.38 0.31 0.03 0.53 0.38 18 0.87 0.04 0.78 0.23 0.42 0.09 19 0.44 0.44 0.71 0.32 0.29 0.14 20 0.58 0.33 0.71 0.36 0.60 0.23 N = 45. Difficulty (p) refers tothe percentage of correct responses. Items that are too easy(p > .80)or toohard(p < .30) arebolded. CITC refers to CorrectedItemTotal Correlations, anditems that have a CITC < .20 are problematicand are bolded. Blue-shadedtest items indicate that both statistics are outside the recommendedrange. Each item from the cognitive scales is represented in Table 6, along with the corresponding p- values and CITC scores. Math question 14 holds the highest difficulty (p = .07), and verbal question 5 was the least difficult (p = .96), and both were amongst those bolded for being outside the threshold values. Math Item 14 also scored -.21 in correct item total correlation, meaning the participants who performed poorly on the entire test had a higher rate of success on this question. Conversely, Item 2 on the verbal subscale has favorable values: with a CITC of .47, we are assured it a good discriminator, and a p-value of .47 tells us that nearly as many individuals in our sample answered the item correctly as those who answered incorrectly. The bolded values in Table 6 represent those that failed to meet one of the thresholds described above. Blue-shaded test items indicate that both statistics are outside the recommended range, and those questions will undergo further scrutiny through a “distractor analysis” and possibly be removed from the survey. The verbal reasoning section of the test did not have any flagged questions, while the psychology subscale had the greatest number of flagged questions (6). Quality test items will correlate with their own composite scores to a greater degree than they do with the scores of another subscale. Failure to meet this qualification is an indication that the question may be a better predictor of a construct other than the one it is meant to measure. Tables 7a, b, and c represent correlations between each test question and the subscale composite scores. Table7a- CorrelationsBetweenPsychologyItemsAnd CompositeScores Psychology Item Number Psychology (CITCs) Verbal Quantitative 1 0.17 .17 -.07 2 0.35 .24 .17 3 0.05 .14 -.27 4 0.04 -.09 -.19 5 0.08 -.02 -.16 6 -0.01 -.09 .02 7 0.11 .19 -.14 8 0.18 .11 .06
  • 12. 12 9 0.22 .11 -.16 10 0.26 .26 .03 11 0.38 .31 .12 12 0.20 .00 .25 13 0.09 -.02 .04 14 0.05 -.05 .13 15 0.10 .15 -.02 16 -0.04 -.17 .15 17 0.38 .29 .34 18 0.09 -.02 .17 19 0.14 .11 .04 20 0.23 .24 .11 N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that correlatemore highlywithsubscales otherthanthe psychology subscale. Table 7a indicates psychology items 3, 6, 7, 12, 14, 15, 16, 18, and 20 are problematic because they correlate with the quantitative or verbal composite at a higher rate than to its own composite. All other items correlate most strongly with the psychology composite. Table 7b indicates verbal items 1, 5, 8, 9, 15, and 18 are problematic because they correlate with the quantitative or psychology composite at a higher rate than to its own composite. All other items correlate most strongly with the verbal composite. Table7b- CorrelationsBetween VerbalReasoning ItemsAndCompositeScores Verbal Item Number Verbal Reasoning (CITCs) Psychology Quantitative 1 -0.01 -0.35 0.28 2 0.47 0.28 0.29 3 0.25 0.19 0.04 4 0.29 0.04 0.03 5 0.27 0.30 0.13 6 0.53 0.30 0.32 7 0.35 0.32 0.07 8 0.31 0.32 -0.03 9 -0.17 -0.12 -0.24 10 0.31 0.13 0.10 11 0.44 0.16 0.34 12 0.51 0.19 0.38 13 0.16 0.01 0.15
  • 13. 13 14 0.27 0.15 0.00 15 0.10 0.27 -0.23 16 0.30 0.09 0.23 17 0.03 0.00 -0.07 18 0.23 0.02 0.26 19 0.32 0.12 0.17 20 0.36 0.20 0.17 N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that correlatemore highlywithsubscales otherthanthe verbal subscale. Table 7c indicates quantitative items 9, 10, and 14 are problematic because they correlate with the verbal or psychology composite at a higher rate than to its own composite. All other items correlate most strongly with the quantitative composite.
  • 14. 14 Table7c- CorrelationsBetweenQuantitative ReasoningItemsAndCompositeScores Quantitative Item Number Quantitative Reasoning (CITCs) Verbal Psychology 1 0.27 0.17 -0.06 2 0.38 0.28 0.07 3 0.39 0.17 -0.01 4 0.37 0.26 0.11 5 0.52 0.37 0.25 6 0.27 0.14 -0.03 7 0.36 0.19 0.12 8 0.37 0.05 0.05 9 0.14 0.19 -0.08 10 -0.04 0.01 0.08 11 0.36 0.05 0.08 12 0.42 0.25 0.11 13 0.31 -0.01 0.06 14 -0.21 -0.29 0.17 15 0.17 -0.15 0.11 16 0.20 0.15 0.03 17 0.38 0.18 -0.01 18 0.04 0.01 -0.20 19 0.44 0.26 0.18 20 0.33 0.16 0.02 N=45. Boldeditems represent the highest correlationforeach item. Blue -shadeditems are problematic because theyrepresent items that correlatemore highlywithsubscales otherthanthe verbal subscale. Summary ofProblematic Cognitive Test Items A summary of the problematic items, including why the item was considered for removal or alteration is presented in Table 8. Test items that fell out of both the acceptable difficulty range (.30 < p < .80) and discrimination range (CITC < .20) were automatically deleted. Falling into this category were items 1, 3, 4, 7, 14, and 17 from the psychology subscale and items 9, 10, 14 and 18 from the math subscale. None of the items in the verbal subscale failed both criteria. Items that met only one of those criteria and/or displayed a higher correlation with a subscale other than its own were also considered for distractor analysis, the next step in evaluating item integrity. Falling into this category were items 2, 15, and 16 from the math subscale, items 6, 15, and 16 from the psychology subscale, and items 3, 5, and 8 from the verbal subscale.
  • 15. 15 Table8– SummaryTableofProblematicCognitiveTestItems Test Item p-value CITC Reasons For Concern Was Item Removed From Survey? Psychology 1 .89 .17 p, CITC Yes Psychology 3 .27 .05 p, high correlation with verbal Yes Psychology 4 .20 .04 p and CITC Yes Psychology 5 .38 .08 CITC No Psychology 6 .71 -.01 CITC, high correlation with math No Psychology 7 .87 .11 p, CITC, high correlation with verbal Yes Psychology 8 .42 .18 CITC No Psychology 9 .22 .22 p No Psychology 12 .56 .20 High correlation with math No Psychology 13 .62 .09 CITC No Psychology 14 .87 .05 p, CITC, high correlation with math Yes Psychology 15 .64 .10 CITC, high correlation with verbal No Psychology 16 .64 -.04 CITC, high correlation with math No Psychology 18 .42 .09 CITC, high correlation with math No Psychology 19 .29 .14 p, CITC Yes Psychology 20 .60 .23 High correlation with verbal No Math 2 .82 .38 p No Math 4 .29 .37 p No Math 9 .24 .14 p, CITC, high correlation with verbal Yes Math 10 .60 -.04 CITC, high correlation with psychology No Math 14 .07 -.21 p, CITC, high correlation with psychology Yes Math 15 .38 .17 CITC No Math 16 .84 .20 p No Math 18 .87 .04 p, CITC Yes Verbal 1 .49 -.01 CITC, high correlation with math No Verbal 3 .91 .25 p No Verbal 5 .96 .27 p, high correlation with psychology No Verbal 8 .91 .32 p, high correlation with psychology No Verbal 9 .76 -.17 CITC, high correlation with psychology No Verbal 12 .87 .51 p No Verbal 13 .73 .16 CITC No Verbal 15 .36 .10 CITC, high correlation with psychology No Verbal 17 .3.781 .03 CITC No Verbal 18 .23 High correlation with math No The psychologysubscale is composedof 24items; verbal andquantitative subscales each have 20items.
  • 16. 16 Distractor AnalysisOf CognitiveScale As required by the assignment, distractor analyses were conducted on three problematic items from each subscale to determine if the item should be removed or altered. In a distractor analysis, the average composite score of individuals who select a particular response for an item is calculated to determine two properties of a good item: 1. An equal distribution of participants selecting incorrect responses (distractors). 2. The mean score of the overall subscale should be higher for the correct response, indicating that the question correctly discriminates between high and low performers on that particular subscale. Psychology Knowledge Item Analysis Psychology knowledge items 1, 3 and 4 were elected for analysis and removal. Psychology Item 1 The distractor analysis of psychology Item 1 is represented in Table 9a. Table 8 indicates the item had a low difficulty level with a p value of .89 and also a low CITC score of .17. The low CITC score represents that those with a strong performance on the psychology knowledge test are not statistically more successful at this item. Table9a- PsychologyScoreOfRespondentsSelectingEachOptionForItem 1 Response Choice Psychology Reasoning Score N % A 5 1 2.22% B 9.5 2 4.44% C 11 1 2.22% D 9 1 2.22% E 13.15 40 88.89% N=45. Correct response is E. Table 9a indicates a very high percentage of participants selected the correct answer E (88.89%). The combination of information in Table 8 and Table 9a display a valid reason to eliminate and replace Item 1. The replacement should be moderately easy due to its placement in the test but more difficult than the original question.
  • 17. 17 Psychology Item 3 The distractor analysis of Item 3 is represented in Table 9b. Table 8 indicates the item had a high difficulty with a p value of .27 and also had a low CITC score of .05. The low CITC score indicates that participants who performed well on the psychology knowledge test did not statistically perform well on this item. Table9b- PsychologyScoreOfRespondentsSelectingEachOptionForItem 3 Response Choice Psychology Reasoning Score N % A 9.88 8 17.78% B 115.42 12 26.67% C 10.86 7 15.56% D 11.88 8 17.78% E 13.50 10 22.22% N=45. Correct response is B. As Table 9b demonstrates, the highest percentage of participants selected the correct answer B (26.67%), which is ideal. Furthermore, the distribution of incorrect answers amongst the distractors is relatively evenly disbursed, which is also ideal. But because the CITC score is so low, it warrants removal and replacement with an item that is better at discriminating between high and low performers. Psychology Item 4 The distractor analysis of Item 4 is represented in Table 9c. The above Table 8 indicates the item had a high difficulty with a p value of .20 and also had a negative CITC score of .04. The low CITC score indicates that participants who performed well on the psychology knowledge test did not statistically perform well on this item. Table9c- PsychologyScoreOfRespondentsSelectingEachOptionForItem 4 N=45 Correct response is D. Response Choice Psychology Reasoning Score N % A 11 7 15.16% B 9.75 8 17.78% C 11.88 8 17.78% D 16.56 9 20% E 13.15 13 28.89%
  • 18. 18 Table 9c displays a lower percentage of participants selecting the correct answer D (20%) and a large percentage of participants selecting answer E (28.89%). The highest percentage should be the correct answer and the responses should have a more even distribution. The combination of information in Table 9c and Table 8 display a valid reason to eliminate and replace Item 4. The replacement should be of moderate difficulty due to its placement in the test. Verbal Reasoning Item Analysis Verbal reasoning items 5, 8 and 9 were elected for analysis. Verbal Item 5 The distractor analysis of Item 5 is represented in Table 9d. Table 8 indicates the item had a low difficulty with a p value of .96 and also had an acceptable CITC score of .27. The acceptable CITC score indicates that those who did well on this item also did well on the rest of the verbal reasoning section. Table9d– Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 5 Response Choice Verbal Reasoning Score N % A 10.5 2 4.44% C 13.70 43 95.56% N=45. Correct response is C. Table 9d displays a very high percentage of participants selecting the correct answer of C (95.56%). No one answered B, D or E and only two participants answered both A. The responses should have a much better distribution. The combination of information in Table 10d and Table 9 provide a valid reason to revise Item 5. The question is very easy; therefore, the revision should include more appealing distractors. Item 5 has been rewritten: 5. Love and beauty are best described as __________ concepts of the mind. A) Physical B) Concrete C) Psychology D) Factual
  • 19. 19 Verbal Item 8 The distractor analysis of Item 8 is represented in Table 9e. Table 8 indicates the item had a low difficulty with a p value of .91 and an acceptable CITC score of .32. The acceptable CITC score indicates that those who did well on this item also did well on the rest of the verbal reasoning section. Table9e- Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 8 Response Choice Verbal Reasoning Score N % A 9.33 3 6.67% C 14 41 91.11% D 8 1 22.22% N=45. Correct response is C. Table 9e displays a very high percentage of participants selecting the correct answer of C (91.11%). None of the participants answered B or E and only one participant answered D. The responses should have a much better distribution. The combination of information in Table 9e and Table 8 display a valid reason to revise Item 8. Item 8 below has been rewritten: 8. It is our view that as a result of intercession from foreign stakeholders our company’s financial difficulties have lamentably been ___________. A) Curtailed B) Ameliorated C) Exacerbated D) Mitigated Verbal Item 9 The distractor analysis of Item 9 is represented in Table 9f. Table 8 indicates the item had an acceptable p value of .76 but a negative CITC score of -.17. The negative CITC score represents that those with a strong performance on the verbal reasoning test are not statistically more successful at this item.
  • 20. 20 Table9f- Verbal ReasoningScoreOfRespondentsSelectingEachOptionForItem 9 Response Choice Verbal Reasoning Score N % A 14 4 8.89% B 14.67 3 6.67% C 11.75 4 8.89% D 13.62 34 75.56% N=45. Correct response is D. Table 9f displays a high percentage of participants selecting the correct answer D (75.56%). The highest percentage should be the correct answer but the responses should have a much better distribution. The mean of the correct response (13.63) is lower than response A (14) and B (14.67). This indicates the question does not discriminate between high and low performers on the verbal reasoning subscale which is why the CITC score is low (-.17). The item may be too easy and high performers are over thinking the answer. The combination of information in Table 9f and Table 8 display a valid reason to rewrite Item 9. Item 9 has been rewritten below to make the other options more appealing. 9. Finish the analogy: Grain : Rice :: A. Cube : Ice B. Rain : Precipitation C. Cow : Milk D. Flake : Snow Quantitative Reasoning Item Analysis Quantitative reasoning items 9, 14 and 18 were elected for analysis and removal. Quantitative Item 9 The distractor analysis of Item 9 is represented in Table 9g. Table 8 indicates the item had high difficulty with a p value of .24 and a CITC score of .14. The CITC score indicates that those with a strong performance on the quantitative test are not statistically more successful at this item.
  • 21. 21 Table9g- Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 9 Response Choice Quantitative Score N % A 7.75 4 9.52% B 7.86 7 16.67% C 12.09 11 26.19% D 10.77 13 30.95% E 12.29 7 16.67% N=42. Correct response is C. As Table 9g displays, a higher percentage of participants selected the answer D (30.95%) rather than the correct answer C (26.19%). The correct answer should have the highest percent of respondents. The mean of the correct response (12.09) is slightly lower than that of response E (12.29). This indicates the question does not discriminate between high and low performers on the quantitative reasoning subscale, which is why the CITC score is low (.14). The item may be too easy and high performers are over thinking the answer. The combination of information in Table 8 and Table 9g display a valid reason to eliminate and replace Item 9. Quantitative Item 14 The distractor analysis of Item 14 is represented in Table 9h. Table 8 indicates the item had a high difficulty with a p value of .07 and also had a negative CITC score of -.21. The negative CITC score represents that those with a strong performance on the quantitative test are not statistically more successful at this item. Table9h- Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 14 Response Choice Quantitative Score N % A 3 1 2.39% B 2 1 2.39% C 7.75 4 9.52% D 7 3 7.14% E 11.85 33 78.57% N=42. Correct response is D Table 9h displays a low percentage of participants selecting the correct answer D (7.14%) and a very large percentage of participants selecting answer E (78.57%). The highest percentage should be the correct answer. The distribution is also not equal across the incorrect choices. The mean of the correct answer (7) should be higher than the other responses. The lower mean
  • 22. 22 indicates the question does not discriminate between high and low performers on the quantitative subscale. The combination of information in Table 8 and Table 9h display a valid reason to eliminate or replace Item 14. The replacement question should be of moderate difficulty because of the placement of the item. Quantitative Item 18 The distractor analysis of Item 18 is represented in Table 9i. Table 8 indicates the item had a low difficulty with a p value of .87 and also had a low CITC score of .04. The CITC score indicates that those with a strong performance on the quantitative test are not statistically more successful at this item. Table9i - Quantitative ReasoningScoreOfRespondentsSelectingEachOptionForItem 18 Response Choice Quantitative Score N % A 10.85 39 86.67% B 3 1 2.22% C 12 2 4.44% D 5.5 2 4.44% E 6 1 2.22% N=45 responseis A. Table 9i displays a high percentage of participants selecting the correct answer A (86.67%). The highest percentage should be the correct answer but the distribution should be more evenly distributed across the incorrect choices. The mean of the correct answer (10.85) should be higher than the mean of answer C (12). The lower mean indicates the question does not discriminate between high and low performers on the quantitative subscale. Revised CognitiveScale As a result of our item and distractor analyses, the following items were eliminated: Psychology items 1, 3, 4, 7, 14 and 19 and Quantitative items 9, 14, 18. No verbal reasoning items were removed. The combined score is a combination of all three subscales so the number of items decreased by 9. By removing the problematic items, the scale’s reliability potentially will be altered; therefore, another analysis is required. The results shown in Table 10 indicate that the removal of the problematic items did not result in changes in reliability for the psychology scale. Quantitative reasoning and the combined cognitive have improved slightly higher than their already acceptable level, meaning the internal consistency has improved these two measures.
  • 23. 23 Table10- DescriptiveStatisticsForCognitiveScaleCorrectedTotals Number of Items Minimum Score Maximum Score Mean Standard Deviation Internal Consistency (α) Verbal Reasoning Old 20 4 20 13.56 3.63 .75 New 20 4 20 13.56 3.63 .75 Quantitative Reasoning Old 20 2 18 10.38 4.27 .81 New 17 1 17 9.20 4.15 .82 Psychology Knowledge Old 24 5 21 12.67 3.63 .64 New 18 3 16 9.29 2.94 .54 Combined Cognitive Old 64 15 53 36.60 8.01 .81 New 55 10 47 32.04 7.70 .82 N = 45. Newacceptable internal consistency (> 0.70) arebolded. Newitems indicate the problem items havebeen removed. Psychologyitems 1, 3, 4, 7, 14, 19 andquantitative items 9, 14,18.The newcombinedcognitive subtotal was computedby addingthe correctedverbal, psychology andmath subscales andcorrectingfor guessing. Descriptive Statistics For Spirituality Scale Scoring of the spirituality portion of the test consisted of adding 1 point for each statement to which individuals responded with “strongly disagree”, 2 points for “slightly disagree”, 3 points for “neither agree nor disagree”, 4 points for “slightly agree”, and 5 points for “strongly agree.” The total was then divided by the number of questions on the scale to obtain an individual average. Therefore an individual’s score for the spirituality scale could range from 1 - 5, with higher scores indicating higher degrees of spirituality. Table 11 provides descriptive statistics for the overall spirituality scale, as well as divided by gender. The scores obtained by our sample ranged from 1.95 to 4.9, with a mean score of 3.60 and a standard deviation of .57. Women reported higher levels of spirituality (Mean = 3.79; SD = .49) than men (Mean = 3.23; SD .55). Cohen’s d is a statistic that compares the difference in the standardized mean between two groups. A positive “d” in Table11 represents a stronger female performance, while a negative “d” represents a stronger male performance. There is a statistically large effect of gender on the spirituality scale.
  • 24. 24 Table11- DescriptiveStatisticsForSpiritualityScaleByGender Overall (N = 45) Female (N = 30) Male (N = 15) Mean 3.60 3.79 3.23 Standard Deviation .57 .49 .55 Minimum Score 1.95 2.85 1.95 Maximum Score 4.90 4.90 3.95 Internal Consistency .90 .86 .91 Cohen’s d 1.08 (large effect) Total N = 45. The boldedinternal consistency indicates an acceptable reliability measure > .70.The spirituality scale was composedof 20 items. Crohnbach’s alpha is a statistic that measures the internal consistency of a scale (i.e., how reliable the test is) and values greater than .70 are considered to be acceptable. At .90 overall, the spirituality scale has a remarkable score of internal consistency. Even though the reliability of the spirituality scale for men (α = .91) is higher than for women (α = .86), the female score is still well above the .70 threshold of acceptability. Table 12 displays descriptive analysis organized by academic degree. The spirituality scale has impressive scores of internal consistency at all academic levels. There were no discernable trends related to scores on the spirituality scale and academic degree of the respondents. Table12- DescriptiveStatisticsForSpiritualityScaleByAcademicDegree Education Level Minimum Score Maximum Score Mean Standard Deviation Alpha (α) Bachelors (N = 19) 1.95 4.90 3.63 0.71 .93 Masters (N = 18) 2.63 4.25 3.50 0.42 .80 PhD or equivalent (N = 6) 3.20 4.35 3.66 0.52 .88 Total N = 45. The boldedinternalconsistencyindicates an acceptable reliabilitymeasure> .70. Spiritualityscale had20 items. Descriptive Statistics For Spirituality Scale By Ethnicity Due to the homogeneity of respondents the statistical software was unable to provide descriptive statistics by race. Out of 45 test takers, only one was Asian, one was Hispanic, and the remainder (96%) were white. Validity And CorrelationAnalysis For Spirituality Scale The spirituality score was correlated with the combined cognitive test and all three of its subscales and is represented in Table 13. There is a negative correlation between spirituality and quantitative reasoning (r = -0.26), which represents a small to medium effect. Conversely, there
  • 25. 25 is a positive correlation between spirituality and the remainder of the scales. There is a small correlation between spirituality and verbal reasoning (r = 0.15), a small to medium correlation between spirituality and psychology knowledge (r = 0.24), and a very small correlation between spirituality and the combined cognitive scale (r = 0.04). Table13- CorrelationsBetweenTheSpiritualityScale,TheTotal CognitiveTestAndAll ThreeSubscales Verbal Reasoning Quantitative Reasoning Psychology Knowledge Combined Cognitive Spirituality 0.15 -0.26 0.24 0.04 Nature Of Effect Size Small Small/Medium Small/Medium VerySmall N = 45. None ofthe correlations was statistically significant. Overall Analysis Of The Spirituality Scale To test whether the spirituality scale was truly measuring spirituality, reliability analysis was conducted on the spirituality scale. Again the acceptable value of internal consistency is above .70 which indicates that the scale is measuring what it intends to measure. Reliability analysis of the spirituality scale resulted in internal consistency of α=.90. This indicates that the spirituality scale measured what it intended to measure. In other words, it is a good measure of the spirituality construct. Even though it had a significant internal consistency further analyses were conducted at the item level of the spirituality scale as specified by the class assignent. Item Analysis Of The Spirituality Scale Table 14 shows that all of the items in the spirituality scale had acceptable CITC scores, meaning the respondents who scored high on an item were likely to score highly on the spirituality survey as a whole. Item 1 (CITC = .22) and Item 4 (CITC = .21) had the lowest CITC scores. No items fell below the CITC threshold of .20, so no items had respondents who scored highly on the item score but low on the spirituality subscale.
  • 26. 26 Table14- DescriptiveStatisticsAndDiscriminationIndex(CITC)ForTheSpiritualityScale Item Number Mean Score Standard Deviation CITC 1 4.26 0.54 0.22 2 4.36 0.53 0.36 3 4.17 0.73 0.47 4 3.88 0.77 0.21 5 4.31 0.78 0.57 6 3.29 0.86 0.65 7 3.74 0.83 0.54 8 4.12 1.09 0.61 9 3.33 1.30 0.79 10 3.14 1.05 0.37 11 3.81 1.07 0.68 12 3.21 1.09 0.84 13 4.02 0.78 0.32 14 3.62 1.04 0.61 15 4.07 0.84 0.60 16 3.88 0.71 0.34 17 2.31 1.16 0.35 18 2.45 1.27 0.71 19 3.60 1.13 0.23 20 2.60 1.29 0.76 N=45. BoldedCITCs indicate items with the lowest (but still acceptable) CITC scores. To satisfy the requirements of the assignment, further analysis was conducted to determine the relationship between the individual items on the spirituality scale and the other facets of the PAL survey. A good item correlates highly with its own facet more than any other facet. If an item correlates more highly with another facet, then it is a poor item. Table 15 presents the results of this analysis. Only Items 1 and 4 are correlating with facets other than its own.
  • 27. 27 Table15- CorrelationsBetweenSpiritualityAndCognitiveScores Item Number Spirituality CITC Verbal Quantitative Psychology 1 0.22 0.34 0.13 0.05 2 0.36 -0.05 -0.16 -0.13 3 0.47 0.00 -0.09 0.03 4 0.21 0.28 0.04 0.26 5 0.57 0.13 -0.05 0.04 6 0.65 -0.04 -0.11 0.27 7 0.54 0.22 -0.11 0.15 8 0.61 0.09 -0.05 0.17 9 0.79 -0.04 -0.34 0.08 10 0.37 -0.12 -0.17 0.14 11 0.68 0.18 -0.32 0.18 12 0.84 0.29 -0.14 0.20 13 0.32 0.03 -0.25 0.11 14 0.61 0.11 -0.20 0.18 15 0.60 0.23 -0.19 0.04 16 0.34 0.02 -0.27 0.16 17 0.35 -0.09 -0.15 0.22 18 0.71 0.80 -0.20 0.16 19 0.23 0.02 -0.09 0.03 20 0.76 0.26 -0.20 0.29 N=45. CITC refers to CorrectedItem Total Correlations. Boldedcorrelations represent the highest correlations forthe itemamongthe subscales. Distractor Analysis OfSpirituality Scale Items The spirituality scale is different from the other subscales in that the answers are not right or wrong. Items 1 and 4 have acceptable CITC scores but they are in the lower range of acceptable scores, indicating that those who scored highly on the items scored lower on the rest of the spirituality scale. Although these items are within the acceptable range, they are the weakest of the 20 questions. Tables 16a and 16b display distractor analysis for the two items. None of the questions in the spirituality scale will be removed. Table16a- SpiritualityScoreOfRespondentsSelectingEachOptionForItem1 Response Choice Spirituality Score N % Slightly disagree (3) 3.78 2 4.44% Slightly agree (4) 3.50 29 64.44% Agree (5) 3.80 14 31.11% N=45. Thespiritualityscale included20 questions.
  • 28. 28 Table16b- SpiritualityScoreOfRespondentsSelectingEachOptionForItem4 Response Choice Spirituality Score N % Disagree (2) 3.21 4 8.89% Slightly Disagree (3) 3.31 4 8.89% Slightly Agree (4) 3.64 31 68.89% Agree (5) 3.87 6 13.33% N=45. Thespiritualityscale included20 questions. Biases Of Spirituality Scale Items Additional investigation into individual responses was taken by looking for response biases of participants completing the spirituality personality measure. A response bias can occur on an attitude measure when a person systematically chooses responses based on something besides their true stance on a question. The tables below show three separate participants flagged for three types of response biases. These respondents were reviewed and ultimately removed. Central Tendency The type of error where a respondent tends to choose the answers closest to the middle is central tendency. In these cases respondents are reluctant to choose strongly agree or strongly disagree which are the extreme positive and extreme negative choices. Case 45 is an example of central tendency in the spirituality respondents. Table 17a displays the respondents answers clustered in the middle of the rating continuum. All twenty answers fell within the three central responses, and a majority of 10 responses being ‘neither’. The participant may not have taken the time to read the entire question properly; therefore, this case will be deleted. Table17a- SpiritualityResponseOfRespondent45:Central TendencyError Frequency Percent Disagree 3 15% Neither Agree Nor Disagree 10 50% Agree 7 35%
  • 29. 29 Severity Bias By definition, severity bias is an error that occurs as the result of a rater’s tendency to be overly critical; however, that definition falsely suggests that there is a preferred way to answer the questions on the spirituality scale. In this study, we arbitrarily assigned severity bias to the cases in which most of an individual’s responses were at the negative end of the Likert scale (i.e, disagreeing to some degree with almost all of the items on the survey.) Case 9 displayed in Table 17b has a severity bias frequency distribution. Case 9 will be removed. Table17b- SpiritualityResponseOfRespondent9:SeverityBiasError Frequency Percent Strongly Disagree 8 40% Disagree 8 40% Neither Agree Nor Disagree 1 5% Agree 3 15% Leniency Error By definition, leniency error occurs as the result of a rater’s tendency to be too forgiving and insufficiently critical; however, that definition falsely suggests that there is a preferred way to answer the questions on the spirituality scale. In this study, we arbitrarily assigned leniency error to the cases in which most of an individual’s responses were at the positive end of the Likert scale (i.e, agreeing to some degree with almost all of the items on the survey.) Case 14 has a strong leniency error seen in Table 17c. The respondent had a vast majority of the responses under the same reply; therefore, the case will be removed. Table17c- SpiritualityResponseOfRespondent14:LeniencyError Frequency Percent Agree 2 10% Strongly Agree 18 90%
  • 30. 30 Revised Spirituality Scale As a result of our item and distractor analyses, the following respondent cases were eliminated: 9, 14 and 45. All questions met the requirements to be included in the final results, no questions were eliminated. The statistical results of eliminating the three cases are delineated in Table 18. The new estimate of internal consistency decreased from 0.90 to 0.85; however, the corrected spirituality scale remains a highly reliable instrument. Table18- DescriptiveStatisticsFor SpiritualityScaleCorrectedTotals Original Corrected Minimum Score 1.95 2.63 Maximum Score 4.90 4.55 Mean 3.60 3.62 Standard Deviation 0.57 0.49 Internal Consistency (α) 0.90 0.85 Original N = 45, Corrected N = 42. Newacceptable internal consistencies (> 0.70) are bolded. “Corrected” indicates the problematic cases (9, 14 and45) have been removed.
  • 31. 31 Correctionfor Guessing On Cognitive Test In order to obtain a better representation of a participant's true score a correction for guessing was performed on the combined cognitive scores. The spirituality measure was corrected by removing respondents with severe response bias. Table 18 shows that after correcting for guessing, the combined cognitive score correlation with GPA decreased. The spirituality correlation with GPA also decreased from .16 to .11. The correction for guessing made a difference in the criterion validity for the spirituality scale. Variance Explained Table 19 indicates that the corrected combined cognitive test accounted for none of the variance explained by GPA. Likewise, the uncorrected combined cognitive test accounted for none of the variance explained by GPA. Additionally, the spirituality scale accounted for 2.46% of the variance explained by GPA and the corrected spirituality scale accounted for 1.10% of the variance explained by GPA. The nature of the effect size in spirituality is considered to be small. The amount of variance explained in GPA is not affected by the correction for guessing in the cognitive test. Table19- VarianceinGPAexplainedeffectsizeand GPAcorrelationofcorrecteditems. Original Combined Cognitive Partially- Corrected Combined Cognitive Corrected Combined Cognitive Original Spirituality Corrected Spirituality GPA Correlation(r) 0.02 .00 .00 .16 .11 GPAExplained Variance (r2 * 100) 0.00% 0.00% 0.00% 2.46% 1.10% Nature of EffectSize No Effect No Effect No Effect Small Effect Small Effect Proportion ofthe variance explained: .01=small effect, .09=medium effect,.25=large effect Correctionfor Attenuation The relationship between two variables can be weakened by measurement error. The correlation between two items is more accurately reflected if a correction for attenuation has been performed. The correlations of the three subscales, spirituality and combined cognitive score with the self-reported measure are reported in Table 20. The correction improved the correlation coefficient of all measures but quantitative which had almost no original correlation. The psychology subscale had the largest increase in effect size from .29 to .39.
  • 32. 32 Table20- CorrectedForAttenuationWith SelfPerformanceReportsAndTestPerformance Verbal Reasoning Quantitative Reasoning Psychology Knowledge Combined Cognitive Spirituality Attenuation .21 .00 .39 .21 .43 Self- Performance .18 .00 .29 .19 .40 Reliability .75 .82 .54 .82 .85 The transformed scores still correlate completely with the untransformed scores. The other corrected correlations only vary slightly from the uncorrected scores. Because the scores have undergone a linear transformation, each of the three subscales correlates perfectly with the subscale’s Z-score, IQ score and T-score. This occurs because the values did not change in relation to each other, i.e., the transformation was standardized. Table21- CorrelationBetween CorrectedRawAndStandardizedScoresOfTheCombined CognitiveTestAndIts Subscales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1. Corrected Combined Cognitive --- 2. Combined Cognitive Z- Score 1.00 --- 3. Combined Cognitive T- Score 1.0 1.00 --- 4. Combined Cognitive IQ 1.00 1.00 1.00 --- 5. Psychology .64 .64 .64 .64 --- 6. Psychology Z- Score .64 .64 .64 .64 1.00 --- 7. Psychology T- Score .64 .64 .64 .64 1.00 1.00 --- 8. Psychology IQ .64 .64 .64 .64 1.00 1.00 1.00 --- 9. Quantitative .76 .76 .76 .76 .23 .23 .23 .23 --- 10. Quantitative Z-Score .76 .76 .76 .76 .23 .23 .23 .23 1.00 --- 11. Quantitative T-Score .76 .76 .76 .76 .23 .23 .23 .23 1.00 1.00 --- 12. Quantitative IQ .76 .76 .76 .76 .23 .23 .23 .23 1.00 1.00 1.00 --- 13. Verbal .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 --- 14. Verbal Z- Score .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 --- 15. Verbal T- Score .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 1.00 --- 16. Verbal IQ .73 .73 .73 .73 .29 .29 .29 .29 .28 .28 .28 .28 1.00 1.00 1.00 --- N = 45. Boldedcorrelations are significant at the 0.01level (2-tailed).
  • 33. 33 Overall ConclusionsAnd Recommendations The PAL Cognitive and Spirituality Scales show promise as psychometric tools for measuring cognitive reasoning, psychology knowledge and personality. Overall the cognitive scale was found to have high internal consistency, as did the verbal and quantitative subscales. The psychology subscale was found not to have good internal consistency, however, and the mandatory removal of items not meeting reliability standards further decreased its internal consistency, an unfortunate consequence when a scale has only a small number of items. The spirituality subscale showed extremely high levels of internal consistency and all items met the required thresholds of reliability and were therefore retained. This report has presented a number of recommendations to further improve the scales’ validity and reliability. Implementation of the recommendations made in the distractor analysis section should improve internal consistency. For example the development of several new questions for the psychology subscale is recommended. Finally, further analysis should be conducted to ensure the validity of both the cognitive and spirituality scales. References Imel, Susan (1998). “Spirituality in the Workplace.” Trends and Issues Alert, ERIC Clearinghouse on Adult, Career and Vocational Information Fry, Louis (2016). “Toward a Theory of Spiritual Leadership.” The Leadership Quarterly, no. 14, 693-727 Laabs, Jennifer J. (1995). "Balancing Spirituality and Work." Personnel Journal 74, no. 9: 60- 76. Leigh, Pamela. (1997). "The New Spirit at Work." Training and Development 5, no. 3: 26-33. McLaughin, Abraham. (1998). "Seeking Spirituality...At Work." Christian Science Monitor
  • 34. 34 Appendices Appendix A –SurveyInstructions Appendix B– BackgroundInformation Appendix C–Self-ReportedPerformance Appendix D– SpiritualityScale Appendix E– PsychologySubscale Appendix F –Quantitative ReasoningSubscale Appendix G– Verbal ReasoningSubscale