12912540 RAFTERY Alan

1
Student ID: 12912540
Student name: Alan Raftery
Improving the Validity of Structured Interviews
with the Implementation of Personality and
General Cognitive Ability Testing.
Supervisor: Dr Chris Dewberry
Occupational Psychology MSc
Submission Date: 30th
September 2014

2
Contents
Declaration.................................................................................................................................... 4
Abstract......................................................................................................................................... 5
Introduction .................................................................................................................................. 6
Literature Review........................................................................................................................ 10
Methodology............................................................................................................................... 25
Organisation............................................................................................................................ 25
Research Design ...................................................................................................................... 25
Sample..................................................................................................................................... 26
Data Collection........................................................................................................................ 26
Personality Test....................................................................................................................... 27
General Cognitive Ability (GCA) test ....................................................................................... 28
Supervisor Ratings................................................................................................................... 29
Procedure................................................................................................................................ 30
Data Analysis........................................................................................................................... 32
Ethical Considerations............................................................................................................. 32
Results......................................................................................................................................... 33
Descriptive Statistics ............................................................................................................... 33
Statistical analysis ................................................................................................................... 36
Discussion and Conclusion .......................................................................................................... 38
Findings ................................................................................................................................... 38
Pros and cons .......................................................................................................................... 39
Future Improvements ............................................................................................................. 41
Practical Application................................................................................................................ 42
Contribution to Occupational Psychology............................................................................... 44
Critical Self-review....................................................................................................................... 46
Appraisal of practise................................................................................................................ 46
Personal approach................................................................................................................... 46
Assumptions............................................................................................................................ 47
Researcher bias ....................................................................................................................... 47
What have I learnt?................................................................................................................. 48
Acknowledgements..................................................................................................................... 49
References................................................................................................................................... 50

3
APPENDIX A................................................................................................................................. 57
The ethics form ...........................................................................................................................
APPENDIX B ................................................................................................................................. 61
Personality Questionnaire Scales Booklet.................................................................................
APPENDIX C ................................................................................................................................. 68
Student Caller Selection Day.......................................................................................................
APPENDIX D................................................................................................................................. 68
Job Description for the post of University of Warwick Student Caller 2013/14.........................
APPENDIX E .................................................................................................................................. 70
Phone InterviewQuestions............................................................................................................
APPENDIX F ................................................................................................................................. 71
Interview Questions....................................................................................................................
APPENDIX G................................................................................................................................. 72
Supervisor Performance Rating Form.........................................................................................

4
Declaration
I certify that the work submitted herewith is my own and that I
have duly acknowledged any quotation from the published or
unpublished work of other persons.
Signed:
Date: 26th
September 2014

5
Abstract
In selection and assessment literature, the high validity of general cognitive
ability (GCA) in predicting job performance has been debated against its controversial
practical application within organisations. It has been stated that ‘it would be better if
alternative selection methods, such as personality were as predictive as GCA due to
their minimal adverse impact’, Schmidt (2002). Therefore, the studies aim is to fill a
gap in the knowledge, and to investigate the predictive validities of no test (control),
GCA, personality and GCA with personality within a structured interview process.
Student caller applicants (N=288) completed the GCA and personality tests and were
assigned a group. Following the interview process, informed by the test data, the
successful candidates (N=38) were given performance ratings after 1 month. The
findings showed a significant difference between supervisor performance ratings across
the four groups. Significant differences in mean supervisor performance ratings were
also found between the control group and the other 3 groups, also between only GCA
and only personality group and between GCA and GCA with personality. The effect
size of the group with the highest mean supervisor ratings (GCA with personality)
compared with the control group was large. This contributes to the gap in the
knowledge as it demonstrates not only the compatibility of GCA and personality with
structured interviews, but also the higher predictive validity that can be achieved by
combining GCA with personality.

6
Introduction
Academic literature assessing the validity of selection methods according to
their ability to predict performance has a long history dating back to the 20th
Century. It
was only until the 1920’s that it was seen that contrasting findings were being found
when analysing the same selection method. In the 1930’s and 1940’s it became apparent
that this may be due to differing settings for the same target job which alter the validity
of the selection method. This was labelled situational specificity and continued to
feature in the research until Schmidt & Hunter, (1977), demonstrated that most of the
variations in validity were the result of statistical and measurement inaccuracies, such as,
sample size being too small. This lead to the introduction of new quantitative techniques
which combine past results and correct the statistical errors, Hunter & Schmidt (1990).
This showed that in contrast to previous perceptions, the variability of selection method
validity was near zero across a wide range of jobs, Schmidt, Hunter & Pearlman (1980).
This was a significant step within the research, as it gave an indication of the most valid
selection method for any job and allowed the comparison of selection methods.
Organisations were then able to alter their selection methods according to the research,
therefore changing how we view and experience selection and assessment.
Schmidt and Hunter (1998) conducted a review of the practical and theoretical
implications of 85 years of research in personnel selection. The article studied the
validity of various selection procedures in terms of the prediction of job performance.
The validity of the combination of one of the selection procedures with general
cognitive ability (GCA) was also studied. Within the literature, predictive validity, the
ability to predict subsequent performance, is deemed to be an integral part of personnel

7
selection methods. The predictive validity coefficient is directly proportional to
practical, economic value, Muldrow (1979). Organisations which choose to implement
the research on improving the predictive validity of their selection methods, which will
lead to better overall employee performance which subsequently leads to greater
learning of job-related skills and improved economic output, Hunter, Schmidt, &
Judiesch (1990). A significant conclusion for this paper is the agreement within the
research that when selecting individuals without previous experience, the most valid
predictor of performance is GCA, Hunter & Hunter (1984).
Further research on predictive value has shown that the validity of a selection
method is a key factor which determines practical value, but not the only factor.
Another factor to be aware of is variability of performance. If variability was low, there
would be little difference between the performances of successful candidates, thus
significantly decreasing the importance of selection. In contrast, if variability is high,
the need for a valid selection method will also be high. The latter is strongly supported
within the research. Particularly, this can be seen in the research over the past 15 years,
which indicates that the variability of performance and output among workers is very
high and that it would be even larger if all candidates were selected or if candidates
were selected at random. This variability is what determines practical value, Hunter
(1990).
In recent years, research on GCA has increased in popularity. This is because of
its high validity and versatility. It is versatile in the sense that it can be implemented
successfully for all jobs, whether it be a factory role or an investment banking role, for
example. In addition, GCA has the lowest application cost, which contrasts expensive
work sample tests, despite being equally valid. Evidence for the validity of GCA in

8
predicting job performance is more significant amongst all selection methods, Hunter &
Schmidt (1996). Thousands of studies have been conducted over the last nine decades.
By contrast, only 89 validity studies of the structured interview have been conducted,
McDaniel et al. (1994). Structured Interviews are also more expensive and usually
contain job knowledge questions which limits their application, as inexperienced
candidates will be at a disadvantage. Despite this step in the research, and significant
support for the use of GCA, organisations prefer the traditional format of structured
interviews, which this research paper aims to address.
Personality within selection and assessment research is still relevant in the
literature. Use of the Myers-Briggs type indicator and the Five Factor model have
shown that personality traits are useful in establishing a person-environment fit,
although, they do not score high in selection method validity, Zhang, Li & Liu (2010).
As a result, the British Psychological Society (BPS 2013) discourages use of personality
tests within personnel selection as a stand-alone method.
However, research has found gathering evidence for the use of
conscientiousness in causal models of job performance. This research has found that
when mental ability is the same across the sample, employees scoring highly in
conscientiousness develop greater job knowledge. This may be due to highly
conscientious individuals work harder and spend more time on a task, compared with
the average. This then in turn, causes higher job performance. From a theoretical point
of view, the latest research suggests that the key determining variables in job
performance may be GCA, job experience, and the personality trait of conscientiousness.
The most popular format of curriculum vitae applications followed by structured
interviews measure a combination of previous experience, mental ability, and a number

9
of personality traits, such as conscientiousness, as well as, job-related skills and
behaviour patterns. The average correlation between interview scores and scores on
GCA tests is .32 (Huffcutt et al., 1996), therefore, interview scores reflect mental ability,
but not to the level of GCA on its own. Subsequently, the research has moved towards
studies which test the validity of selection methods, which can be practically
incorporated within more traditional formats of selection (e.g. structured interview)
and/or when used to ‘compliment’ each other, for example – GCA and personality.

10
Literature Review
Structured Interviews in personnel selection.
A structured interview is a selection method designed to predict future job performance
through the observation of an individual’s oral responses to questioning from a panel.
The use of competency-based assessment in selection has grown in popularity,
as it is perceived to be the most effective way to aid personnel selection, Golec &
Kahya (2007). In particular, structured interviews are one of the most common selection
methods used by organisations around the globe, McCarthy, Van Iddekinge & Campion
(2010). Despite there being academically more efficient personnel selection methods
available, both large and small organisations prefer to use structured interviews which
have a weaker validity coefficient of .38, compared to a GCA test which scores .51,
Schmidt & Hunter (1990). Prior to Schmidt and Hunter’s research, it was widely
believed that GCA is significant for academic performance, but cannot then be applied
to real-life situations, therefore is not applicable to work performance, Jencks (1972). It
was stated that although GCA can predict job performance, it varied depending on the
type of work and sector. However, following meta-analytic studies, this theoretical
perspective was overturned, because it was shown that the variability of validity was
mostly due inaccuracies in the statistics, such as sampling error variance, job
performance rating error and GCA range restriction, Schmidt & Hunter (1990).
Organisations spend millions on recruiting the right people. The productivity,
well-being and subsequently the profitability of the business strongly depends on the
workforce. The better alternatives (e.g. general cognitive ability) have issues and
complications (e.g. equality), thus organisations tend to prefer interviews which have

11
the added positive of meeting face to face and the ability to ask custom questions for
specific job roles. There is seemingly an intuitive appeal towards interviews. When 852
organisations were surveyed, 99% used interviews as a selection tool and all levels of
staff had the perception that interviewing is a valid method to predict job performance,
Ulrich & Trumbo (1965). Perhaps the popularity of interviews comes down to the
perception that the best people to assess potential candidates are people who are already
apart of the organisation’s culture and even within the same role, something that a
uniform test cannot emulate. In addition, an interview gives a sense of control over the
process to the employer. The research supports this suggestion by concluding that
interviewers perceived personal relations and motivation as being the two attributes
which are the most valid predictors of job performance, Ulrich & Trumbo (1965). As a
result, other valid factors are neglected. Furthermore, there is strong evidence stating
that selection through structured interviews are commonly made according to
behavioural and verbal cues, as well as the social relationship between the panel and the
interviewee, Wright (1969). This implies that many structured interviews are unfairly
rewarding the perception of person-environment fit rather than attributes which predict
future performance. Schmidt, (1976) suggested that there is an abundance of factors that
influence decisions through interview, which goes some way to explain its weak
validity. Many studies within the literature have addressed interviewer bias, such as, the
effects of appearance, more attractive candidates tend to receive higher ratings, Carlson
(1967). The misinterpretation of physical or behavioural cues, for example, sweating or
nail biting may be internally perceived as really wanting the job by the interviewer. But
when viewed externally with a candidate, the interviewer may perceive this observation
as psychological weakness, Yukl et al. (1972). Also, more controversially, racial bias is

12
still an issue with selection and assessment. When a sample of manager’s, selected for
their high scores in ‘openness’, were asked to assess candidates for a variety of roles,
they were given fake resumes (varying in quality) and fake candidate photographs
(varying in race). It was found that white candidates were selected when the resume was
average or good, Asian candidates tended to be selected for higher grade roles and black
candidates were not selected regardless of resume quality, Ledvinka, (1973). This
presents a need for research surrounding the managing of large amount of unconscious
bias that occurs in this selection method. In particular, common biases such as, reading
of non-verbal cues change the nature of the structured interview, but despite its resulting
low validity, will continue to be used by organisations, Arvey & Campion (1982).
Structured interviews are classified as job-related due to their references to past
experiences, but with the many studies identifying various types of interviewer bias,
they are becoming more like behavioural/situational interviews, Janz (1989). Another
study collected data on interviewer judgments, gender, and age for job applicants
interviewing for seasonal retail sales clerk positions in two separate years. It was seen
that females and older applicants received higher average interview evaluations between
the two years As a result, it was suggested that there is a need to manage the interviewer
bias through the use of select psychological trait tests, e.g. influential or adaptable,
which informs the interviewer about the candidates traits based on evidence and not
perception, Arvey et al. (1987).
Personality testing in personnel selection.
Using personality testing as a pre-employment tool is useful in matching
appropriate traits of the candidate to the criteria of the job role. The literature recently
has recorded a large increase in the popularity of personality assessment, Hough &

13
Oswald (2008). However, this has slowed due to the weak correlations of personality
scores with job performance, Morgenson (2007), in particular through analysis of
criterion validity. Poor validity scores for personality testing were first recorded upon
the emergence of quantitative review techniques in the 1980’s, Schmitt et al. (1984).
Despite the support for personality testing to be only used to ascertain behavioural
outcomes, many studies continued to use the tests within selection research, Weiss &
Adler, (1984).
Following a 12 year review, it was concluded that; “It is difficult in the face of
this summary to advocate, with a clear conscience, the use of personality measures in
most situations as a basis for making employment decisions about people”, Guion &
Gottier (1965). Other studies have suggested the reason for this weakness is the ability
candidates have to fake their responses, Goffin & Christiansen, (2003); Putka &
McCloy, (2004). This is when a candidate gives ‘fake’ responses in order to, according
to their perception of what the employers wants, gain an advantage in the selection
process. It is therefore possible to elevate scores and unrealistically present oneself as
being a good match for the role. In contrast, GCA does not have this issue, as an
individual cannot simply decide to choose the correct answers.
Some research suggests that although faking may occur, it does not affect the
validity of the test. This is because, less than a third of differences between truthful and
desirable responses were statistically significant, Hough (1990). Studies using
impression management and self-deceptive enhancement, Barrick & Mount, (1996),
have suggested that faking has an insignificant effect. Contrary to this stance, other
studies have concluded that faking is a key issue as selection ratios decrease especially
if ranking of candidates is used in selection, resulting in different candidates being

14
selected, Rothstein & Goffin (2006).
It is possible however to introduce faking detection scales to counter this
confounding variable, Zhang, Li & Liu (2010). The scale has had some positive
outcomes with being able to identify both faking ‘good’ and faking ‘bad’ (attempting to
avoid recruitment e.g. military), though it can lead to further complications, such as the
lack of clarity between a faking candidate and a nervous, yet determined candidate and
unnecessary costs to implement this scale. More modern personality testing tools
account for any social desirability bias through repetition and rewording of the
questions, thus indicating if a candidate is answering consistently throughout the test.
Good personnel selection is focused upon identifying candidates who have the
ability, as well as an adequate job-fit in order to predict future job performance. The use
of personality scores is an easy and efficient way of matching these traits to the test
scores. In more recent research within the 1990’s, similar low levels of validity as
previous were recorded, however, it was concluded that ‘meta-analytically corrected
estimates’ of validity were meaningful, and personality measures should once again be
used in selection contexts, Day (1993). Furthermore, most recent, literature reviews
indicate that there is strong support for personality scores predicting job performance
(Ones, Dilchert, Viswesvaran & Judge, 2007). This suggests a changing of opinions on
the usage of personality testing in personnel selection.
Successful application of personality testing in occupational psychology
involves the matching of traits to specific roles which reflect the nature of the job. For
example, in broad terms, one would tend to select an energetic, outgoing extrovert for
face-to-face sales roles. However, it should be noted that some personality traits have
been found to be more predictive of performance than others, and not all traits are seen

15
as relevant for certain jobs. The trait that has the greatest validity in predicting
performance is conscientiousness. Conscientiousness has been found to show a constant
relationship with all job performance criteria for all work sectors, Murray, Barrick &
Mount (2006). Other personality traits have been seen to be valid, however not across
all sectors and the level of validity was smaller. In addition, when more attention is
given to trait-matching, it results in considerably higher criterion validity, Tett et al.
(1999). This implies that the prediction of performance becomes more valid when
attention is given to the type of traits that are relevant to particular job roles. Modern
personality tests, like the one chosen for this research project, allows for the
customisation of scales, in order to accurately reflect the key traits with the person
specification.
A popular and publically familiar personality test is the Five-factor model (FFM). The
model has broad personality traits of agreeableness, conscientiousness, extraversion,
neuroticism and openness. The precise measurement of this multi-faceted domain of
personality holds value in aiding the accurate matching of traits with jobs. FFM is
criticised for its broad traits, as it struggles to accommodate more narrowly defined
traits, such as, dominance and affiliation, which are lost in the label of extraversion. Use
of narrow scales significantly improves the accuracy of trait matching, Rothstein &
Goffin (2006).
Using a modern, custom made personality tests which accounts for social desirability
test and uses narrow scales, is likely to result in a much improved validity compared
with a standard broad personality test. However, it needs to be acknowledged that there
are still alternative pre-employment testing methods which have better and more
consistent validity than personality testing. The literature does support the notion that

16
personality variables are valuable predictors of job performance when carefully
matched with the appropriate occupation and organization. Current research is moving
towards investigating whether pre-employment testing approaches tend to assess
individuals’ capacity to perform well on the job, e.g. accounts of previous good
performance, but are lacking in their potential to predict whether individuals necessarily
will choose to perform well on the job, e.g. accounts of good work ethic, Marcus,
Goffin, Johnston, & Rothstein (2007); Murphy (1989). This has opened up a gap in the
research which concedes that personality tests should not be used in isolation, and
would focus on whether personality tests can add and improve other selection methods?
Traditional selection methods, such as structured interviews, tend to assess the good
work ethic accounts more so than the good performance accounts, Marcus et al. (2007).
Therefore, within personnel selection situations, personality testing has great potential
for complementing and adding to the predictive validity of existing selection tools by
predicting unique parts of the job performance domain. Furthermore, not including
personality testing within selection increases the likelihood that new employees will
become less productive after they master the challenges of their jobs and enter
“maintenance” stages, at which point the good work ethic accounts for increased
variance in job performance, Murphy (1989).
This research paper aims to implement personality scores with the interview process.
The personality test, developed by ‘Criterion’, is custom made according to the key
traits in the person specification. These are a collection of narrow traits which will give
the interviewer’s an opportunity to order the candidates as part of their selection process.
Furthermore, the interviewers are subsequently able to adjust their questions and
perceptions according the test results. This will minimise interviewer bias because it

17
will give them an evidence-based understanding of what sort of traits the candidate has
and it will enable them to explore the candidate’s traits with appropriate questions,
rather than relying too much and their perception of the candidate.
GCA testing in personnel selection
Historically, testing an individual’s cognitive ability as a selection method was common
place. Earliest records of this are approximately three thousand years ago in China,
where the individual differences of potential leaders were measured by giving them
puzzles to solve. To define GCA – it is seen as a psychometric and psychological
construct which describes the phenomena of human mental functioning, (Reeve &
Hakel, 2002).
It is exactly 100 years since Spearman (1904) defined the construct of GCA and
proposed its central role in human cognition. During the middle part of the 20th century,
interest in the construct of GCA declined in some areas of psychology, but in the last 20
to 25 years there has been a resurgence of interest in GCA and its role in various life
areas. Modern studies are showing that it can predict both occupational level and the
individual performance within the role, as well as many aspects of general life and
career success, including salary, divorce rate and life satisfaction, Schmidt & Hunter
(2004). These correlations are consistently over .50, which is said to be rare in
psychological research, and are considered large, Cohen & Cohen (1988).
Large scale literature reviews encompassing all types selection methods find GCA on
average as the most efficient method with a validity coefficient as high as .57, (Hunter
& Hunter, 1984).

18
Subsequently it was claimed that, “GCA can be considered the primary personnel
measure for hiring decisions and one can consider the remaining personnel measures as
supplements to GCA measures”. The evidence of GCA’s predictive validity is
significant. Also, compared with other selection methods, GCA also has significant time
and cost saving advantages. For example, structured interviews require planning,
contact and evaluation time, extensive candidate communication and feedback and can
be logistically problematic. Similarly, assessment centres are significantly more
expensive and have poor predictive validity, Schmidt and Hunter, (1986). The literature
suggests the implementation of GCA testing within all selection and assessment
procedures would not only help organisations select better personnel, but it will have
great economic benefits too. Despite this, organisations are not changing their methods.
This due to the controversies surrounding its use.
GCA tests are criticised for the lack of equality, which has become an increasingly
important topic for organisations over the past decade. The Equality Act (2010) states
that organisations must put procedures in place within their selection processes which
do not discriminate against the list of protected characteristics, such as age, disability,
gender, race, religion and sexual orientation. GCA can be seen as a tool which
discriminates against ethnic minorities due to the research into average scores across
different racial groups.
However, the literature strongly suggests that racial bias exists in other more widely
accepted selection methods. When investigating the competitive selection process for
trainee lawyers, a small but significant relationship was present between ethnicity and
performance. The study compared the performance of trainees in blind-marked tests
with the performance in non-blind marked tests. No racial discrimination was found in

19
the tests where ethnicity was hidden, Dewberry (2001). This implies that the variation in
performance between whites and ethnic minorities can occur in a highly educated and
intellectually able group of people. Thus meaning that even when GCA is equally high
in candidates, racial bias occurs.
In addition, the majority of the literature on race and selection methods have
stated that GCA has a substantial negative impact on race. For example, on average
Blacks have been found to score one full standard deviation lower than whites, Schmidt
& Hunter (1986). It has been countered that this may be due to the ‘western’ design of
GCA tests and due to the cultural difference, these tests are not valid outside of western
groups. However, predictive validity was found when end of year exam marks were
predicted in Kenyan 12-15 year olds, Sternberg (2001). Also, correlations have been
found between GCA scores in African engineering students and their final grades,
Rushton, (2003). Critics also suggest that the items have different meanings for
Africans than they do for Whites, Nell, (2000). Although it this was seen not to be the
case as similar findings were found when using Raven’s matrices which are non-
culturally specific tasks, Raven (2000).
The most commonly used GCA tests within personnel selection are split up into
two parts, verbal and numerical. The verbal part has been found to have a slight bias
against individuals whose first language is not English, Skuy (2001).
Further to this, studies have shown that even if ethnic minority candidates are of a
higher standard than white candidates, racial bias still occurs. A group of managers,
scoring high in openness, were asked to assess a group of candidates. High ‘openness’
is associated with a greater acceptance and understanding of other cultures. Despite this,
it was found that the black candidates were not selected even if the quality of

20
application was high, Ledvinka (1973). Therefore, it can be argued that for highly
capable candidates from ethnic minorities who score highly in GCA, a GCA selection
method would be beneficial. This is because it will clearly, beyond any reasonable
doubt, outline the level of GCA to the selector. If the selectors are burdened with the job
of differentiating between the candidates, it opens up the selection to bias, personal
perceptions, unconscious discrimination and social desirability bias. If for example, a
candidate from an ethnic minority in fact as the highest GCA amongst the pool of
candidates, this will be made clear and perhaps minimise the level of racial bias he/she
may have otherwise received.
Compared to alternative selection methods, GCA tests produce racial differences that
are 3 to 5 times larger. Examples of methods with less bias are bio data, personality
tests, and structured interview. Albeit they have lower predictive validity, they are still
nevertheless valid predictors of job performance. GCA tests can be combined with other
selection methods in order to minimise the adverse impact while maintaining or even
increasing the overall predictive validity. Outtz (2002). This adds to the benefits of this
studies’ combination of selection methods.
Moreover, Schmidt and Hunter concluded that there is considerable evidence that GCA
is predictive of performance in a wide variety of jobs. This is especially positive as
many alternative methods are not easily applied across all jobs, as they require adjusting,
e.g. personality tests.
The adaptability of GCA was identified through a meta-analysis of over 32,000
employees in 515 widely diverse civilian jobs. The predictive validity of GCA was .58
for professional-managerial jobs, .56 for high level complex technical jobs, .51 for

21
medium complexity jobs, .40 for semi-skilled jobs and .23 for completely unskilled jobs.
The validity for the middle complexity level of jobs (.51) —which includes 62% of all
the jobs in the U.S. economy, Schmidt & Hunter (1998). Therefore the same,
unchanged GCA test maintains its validity across the majority of jobs.
The theoretical foundation within the research for GCA is stronger than for any other
personnel measure. Theories of intelligence have been developed and tested by
psychologists for over 90 years, Jensen, (1998). Due to the vast amounts of research
into this subject area, the meaning of the construct of intelligence is much clearer than,
for example, the meaning of what is measured by interviews or assessment centres,
Hunter (1986); Jensen (1998).
Both traditionally accepted methods like structured interviews and academically
supported, but controversial methods like GCA tests have inverted positives and
negatives. Structured interviews lack high validity and are prone to interviewer bias,
while GCA tests have high validity and are standardised to prevent interviewer bias, yet
may create additional barriers for ethnic minorities. Furthermore, GCA tests are
impersonal and do not provide a custom job-match, while structured interviews are face
to face and can be specific to job role. New research suggests that a combination of the
popular structured interviews with alternative selection methods, such as GCA, can
provide an improved selection process which minimises inequality and bias, while
maintaining high performance prediction and relevance to job role.

22
Supervisor Performance Ratings
The evaluation of human performance can take place in a range of different contexts
including academic examinations, promotions and personnel selection. Performance in
all jobs and activities can be analytically broken down into a number of elemental
dimensions. It has been suggested that these performance dimensions are consistent
across a range of different jobs, Campbell et al. (1992). Using these dimensions to
reflect the specific job specification, positive correlation of performance can be
achieved. However, it has also been stated that it is possible to create a composite index
of job performance which records overall job performance, Schmidt & Kaplin (1971).
In order for the performance ratings to be effective and fair, the rating system
needs to be accurate and avoid supervisor bias. One such bias is supervisor’s giving
preferential scores to their own racial, gender or perceived social groups, Schmitt &
Lapin (1980). However, within the same study, it was seen to not substantially affect the
validity of the ratings. Also, the similarity effect bias can have an effect on findings, but
varies across professions. Although, the current target profession is within the higher
education sector, which is stated to have weak to no similarity effects, Frank &
Huckman (1975).Within the literature, certain bias found in supervisor performance
rating can have a significant effect on results, Kane et al. (1995). An example of such
bias is the tendency for certain supervisors to be naturally more lenient or severe than
others, which subsequently skews the results. It has been suggested that this variation in
leniency or severity is due differences in supervisor personality, Villanova (2009). For
example, supervisors scoring high in agreeableness were more likely to rate more
leniently, Yun (2005). Contrary to previous studies, no evidence was found to support
face to face feedback increasing or decreasing supervisor’s ratings, Dewberry, Davies-

23
Muir & Newell (2013). The current study aims to minimise the bias of supervisor
variation by having one supervisor conduct the performance ratings for all of the
successful candidates. However, these biases are predicted to have a weak effect on
findings. A study has demonstrated this by developing a statistical method to remove
the halo effect which inflates the correlations between supervisor ratings. It was found
that when this bias was removed, there was still a large relationship with GCA and
personality, Viswesvaran, Schmidt & Ones (2002).
Meta-analytic research on the reliability of supervisor performance ratings found
that employees from another team showed the lowest mean reliability (.30), fellow
employees in between (.37), and supervisors showed the highest (.50), Conway &
Huffcutt (1997). In addition, it is suggested that greater validity of ratings can be
achieved when the ratings structure includes a statistical key performance indicator
(KPI), such as sales per month, but is focused towards the ratings of key criteria in the
person specification, Ones, Viswesvaran, & Schmidt (1993). The current study will use
this structure to include 1 KPI score and 4 agreed key criteria from the person
specification.
Combination of GCA and personality
Throughout personnel selection literature, tests of GCA are regarded as the most
accurate performance predictors of all selection methods, Schmidt & Hunter (1998).
Thus, it is particularly noteworthy that the addition of personality testing to a selection
testing program based on GCA has been shown to significantly increase the overall
level of performance validity (Day & Silverman, 1989; Schmidt & Hunter, 1998). My
study aims to test this claim and by having separate groups for GCA, personality, GCA

24
with personality and no test. Subsequently, based on supervisor ratings, the variation
performance validity between these groups will be compared
Whereas the validity of GCA tends to decrease as the complexity level of the job
decreases (Schmidt & Hunter, 1998), the validity of personality tests tends not to
decrease (Ones, Viswesvaran, & Schmidt, 1993). Thus, personality testing may provide
the largest benefit beyond GCA in the case of low-complexity jobs, although its
increment in validity is still likely to be valuable in high-complexity jobs, such as
student callers.
The hypothesis of the research is to investigate and test whether personality tests or
GCA scores (or a combination of both) improve the validity of structured interviews,
based on supervisor ratings. The candidates within the test data groups are predicted to
perform better than the control group. In particular, it is expected that the GCA with
personality group will record the highest supervisor ratings.

25
Methodology
Organisation
The target organisation is the department of Development & Alumni Relations Office,
University of Warwick. The department run seasonal campaigns, recruiting current
students into ‘student caller’ roles. A student caller is involved in contacting Warwick
alumni, parents and other associates by telephone in order to build a relationship, inform
about new projects and inspire the individuals to give a donation. The structure of the
selection and assessment process begins with the application, if successful, a phone
interview will follow (questions and rating system – appendix E). If successful again,
candidates are invited to a selection day (for full schedule – appendix). On the day,
structured interviews (appendix F) are conducted and provide the final assessment of
the candidate. My research is part of a review of the recruitment processes within the
department.
Research Design
The study is a quantitative study using 2 pre-designed selection tools (GCA and
personality online tests) and one self-designed supervisor performance rating
questionnaire. The research design will be an independent samples method. The one
independent variable will be the type of selection data provided within the structured
interview. The four levels to the independent variable are personality test, GMA test,
personality & GMA test and no test at all. The dependant variable will be the supervisor
performance ratings. This design will enable one to see, through an independent sample
one-way ANOVA, whether there is a significant difference between the means of the
four groups and the dependant variable (supervisor ratings). And also, whether there is a

26
significant difference between groups, for example, between only GCA and GCA with
personality.
Sample
The participants will be student caller applicants, aged between 18-22 years old. This
sample was selected because the recruitment scale is large enough for a comprehensive
study. Also, the participants will be all from a similar social group and age – University
of Warwick students. Despite this, the range of nationality and culture will be large,
which is beneficial for cross-cultural observations. In addition, the student caller job
specification contains strong and distinct traits, such as positivity or confident
negotiation skills, which are easily matched with personality measures. 307 applicants
will be tested. Following this, 91 candidates will successfully reach the selection day.
The data collected from the GCA or personality tools will be used within the structured
interview to select the final 38 student callers.
Data Collection
Both GCA and personality tests used in the current study were developed by Criterion
Partnership.
Criterion are a consultancy of occupational psychologists. They work with progressive
employers to improve performance in the workplace. Criterion help organizations to
succeed by applying psychology to the recruitment process, development and the
retaining of talented people. Criterion develop and publish online psychometrics,
personality questionnaires, ability tests and situational judgment tests. Established in
1991, Criterion have developed a professional reputation and have become known for
the rigorous testing of their tools which are grounded within the latest research.

27
Working together with Criterion, it was possible to select the most appropriate tools for
the target candidate pool – student callers.
Personality Test
The personality assessment is delivered through Criterion’s Coast platform, which
enables the researcher to customize the personality measures in order to tailor the test
around the culture and requirements of the target job. Criterion has an attribute library
containing 46 measures, each measuring a different dimension of personality at work.
Through consultation with both consultants at Criterion and the head recruiter for
student callers the following 12 measures were chosen based upon the job specification
(appendix A).
1. Adaptable
2. Approachable
3. Influential
4. Decisive
5. Risky
6. Optimistic
7. Stress Management
8. Money
9. Striving
10. Competition
11. Profit
12. Social Desirability

28
The scores are all out of 10 but this is a normed score (not a raw score) so it takes into
account the comparison group associated with the tests. The interview panel will receive
an average score out of 10, indicating the candidate’s compatibility with the key
specifications of the job.
General Cognitive Ability (GCA) test
The test, called ‘Utopia’, consists of two parts - a verbal and numeracy test. It is a high-
level aptitude test designed to be appropriate for university students. The verbal test
measures verbal critical reasoning, requiring candidates to comprehend and evaluate
meaning using precise logical thinking. While the numerical test measures high-level
numerical reasoning, requiring candidates to analyse and manipulate numerical data.
Both tests have a generous time limit to avoid confounding variables such as reading
time having an impact on performance, especially if candidate’s first language is not
English. The contents of the tests were designed to be easily accessible to all –
numerical tests contain minimal verbal information so the measure is not contaminated
by verbal ability. Also, the tests are designed to be challenging enough to rank the top
performers as well as the average performers. The total length of time it will take a
candidate to complete all of the tests is no more than 45 minutes. This was agreed to be
an appropriate length of time to be spending as part of the process by the head student
recruiter.

29
The tests are likely to take:
Verbal Test – 11 mins
Numerical Test – 20 mins
Personality – 12 mins
Supervisor Ratings
The supervisor providing the ratings for each successful candidate will be the same
individual - head recruitment officer for student callers. Having one supervisor rating all
successful candidates is greatly beneficial because it minimises supervisor bias and
negates variation between supervisor leniencies. The rating will occur 1 month
following the start of their role. Due to time constraints of the student caller campaign
and term times, this was the maximum period of time allowed before ratings. The rating
system is based on the following 5 criteria which is deemed as ‘essential’ within the job
specification.
 Good telephone manner
 Ability to work both independently and within a team
 Reliability
 Negotiation Skills
 Fundraising record

30
Each criteria is given a ratings out of 5 on a Likert scale. The ratings are 1-very poor, 2-
poor, 3-satisfactory, 4-good, 5-very good. The total is then multiplied by 4 to give a
final score out of 100. For example, ratings of 3 out of 5 in each criteria would give a
total of 15, multiplied by 4 gives a total score of 60/100, (appendix F).
Procedure
The below model in Fig.1 is a visual representation of the research procedure. This is
the general step by step timeline of the research.
Fig.1. A project model outlining the structure of the research

31
The 288 (out of 307) applicants who complete the tests and will be split into 4 groups at
random, to be assigned either:
 GCA test
 Personality test
 Both GCA and personality test
 No tests – control group
Candidates will be contacted via e-mail and sent the link to the Criterion tools. The
email will contain information about the research and candidates will be given the
opportunity to decline participation and withdraw at any stage, with no consequences on
their chances within the recruitment process. The use of e-mail was seen to be better
than pencil and paper tests considering the time constraints of the selection day and that
many of the candidates will not have arrived onto campus yet.
Following the phone interview stage, 91 candidates will be invited to the selection day.
The candidates will be re-grouped according to the assigned groups above. The scores
and/or results for each attending candidate will be given to the interview panel who will
use the information in order to customize their interview questions and inform their
decision regarding the candidate. The control group with no additional scores or results
will undergo the standard procedure of the selection day tasks and interview.
The 30-40 successful candidates will be given 1 month to settle into their role before
supervisor ratings are recorded. This data is collected over a 3 day period. All
participants are thanked and informed according to the ethical guidelines.

32
Data Analysis
The data collected will be input and analyzed within SPSS. Firstly, partial correlations
will be screened to identify any biases. Descriptive statistics will be explored to gain an
understanding of the means and standard deviations of the sample. The correlation
trends will be investigated to determine validity ratings of the four groups and
subsequently compared against each other to see which group lead to the best
performers.
Ethical Considerations
Upon initial contact with the candidates via e-mail, it will be made clear that they are
under no obligation to participate and are able to withdraw their data at any stage. They
are provided with a participant information sheet and consent form (appendix C).
Candidates are also reassured by the head recruitment officer within the e-mail, that not
participating will not affect their chances during the recruitment process.
The data collected via the Criterion tools (personality and GCA) are stored
securely on their system under the consultancies strict confidentiality guidelines. The
data is then sent to myself and stored on one computer within a password locked folder.
The data at this stage is fully anonymous as each candidate is given a unique code.
Upon request, the candidates may receive their own results/scores from Criterion via e-
mail. The supervisor ratings will be added to the data using the anonymous candidate
codes. Following analysis of the data, the data will be destroyed.
Finally, all candidates will receive a full debrief and summary of the study. In
return for the kindness of the Development & Alumni Relations Office, University of
Warwick in allowing me to conduct my research with them, I will present my findings

33
and provide a conclusion on the future of their selection and assessment methods for
student callers.
Results
Descriptive Statistics
Out of a total of 307 student’s applicants, 288 completed the tests. From this dataset, 2
extreme scores were removed from the GCA scores, as the variation between the verbal
and numeracy was too large. The central tendency (mean) and standard deviations of
their test scores are represented within Table 1. As can be seen, GCA and particularly
the numeracy part of GCA has a high standard deviation. Meaning that there is much
variation between the data points and the mean. In contrast, personality scores showed a
low standard deviation.
Table 1. Mean and standard deviation scores for personality, numerical, verbal and
overall GCA in all student caller applicants.
Personality Numeracy Verbal Overall GCA
Mean score
(/10)
5.61 4.36 5.94 5.40
Standard
deviation
2.13 6.84 3.75 5.29

34
Following the selection process, the 38 successful candidates’ scores were analysed.
Table 2 presents the central tendency (mean) and standard deviations of the personality
and GCA scores. The candidates are organised according to their assigned group. The
italicised number in brackets indicates the number of candidates within each group. The
scores highlighted in green are to indicate which parts of the data were revealed to the
interviewers for each candidate, depending on their group.
As can be seen, both personality and GCA scores are above the average of the
complete applicant pool seen in Table 1 across all groups. Also, the general standard
deviation across all groups has decreased, although the personality score standard
deviation remained lower than GCA score across all groups (0.97 lower). The highest
mean GCA score (8.54) was found in the GCA group, and the highest mean personality
score (7.25) was found in the personality group.
Table 2. Mean and standard deviation scores for personality and GCA in successful
student caller applicants.
Personality group
(7)
GCA group (9) GCA & personality
group (15)
Control group (7)
Test Personality
(/10)
GCA
(/10)
Personality
(/10)
GCA
(/10)
Personality
(/10)
GCA
(/10)
Personality
(/10)
GCA
(/10)
Mean
score
7.25 7.22 6.89 8.54 6.98 8.31 6.88 7.16
Standard
deviation
2.03 3.35 2.11 2.26 2.47 3.85 2.14 3.14
After a period of 1 month, the successful candidates were assessed and assigned a
supervisor performance rating (/100). Fig. 2 presents the mean supervisor ratings for

35
each group. As can be seen, the highest mean rating of 97.73 was found in the group
who were selected using the GCA and personality scores, followed by the only GCA,
and only personality group. The lowest mean rating of 74.3 were recorded by the
control group. The only GCA group scored a mean rating of 91.29, which is nearly a
significant 10 marks lower than the only personality group mean rating of 81.65.
However, the combination of these two groups has produced the highest mean ratings.
All successful candidates completed their first month within the role and there
was no dropout. The standard deviation across supervisor ratings was low. The mean
standard deviation for all groups’ supervisor ratings was 1.77. Table 3 presents the 5-
point summary for supervisor performance ratings for all successful candidates to
provide a better understanding of the data.
Fig. 2. Mean supervisor performance rating for each successful student caller group.
Table 3. 5-point summary for supervisor performance ratings for all successful
candidates
74.3
81.65
91.29
97.73
0
20
40
60
80
100
120
Control group Personality group GCA group GCA & personality group
MEANSUPERVISORPERFORMANCERATING(/100)
SUCCESSFUL STUDENT CALLER GROUPS

36
Supervisor ratings (/100)
Highest extreme 100
Upper fourth 95
Median 88
Lower fourth 79
Lowest extreme 52
Statistical analysis
To examine the hypothesis, an independent samples one-way analysis of variance
showed that there is a significant difference between supervisor performance ratings
across the four groups F(2, 36) = 27.55, p > .001. Significant differences in mean
supervisor performance ratings were also found between the control group and the other
3 groups, p < .001, only GCA and only personality group, p < .001, and between GCA
and GCA with personality, p = .013.
As the F value is significant, the effect size of the group with the highest mean
supervisor ratings compared with the control group, therefore the best predictor of
performance, was analysed and recorded as d = 1.11. From the data, not only can we see
that the difference between mean GCA with personality and control group supervisor
ratings is statistically significant, we can also see that the effect size (d = 1.11) is large.
This confirms the hypothesis that all test groups: personality, GCA and GCA with
personality improved the validity of structured interviews, based on supervisor ratings.
Also all test groups predicted performance better than the control group. In particular, as

37
expected, the GCA with personality group recorded the highest overall supervisor
ratings, thus the most valid predictor of performance.

38
Discussion and Conclusion
Findings
The purpose of the research is to investigate whether personality tests or GCA scores
(or a combination of both) improve the validity of structured interviews, based on
supervisor ratings. The hypothesis that supervisor ratings will be higher in the GCA,
personality and GCA with personality groups compared to the control group was
confirmed. In addition, the secondary hypothesis that the GCA with personality group
will produce the highest supervisor ratings, followed by GCA, then personality, was
also confirmed. The control group, using the traditional structured interview, scored an
average supervisor performance rating of 74.3, while the group using the structured
interview with GCA and personality tests scored an average supervisor performance
rating of 97.73. This shows that the implementation of these tests significantly improve
the validity of structured interviews. With the addition of a second predictor – overall
predictive validity of GCA and personality within structured interview increased with a
significant effect size (d=1.11). This strongly supports the use of selection methods in
combination to further improve predictive validity, which is greatly beneficial to
organisations who actively strive to select the best candidates for their roles.
Such findings conclude that using personnel selection methods with high predictive
validity provides greater certainty and confidence to the interviewers when making the
final decision. Also, the use of GCA and personality within the more traditional format
of a structured interview minimises the negatives associated with them if used
separately. For example, the British Psychological Society (BPS 2013) recommend
against the use of personality tests as a stand-alone selection tool, which is partly due to

39
the mandatory training (expensive) involved for the correct application of such tests.
Also, GCA tests used on their own in selection are criticised for having a negative
impact on race. When used in combination with structured interviews, personality acts
as a helpful supplement which can provide evidence rather than an interviewers
potentially biased opinion on person-environment fit. While GCA acts as a helpful way
to rank candidates in order of cognitive ability, but is not completely definitive in the
final selection, as a candidate who will fit the organisational culture (previous
experience or personality test), may be picked over a candidate with superior cognitive
ability, therefore minimising its negative perception. As a result, the findings suggest
that currently, the best selection approach, with the highest predictive validity, allowing
organisations to pick the best candidates, is GCA and personality testing within a
standard structured interview.
Pros and cons
Taking into consideration certain factors that act against the legitimacy of previous
studies, the current research project aimed to minimise these affects. One such factor is
the important supervisor bias. If the performance ratings are bias and not consistent
across the sample, then the results will be misleading. Studies showed that supervisors
are often recorded giving preferential scores to their own racial, gender or perceived
social groups, Schmitt & Lapin (1980). In order to avoid this significant confounding
variable, one supervisor was used for all performance ratings, which completely
nullifies the issue of variation between supervisor ratings. In addition, to minimise the
bias of the individual supervisor, a structured rating model was used in order to limit the
opportunity of personal perception, for example, amount of donations received and

40
good telephone manner.
In contrast, due to the timing of the recruitment drive against the University term
dates, there was a significant impact on the time available between selection of the final
student callers and the collection of supervisor performance ratings. Initially, 3 months
was the target period of time, within which 2 separate ratings would be collected (at 45
days and then 90 days) to record any variation between the ratings. This would give an
indication of the consistency and reliability of the supervisor’s rating. However, due to
the time constraints, only 1 collection of supervisor ratings was possible, and only 1
month following the final selection. This has meant that although it was of great benefit
for reliability having just one supervisor, the overall reliability may have been
weakened as there is no clarity as to whether the supervisor ratings are accurate, as
limited observation time was given, or consistent over time.
A limitation in gathering GCA and personality test results was that the sample was not
completely random. All applicants were invited to take the tests with the knowledge that
a proportion of the sample will not have time to complete the test, will forget or decline
and opt out, these candidates became the control group. Therefore, it can be argued that
the control group may have consisted of a certain type of individual who was less
motivated to succeed in the application, while the individuals who took the opportunity
to take the tests were more motivated to get the job and subsequently were more likely
to perform better in the first place. If this is the case, then the results would have been
skewed towards low supervisor ratings in the control group and high supervisor ratings
in the ‘test’ group, even without the influence of the testing tools.
A major positive for this research project was the access to high quality and
rigorously tested GCA and personality tools from Criterion. In addition, it was possible

41
to tailor the tests to fit the target job role of student caller, thus further increasing the
predictive validity. Working with Criterion also benefited the professionalism of the
study as well as ensuring ethical considerations were met. This gave students the
confidence and trust in the tools in order to complete them to the best of their ability. If
the tests were poorly organised and presented, candidates are more likely to lose interest.
Also, candidates were interested in their test results. Using Criterion, it enabled the
sending out of test outcomes for students to use and learn from in the future.
Future Improvements
The target job role was student callers with the University of Warwick. The student
caller person specification contained very clear key attributes, such as high confidence,
adaptability, strong negotiation and target-driven. Such attributes tend to be associated
with extroverted individuals, therefore it would be useful to know whether to same can
be found in other sectors and job types. For example, would there be the same
predictive value within a creative, project-based role?
In addition, all applicants for the role were students from the University of
Warwick. Therefore, the vast majority of the candidates were aged between 18-22 years
old, thus at the very early stages of their careers, with limited previous work experience.
The student caller recruitment campaign acknowledged this and subsequently put no
emphasis on prior experience. This can be viewed as a positive, because it gives more
weight to the data from the selection tools. However, it poses a future question of
whether the predictive validity increases or decreases when greater emphasis is placed
on prior experience, perhaps also when tested with higher level jobs. Furthermore,

42
according to the data from the department of Development & Alumni Relations, the
sample of applicants reflected the University wide demographic, as 36% of applicants
were from outside of the European Union (EU). The multi-cultural sample has the
potential to contribute to the cross-cultural literature. It would be a step forward if more
data analysis can be done on differences between race, cultures and nationalities, in
order to further test the new combinations of selection methods.
As mentioned in pros and cons, a limitation of this study was the short length of
assessment time. Despite this, associations can be made between GCA and personality
with job performance. Building on this point, there is minimal literature on whether
GCA and personality are associated with career success. In order to test this, a
longitudinal study would have to be implemented.
Also, it was mentioned that a positive of this study was the custom design of the
personality test, which was tailored against the student caller job specification.
Although the findings showed a positive relationship, it would be valuable to explore
whether a similar outcome can be achieved with more universal and broad personality
tests. Literature has suggested that the trait of conscientiousness consistently predicts
job performance in all job sectors, Barrick & Mount (1991). It would be useful to test
whether the incorporation of the conscientiousness into the current format will maintain
or improve the level of overall predictive validity and therefore take away the need for
customisation for each job role. This would potentially strengthen its practical
application and thus its appeal to organisations.
Practical Application

43
There is great financial benefits to these findings for organisations. Research suggests
that a high performing employee in the 98th
percentile, compared with an average
employee in the 50th
percentile produces on average $32,000 (approx. £20,000) more
per annum, Schmidt & Hunter (1998). In organisational terms, during a recruitment
drive, employing 10 or more high performing employees can increase the financial
productivity up over £200,000 per annum.
In addition, the cost-effectiveness of using GCA and personality testing is
significant. Assessment centres and structured interviews can be very costly and time-
consuming. With the implementation of these testing tools, which according to the
research improves overall predictive validity, external costs will be minimised and the
length of time used for interviews can be shortened, as much of the useful data about the
candidate can be gained via online testing. Also, with the testing tools being available
online, the collection of data is very efficient and negates the need for paper tests, as
candidates can complete the tests in their leisure. Although, this does pose the issue of
additional external variables which may affect performance in the tests that is difficult
to control, such as noise, guessing, internet connection failure or cheating. The extreme
outliers were removed from the results as the scores between the verbal test and the
numeracy test were artificially large. Candidates complete the verbal test before the
numeracy test. Therefore, these two outliers may have been down to fatigue and loss of
motivation. Consideration was made for this by clearly stating the approximate duration
and designing the tests to not be too long. In addition, studies have shown that the moral
judgement of students is significantly lower when online and cheating in online
employment psychometrics or GCA via search engines or group collaboration, is rising,
Young & Calabrese (2007). Therefore it can be argued that while applicants may find

44
methods to artificially increase their GCA, treating GCA as a supplementary tool in
addition to the combination with personality testing and structured interview, is
important to minimise the effect of cheating.
Furthermore, use of GCA tests in selection has led to criticism due to its ethnic
group differences in scores. Therefore another positive practical application of
personality tests alongside GCA is that personality measures typically do not show
these ethnic group differences. Together, GCA and personality can increase overall
validity, while reducing group differences compared with when used separately.
Contribution to Occupational Psychology
The research findings will fill a gap in the literature and add to the limited number of
studies in this area. It will also contribute to the gathering evidence supporting the usage
of GCA tests in selection, whilst suggesting a more supplementary role for GCA. For
the practical application of GCA within organisational selection and assessment, it
should not matter so much why and how GCA predicts performance, but that the
evidence supports that it does. Thus, the considerations in this research of bias, through
the combination of personality is positive. Not only does the research provide a way to
minimise the controversial perceptions of GCA, but it also provides a theoretical basis
in order to gain acceptance of the empirical findings.
It will also demonstrate a new use for psychological testing. Stating that personality
tests are not to be used in the pre-screening screening stage, but as a significant tool to
aid the final hiring decision. It has been shown to be a successful complimentary tool to
GCA and increases overall validity. The research has cleared a new path for future
research to further investigate different variations of this combination to attempt to

45
increase both predictive validity and acceptance with Occupational Psychology.
The research will strengthen the empirical findings which demonstrate GCA as
highly predictive of job performance, even when used within a more traditional
selection methods, such as structured interview. This will go some way to make the
perception of GCA within selection and assessment more positive and widely used.
Word count: 10,188

46
Critical Self-review
Appraisal of practise
General practise in terms of the studies timeline was efficient and only small
adjustments were necessary. The confirmation of the target organisation was achieved
much later than planned. This put increased time constraints on certain parts of the
study. The search for the potential organisations should have commenced earlier in the
year. The method was carried out successfully and all ethical considerations were
addressed. The e-mail feedback from Criterion consultants, the head recruiter at
Warwick and student applicants indicate that professionalism was maintained
throughout the study.
Personal approach
Following initial face to face meetings, the majority of my organisation and
communication throughout the study was conducted via e-mail. I attempted to keep it
very personable and provide all necessary information along the way. A trial run of the
tests were shown to the department prior to the study, to give a better understanding of
the tools and data they will be using. This also included a demonstration of my own
custom designed supervisor performance rating sheet which was based around the job
specification.

47
Assumptions
Key assumptions made during the study are that the interview panel will use and take
into account the test data and accept the change to the traditional format. Observations
and notes were taken during the selection day, and this seemingly was the case. This is
the key distinguishing factor between a ‘test’ group and the control group, therefore
vitally important in terms of the influence on the results. Also, it was assumed that 1
month, although not ideal, would be a sufficient time for supervisors to give an accurate
account of the candidate’s performance.
Researcher bias
The choice of topic was made during a selection and assessment lecture by my
subsequent project supervisor. I was inspired by how in the space of 10 minutes, my
perceptions of certain selection methods changed. I previously had assumed that GCA
or IQ tests had little validity in predicting performance, therefore learning about the
significant contrary evidence appealed to me. Particularly as I was disillusioned and
frustrated with the current more traditional selection methods e.g. stressful assessment
centres. As a result, the researcher bias may come about through my personnel
determination to support the evidence behind alternative methods of GCA and
personality, leading to a blind view of patterns in the statistics.

48
What have I learnt?
Through the reading around the literature review I have learnt that there is a great
challenge for academic research to be applied practically in everyday life. Therefore, at
times adaptation is required to make advances in the research, e.g. the use of GCA and
personality with structured interviews. Also, I have learnt a great amount in terms of
statistics, reading journals, efficient time management as well as liaising and dealing
with various different stakeholders during the research; Criterion Partnership
consultants about tools, University HR team about recruitment and the participants
about completing the tests. This has taught me much about academic writing and has
developed many skills that I will utilise throughout my future career.

49
Acknowledgements
First and foremost, I would like to thank my supervisor Dr Chris Dewberry who provided the
inspiration behind the research and supported me throughout the year.
I am also very grateful to Samuel Burdock, manager of the student caller recruitment at
the University of Warwick, who was open-minded to new ideas and made my project possible by
providing a target organisation. Also, I thank Maria Gardener and Juliane Sternemann, consultants at
Criterion Partnership, who worked closely with me and kindly offered their testing tools for free, in
the name of research.

50
References
Arvey, R. D., Miller, H. E., Gould, R., & Burch, P. (1987). Interview validity for selecting sales
clerks. Personnel Psychology, 40(1), 1-12.
Arvey, R. D., & Campion, J. E. (1982). The employment interview: A summary and review of
recent research 1. Personnel Psychology,35(2), 281-322.
Barrick, M. R., & Mount, M. K. (1996). Effects of impression management and self-deception
on the predictive validity of personality constructs. Journal of applied
psychology, 81(3), 261.
British Psychological Society (BPS): Psychological testing – a new approach. (2013). Retrieved
March 7, 2014 http://www.bps.org.uk/psychological-testing-new-approach
Campbell, R. L (1992). Types of constraints on development: An interactivist
approach. Developmental Review, 12(3), 311-338.
Carlsen, G. R. (1967). Literature: Dead or Alive?. California English Journal,3(3), 24-29.
Conway, J. M., & Huffcutt, A. I. (1997). Psychometric properties of multisource performance
ratings: A meta-analysis of subordinate, supervisor, peer, and self-ratings. Human
Performance, 10(4), 331-360.
Day, D. V. (1993). Brief psychotherapy in two-plus-one sessions with a young offender
population. Behavioural and Cognitive Psychotherapy, 21(04), 357-369.
Day, D. V., & Silverman, S. B. (1989). Personality and job performance: Evidence of
incremental validity. Personnel Psychology, 42(1), 25-36.
Dewberry, C. (2001). Performance disparities between whites and ethnic minorities: real
differences or assessment bias?. Journal of Occupational and Organizational
Psychology, 74(5), 659- 673.

51
Dewberry, C., Davies‐Muir, A., & Newell, S. (2013). Impact and causes of rater
severity/leniency in appraisals without postevaluation communication between raters
and ratees. International Journal of Selection and Assessment, 21(3), 286-293.
Frank, L. L., & Hackman, J. R. (1975). Effects of interviewer-interviewee similarity on
interviewer objectivity in college admissions interviews. Journal of Applied
Psychology, 60(3), 356.
Goffin, R. D., & Christiansen, N. D. (2003). Correcting personality tests for faking: A review of
popular personality tests and an initial survey of researchers.International Journal of
Selection and assessment, 11(4), 340-344.
Guion, R. M., & Gottier, R. F. (1965). Validity of personality measures in personnel
selection. Personnel psychology, 18(2), 135-164.
Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterion-
related validities of personality constructs and the effect of response distortion on those
validities. Journal of applied psychology, 75(5), 581.
Hough, L. M., & Oswald, F. L. (2008). Personality testing and industrial–organizational
psychology: Reflections, progress, and prospects. Industrial and Organizational
Psychology, 1(3), 272- 290.
Huffcutt, A. I., Roth, P. L., & McDaniel, M. A. (1996). A meta-analytic investigation of
cognitive ability in employment interview evaluations: Moderating characteristics and
implications for incremental validity. Journal of Applied Psychology, 81(5), 459.
Hunter, J. E. (1990). An analysis of the diversification benefit from international equity
investment. The Journal of Portfolio Management,17(1), 33-36.

52
Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job
performance. Psychological bulletin, 96(1), 72.
Hunter, J. E., Schmidt, F. L., & Judiesch, M. K. (1990). Individual differences in output
variability as a function of job complexity. Journal of Applied Psychology,75(1), 28.
Janz, T. (1989). Case study on Utility: Utility to the rescue, a case of staffing program decision
support'. Advances in Selection and Assessment. Chichester: John Wiley, 269-272.
Jensen, B. (1998). Communication or knowledge management. It’s time to wake up and smell
the koffee. Communication World.
Kane, J. S., Bernardin, H. J., Villanova, P., & Peyrefitte, J. (1995). Stability of rater leniency:
Three studies. Academy of Management Journal, 38(4), 1036-1051.
Ledvinka, J. (1973). Race of employment interviewer and reasons given by black job seekers
for leaving their jobs. Journal of Applied Psychology, 58(3), 362.
McCarthy, J. M., Van Iddekinge, C. H., & Campion, M. A. (2010). Are highly structured job
interviews resistant to demographic similarity effects?. Personnel Psychology, 63(2),
325-359.
McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of
employment interviews: A comprehensive review and meta-analysis.Journal of applied
Muldrow, T. W. (1979). Impact of valid selection procedures on work-force
productivity. Journal of Applied Psychology, 64(6), 609.
Murphy, K. R. (1989). Is the relationship between cognitive ability and job performance stable
over time?. Human Performance, 2(3), 183-200.

53
Nell, V. (2000). An evolutionary perspective on the prevention of youthful risk-taking: The case
for classical conditioning. In Transportation, Traffic Safety and Health—Human
Behavior (pp.163-179). Springer Berlin Heidelberg.
Ones, D. S., Dilchert, S., Viswesvaran, C., & Judge, T. A. (2007). In support of personality
assessment in organizational settings. Personnel psychology, 60(4), 995-1027.
Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of
integrity test validities: Findings and implications for personnel selection and theories
of job performance. Journal of applied psychology, 78(4), 679.
Outtz, J. L. (2002). The role of cognitive ability tests in employment selection. Human
Performance, 15(1-2), 161-171.
Raven, J. (2000). The Raven's progressive matrices: change and stability over culture and
time. Cognitive psychology, 41(1), 1-48.
Reeve, C. L., & Hakel, M. D. (2002). Asking the right questions about g. Human
Performance, 15(1-2), 47-74.
Rothstein, M. G., & Goffin, R. D. (2006). The use of personality measures in personnel
selection: What does current research support?. Human Resource Management
Review, 16(2), 155-180.
Rushton, J. P., Skuy, M., & Fridjhon, P. (2003). Performance on Raven's advanced progressive
matrices by African, East Indian, and White engineering students in South
Africa. Intelligence, 31(2), 123-137.
Schmidt, F. L., & Kaplan, L. B. (1971). Composite vs. multiple criteria: A review and
resolution of the controversy. Personnel Psychology, 24(3), 419-434.

54
Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of
validity generalization. Journal of Applied Psychology, 62(5), 529.
Schmidt, F. L. & Hunter, J. E. (1990). Dichotomization of continuous variables: The
implications for meta-analysis. Journal of Applied Psychology, 75(3), 334.
Schmidt, F. L. & Hunter, J. E. (1996). Intelligence and job performance: Economic and social
implications. Psychology, Public Policy, and Law, 2(3-4), 447.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel
psychology: Practical and theoretical implications of 85 years of research
findings. Psychological bulletin, 124(2), 262.
Schmidt, F. L., & Hunter, J. (2004). General mental ability in the world of work: occupational
attainment and job performance. Journal of personality and social
Schmidt, F. L., & Ones, D. S. (2002). The moderating influence of job performance dimensions
on convergence of supervisory and peer ratings of job performance: Unconfounding
construct-level convergence and rating difficulty. Journal of Applied Psychology, 87(2),
345.
Schmidt, F. L., Hunter, J. E. & Pearlman, K. (1980). Validity generalization results for tests
used to predict job proficiency and training success in clerical occupations. Journal of
Applied Psychology, 65(4), 373.
Schmitt, N., Noe, R. A., Meritt, R., & Fitzgerald, M. P. (1984). Validity of assessment center
ratings for the prediction of performance ratings and school climate of school
administrators. Journal of Applied psychology, 69(2), 207.

55
Skuy, M. (2001). Performance on Raven's Matrices by African and White university students in
South Africa. Intelligence, 28(4), 251-265.
Spearman, C. (1904). “General Intelligence," Objectively Determined and Measured. The
American Journal of Psychology, 15(2), 201-292.
Sternberg, R. J. (2001). The predictive value of IQ. Merrill-Palmer Quarterly, 47(1), 1-41.
Tett, R. P. (1999). Meta-analysis of bidirectional relations in personality-job performance
research. Human Performance, 12(1), 1-29.
The Equality Act (2010). Retrieved August 7, 2014 https://www.gov.uk/equality-act-2010-
guidance
Ulrich, L., & Trumbo, D. (1965). The selection interview since 1949.Psychological
bulletin, 63(2), 100.
Villanova, P. (2009). Rating level and Accuracy as a Function of Rater
Personality. International Journal of Selection and Assessment, 17(3), 300-310.
Weiss, D. J. (1984). Application of computerized adaptive testing to educational
problems. Journal of Educational Measurement, 21(4), 361-375.
Wright, O. R. (1969). Summary of research on the selection interview since 1964. Personnel
Psychology, 22(4), 391-413.
Yukl, G. A., Kovacs, S. Z., & Sanders, R. E. (1972). Importance of contrast effects in
employment interviews. Journal of Applied Psychology,56(1), 45.
Yun, G. J., Donahue, L. M., Dudley, N. M., & McFarland, L. A. (2005). Rater personality,
rating format, and social context: Implications for performance appraisal
ratings. International Journal of Selection and Assessment, 13(2), 97-107.

56
Zhang, Y. Luo, F. & Liu, H. Y., (2010). The Developing of Faking Detection Scale in
Application Situation [J]. Acta Psychologica Sinica, 7, 006.
Zhou, J. B., & Miao, Q. (2004). Decision Making Based Upon the Attribute of Subjective
Value: A Quasi Experimental Research. Chinese Journal of Management, 10(4), 20-21.

57
APPENDIX A
The ethics form
Organizational Psychology, BIRKBECK UNIVERSITY OF LONDON
PROPOSAL TO CONDUCT RESEARCH INVOLVING HUMAN PARTICIPANTS
SUBMISSION TO SCHOOL ETHICS COMMITTEE
Please type or write clearly in BLACK ink
Name of investigator_ Alan
Raftery____________________________________________________
Status (e.g. PhD student, postgraduate) _postgraduate_____________________
Name of supervisor (if known) _Chris Dewberry___________________________________
Course/Programme:_Occupational Psychology MSc_________________________
Title of investigation (15 words maximum):
Improving the Validity of Structured Interviews with the Implementation of Personality
Profiling and General Cognitive Ability Testing.
Contact address _8 Erithway Rd___________________________________
for investigator _ Coventry___________________________________
_West Midlands__________________________________
Telephone number _024 76 410 135_______________ Mobile
07732068937__________________
Email alanraftery@live.com__________________
Date of Application:_16/09/2014*_ Proposed starting
date:__16/09/2014__________________
Source of funding if relevant: ____n/a________________________
Is any other Ethical Committee involved: YES/NO
If YES, give details of committee and its decision:
Brief description of aims/objectives of the study
To test new research which suggests that a combination of the popular structured
interviews with alternative selection methods, such as GCA, can provide an improved

58
selection process which minimises inequality and bias, while maintaining high
performance prediction and relevance to job role.
How will participants be selected? Will the selection process have implication in terms of data
protection etc?
Where will the study be conducted?
The target organisation is the department of Development & Alumni Relations Office,
University of Warwick. The department run seasonal campaigns, recruiting current students into
‘student caller’ roles. My research is part of a review of the recruitment processes within the
department.
Briefly describe what participating in the study will involve:
The 300-400 applicants will be split into 4 groups at random, to be asked to complete
either:
 GCA test
 Personality test
 Both GCA and personality test
 No tests – control group
Candidates will be contacted via e-mail and sent the link to the Criterion tools. The
email will contain information about the research and candidates will be given the
opportunity to decline participation and withdraw at any stage, with no consequences on
their chances within the recruitment process.
The 30-40 successful candidates will be given 1 month to settle
into their role before supervisor ratings are recorded. This data is collected over a 3 day
period. All participants are thanked and informed according to the ethical guidelines.
Does the study involve the deliberate use of:
(i) Unpleasant stimuli or unpleasant situations?
(YES/NO)
(ii) Invasive procedures?
(YES/NO)
(iii) Deprivation or restriction (e.g., food, water, sleep)?
(YES/NO)
(iv) Drug administration?
(YES/NO)
(v) Actively misleading or deceiving the subjects?
(YES/NO)

59
(vi) Withholding information about the nature or outcome of the experiment?
(YES/NO)
(vii) Any inducement or payment to take part in the experiment
(YES/NO)
Does the study have any procedure that might cause distress to the subject?
(YES/NO)
Give details of any item marked YES:
The tests, in particular the general cognitive ability (GCA) tests can be quite stressful for the
candidate. They are challenging and are timed, which may cause anxiety or distress. The target
sample should be comfortable with the format of the tests and the sample is drawn from
students who have applied for an application process which involves tasks, interviews and
assessment centres. All candidates are re-assured that they do not have to take the tests and that
the results of the tests will not affect their application.
What arrangements are to be made to obtain the free and informed consent of the subjects?
Upon initial contact with the candidates via e-mail, it will be made clear that they are under no
obligation to participate and are able to withdraw their data at any stage. Candidates are also
reassured by the head recruitment officer within the e-mail, that not participating will not affect
their chances during the recruitment process. Upon clicking the on the link provided,
participants will be asked to read an information sheet and if willing, sign a consent form before
proceeding to the tests.
How will you maintain the participants’ confidentiality?
The data collected via the Criterion tools (personality and GCA) are stored securely on their
system under the consultancies strict confidentiality guidelines. The data is then sent to myself
and stored on one computer within a password locked folder. The data at this stage is fully
anonymous as each candidate is given a unique code. Upon request, the candidates may receive
their own results/scores from Criterion via e-mail. The supervisor ratings will be added to the
data using the anonymous candidate codes. Following analysis of the data, the data will be
destroyed.
Will the subjects be minors or suffer from learning disabilities? (YES/NO)
If yes, outline how you will address the ethical issues raised.
If you feel that the proposed investigation raises ethical issues please outline them below:
Will the research involve any conflict between your role at work and your role as a research
student?

60
(i.e. will you want to use data/colleagues that you have access/contact with in your job
but as a researcher this data/colleagues would not normally be available to you)
Classification of proposal (please underline) ROUTINE NON-ROUTINE
When you are ready to start data collection you and your supervisor should check the ethics
form has been satisfactorily completed before signing the form and sending a copy to George
Michaelides.
I consider that my study conforms with the ethical expectations of management and
psychological research
SIGNATURE of investigator and supervisor (Signed in Research Proposal)
ALAN RAFTERY
Date:16/09/2014
Word count of this document: 857

61
APPENDIX B
Coast
Personality Questionnaire Scales Booklet
© 2013 Criterion Partnership Ltd
01273 734000
coastPcriterionpartnership.co.uk
http://www.surftocoast.co.uk
www.oriterionpartnership.co.uk

62
Approachable
Interpersonal style
Adaptable
Ra ly alters behaviour to create an impression in
fferent circumstances. Personality consistent across
situations.
Adapts style of behaviour to suit different individuals.
Changes personality in different situations.
eserved. Takes time to get to know people. Can
appear guarded. Dislikes small talk.
Friendly. Easy to get on with. Quickly builds rapport
with others.
Assertive
Dislikes being bossy. Tends to play supporting roles
rather than directive ones.
Dominant. Makes presence felt. Sometimes
overbearing with others.
Direct
Diplomatic and tactful. Cautious in expressing
opinions. Tends to avoid confrontations.
Candid. Speaks out without worrying too much about
upsetting people. Direct in expressing opinions.
Gregarious
Reticent and quiet in many social situations. May
appear shy in some circumstances.
At ease with other people. Confident and relaxed on
social occasions.
Enjoys own company. Happy to work alone. Inclined to
be less sociable than others.
Likes the company of other people. Sociable. Works
well with others. May dislike working alone.
Accepts other people's views. Prefers to 'agree to
disagree' rather than try to influence or persuade.
Persuasive. Persists in trying to influence other people.
Aims to win people over.
Independent
Happy to fit in with others. Prefers to be considered
normal rather than different. Content to compromise.
Non-conforming. Goes own way. Likes to be different.
Dislikes compromising to suit others.
Listening
Likes to be the one who does the talking. Sometimes
doesn't listen to others or forgets what they say.
Prepared to take time to listen to people. Considers
others' opinions. Easy to talk to.
Poised
letterbox@criterionpartnership.co.uk
CRITERION
PARTNERSHIP

63
Gets work done as quickly as possible. Looks for ways
to cut corners. More expedient than careful.
Thorough and conscientious. Likes to do things
properly. Takes time and avoids short cuts.
Cautious. More inclined to 'play safe' than take chance
decisions. Avoids substantial risks.
Prepared to take chances. Sometimes enjoys throwing
caution to the wind.
Thinking style
Happy to stick with clearly defined systems which work.
Prefers following procedures to creating new methods.
More interested in the main task than the intricate
details. Prepared to leave others to spot minor errors.
Looks for new approaches. Enjoys trying new ideas.
Prefers inventing new methods to applying old ones.
Takes a perfectionist approach. Enjoys attending to
detail. Notices points that others overlook.
Takes time to consider all options before taking a Quick to take decisions. Prefers to reach decisions
course of action. Dislikes making snap decisions. rapidly rather than leave issues open.
Intuitive. Likes to rely on feeling rather than gathering
too much data. Guided by experience more than
rational analysis.
Scientific and analytical when dealing with problems.
Logical by nature. Prefers to rely on data.
Distractible. Finds it difficult to stick with routine tasks.
Becomes bored quickly with tedious jobs.
Follows things through. Persists with a task even if it is
boring.
Concerned with concrete practicalities. Less interested
in the theory or the wider implications. Applies thinking
to operational considerations.
Approaches issues from a theoretical perspective.
Concerned with underlying principles. Prefers strategy
to operational specifics.
More inclined to be spontaneous than structured. Finds
working to a plan restrictive.
Methodical, orderly and systematic. Plans things out
before starting. Uncomfortable working in a chaotic
manner.

64
Not usually upset by criticism. Tough, rather than
emotional.
Sensitive to criticism. Can become emotional and suffer
from the feeling of being hurt.
Emotional style
Often feels anxious, May worry and feel tense. Finds it
difficult to relax.
Disclosure
Private. Keeps emotions to self. Prefers not to let true
feelings be obvious to others.
Expresses emotions easily. Lets feelings show. Open
with people about emotions.
Emotional analysis
Avoids thinking up explanations for feelings. Trusts own
emotions without needing to explain them.
Likes to make sense of own emotions. Tries to find
explanations for feelings.
Internal control
Resilient
Feels that events are outside own control. Sometimes
leaves things to fate. Attributes outcomes to good and
bad luck.
Feels a strong sense of personal control. Responsible
for own destiny. Tries to influence events towards
preferred outcomes.
Optimistic
Often anticipates the negative. Sometimes pessimistic
about the future. Sees problems more than benefits.
Expects things to turn out for the best. Confident about
the future. Accentuates the positive.
Calm
Unlikely to become tense or flustered. Cool and calm
even in difficult situations.
Prefers to take time over work. Works better when
feeling calm and relaxed. Dislikes being rushed.
Not flustered by high pressure work. Happy coping with
tight deadlines. Works better under stress.
Stress management
Self esteem
May experience self-doubt. May sometimes need
encouragement to build up self-confidence.
Has inner confidence in own abilities. Feels seif-
assured and values own worth.
Self sufficiency
Seeks emotional support from other people. Enjoys
being able to have someone to turn to,
Prefers to rely on self without emotional support from
others. Feels emotionally self-sufficient.

12912540 RAFTERY Alan

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to 12912540 RAFTERY Alan

Similar to 12912540 RAFTERY Alan (20)

12912540 RAFTERY Alan