PSYCHOLOGICAL ASSESSMENT PSYCHOLOGICAL TEST CONSTRUCTION
Ethics for Psychologists: 7 Essentials
Ethical awareness is a continuous, active process that involves constant questioning and personal responsibility
Awareness of ethical codes and legal standards is important, but formal codes and
standards cannot take the place of an active, thoughtful, creative approach to our ethical responsibilities
Awareness of the evolving research and theory in the scientific and professional literature is another important aspect of ethical competence, but the claims and conclusions emerging in the literature can never be passively accepted or reflexively applied no matter how popular, authoritative, or seemingly obvious
We believe that the overwhelming majority of therapists and counselors are conscientious, dedicated, caring individuals, committed to ethical behavior. But none of us is infallible
Many of us find it easier to question the ethics of others than to question our own beliefs, assumptions, and actions. It is worth noticing if we find ourselves preoccupied with how wrong others are in some area of ethics and certain that we are the ones to set them right, or at least to point out repeatedly how wrong they are
6) Many of us find it easier and more natural to question ourselves in areas where we are uncertain. It tends to be much harder --but often much more productive -- to question ourselves about what we are most sure of, what seems beyond doubt or question. Nothing can be placed off- limits for this questioning
As psychologists, we often encounter ethical dilemmas without clear and easy answers There is no legitimate way to avoid these ethical struggles. They are part of our work.
PAP Code of Ethics
Bases for Assessment
1. The expert opinions that we provide through our recommendations, reports, and diagnostic or evaluative statements are based on substantial information and appropriate assessment techniques.
2. We provide expert opinions regarding the psychological characteristics of a person only after employing adequate assessment procedures and examination to support our conclusions and recommendations.
3. In instances where we are asked to provide opinions about an individual without conducting an examination on the basis of review of existing test results and reports, we discuss the limitations of our opinions and the basis of our conclusions and recommendations
B. Informed Consent in Assessment
1. We gather informed consent prior to the assessment of our clients except for the
when it is mandated by the law
b. when it is implied such as in routine educational, institutional and organizational activity
c. when the purpose of the assessment is to determine the individual’s decisional capacity.
2. We educate our clients about the nature of our services, financial arrangements, potential risks, and limits of confidentiality. In instances where our clients are not competent toprovide informed consent on assessment, we discuss these matters with immediate familymembers or legal guardians. (See also III-J, Informed Consent in Human Relations) 3. In instances where a third party interpreter is needed, the confidentiality of test results and the security of the tests must be ensured. The limitations of the obtained data are discussed in our results, conclusions, and recommendations.
C. Assessment Tools 1. We judiciously select and administer only those tests which are pertinent to the reasonsfor referral and purpose of the assessment. 2. We use data collection, methods and procedures that are consistent with current scientific and professional developments. 3. We use tests that are standardized, valid, reliable, and has a normative data directly referable to the population of our clients. 4. We administer assessment tools that are appropriate to the language, competence and other relevant characteristics of our client.
D. Obsolete and Outdated Test Results 1. We do not base our interpretations, conclusions, and recommendations on outdated test results. 2. We do not provide interpretations, conclusions, and recommendations on the basis of obsolete tests. .
E. Interpreting Assessment Results 1. In fairness to our clients, under no circumstances should we report the test results without taking into consideration the validity, reliability, and appropriateness of the test. We should therefore indicate our reservations regarding the interpretations. 2. We interpret assessment results while considering the purpose of the assessment and other factors such as the client’s test taking abilities, characteristics, situational, personal, and cultural differences
F. Release of Test Data
It is our responsibility to ensure that test results and interpretations are not used by persons other than those explicitly agreed upon by the referral sources prior to the assessment procedure.
2. We do not release test data in the
forms of raw and scaled scores,client’s
responses to test questions or stimuli, and
notes regarding the client’sstatements and
behaviors during the examination unless
regulated by the court.
G. Explaining Assessment Results 1. We release test results only to the sources of referral and with a written permission from the client if it is a self referral. 2. Where test results have to be communicated to relatives, parents, or teachers, we explain them through a non-technical language. 3. We explain findings and test results to our clients or designated representatives except when the relationship precludes the provision of explanation of results and it is explained in advanced to the client. 4. When test results needs to be shared with schools, social agencies, the courts or industry,we supervise such releases.
H. Test Security The administration and handling of all test materials (manuals, keys, answer sheets, reusable booklets, etc.) shall be handled only by qualified users or personnel.
Assessment by Unqualified Persons
1. We do not promote the use of assessment tools and methods by unqualified persons except for training purposes with adequate supervision.
2. We ensure that test protocols, their interpretations and all other records are kept securedfrom unqualified persons.
9.09 Test Scoring and Interpretation Services
Psychologists who offer assessment or scoring services to other professionals accurately describe the purpose, norms, validity, reliability, and applications of the procedures and any special qualifications applicable to their use.
(b) Psychologists select scoring and interpretation services
(including automated services) on the basis of evidence of
the validity of the program and procedures as well as on
other appropriate considerations. (See also Standard 2.01b
and c, Boundaries of Competence.)
(c) Psychologists retain responsibility for the appropriate
application, interpretation, and use of assessment
instruments, whether they score and interpret such tests
themselves or use automated or other services.
POSSIBLE GUIDELINES TO CONSIDER IN CLINICAL ASSESSMENT
assessment is more that testing : “Whereas Tests deliver scores, assessment provides a meaningful way to describe an individual’s strengths and weaknesses” (Van Ornum, Dunlap, & Shore, 2008, p. 17). Among other things, the purpose of psychological assessment is to enhance decision-making. Gregory (1998) emphasizes the usefulness of assessment for decision-making: Assessment is problem solving . . . to answer questions about persons referred to a psychologist . . . . Assessment is a process in which the clinician integrates three components: (1) the reason for assessment, (2) a preferred theoretical orientation, and (3) relevant sources of information. (p. 27)
Compas and Gotlib (2002) highlight the use of assessment for the formulation of goals: “Goals may include diagnostic classification, determination of the severity of a problem, risk screening for future problems, evaluation of the effects of treatment, and predictions about the likelihood of certain types of future behavior”
First, in clinical settings, assessment requires (with limited exceptions) that the psychologist meet face-to-face with the service user and maintain an active role in the evaluation and assessment. Second, supervised use of an assistant should be acknowledged in communications (e.g., a report). Third, a psychologist cannot “sign off” for another practitioner, unless the psychologist has, in fact, been actively involved in the evaluation or assessment of the service user. Fourth, there should be awareness of research support for interpretive statements, with reconciliation of norms with independent judgments by the psychologist
Fifth, any test used should be appropriate for the needs (goals and objectives) of the particular service user. Sixth, since some assessment procedures are highly subjective, the term “test instrument” should be limited to standardized procedures that purport to provide objective measurements. Seventh, the release of psychological data or information that could compromise the integrity of the measures should be restricted to psychologists (or school psychologists) as much as is legally justified. Clearly the foregoing principles require scholarly thought, and when there is doubt, such as how to integrate legal and professional considerations, consultation may be advisab le.
Table 1 - Categories of 703 Ethically Troubling Incidents Category n % Confidentiality 128 18 Blurred, dual, or conflictual relationships 116 17 Payment sources, plans, settings, and methods 97 14 Academic settings, teaching dilemmas, and concerns about training 57 8 Forensic psychology 35 5 Research 29 4 Conduct of colleagues 29 4 Sexual issues 28 4 Assessment 25 4 Questionable or harmful interventions 20 3 Competence 20 3 Ethics (and related) codes and committees 17 2 School psychology 15 2 Publishing 14 2 Helping the financially stricken 13 2 Supervision 13 2 Advertising and (mis)representation 13 2 Industrial-organizational psychology 9 1 Medical issues 5 1 Termination 5 1 Ethnicity 4 1 Treatment records 4 1 Miscellaneous 7 1
Assessment As the following examples illustrate, the most typical dilemmas focusing on assessment tended to involve one of two themes: (a) the availability of tests (or computerized interpretations) to those who may not be adequately trained in testing, and (b) basing conclusions on inadequate data or ignoring important sources of data (e.g., observation, interview, or other contact with the client) or expertise. only one person was sent for training.... often asked to add new tests without appropriate supervision. Test publishers aren't motivated to slow down sales by requiring training to purchase tests. asked by social workers to "interpret" psychological tests which they administer without allowing me to a see the client. I refuse to render an opinion without client contact.
Colleagues in the medical profession have the right to order psychological tests from computer companies that give computer generated interpretations. problematic testing issues such as full and part-time psychologists...making neuropsychological diagnoses without expertise at the same time ignoring resident neuropsychologist. Other practitioners, especially internists, wanting to base important decisions on just an MMPI [Minnesota Multiphasic Personality Inventory] result. Some psychologists will omit subtests from the Wechsler and report verbal and performance IQ scores without indicating that they have omitted subtests. This practice was so common we had to require a copy of the summary sheet to ensure that all tests were administered. Psychologists use computer-generated test reports as the only report of an evaluation, without integrating the test results with other data.
ETHICS ON PSYCH TEST CONSTRUCTION
J. Test Construction We develop tests and other assessment tools using current scientific findings and knowledge,appropriate psychometric properties, validation, and standardization procedures
Critiquing a test Purpose What does the test measure overall? Who is it for? What assumptions are these purposes based on? Design What are the individual constructs that it measures? What logical or theoretical assumptions underpin these constructs? What is the empirical basis is there for the constructs, i.e. how werethey developed, and what evidence is there of this? Bias, validity and reliability Examine the test for item, wording, ordering biases How would you test the test for content, construct and predictive validity? Make a plan for one of these. Overall – what are the main flaws and strengths of this test?
Ethics • Purpose Be ethical about why and how the test is given • Authenticity The test must reflect ‘real life’ psychology • Generalisability When reporting results, be realistic about who these can be extended to • Subjectivity Be honest about how much personal judgement is included in the design of your test and results analysis
Mismatched Validity (1) Selecting assessment instruments involves similarly complex questions, such as: "Has research established sufficient reliability and validity (as well as sensitivity, specificity, and other relevant features) for this test, with an individual from this population, for this task (i.e., the purpose of the assessment), in this set of circumstances?" It is important to note that as the population, task, or circumstances change, the measures of validity, reliability, sensitivity, etc., will also tend to change. To determine whether tests are well-matched to the task, individual, and situation at hand, it is crucial that the psychologist ask a basic question at the outset: Why--exactly--am I conducting this assessment?
Confirmation Bias Often we tend to seek, recognize, and value information that is consistent with our attitudes, beliefs, and expectations. If we form an initial impression, we may favor findings that support that impression, and discount, ignore, or misconstrue data that don't fit. This premature cognitive commitment to an initial impression--which can form a strong cognitive set through which we sift all subsequent findings--is similar to the logical fallacy of hasty generalization. To help protect ourselves against confirmation bias (in which we give preference to information that confirms our expectations), it is useful to search actively for data that disconfirm our expectations, and to try out alternate interpretations of the available data.
Confusing Retrospective & Predictive Accuracy (Switching Conditional Probabilities) (3) Predictive accuracy begins with the individual's test results and asks: What is the likelihood, expressed as a conditional probability, that a person with these results has condition (or ability, aptitude, quality, etc.) X? Retrospective accuracy begins with the condition (or ability, aptitude, quality) X and asks: What is the likelihood, expressed as a conditional probability, that a person who has X will show these test results? Confusing the "directionality'' of the inference (e.g., the likelihood that those who score positive on a hypothetical predictor variable will fall into a specific group versus the likelihood that those in a specific group will score positive on the predictor variable) causes many errors. This mistake of confusing retrospective with predictive accuracy often resembles the affirming the consequent logical fallacy: People with condition X are overwhelmingly likely to have these specific test results. Person Y has these specific test results. Therefore: Person Y is overwhelmingly likely to have condition X.
Unstandardizing Standardized Tests (4) Standardized tests gain their power from their standardization. Norms, validity, reliability, specificity, sensitivity, and similar measures emerge from an actuarial base: a well-selected sample of people providing data (through answering questions, performing tasks, etc.) in response to a uniform procedure in (reasonably) uniform conditions. When we change the instructions, or the test items themselves, or the way items are administered or scored, we depart from that standardization and our attempts to draw on the actuarial base become questionable. There are other ways in which standardization can be defeated. People may show up for an assessment session without adequate reading glasses, or having taken cold medication that affects their alertness, or having experienced a family emergency or loss that leaves them unable to concentrate, or having stayed up all night with a loved one and now can barely keep their eyes open. The professional conducting the assessment must be alert to these situational factors, how they can threaten the assessment's validity, and how to address them effectively.
Ignoring the Effects of Low Base Rates(5) Ignoring base rates can play a role in many testing problems but very low base rates seem particularly troublesome. Imagine you've been commissioned to develop an assessment procedure that will identify crooked judges so that candidates for judicial appointment can be screened. It's a difficult challenge, in part because only 1 out of 500 judges is (hypothetically speaking) dishonest.
Misinterpreting Dual High Base Rates (6) As part of a disaster response team, you are flown in to work at a community mental health center in a city devastated by a severe earthquake. Taking a quick look at the records the center has compiled, you note that of the 200 people who have come for services since the earthquake, there are 162 who are of a particular religious faith and are diagnosed with PTSD related to the earthquake, and 18 of that faith who came for services unrelated to the earthquake. Of those who are not of that faith, 18 have been diagnosed with PTSD related to the earthquake, and 2 have come for services unrelated to the earthquake. It seems almost self-evident that there is a strong association between that particular religious faith and developing PTSD related to the earthquake: 81% of the people who came for services were of that religious faith and had developed PTSD. Perhaps this faith makes people vulnerable to PTSD. Or perhaps it is a more subtle association: this faith might make it easier for people with PTSD to seek mental heath services. But the inference of an association is a fallacy: religious faith and the development of PTSD in this community are independent factors.
Perfect Conditions Fallacy B (7) Especially when we're hurried, we like to assume that "all is well," that in fact "conditions are perfect." If we don't check, we may not discover that the person we're assessing for a job, a custody hearing, a disability claim, a criminal case, asylum status, or a competency hearing took standardized psychological tests and completed other phases of formal assessment under conditions that significantly distorted the results. For example, the person may have forgotten the glasses they need for reading, be suffering from a severe headache or illness, be using a hearing aid that is not functioning well, be taking medication that impairs cognition or perception, have forgotten to take needed psychotropic medication, have experienced a crisis that makes it difficult to concentrate, be in physical pain, or have trouble understanding the language in which the assessment is conducted.
Financial Bias (8) It is a very human error to assume that we are immune to the effects of financial bias. But a financial conflict of interest can subtly -- and sometimes not so subtly -- affect the ways in which we gather, interpret, and present even the most routine data. This principle is reflected in well-established forensic texts and formal guidelines prohibiting liens and any other form of fee that is contingent on the outcome of a case. The Specialty Guidelines for Forensic Psychologists , for example, state: "Forensic psychologists do not provide professional services to parties to a legal proceeding on the basis of 'contingent fees,' when those services involve the offering of expert testimony to a court or administrative body, or when they call upon the psychologist to make affirmations or representations intended to be relied upon by third parties."
Ignoring Effects of Audio-recording, Video-recording or the Presence of Third-party Observers (9) Empirical research has identified ways in which audio-recording, video-recording, or the presence of third parties can affect the responses (e.g., various aspects of cognitive performance) of people during psychological and neuropsychological assessment. Ignoring these potential effects can create an extremely misleading assessment. Part of adequate preparation for an assessment that will involve recording or the presence of third-parties is reviewing the relevant research and professional guidelines.
Uncertain Gatekeeping (10) Psychologists who conduct assessments are gatekeepers of sensitive information that may have profound and lasting effects on the life of the person who was assessed. The gatekeeping responsibilities exist within a complex framework of federal (e.g., HIPAA) and state legislation and case law as well as other relevant regulations, codes, and contexts.
How to Construct a Psychological Test
Here are the basic steps to constructing a useful psychological test: 1) Determine the trait, ability, emotional state, disorder, interests, or attitude that you want to assess. Psychological tests can be created that measure – Abilities , such as musical skill, writing skill, intelligence, or reading comprehension. Personality Traits , such as extroversion, creativity, or deviousness, Disorders , such as anxiety, depression, psychotic thought disorder, Emotions , such as happiness and anger, Attitudes , such as authoritarianism or prejudice, Interests , such as career-related interests.
2) Decide how you want to measure the construct you selected. In general, the best measures sample the behavior of interest. For instances, if you want to determine how aggressive a person is, the best measure would be to provide a frustrating situation, and see whether the person reacts aggressively. It's not always practical or ethical to directly measure constructs, so instead, tests rely on a person's self-report of their behavior. A number of other factors need to be considered. Should the test be written, or should it be administered orally? Should the responses be discrete (a rating scale, or Yes/No answers), or should it allow open-ended answers that can be reliably rated? Should the responses be oral, written, or nonverbal?
3) Does the construct that you want to measure have only one dimension, or can it be broken down into several dimensions? For instance, intelligence is usually considered multi-dimensional, consisting of several different verbal abilities and nonverbal abilities.
4) Once you've made decisions about the factors above, you can begin creating your test items. If the items are measuring a particular area of knowledge, then you will review textbooks or consult subject-matter experts in that area. If you are measuring a personality trait or emotional state, then the items should be consistent with a theory or agreed upon description of what you are measuring. It's generally best for several experts to generate items.
5) After generating items , it often makes sense to have experts rate the quality of the items, and to retain only the items with the highest ratings. The experts can also suggest revisions. If your items measure depression, the experts should be mental health professionals. If your items measure business skill, your experts should be business executives and managers.
6) Your test is then ready to be tested on a sample of people. Your sample should be a good cross-section of the people that you will want to compare test-takers to. After you administer your test to a sample of people: -Determine the correlation between each item and the sum of the other items. If your test has subscales, do this separately for each subscale. Eliminate items that do not correlate well with the rest of their scale. -Eliminate items that are too easy or too hard. If almost everyone agrees with an item or gets the correct answer, it is not a useful item.
- This procedure will maximize the test's internal consistency, one measure of reliability. You should calculate coefficient alpha. This statistic measures the degree to which a test scale measures a single construct, and the degree to which the test items are all measuring the same ability or trait. Alpha has a theoretical maximum of +1.00. A good test alpha is greater than .70.
7) The final test should be cross-validated on a new sample. During cross-validation, you can demonstrate test validity: -You should be able to show that your test scores correlate with what they are supposed to correlate with. For instance, a test of math skill should yield higher scores for students with higher math grades. A test of depression should yield higher scores for people who have been diagnosed with Major Depression. -Factor analysis can be used to demonstrate that the test subscales group together (inter-correlate) in the way that theory would predict.
8) When the test is cross-validated, you can also calculate normative data. You can calculate the mean (average) score for test-takers, and calculate the standard deviation to determine how spread out the scores are around the mean. These statistics are extremely useful, because now any individual's score can be compared to the scores of people in general. If your test has subscales, you will find the mean and standard deviation for each subscale. It is also often useful to find separate normative data for different groups of potential test takers. Many tests have norms according to gender, ethnic group, and age.