2. Study instruments
ā¢ A tool that you use to measure your data
Weighing
scales
https://www.flickr.com/photos/johncarney
/
Sphygmomanometer
Glucometer
https://www.flickr.com/photos/imcomkorea/
https://www.flickr.com/photos/alishav/
3. How do we measure psychological
domains?
Intelligence
Happiness
Stress
www.pdclipart.org
4. A questionnaire can be your study
instrument
https://www.flickr.com/photos/joegratz
/
https://www.flickr.com/photos/mdid
/
5. What is reliability?
ā¢ The ability to give a consistent measurement
https://www.flickr.com/photos/johncarney
/
Reading Weight (kg)
1 99.0
2 98.7
3 99.1
4 98.8
Reading Weight (kg)
1 99.0
2 95.4
3 100.4
4 103.1
Reliable
Not reliable
6. Types of reliability
Reliability Explanation
Test-retest
reliability
The ability of the tool to give the same reading for the same subject
over two point in time
Inter-rater
agreement
The ability of the tool to give the same reading for the same subject by
two raters (person doing the measurement) at the same point of time
Split half
reliability
The results for a single domain in a questionnaire is split into half and
these two halves are compared with each other. If they are similar,
therefore, the items in the test are measuring the same things.
e.g. A domain measuring happiness has 10 questions. The results for 5
questions is compared with the results of the other 5 questions.
Internal
consistency
How well the questions in the domain measure the psychological
concept it is supposed to measure.
e.g. 10 questions to measure happiness and each question is asking
about happiness, not other emotions like excitement, anxiety, anger
etc
7. Identifying instrument reliability from
research articles
Cronbach alpha values Interpretation
>0.95 Extremely good reliability but may have
redundant items in the scale
>0.7 Good internal consistency reliability
>0.6-0.7 Fair internal consistency reliability
<0.6 Poor internal consistency reliability
8. Internal consistency reliability
ā¢ Note that the Cronbach alpha should be
determined for each domain / subscale in the
instrument.
ā¢ A poor Cronbach alpha can be due to
insufficient number of questions in the
domain OR the questions in the domain are
not measuring the same theoretical construct.
9. Identifying instrument reliability from
research articles
(Balogun, 2010)
Test-retest reliability can be determined via Cohenās Kappa
(which measures agreement), correlation, or a non-significant
change in the mean.
Good or acceptable reliability should have Kappa or
correlation coefficients of 0.7 and above.
10. What is validity?
ā¢ An instrument that is valid measures what it is
supposed to measure.
Weight
Blood pressure
Capillary blood glucose
https://www.flickr.com/photos/johncarney
/
https://www.flickr.com/photos/imcomkorea/
https://www.flickr.com/photos/alishav/
11. What is validity
ā¢ Using an instrument that is not valid gives you
results that are not reflecting your intended
objective!
A weighing scale cannot tell you the height of your sample
https://www.flickr.com/photos/johncarney
/
12. What is validity?
ā¢ Similarly, a questionnaire that is not valid
cannot give you the measure of a
psychological construct!
Valid to measure QoL
Invalid to
measure QoL
Quality
of Life
https://www.flickr.com/photos/carlosmateus
/
13. How is reliability different from
validity?
ā¢ Reliable but not valid ā¢ Reliable and valid
14. How is reliability different from
validity?
ā¢ Valid but not reliable?
When a tool is not reliable,
it is not advisable to be
used for your study
instrument because the
results would be
inconsistent
16. Types of validity
ā¢ Content validity ā The degree to which the
content of an instrument is an adequate
reflection of the construct to be measured.1
ā¢ Determined via focus group discussion by a
panel of experts, to ensure that all the aspects
have been included and are appropriately
grouped.2
1. Mokkink, 2010
2. Patrick DL, Burke LB, Gwaltney CJ, et al. Content-validity ā Establishing
and reporting the evidence in newly developed patient-reported
outcomes (PRO) instruments for Medical Product Evaluation: ISPOR
PRO. Value in Health 2011, 14:967-977
17. Example of content validation
Yesavage JA, Rose TL, Lum O, et al. Development and validation of a geriatric
depression screening scale: a preliminary report. J Psyc. Res 1983, 17:37-49
18. Types of validity
ā¢ Face validity ā The degree to which an
instrument looks as though they are an
adequate reflection of the construct to be
measured.1
ā¢ Can be determined during pre-testing /
cognitive debriefing.
Mokkink LB, Terwee CB, Patrick DL, et al. The COSMIN study reached international
consensus on taxonomy, terminology, and definitions of measurement properties for
health-related patient-reported outcomes. J Clin Epid 2010, 63:737-745
19. Example of face validity
ā¢ A questionnaire to test knowledge regarding
URTI is given to the sample population. The
respondentsā understanding of the
questionnaire is tested, clarified, and they feel
that the questions do test their knowledge
about URTI
20. Types of validity
ā¢ Construct validity: The degree to which the
scores of an instrument are consistent with
hypotheses.
ā¢ Consists of:
ā Structural validity
ā Hypothesis testing
ā Cross-cultural validity
Mokkink 2010.
21. Construct validity
ā¢ Structural validity ā how well the instrument
reflects the dimensions of the construct. May
use factor analysis.
ā¢ Hypotheses testing ā testing hypotheses from
theory. May include known-groups or
contrasting groups testing.
ā¢ Cross-cultural validity ā how well the
translated or culturally adapted instrument
reflects the original version.
Mokkink 2010.
22. Structural
validity
Items that belong to
the same domains
theoretically correlate
well with one another
(convergent validity).
Items that do not
belong should not
correlate well
(divergent validity).
Factor analysis can be
used to group the
items into their
respective domains.
23. Hypotheses testing
The tool should be consistent with known theories.
e.g. It is expected that quality of life will be poorer in those with
back pain versus those without. Therefore, a significant
difference in the scores for a tool measuring quality of life shows
that it is consistent with the theory.
Nagammai T, Mohazmi M, Liew SM, et al. Validation of the Malay version of the Quality of Life
Questionnaire of the European Foundation for Osteoporosis (QUALLEFO-41) in Malaysia. Qual
Life 2015; DOI: 10.1007/s11136-015-0933-7
24. Cross-cultural validity
ā¢ Questionnaires from overseas should undergo linguistic
validation and cross-cultural adaptation prior to use in the
local population.
ā¢ Certain terms may not be familiar or interpreted differently
by the local population.
ā¢ E.g. I am feeling sick. ā Saya berasa sakit.
ā¢ (Intended meaning: Saya berasa loya)
ā¢ This leads to incorrect results using the tool
Sousa VD, Rojjanasrirat W. Translation, adaptation and validation of instruments or scales for
use in cross-cultural healthcare research: a clear and user-friendly guideline. J Eval in Clin Pract
2011, 17:268-74
25. Types of validity
ā¢ Criterion validity: The degree to which the
scores of an instrument adequately reflects a
gold standard.
ā¢ When no gold standard exists, a similar tool
can be used as comparison (concurrent
validity)
ā¢ The tool is tested for its ability to predict the
results of the āgold standardā.
26. Criterion validity
ā¢ Example: Comparing the FRAX scores with
actual BMD results.
ā¢ Use of area-under-the-curve and receiver
operating characteristics (AUC-ROC) to
determine the best cut-off score
ā¢ Calculating the sensitivity and specificity,
positive predictive value, and negative
predictive values
27. How to validate a questionnaire
Dr Tan Chai Eng
Image from
www.expertbriefings.com
28. Objectives
ā¢ What type of validation do I need to do?
ā¢ The process of content validation
ā¢ The process of linguistic validation
ā¢ The process of using exploratory factor
analysis for determining construct validity
29. Types of validation
ā¢ Content validation?
ā¢ Newly developed
questionnaire / survey
ā¢ Linguistic validation?
ā¢ Questionnaire in non-
local language
ā¢ Construct validation by
exploratory factor analysis?
ā¢ Checking whether the tool
measures factors consistent
with theoretical concepts
ā¢ Criterion validation?
ā¢ Can it correctly predict /
discriminate
conditions?
30. Content Validation
ā¢Identify topic of interest
ā¢Gather a collection of relevant experts
ā¢Brainstorm items / qualitative study to
provide items related to the topic of
interest
Vet the items proposed in detail:
ā¢Are the contents relevant?
ā¢Are the items representative of the
many aspects of the intended aspect
to be measured?
31. Why content validation?
ā¢ Before you proceed to
construct validation,
you need to make sure
that you have covered
all aspects that could
potentially be related to
the topic of interest!
34. Linguistic validation
ā¢ Saya berasa susah
untuk menurunkan
angin
ā¢ Saya sedar tentang
kekeringan mulut
ā¢ Saya nampaknya tidak
dapat mengalami apa-
apa perasaan yang
positif langsung
DASS 21
35. Linguistic Validation
ā¢ Kesal atau marah di
atas benda-benda kecil
ā¢ Mulut terasa kering
ā¢ Tidak dapat mengalami
perasaan positif
DASS 21
Malay version of DASS 21 by Dr Ramli
Musa
36. Linguistic Validation
ā¢ If linguistic validation is not done,
respondents may interpret the items
differently and give inaccurate responses.
ā¢ Even English versions need to be adapted and
validated locally to ensure conceptual
equivalence and acceptability.
37. Process of linguistic validation
ā¢ Any inappropriately worded items ā need to
repeat the process until an acceptable version
is achieved
ā¢ Desired outcome ā conceptually equivalent
and suitable for local setting
38. Construct validation
ā¢ Exploratory factor analysis
ā¢ Confirmatory factor analysis
ā¢ Rasch model analysis
ā¢ Structural equation modelling
ā¢ Contrast analysis
Will not be discussed in detail during this lecture
39. Exploratory factor analysis
ā¢ Data reduction, can be done using SPSS
ā¢ Explores the underlying factor structure
ā¢ Determines the correlation between each
item in the questionnaire
ā¢ Items that belong to the same factor should
be highly correlated with each other
40. Exploratory factor analysis
Administer tool to a wide
variety of samples
ā¢Heterogeneous sample is desirable.
ā¢Not generalisable.
ā¢Purposive sampling
Test for suitability for principal
components analysis
Determine number of factors
to extract
ā¢Monte Carlo parallel analysis
ā¢Kaiserās criterion
ā¢Scree plot
ā¢Kaiser-Meyer-Olkin > 0.6
ā¢Bartlettās test of sphericity - significant
41. Exploratory factor analysis
Run principal components analysis
based on pre-determined number
of factors
Interpret the factor structure
Choice of rotation depends on whether
theoretically the factors are correlated
or not.
High factor loading means item is
correlated with the factors
Name the factors, test internal
consistency of each factor
43. Exploratory factor analysis
ā¢ May reveal that the constructs measured by
the questionnaire is different from the original
questionnaire
ā¢ Confirmatory factor analysis to test how well
the items fit into the proposed structure
ā¢ Ideally construct should be similar to original
to allow cross-cultural comparison
44. Exploratory factor analysis
Parker et al
(original MPP ā
US)
Chiu et al
(Singapore)
Fujimori et al
(Japan)
Mauri et al
(Italy)
Tan et al
(Malaysia)
3 factors:
ā¢Content
ā¢Facilitation
ā¢Support
2 factors:
ā¢Content and
facilitation
ā¢Support
5 factors:
ā¢Emotional
support
ā¢Medical
information
ā¢Clear
explanation
ā¢Encouraging
question asking
ā¢Setting
3 factors:
ā¢Information
ā¢Support
ā¢Care
3 factors:
ā¢Content and
facilitation
ā¢Emotional
support
ā¢Structural and
informational
support
45. Criterion validation
ā¢ Predictive validity - the scores from the new measure
to predict performance on a criterion measure
administered at a later time.
ā MoCA vs neuropsychological testing (gold
standard) in frontotemporal dementia
Freitas et al, 2012
Mislevy, J (2010). doi:
http://dx.doi.org/10.4135/9781412961288.n67
46. Criterion validation
ā¢ Concurrent validity -the extent to which scores on a
new measure are related to scores from a criterion
measure administered at the same time.
ā QoL in obesity vs SF-36
Moorehead et al, 2003
Mislevy, J (2010). doi:
http://dx.doi.org/10.4135/
9781412961288.n67
48. Can I use English questionnaires in my
study?
ā¢ Can your respondents understand English?
ā¢ Are the questionnaire items worded in a way
that is suitable to the local culture/language?
ā E.g. toilet vs loo vs restroom vs washroom
ā¢ Pre-test among local population and check
their understanding of the wording in the
questionnaire.
49. Can I translate the questionnaires to
Malay myself?
ā¢ Need translate in a way that you retain the
semantic and content equivalence
ā¢ Back-to-back translation
ā¢ Pilot-test, check reliability of the scales within
the questionnaire
ā¢ Best practice: validate the questionnaire
locally
50. Can I verbally translate the
questionnaires to Malay myself?
ā¢ Difficulty in achieving standardized questions
to respondents ā may get different responses
depending on what you say at that point of
time
ā¢ Affects the validity of your study results
ā¢ Not suitable for self-administered
questionnaires, especially where sensitive
information is requested
51. Permissions
ā¢ Development of a good questionnaire for
research takes a lot of work from a team of
researchers.
ā¢ Ethically, need permission from the
corresponding author who developed the
questionnaire.
ā¢ Public domain questionnaires do not require
permission. Eg DASS21
52. Permissions
ā¢ Validated Malay-language versions, need
permission from the researcher who validated
it locally. Eg Malay version DASS21 ā Prof
Ramli Musa
ā¢ Copyright holder may be a company, e.g.
QualityMetrics ā may need to pay for licence
to use the questionnaire. Include into your
proposal budget for funding.
53. Any further questions?
ā¢ You may want to consult your supervisor.
ā¢ You may also ask me via email at
tce@ppukm.ukm.edu.my