Validating Wordscores

Validating Wordscores
Bastiaan Bruinsma Kostas Gemenis
Universiteit Twente
5th EPSA General Conference, Vienna, 25-27 June 2015
Bruinsma, Gemenis Validating Wordscores

Computer assisted methods for text analysis
Fig. 1 An overview of text as data methods.
Justin Grimmer and Brandon M. Stewart2

Wordscores
Originally proposed by Laver, Benoit & Garry (2003)
Popular tool (869 citations on Google Scholar)
Developed for political manifestos, but also used to study:
Party mergers, electoral coalitions, policy preferences,
speeches, reports from US state lotteries, Chinese newspaper
articles, public statements by US Senators, open-ended
questions ...
Attempts at validation are rather limited

How Wordscores Works

Previous attempts at validation
Mostly against CMP data though Benoit & Laver (2007)
advise against this
Only assess criterion validity
Only assess ordinal placement (Hjorth et al. 2015)
Only use Spearman’s ρ or Pearson’s r (and thus no
assessment of systematic measurement error)

Replication of the original Laver et al. article
Table 1: Replication of the original scores
Number of Parties
Stata Version 5 parties 7 parties
0.36
EC
0 5 10 15 20
SO
DL Labour FG FF PD
FFLabour
PD
FGDL
DL Labour FFFG PDSF
Greens
EC
0 5 10 15 20
DL
Labour
FFFG
Greens
SO
SF PD
Laver et al. (2003)
23-Jun-2009
EC
0 5 10 15 20
SO
Labour FG PDFF DL
DL Labour FFFG
PD
EC
0 5 10 15 20
SO
DL
Labour
FF
FG
PD
SFGreens
DL
LabourFF FG PD
SF
Greens
Laver et al. (2003) Replication Material

Hjorth et al. validation
ws_rankexpws_rankexpws_rankexpws_rankexp
low high low high low high
low high low high
low high low high low high
low high low high
low high
1945 1950 1953 1957 1960
1964 1966 1968 1971 1973
1977 1979 1981 1984 1987
1988 1990 1994 1998 2001
2005 2007

Study Design

Study Design
Documents
Using 2004 Euromanifestos to score 2009 Euromanifestos
Euromanifestos obtained from the Manifesto Project Database

Study Design
Documents
Reference scores
Chapel Hill Expert Study (2002), Benoit & Laver Expert
Survey (2003-2004), Euromanifestos Project (2004)

Study Design
Documents
Reference scores
Comparison
Chapel Hill Expert Study (2010), EU Proﬁler (2009),
Euromanifestos Project (2009)

Study Design
Documents
Reference scores
Comparison
Chapel Hill Expert Study (2010), EU Proﬁler (2009),
Euromanifestos Project (2009)
Analysis
Use Lin’s Concordance Correlation Coeﬃcient instead of
Spearman’s ρ or Pearson’s r
25 countries/territories ∗ 4 dimensions ∗ 3 reference scores ∗ 2
transformations = 600 analyses

Types of validity
Following Carmines & Zeller (1979):

Types of validity
Content Validity
Does the method represent all facets of a construct?

Types of validity
Content Validity
Construct Validity
Does the method correlate with other measures reﬂecting the
same concept?

Types of validity
Content Validity
Construct Validity
Does the method correlate with other measures reﬂecting the
same concept?
Criterion Validity
Does the method behave as expected within a given theoretical
context?

Content validity for EU Integration
0.511.522.5
Density
0 .5 1
word relevance (mean)
BNP
01234
Density
0 .2 .4 .6 .8 1
CONSERVATIVES
0246810
Density
0 .2 .4 .6 .8 1
GREENS
0246
Density
0 .2 .4 .6 .8 1
LABOUR
02468
Density
0 .2 .4 .6 .8 1
LIBDEM
02468
Density
0 .2 .4 .6 .8 1
PC
02468
Density
0 .2 .4 .6 .8 1
SNP
0.511.522.5
Density
0 .5 1
UKIP
0246
Density
0 .2 .4 .6 .8 1
Total

Construct validity
LBG
MV
Transformation
0 .2 .4 .6 .8 1
McFadden's R Squared
BL CHES EMP
Reference scores from
LBG
MV
Transformation
0 .2 .4 .6 .8 1
Count R Squared
BL CHES EMP

Criterion validity
CHESEUPEMP
Comparedto
0 .2 .4 .6 .8 1
Concordance Correlation Coefficient
LBG Transformation − Per Country Rescaling
CHESEUPEMP
Comparedto
0 .2 .4 .6 .8 1
LBG Transformation − Whole Dimension Rescaling
CHESEUPEMP
Comparedto
0 .2 .4 .6 .8 1
MV Transformation − Per Country Rescaling
CHESEUPEMP
Comparedto
0 .2 .4 .6 .8 1
MV Transformation − Whole Dimension Rescaling
EU Integration Dimension
BL CHES EMP

Conclusion
No serious validation of Wordscores up till now
This validation found it lacking on content, construct and
criterion validity
Wordscores should not be used to estimate parties’ policy
positions using electoral manifestos as reference and virgin
texts

Outlook
Wordscores might still be useful in other applications where
the assumptions of ideal point estimation for words might be
approximated
However, a case-by-case validation should be applied

Validating Wordscores

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Validating Wordscores