WPA Opinion Research
Predictive Analytics Capabilities
CEO & Partner
We are making predictions about the behavior of
In particular, we are predicting behavior on four key
•Initial Likelihood to Vote
•Informed Likelihood to Vote
We build each of these predictions using
demographic, behavioral, and consumer information.
we doing this?
We use regression models, specifically an Ordered Stepwise
Probit model, to make these predictions.
•Starts with all available information
•Narrows down the available variables to only those that are useful
We start with 200+ variables covering demographics,
consumer information, and behavioral information.
•Demographic: County, Age, Gender, Veteran Status, Head of Household,
Presence of Children, Religion, etc.
•Behavior: Art Collectors, Read Science Fiction books, Interest in camping
or hiking, Follow current affairs/politics, Golfers, Investors, Interest in
•Consumer: Purchased craft products, Have American Express card, Buy
electronics like computers, Purchased clothing (women’s, men’s,
Types of Variables
These are the types of fields we
have access to for variables to
run predictive analytics
How the Variables
All of the variables we use are necessary.
•They give us a very detailed picture of the voters.
•It allows us to accurately predict the behavior of other voters who
share similar characteristics.
•Other data from the campaign can be used if there is sufficient
data (i.e. donors, volunteers, etc.)
Steps in the Model:
•Use every variable we have to build an initial model based on the
campaign’s observed data to predict the behavior under study.
•Use an Ordered Stepwise Probit model to narrow down all the
variables by running a series of regressions to determine which
variables are the most effective at predicting the behavior.
•The finalized model is applied back to the full list of voters to
calculate a prediction for each voter.
The data going back to the campaign
is a prediction of each voter’s
•Every voter on the file has a number from 0
(zero) to 1 (one).
•That number is their prediction score for that
•The higher the value the more likely that
individual voter is to perform that particular
Using the Data:
Likelihood to Vote
for the candidate
Voter #1 0.8
Voter #2 0.2
Here, Voter #1 is much more
likely to vote for the candidate
than Voter #2 is.
Change on Vote
Voter #3 0.8 to 0.85
Voter #4 0.6 to 0.75
Here, Voter #3 is already very likely to vote for the
candidate and probably does not need additional
contact. Voter #4 though can be moved from
“persuadable” to “likely” and should be contacted.
Vote for the
Voter #5 0.8 to 0.85 0.2
Voter #6 0.8 to 0.85 0.8
Here, both voters are very likely to vote
for the candidate…if they vote. The
campaign should focus on convincing
Voter #5 about the importance of
voting and Voter #6 should be
targeted in GOTV efforts.
The data can be used across the campaign for a
variety of purposes:
•Door to Door Walkers: May be more concerned about convincing
people to vote for the candidate. So they are going to want to
focus on “persuadable” voters identified by the initial model who
had larger movements toward the candidate on the informed
•Fundraisers: May be more interested in voters who are already
likely to support the candidate. They won’t have to convince them
why he is the right candidate – they can focus on why and how
their donations will help re-elect the candidate.
•Ad Buyers: May be interested in the demographic makeup of
“persuadable” voters so they can place the right buys. The
individual level data can be aggregated back to the population to
help identify the best “groups” of voters to target.
The data can even be analyzed across models.
•Did more voters move toward the candidate on model #2
which was run a month after model #1?
•Who are these voters and what theories do we have about
why they are moving toward or away from the candidate?
•How can we continue the momentum or turn back the tide?
All of this data can be uploaded to the
•This gives every department access to the data so they can
pull lists of voters they care about rather than trying to wade
through the voters on the file.
Additional Uses for
This type of analysis is not limited to vote behavior. It
can be used to predict any behavior the campaign is
•So long as there is enough data from the campaign about a behavior
they are interested in we can model it.
•Donations: Based on who has already donated to the campaign,
who else is likely to donate?
•Volunteers: Based on who is already volunteering, who are others
that could be recruited to volunteer?
•Yard Signs: Based on who has requested a yard sign, who else
might be interested in displaying a sign if asked?
•Events: Based on who has attended previous campaign events,
which voters might be likely to attend a campaign event? Which of
these voters are also likely donors or volunteers?
How we will interact
with the Data
How will we interact with the Campaign?
•The campaign will send us the information they have
for the behavior they are interested in modeling.
•We will clean the data and add all of the
demographic, behavior, and consumer information
obtained from Aristotle.
•We will run our models to predict the behavior.
•After the predictions are complete we will send the
predictions back to the campaign along with each
voters Voter ID number to upload back to their