Transcript of a BriefingsDirect podcast on how a political campaign used big data to better understand and predict voter behavior and what was going on on the ground during the 2012 national
Transcript of a BriefingsDirect podcast on how a political campaign used big data to better understand and predict voter behavior and what was going on on the ground during the 2012 national elections.
1. Democratic National Committee Leverages Big Data to Turn
Politics into Political Science
Transcript of a BrieﬁngsDirect podcast on how a political campaign used big data to better
understand and predict voter behavior and what was going on on the ground during the 2012
Listen to the podcast. Find it on iTunes. Sponsor: HP
Dana Gardner: Hello, and welcome to the next edition of the HP Discover Performance
Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your moderator for
this ongoing discussion of IT innovation and how it’s making an impact on people’s lives.
Once again, we’re focusing on how IT leaders are improving their business performance for
better access, use and analysis of their data and information. This time, we're
coming to you directly from the HP Vertica Big Data Conference in Boston.
[Disclosure: HP is a sponsor of BrieﬁngsDirect podcasts.]
Our next innovation case study interview focuses on the big-data problem in the
realm of political science. We'll learn how the Democratic National Committee
(DNC) leveraged big data to better understand and predict voter behavior and
alliances in the 2012 elections.
To learn more about how the DNC pulled vast amounts of data together to predict and
understand a much bigger pie in terms of people's preferences and understanding of politics and
issues, please join me in welcoming Chris Wegrzyn. He is the Director of Data Architecture at
the DNC, based in Washington. Welcome, Chris.
Chris Wegrzyn: Hello. Thank you for having me.
Gardner: We're glad you're here. Like a lot of organizations, you had different silos of data and
information and you perhaps weren't able to do the analysis properly because of the distributed
nature of the data and information. What did you do that allowed you to bring data together and
then also start to get towards that goal of getting all the data involved to bring out a better
Wegrzyn: In 2008, we got a lot of recognition at that time for being a data-driven campaign and
making some great leaps in how we improved efﬁciency by understanding our organization.
Coming out of that, those of us on the inside were saying this was great, but we have only really
skimmed the surface of what we can do. We focused on some sets of data, but they're not
connected to what people were doing on our website, what people were doing on social media, or
what our donors were doing. There were all of these different things, and we weren’t looking at
2. Really, we couldn’t look at them. We didn't have the staff structure, but we also didn't have the
technology platform. It’s hard to integrate data and do it in a way that is going to give people
reasonable performance. That wasn't available to us in 2008.
So, fast forward to where we were preparing for 2012. We knew that we wanted to be able to
look across the organization, rather than at individual isolated things, because we
knew that we could be smarter. It's pretty obvious to anybody. It isn’t a
competitive secret that, if somebody donates to the campaign, they're probably a
good supporter. But unless you have those things brought together, you're not
necessarily pushing that information out to people, so that they can understand.
We were looking for a way that we could bring data together quickly and put it
directly into the hands of our analysts, and Vertica was exactly that kind of
solution for us. The speed and the scalability meant that we didn't have to worry about making
sure that everything was properly transformed and didn't have to spend all of this time
structuring data for performance. We could bring it together and then let our analysts ﬁgure it out
using SQL, which is very powerful, but pretty simple to learn.
Better analytic platform
Gardner: Until the fairly recent past, it wasn't practical, both from a cost and technology
perspective, to try to get at all the data, but it has gotten to that point now. So when you are
looking at all of the different data that you can bring to bear on a national election, in a big
country of hundreds of millions of people, what were some of the issues that you did face when
you were looking at that goal of getting to all the data and therefore getting a better analytic
Wegrzyn: We hadn’t done it before. We had to ﬁgure it out as we were going along. The most
important realization that we made was that it wasn't going to be a huge
technology effort that was going to make this happen. It was going to be about
analysts. That’s a really generic term. Maybe it's data scientists or something,
but it's about people who were going to understand the political challenges,
understand something about the data, and go in and ﬁnd answers.
We structured our organization around being analyst-centric. We needed to build
tools and platforms, so that they could start working immediately and not wait on us
on the technology side to build the best system. It wasn’t about building the best system, but it
was about getting something where we could prototype rapidly.
Nothing that we did was worth doing if we couldn't get something into somebody's hands in a
week and then start reﬁning it. But we had to be able to move very, very quickly, because we
were just under a constant time crunch.
3. Gardner: I would imagine that in the ﬁnal two months and weeks of an election, things are
happening very rapidly, and to have a better sense of what the true situation on the ground is
gives you an opportunity to react to it.
It seems that in the past, it was a gut instinct. People were very talented and were paid very good
money to be able to try to distill this from a perspective of knowledge and experience. What
changed when you were able to bring the Vertica platform, big data, and real-time analysis to the
function of an election?
Wegrzyn: Just about everything. There isn't a part of the campaign that was untouched by us,
and in a lot of those places where gut ruled, we were able to bring in some numbers. This came
down from the top campaign manager, Jim Messina. Out of the gate, he was saying that we have
to put analytics in every part of the organization and we want to measure everything. That gave
us the mission and the freedom to go in and start thinking how we could change how this
But the campaign was driven. We tested emails relentlessly. A lot of our program was driven by
trying to ﬁgure out what works and then quantify that and go out and do more. One of our big
successes is the most traditional of the areas of campaigns nowadays, media buying.
There have been a bunch of articles that have come up recently talking about what the
campaign did. So I'm not giving anything away. We were able to take what we understood about
the electorate and who we wanted to communicate with. Rather than taking the traditional TV
buying approach, which was we're going to buy this broad demographic band, buy a lot of TV
news, and we are going to buy a lot of the stuff that's expensive and has high ratings amongst the
big demographics. That’s a lot of wasted money.
We were able to know more precisely who the people are that we want to target, which was the
biggest insight. Then, we were able to take that and ﬁgure out -- not the super creepy "we know
exactly what you are watching" level -- but at an aggregate level, what the people we want to
target are watching. So we could buy that, rather than buying the traditional stuff. That's like an
arbitrage opportunity. It’s cheaper for us, but it's way more valuable.
So we were able to buy the right stuff, because we had this insight into what our electorate was
like, and I think it made a big difference in how we bought TV.
Gardner: The results of your big data activities are apparent. As I recall, Governor Romney's
campaign, at one point, had a larger budget for media, and spent a lot of that. You had a more
effective budget with media, and it showed.
Another indication was that on election night, right up until the exit polls were announced, the
Republican side didn't seem to know very clearly or accurately what the outcome was going to
4. be. You seemed to have a better sense. So the stakes here are extremely high. What’s going to be
the next chapter for the coming elections, in two, and then four years along the cycle?
Wegrzyn: That’s a really interesting question, and obviously it's one that I have had to spend a
lot of time thinking about. The way that I think about the campaign in 2012 was one giant fancy
ofﬁce tower. We call it the Obama Campaign. When you have problems or decisions that have to
be made, that goes up to the top and then back down. It’s all a very controlled process.
We are tipping that tower on its side now for 2014. Instead of having one big organization, we
have to try to do this to 50, 100, maybe hundreds of smaller organizations that are going to have
conﬂicting priorities. But the one thing that they have in common now is they saw what we did
on the last campaign and they know that that's the future.
So what we have to do is take that and ﬁgure out how we can take this thing that worked very
well for this one big organization, one centralized organization, and spread it out to all of these
other organizations so that we can empower them?
They're going to have smaller staffs. They're going to have different programs. How do we
empower them to use the tools that we used and the innovations that we created to improve their
activity? It’s going to be a challenge.
Gardner: It’s interesting, there are parallels between what you're facing as a political
organization, with federation, local districts for Congress, races in the state level, and then of
course to the national ofﬁces as well. This is a parallel to businesses. Many businesses have a
large centralized organization and they also have distributed and federated business units,
perhaps in other countries for global companies.
Is there a feedback loop here, whereby one level of success, like you well demonstrated in
2012, leads to more of the federated, on-the-ground, distributed gathering and utilization of data
that also then feeds back to the larger organization, so that there's a virtual adoption pattern that
will beneﬁt across the ecosystem, is that something you are expecting?
Wegrzyn: Absolutely. Even within the campaign, once people knew that this tool was available,
that they could go into Vertica and just answer any question about the campaign's operation, it
transformed the way that people were thinking about it. It increased people's interest in applying
that to new areas. They were constantly coming at us with questions like, "Hey, can we do this?"
We didn't know. We didn’t have enough staff to do that yet.
One of our big advantages is that we've already had a lot of adoption throughout campaigns of
some of the data gathering. They understand that we have to gather this data. We don't know
what we are going to do with it, but we have them understanding that we have to gather it. It's
really great, because now we can start doing smart things with it.
5. And then they're going to have that immediate reaction like, "Wow, I can go in there now and I
can ﬁgure out something smart about all of the stuff that I put in and all of the stuff that I have
been collecting. Now I want more." So I think we're expecting that it will grow. Sometimes I lose
sleep about how that’s going to just grow and grow and grow.
Gardner: We think about that virtuous adoption cycle, more-and-more types of data, all the data,
if possible, being brought to bear. We saw this morning here at the Big Data Conference some
examples and use cases for the HAVEn approach for HP, which includes Vertica, Hadoop,
Autonomy IDOL, Security, and ArcSight types of products and services. Does that strike a chord
with you that you need to get at the data, but now that deﬁnition of the data is exploding and you
need to somehow come to grips with that?
Wegrzyn: That's something that we only started to dabble in, things like text analysis, like what
Autonomy can with that unstructured data, stuff that we only started to touch on on the
campaign, because it’s hard. We make some use of Hadoop in various parts of our setup.
We're looking to a future, where we bring in more of that unstructured intelligence, that
information from social media, from how people are interacting with our staff, with the
campaign in trying to do something intelligent with that. Our future is bringing all of those
systems, all of those ideas together, and exposing them to that ﬂeet of analysts and everybody
who wants it.
Gardner: Well, great. I'm afraid we'll have to leave it there. We've been learning about how big
data problems were handled in a handy fashion in the realm of political science. In fact, making
it more scientiﬁc.
We've seen how the Democratic National Committee leveraged big data to better understand and
predict voter behavior and what was going on on the ground during the 2012 national elections.
We have seen how they've deployed HP Vertica analytics platform to better provide analytics and
insights for their various analysts and the participants in the campaign.
So a big thank you to our guest, Chris Wegryzn, the Director of Data Architecture for the DNC in
Washington. Thanks so much, Chris.
Wegrzyn: Thank you.
Gardner: And thanks also to our audience for joining this special HP Discover Performance
Podcast coming to you from the HP Vertica Big Data Conference in Boston.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of
HP sponsored discussions. Thanks again for joining, and come back next time.
Listen to the podcast. Find it on iTunes. Sponsor: HP
6. Transcript of a BrieﬁngsDirect podcast on how a political campaign used big data to better
understand and predict voter behavior and what was going on on the ground during the 2012
national elections. Copyright Interarbor Solutions, LLC, 2005-2013. All rights reserved.
You may also be interested in:
MZI Healthcare Identiﬁes Big Data Patient Productivity Gems Using HP Vertica
Thought Leader Interview: HP's Global CISO Brett Wahlin on the future of Security and
Panel explains how CSC creates a tough cybersecurity posture against global threats
Risk and complexity: Businesses need to get a grip
Advanced IT monitoring Delivers Predictive Diagnostics Focus to United Airlines
CSC and HP team up to deﬁne the new state needed for comprehensive enterprise
BYOD brings new security challenges for IT: Allowing greater access while protecting
HP Vertica Architecture Gives Massive Performance Boost to Toughest BI Queries for