SlideShare a Scribd company logo
1 of 59
Download to read offline
1
Forecasting and Analysing the 2012 French
Presidential Election with Google Trends
2
Abstract
In the last ten years there has been a strong surge in the use of the Internet during election
campaigns. Google Trends gives political scientists an unprecedented access to non-intrusive data
on voters’ web queries. Although this data has been extensively used in medical and economic
research, surprisingly few political scientists use this data. Hyunyoung Choi and Hal Varian have
developed a ‘forecasting the present’ method to predict the next value of an economic indicator,
combining Google Trends with past economic data. This paper applies this method to predict the
results of the first round of the 2012 French Presidential election. The predictions of the election
results, obtained by combining daily polls with the Google queries for candidates, slightly
outperform the poll’s predictions. In the future, pollsters could therefore use the 100,000 Google
queries per day for candidates in France to improve their estimates and increase poll sensitivity to
opinion changes.
In addition to improving election predictions, Google Trends is used to analyse the turning points
in the campaign and the influence of French institutional rules on candidates’ strategies. The
analysis identifies three distinct phases. First, the incumbent effect dominates other
communication strategies. Then from the begining of January to mid March, major candidates
compete to control the campaign agenda. François Hollande started his campaign earlier and took
the lead in the issue ownership competition during this crucial phase. Finally, between mid March
and the election (22nd
March), the major candidates are constrained by the institutional rules
which impose equal media coverage for all candidates. Although Google Trends show a surge in
queries for minor candidates during this phase, voters’ interest does not materialise in poll
increases. The constraints on airtime nullified Sarkozy’s efforts to introduce new issues during
this final period, as all major candidates had stable Google queries. Given the budget constraints
of the French presidential election, this analysis suggests that major candidates should use their
limited resources as soon as possible, in order to push their preferred issues on the agenda before
the introduction of the media airtime limits.
3
Table Of Content
Abstract ............................................................................................................................................ 2
I. Introduction............................................................................................................................... 5
II. Literature review and background ............................................................................................ 7
A. The use of the Internet in political campaigns: the US as forerunner ............................... 7
B. Google Trends and political science ..................................................................................... 8
C. Literature on forecasting....................................................................................................... 9
1. Public opinion and election forecasting with Twitter ....................................................... 9
2. Forecasts based on Google Trends: the examples of medical research and then
economics. .............................................................................................................................. 10
D. Background to the 2012 French presidential election..................................................... 11
1. French presidential election ............................................................................................ 11
2. The Internet in French Politics........................................................................................ 12
III. Forecasting the election results with Google Trends .......................................................... 17
A. Data collection ................................................................................................................ 17
1. Google Trends................................................................................................................. 17
2. Polls................................................................................................................................. 21
B. First findings from the comparison of polls and Google trends ......................................... 23
1. Information surge for small candidates in the last 6 weeks ............................................ 23
2. Big variance in Google searches ..................................................................................... 25
3. Correlation between Google Trends and polls................................................................ 25
4. Comparison of weekly Google Trends with final election results. ................................. 26
C. Methodology....................................................................................................................... 28
1. “Nowcasting” the results of the election with Google Trends based on daily polls.
Method developed by Hyunyoung Choi and Hal Varian (2011)............................................ 28
2. Models............................................................................................................................. 29
D. Results............................................................................................................................. 32
4
1. Predictions with model 1................................................................................................. 32
2. Comparing predictions of election results....................................................................... 33
3. Testing the accuracy of the predictions using a one-sample z-test ................................. 34
4. Statistical distance (Χ2
sum) between model predictions and election results................ 35
IV. Analysis of the French presidential Campaign in the light of Google Trends: issue
ownership theory and candidate strategy given institutional constraints. ...................................... 36
A. Theoretical and institutional background for the analysis .............................................. 36
1. Institutional rules on media coverage during the presidential campaign ........................ 36
2. Issue ownership theory.................................................................................................... 37
B. Political strategies through the multiple phases of the campaign observed on Google
Trends ......................................................................................................................................... 37
1. 1st
Phase: The Incumbent Advantage.............................................................................. 38
2. 2nd
Phase: Competition and Issue Ownership ................................................................. 39
3. Third Phase: Information on minor candidates............................................................... 41
V. Conclusion .......................................................................................................................... 45
A. Google Trends provides slightly better predictions but has its limits. ............................ 45
B. Timing as the key to issue ownership ................................................................................. 46
Bibliography................................................................................................................................... 47
Table of illustrations....................................................................................................................... 51
Tables ............................................................................................................................................. 53
Appendices ..................................................................................................................................... 54
A. Fixed effect panel regression: inconclusive method ....................................................... 54
B. Predicting the turnout with Google Trends......................................................................... 56
C. Comparing websites now and in 2007: bridging the gap ?................................................. 57
D. Weekly Google trends and Timeline of the main events during the campaign. ............. 59
5
I. Introduction
As the Internet’s role is increasing in people’s lives, politicians have understood the importance of
campaigning on the web. Political scientists have studied how parties and leaders use the Internet
as a new means of communication to mobilize, raise funds or communicate. However, less
academic work has used the Internet as a way to predict the outcome of an election or analyse a
campaign.
In Western democracies, the Internet has become the second medium of information during
political campaigns. Nevertheless, the knowledge of voters’ political preferences as well as their
reaction to political proposals or debates comes essentially from real world data studies such as
polls. Parties and media have an ever growing appetite for information on public opinion but the
intrusiveness of these real world studies makes them costly and often biased.
The first platform of access to the Internet being the web search engine, significant information is
accessible through analysing web queries. This paper aims to explore how Google Trends data
can be used to increase the accuracy of election predictions as well as to analyse the key steps of a
campaign. It will focus on the case study of the French Presidential election in 2012 but the
methods used, which are inspired by economics and medical research, could be applied to most
elections. The only existing political science literature using Google Trends is focused on the U.S.
and hasn’t used any predictive econometric model.
The French election offers the advantage of staging ten candidates competing nationally which
provides a vast amount of data. The first round of the French presidential election is particularly
interesting for a Google Trends analysis because the universal suffrage ensures the candidates are
well known among the population and that therefore there are enough Google queries for each
candidate to have good Google Trends outputs. In addition, the multiplicity of candidacies means
there are more polls and Google Trends to compare which increases the likelihood of obtaining
significant results. Finally, the demand for election predictions is twofold. In France, it is
forbidden to publish polls after the end of the official political campaign, several days before the
elections, to prevent polls from influencing voters. Poll institutes are looking for less intrusive and
less costly information on voters. Google Trends could allow them to conduct fewer interviews
6
and rely on internet queries to give a revised estimate of their last poll over several days instead of
producing a costly new poll every day1
.
France’s very stringent institutional campaign rules offer a unique setting to analyze how
candidates’ strategies are constrained and how voters react to equal airtime and speaking time for
each candidate. Following on from the vast literature on the incumbent advantage and on
campaign issue ownership, this paper analyses the Google Trends queries to identify the different
steps of the campaign and the impact of candidates’ strategies on voters. Given the institutional
cap on campaign budgets in France, choosing the proper moment to allocate candidate’s resources
is a key to controlling the campaign agenda and performing well in the election.
After a literature review, this paper applies Choi and Varian’s ‘forecasting the present’ method
(2011) to predict the election results from daily Google Trends and daily IFOP polls. Then Google
Trends are used to identify the turning points of the campaign, highlight the impact of institutional
constraints of the campaign and analyse the political strategies and communication in terms of
issue ownership and negative narrative.
1
The cost of polling has been a big issue in the recent political debate because the right wing
party was accused of benefiting from its incumbent position to use public money to pay for the
polls it needed for its campaign.
7
II. Literature review and background
A. The use of the Internet in political campaigns: the US as
forerunner
The Internet has been widely used in election campaigns in the last ten years. As often in
information and technology, the US has been a leader. The 2004 US presidential campaign was
the first to demonstrate the potential power of the Internet in influencing campaign processes, if
not election outcomes (Vaccari, 2008). Endres and Warnick (2004) report that “by early summer
2003, some ten presidential campaigns had already established an active Web presence for the
2004 presidential race”.
Many academics have analyzed the 2008 US presidential campaign as a turning point. Barack
Obama had thousands of volunteers to maintain his website, mobilize voters and to organizing
fund-raising. However, the new use of the Internet as a political tool started with the Democrat
primaries. The Democrat candidates first used their websites to “create ideological unity,
involvement, and commitment among their supporters” (Pollarda, Chesebro, & Studinski, 2009).
In other words, a candidate’ website ensures the core functions of the campaign: motivate
supporters for concrete action and “convert undecided into supporters” (Bimber and Davis, 2003).
In addition to classical political objectives, websites were also used for fundraising. In that sense,
the US presidential election invented wide scale crowd-funding, a new source of financing based
on many very small donors (or investors). The amounts of money raised in the 2008 and 2012 US
presidential elections are unprecedented. According to BuisnessWeek, Barack Obama raised 690
million dollars online in 2012 out of the 720 million dollars of individual campaign spending
(Green, November 29, 2012).
The Internet has not increased activism massively but it has made it easier for small organizations
to contribute to a campaign and mobilize their members. One of the main debates left unanswered
is whether the lowering of the entrance cost has created a level playing field or widened the gap
between rich and poor candidates. The answer to this debate certainly depends on the country.
Indeed, until 2003 most of the research had failed to show any impact of new technologies on
political participation in other Western democracies (Ward and Vedel, 2006). After the 2008
presidential campaign, there has been much debate among academics about the impact of the
Internet use on civic and political engagement. Shelley Boulianne (2009) did a meta-analyses of
8
research in which she found that the Internet had an impact on engagement which decreases with
time.
Margolis and Resnick’s (2000) famously asserted “that ‘politics as usual’ would prevail online
and that a process of ‘normalization’ would empty the internet of most of its innovative potential”
(Vaccari, 2008). The situation is very different in two-party systems such as the US compared to
multi-party systems. In the USA, Republican and Democrat parties and candidates regularly
outperform their minor counterparts (Schweitzer, 2005). In France, “a quantitative study on
French parties’ websites conducted in 2000 (Sauger, 2002) found that relatively small forces, such
as the Parti Communiste Français (PCF), the Greens and the Union pour la Démocratie Française
(UDF) outperformed the three largest parties: the Gaullist Rassemblement pour la République
(RPR), the Parti Socialiste (PS), and the Front National (FN).” (Vaccari, 2008)
More generally, the adoption of information and communication technologies (ICTs) by political
parties has been found to depend on three broad factors: the technological development, the socio-
political environment (including electoral laws, types of election, and party system structure), and
internal variables (such as party resources, incentives, and philosophical orientation) (Nixon et al.,
2003). Thus, the factors that influence political actors’ adoption of ICTs seem to be more nuanced
than the mere availability of resources.
More surprisingly, the internet was widely used during the 2008 Democrat primaries to provide “a
foundation for tracking, if not predicting, the success of specific candidates at different stages in
the campaign process”(Pollarda, Chesebro, & Studinski, 2009). Indeed, forecasts of election
outcome were inaccurate (Jurkowitz, 2008; Kohut, 2008). Pollarda & al. describe digital
technologies as “unobtrusive measures of political success”. For example they argue that the
number of followers on MySpace.com or on Facebook and the number of hits on the candidates’
websites are better predictors than polls of political success.
B. Google Trends and political science
Google Trends has not been used extensively to analyze election results and all the research has
been done on US politics.
Catherine Lui, Panagiotis T. Metaxas and Eni Mustafaraj analyzed the 2008 and the 2010 US
Congressional elections and compared the Google searches in each state to the voting results.
They only use the raw Google trend data as a predictor: the candidate with the most Google
9
queries is predicted as the winner. Using this very basic model, they do not find that Google
Trends is a good predictor except in very specific cases. However the analysis of incumbency
effects and comparison with the polls of the New York Times give more interesting results (Lui,
Metaxas, & Musta, 2011).
Reilly & al. show that Google searches for state ballots one week before the 2008 US Presidential
election are well correlated with the actual participation to these ballots (Reilly, Richey, & Taylor,
2012). The authors test two ways of searching—by ballot name or by ballot topic— to correct
possible limitations of Google Trends. Most notably, search volumes depend a lot on the search
term used to generate the query shares as well as the “temporal coding decision” (monthly,
weekly or daily data).
These studies both underline the potential of Google Trends in predicting election results but the
mere calculation of correlations between Google queries and election results isn’t sufficient. It is
therefore necessary to look at how other disciplines have made the most of Google queries and
apply their methods to election forecasting.
C. Literature on forecasting
1. Public opinion and election forecasting with Twitter
Academics have studied twitter as a proxy for public opinion. O’Connor, Balasubramanyan,
Routledge, & Smith (2010) find a high correlation between surveys and Twitter messages on
political opinion in the US in 2008 and 2009. Given that twitter “replicates consumer confidence
and presidential job approval polls”, they believe Twitter analysis could be a supplement to
polling. However simple correlations are not conclusive enough and tweets alone cannot give
better results than polling because of higher uncertainty. Tumasjan, Sprenger, Sandner, & Welpe
(2010), made a more advanced Twitter analysis of the German elections. Not only do they find
that the number of messages mentionning a candidate or a party reflects the election results, but
even ‘joint mentions’ of two parties reflect the coalitions that form after the elections. These
results are very robust because they take into accout the difference between original tweets and
those retwitted. Although Sunstein (2007) had described the blogosphere as lacking a “pricing
system”, Welpe and al. (2010) find that “The size of the followership and the rate of retweets may
represent the Twittersphere's “currency” and provide it with its own kind of a pricing
mechanism.”
10
Previous studies ((Albrecht, Lübcke, & Hartig-Perschke, 2007); (Jansen, Zhang, Sobel, &
Chowdury, 2009)) find contradicting results but this can be explained by the increasing number of
twitter users through the years. Indeed in 2005 small parties were always found to be over
represented on twitter compared to the real political world. The increasing number of users makes
it a better indicator of political opinion.
A more recent unpublished study used the “forecasting the present” method (Choi, H.and Varian,
H., 2011) to predict the US Presidential Election 2012 using twitter data (Choy, Cheong, Nang
Laik, & Ping Shung, 2012).
2. Forecasts based on Google Trends: the examples of medical
research and then economics.
Google Trends was first famously used to detect influenza epidemics in the United States
(Ginsberg, J, Mohebbi, M., Patel, R, Brammer, L., Smolinski, M., Brilliant, L., 2009 ). Although
these epidemics have been studied through real world questionnaires, the use of Google Trends
gave very accurate results with only one day lag and at almost no cost. The method established a
relation between the number of Google queries for influenza symptoms and the number of cases
by looking at the past five years and then applied successfully the equation to forecast the present.
In 2009, Choi and Varian developed a new methodology to use web queries data as additional
information to predict the current value of an economic indicator which is usually published
monthly or quarterly:
“Use data you already have and add the data trends as incremental … to help in predicting the
present. For example, the volume of queries on automobile sales during the second week in June
may be helpful in predicting the June auto sales report which is released several weeks later in
July.”
Hal R. Varian, Google Chief Economist
They used this method that they call ‘predicting the present’ to forecast consumer behaviour
(Retail and Automotive sales, Travel destinations) (Choi, H.and Varian, H., 2011). This method
was then used to estimate the initial claims for unemployment benefits in the US (Askitas,
Nikolaos, and Klaus F. Zimmermann, 2009) and private consumption (Vosen & Schmidt, 2011).
11
Central Banks are starting to study the possibility of using Google Trends in addition to more
standard indicators of economic activity (the Australian Reserve Bank (Troy, Perera, & Sunner,
2012) and the Bank of England (McLaren & Shanbhorge, 2011)). In recent months, Google
Trends has been used with this method in new fields such as birth forecasting (D’Amuri,
Marcucci, & Billari, Forecasting Births Using Google., April 6, 2013) and trading behavior in
financial markets (Preis, Moat, & Stanley, 25th April 2013).
D. Background to the 2012 French presidential election
1. French presidential election
The French presidential election is a two round election, also called run-off voting, in which a
candidate can only be elected if he obtains the majority of votes. If no candidate gets the majority
of votes in the first round, the two best candidates are qualified for a second round.
In order to become candidates, citizens must obtain 500 signed nominations from elected officials.
There are 45,543 elected officials who can only support one candidate. Among the 500
nominations, no more than 10% can come from the same départment and a candidate must have
nominations from at least 30 different departments.
In 2012, the first round of the presidential election was held on 22nd
April 2012. There were 10
candidates:
N° Candidate Number of votes % registered % votes
1 Mme JOLY Eva 828 345 1,8 2,31
2 Mme LE PEN Marine 6 421 426 13,95 17,9
3 M. SARKOZY Nicolas 9 753 629 21,19 27,18
4 M. MÉLENCHON Jean-Luc 3 984 822 8,66 11,1
5 M. POUTOU Philippe 411 160 0,89 1,15
6 Mme ARTHAUD Nathalie 202 548 0,44 0,56
7 M. CHEMINADE Jacques 89 545 0,19 0,25
8 M. BAYROU François 3 275 122 7,12 9,13
9 M. DUPONT-AIGNAN Nicolas 643 907 1,4 1,79
10 M. HOLLANDE François 10 272 705 22,32 28,63
TABLE 1: RESULTS OF THE FIRST ROUND OF THE ELECTION (22ND
APRIL 2012)
12
2. The Internet in French Politics
One of the main difficulties of using data on internet users is the risk of biases coming from the
non-representativeness of Google users. Indeed most studies show that users of political websites
are more interested in politics and more partisan than the average population (Vaccari, 2008).
However, most research dates back to almost ten years ago when the internet was less used. This
part looks at statistics on the use of the Internet in French households, the use of Google in the
French population and the use of the internet as a source of information during the 2012
campaign.
a) Internet use in households and by individuals in 2012
Eurostat provides very detailed information on the use of the Internet by European households.
The 2012 report shows that 80% of the French population had access to internet at home (Seybert,
2012)
Internet connection households Broadband connection households
Year 2008 2010 2012 2008 2010 2012
France 62 74 80 57 66 77
TABLE 2: INDIVIDUALS WHO USED THE INTERNET, 2012
SOURCE : EUROSTAT
FIGURE 1: HISTOGRAM OF INTERNET USAGE (% OF INDIVIDUALS), 2012 (EUROSTAT)
13
b) Google in France: a quasi-monopoly as search engine in
France
In March 2013 in France, 91.7% of visits to websites initiated from search engines came from
Google.
TABLE 3: RELATIVE WEIGHT OF INTERNET SEARCH ENGINES IN FRANCE IN MARCH 2013.
SOURCE: AT INTERNET
The situation since the 2012 Presidential election hasn’t changed. Indeed a study by Médiamétrie-
eStat done in September 2011 finds similar results with 90.5% of searches done on Google.
This very important market share gives a quasi-monopoly to Google. These figures support the
idea of using internet searches as a proxy for the public opinion.
c) Internet and the 2012 French elections
Sylvain Brouard and Simona Zimmermann (2012) have studied the French media practices during
the 2012 presidential campaign compared to those of 2007. It appears that the use of the Internet
is on average more frequent (9 points increase), whereas on the contrary, the press declined: in
2012, the use of the national press lost 3 points and the use of the regional press lost 5 points. The
use of television and radio remained stable.
The use of the Internet as a primary source of information stays low but has increased
continuously between 2007 and 2012 from 5% to 13%. Meanwhile, the use of national press
regularly regresses 11% to 5% and the regional press is reduced from 9% to 3%. Thus, while the
use of the Internet has doubled that of the printed media has halved.
Television Radio Internet National printed Regional printed Free printed Neither
2007 58 16 5 10 8 1 0
2012 57 16 14 7 4 1 1
14
TABLE 4 THE PRIMARY SOURCE OF POLITICAL INFORMATION DURING ELECTION CAMPAIGNS
IN 2007 AND 2012.
Sources : Baromètre Politique Français : Septembre 2006-February 2007 ,
TNS Sofres - TriÉlec : Septembre 2011-Mar 2012
Radio, television and the Internet are almost equal as second source policy information. During
the 2012 campaign, 23% of respondents quoted the television as a second source of political
information, 23% the radio and 22% Internet. Then comes regional print media (15%), national
newspapers (12%) and the free press (3%).
Compared to the 2007 election campaign, the use of the internet as a second source of information
is much higher (12 points) in 2012.
Télévision Radio Internet National printed Regional printed Free printed Neither
2007 24 23 10 15 23 4 2
2012 23 23 22 12 15 3 2
TABLE 5: SECOND SOURCE OF POLITICAL INFORMATION DURING THE 2007 AND 2012 ELECTION
Sources : Baromètre Politique Français : septembre 2006-février2007 ,
TNS Sofres - TriÉlec : septembre 2011-mars 2012
d) Frequency of internet use and type of sites browsed.
Among people who use the Internet as a source of political information, the majority of
respondents consult daily (54%), while two-thirds (66%) look up information at least 5 days a
week on the Internet. These figures are considerably lower than those declaring a daily use of
television or radio but are still important enough to consider that daily Google search queries are
representative of the overall population.
15
FIGURE 2: FREQUENCY OF MONITORING INFORMATION ON THE INTERNET
SOURCE: TNS SOFRES – TRIÉLEC, SEPTEMBRE 2012- MARCH 2012
During the 2012 campaign, the websites most often used by internet user portals were generalists
such as Google, Yahoo, Orange, etc... (53%). National media websites are used by a quarter of
Internet users whereas online newspapers are used by 13%.
FIGURE 3: TYPES OF WEBSITES USED FOR INFORMATION ON THE ELECTION CAMPAIGN
SOURCE: TNS SOFRES – TRIÉLEC, SEPTEMBRE 2012- MARCH 2012
16
e) Possible bias by political activists
In addition to the usual biases related to the possible lack of representativeness, Google Trends
also suffers from an experiment bias. Indeed Mustafaraj and Metaxas (2009) underline the fact
that political activists have tried to influence web search results, “using link-bombing techniques
to raise negative web pages with contents close to their agendas to the top-10 search results”.
Google even admitted that this happened in the 2006 US Elections. However, the authors do not
find such effects of “gaming the search engines” in the 2008 US Congressional Elections. This
experiment effect is very common in behavioural economics because people act unnaturally
because they are in an experiment. Here the fact that internet is becoming a focus of attention
makes it vulnerable to individuals wanting to rig the number of searches for a candidate.
However, the important increase in usage in the past five years, decreases the likelihood of
malicious individuals being able to introduce a bias.
17
III. Forecasting the election results with Google Trends
A. Data collection
The variables for poll results are formatted as firstname_lastname whereas the variables for
Google Trends are formatted as lastname.
1. Google Trends
a) Google Trends raw data
Google Trends is a free and public service provided by Google Inc. that shows how often a term is
searched on the search engine relative to the total search-volume across a geographical region, for
a given language and time period. The data is given relative to the total number of researches,
corrected for the general use of Google. It is easily downloadable in CSV format but the number
of keywords is limited to 5 at a time.
FIGURE 4: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FIVE MAIN
CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012
For confidentiality reasons, Google does not disclose the absolute number of queries and the
results given by Google Trends are averaged, normalized and scaled. More specifically, Google
performs the following adjustments.
Firstly, Google only analyses a portion of the total number of web queries over the selected period
of time and geographic area chosen by the user in order to display results quickly. As a result,
queries made by only a few users over a short period of time are eliminated from the results.
18
Secondly, Google normalizes the results, which means that “the sets of data are divided by a
common variable to cancel out the variable's effect on the data” (Google, 2011). That way, it
eliminates the general “trend” resulting from the increase in Google’s usage as well as the
difference related to varying use of Google in different regions.
Finally, once the data has been normalized, it is scaled. Google divides all query numbers for the
period of time considered by the highest number of queries for that particular keyword during that
period. The results are then displayed as a percentage of the maximum for the period.
Google Trends data has been available since January 2004. Depending on the time span requested
and the popularity of the query, results are shown daily, weekly or monthly. However, Google
doesn’t provide any information on keywords that generate very few queries on the web.
Unfortunately, Google does not disclose the threshold for the minimum number of queries
required to appear in Google Trend results.
As a conclusion, we will consider that Google Trends provides the likelihood of a random user
searching for a particular keyword in a given location during a specified period of time.
b) Google trends data for the 2012 presidential election.
The data used in this paper is daily and weekly queries starting in October 2011- after the socialist
primaries (16 October 2011) and the birth of Sarkozy and Carla Bruni’s child Giulia (19th
October
2011)- until the first round of the election (22nd
April 2012).
Google trends only allows the comparison of 5 names but given that the results are displayed
relative to the highest point of the sample, it is possible to compare as many keywords as
necessary as long as candidate with the maximum queries stays as baseline. For example, looking
at Figure 4, Sarkozy has the highest peak (11-17 March 20122
; marked 100 by Google Trends)
between October 2011 and March 2012 and can be taken as baseline. Comparing all the 9 other
candidates with Sarkozy on the same time span enables a perfect comparison.
It is important to exclude April 2012 from the time span because on Election Day the queries are
so big that they change the scale and override all previous months (see Figure 5). It is however
2
For the first time a poll predicted Sarkozy ahead of Hollande in the first round
19
possible to solve the problem by taking data from two different time spans and rescaling to build
an extension by continuity (also called concatenation).
FIGURE 5: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FIVE MAIN
CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012
One of the main drawbacks of Google Trend results is that small outcomes are rounded up. This is
highly problematic for small candidates which are always around 1%. To counter this difficulty, it
is worth comparing the 4 smallest candidates among each other (Figure 6 which is much more
precise for small candidates) and to rescale the results by continuity with the results for main
candidates. This method gives very precise results for small candidates.
FIGURE 6: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FOUR
SMALLEST CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012
20
Once the Google Trend for all candidates has been rescaled and concatenated, the queries are
expressed as a percentage of the sum of queries for all candidates (Figure 7). This facilitates the
comparison with polls and election results which are also expressed as a share of the total. In
addition, the increase in the number of queries linked to the increase in political interest through
the campaign is erased.
FIGURE 7: GOOGLE TRENDS FOR ALL CANDIDATES DURING THE SIX MONTHS PRECEDING THE
ELECTION, RESCALED BY THE AUTHOR (% OF ALL QUERIES FOR CANDIDATES)
c) Estimation of the Average number of searches per month
(with Adwords)
Adwords is service provided by Google for businesses to increase the visibility of their website.
Although the service is costly, the website allows free estimates of the absolute number of queries
a site would have for a given keyword.
SARKOZY 550
HOLLANDE 1200
LE PEN 1000
MELENCHON 240
JOLY 256
TABLE 6 ESTIMATE OF THE NUMBER OF MONTHLY GOOGLE QUERIES (IN THOUSANDS) IN 2013.
SOURCE: GOOGLE ADWORDS
0%
10%
20%
30%
40%
50%
60%
06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012
sarkozy bayrou hollande melenchon joly
le pen cheminade dupont-aignan arthaud poutou
21
Taking into account these figures we can estimate the number of searches for all the candidates
at 3 million per month that is 100,000 per day. This enormous number of searches highlights
the potential of Google Trends compared to Polls which are usually based on 1000 interviews and
conducted every week. Google Trends gives access to the behaviour of one hundred thousand
people every day for free. By way of comparison, in France, polls cost 1€ per person and per
question (IFOP website). So the polls cost several thousand Euros. Learning how to use the results
of Google Trends would give more precise information (based on 100,000 queries), more
frequently and at a lower price.
2. Polls
a) Gathering all French polls
Most French poll institutes provide the results of their polls on their websites. Gathering the data
of the seven biggest poll institutes 3
for the 2012 French Presidential campaign gives the
following graph.
3
TNS , Harris, opionway, ifop, CSA, BVA and LH3
22
FIGURE 8: AGGREGATION OF ALL THE POLLS DURING THE SIX MONTHS PRECEDING THE
ELECTION (TNS , HARRIS, OPIONWAY, IFOP, CSA, BVA, LH3)
b) Restriction to daily IFOP polls only for regressions
Although it is useful to gather data from different poll institutes, the regressions and predictions
that follow are only based on IFOP polls. Indeed it is impossible to perform time series
regressions when several polls are published on the same day. Moreover different poll institutes
use different methods to correct the biases in their sample so using different polls in the same time
series would have added undesired discontinuities.
The IFOP (Institut Francais d’Opinion Publique or French Public Opinion Institute) published
very frequent polls during the last six months before the election held on 22nd
April 2012.
Between 3rd
November and 20th
April, 129 polls were conducted in 171 days.
More specifically, between January and April 2012, the IFOP institute carried out a poll on every
working day. However, in 2011 the polls were a bit less frequent. In order to perform the
regression daily and to use all the data provided by Google Trends, the daily time series of polls
0
5
10
15
20
25
30
35
40
20/10/2011 20/12/2011 20/02/2012 20/04/2012
Nathalie_Arthaud Philippe_Poutou Jean_Luc_Mélenchon
François_Hollande Eva_Joly François_Bayrou
Nicolas_Sarkozy Nicolas_Dupont_Aignan Marine_Le_Pen
Jacques_Cheminade
23
was completed using the value of the last existing poll for any missing value (when no poll was
conducted).
It is important to note that IFOP does a “rolling poll” which means that the poll published every
day represents the accumulated results of the last three days. About 350 people are interviewed
every day, but daily publishing is the result of the latest 1000 interviews. This method tends to
smooth the variations and increases slightly the uncertainty (variance) because the evolution
between two successive polls is only based on 350 interviews. This survey methodology does not
call into question the nowcasting method used but it is important to take it into account when
interpreting the results.
B. First findings from the comparison of polls and Google trends
1. Information surge for small candidates in the last 6 weeks
Figure 9 shows a comparison between daily Google Trends and daily polls for all candidates,
ranked by order of election results (black number). Although all candidates are not displayed with
the same scale, it is noteworthy that the 3 smallest candidates have a relatively very important
surge in the number of queries on Google starting mid March which doesn’t correlate with an
increase in polls or votes on Election Day. These surges can be explained by the fact that those
candidates are not very well know among voters and many people want to learn more about them
before voting even if they won’t vote for them. This information surge is also caused by the
institutional rules in France which impose an equal time in the media during the final weeks
before the elections.
24
FIGURE 9: COMPARISON OF WEEKLY GOOGLE TRENDS FOR ALL CANDIDATES WITH POLLS
FROM ALL INSTITUTES (TNS , HARRIS, OPIONWAY, IFOP, CSA, BVA, LH3)
25
2. Big variance in Google searches
FIGURE 10: VARIANCES OF GOOGLE TRENDS AND POLLS4
Figure 9 clearly highlights that Google Trends tend to vary more and be more uncertain than
polls. Indeed Google queries are much more influenced by the media and current events. Figure
10 highlights that variances on Google Trends are much higher than those on polls. More
interestingly, the variances of Google Trends for different candidates match the results of the
election. This can be interpreted in terms of communication because efficient or expensive
campaigns arouse the interest of a lot of people at the same time but for a short period of time,
which increases variance. Although campaign budgets in France are capped at 21 million Euros
and media coverage is regulated, minor candidates have less money and candidates usually
perform according to their ability to arouse interest when they master the political agenda.
3. Correlation between Google Trends and polls.
Correlation is the most basic statistical tool to compare the evolution of two time series without
taking into account the average. The correlation measures the linear dependence between the two
4
TNS , Harris, opionway, ifop, CSA, BVA, LH3
0
0,001
0,002
0,003
0,004
0,005
0,006
0,007
0,008
polls google trend
26
time series but can also be seen geometrically as the angular distance between vectors
representing the two time series.
The correlation between the daily Google Trends for all candidates (1548 observations) and daily
IFOP polls is 61% with a 0 p-value (see Appendices). This correlation means that the two times
series are relatively close and that they partly describe the same phenomenon or two effects that
are related.
Correlations between polls and Google Trends for each candidate vary quite a lot (because of the
variance differences observed previously). While Jean-Luc Mélenchon’s Google Trends and polls
are correlated at 75.33% (p-value=0), Nicolas Sarkozy’s polls and queries only correlate at 21.6%
(p-value = 4.5%). For all candidates the correlation is significantly different from zero at the 5%
level.
Intuitively, if polls and Google Trends both described public opinion exactly they should vary
together and be perfectly correlated. Although the correlation is not perfect, especially because of
the strong variance of Google Trends which reacts more to current affairs, a 61% correlation is
convincing by political science standards. By way of comparison, O’Connor, Balasubramanyan,
Routledge, & Smith (2010), on the one hand, compare measures polls with sentiment measured
from Twitter and find most correlations around 70%. Reilly, Richey, & Taylor (2012), on the
other hand, want to establish that Google data taken a week before the election significantly
correlates with actual electoral participation on ballot measures. They find that Google queries
“data on ballot question names has a negative correlation of −.191 (p value = .02), and topic
searches are correlated at −.150 (p value = .06)” and conclude that the correlations are significant.
4. Comparison of weekly Google Trends with final election results.
In order to quantify the quality of Google Trend data to approximate public opinion (here polls),
this paper proposes to use a statistical distance taken from the χ² test of adjustment to a given
probability law. The test is originally designed to check whether two distribution functions are
from the same random phenomenon (same probability law). The test calculates an χ² sum or
distance between the two distribution functions which should follow an χ² law. In our case the
Google Trends are not derived from real tries so we cannot perform the test completely but we
can calculate the statistical distance between the two time-series and analyze the evolution.
27
is the share or Google trend for candidate j at time t
is the poll results obtained by the candidate j at time t
FIGURE 11: Χ² SUM = STATISTICAL DISTANCE BETWEEN GOOGLE TRENDS AND POLLS
We could have compared the weekly searches (Google Trends) to the polls of the same week but
since the variation of the polls is much smaller than that of the search trends it wouldn’t have
improved the measure significantly.
Figure 11 shows that Google queries are closely related to votes (small distance of χ² sum)
between December 2011 and mid March 2012. Using Google Trends only, the election results
would therefore be best forecasted by the average share of queries during that period. However
these averages would be very bad predictors because they would only reflect very long term
trends disregarding the effect of the campaign. For this reason, this paper will only use Google
Trends combined with poll results to make predictions.
After the week of the 13th
March, searches for small candidates start to rise without any link to the
polls. People start being interested in small candidates they have not heard of before but won’t
change their vote. This increase in distance after mid March confirms the findings of paragraph 1)
on the information surge in Google Trends during that period without any changes in polls.
0,00
0,50
1,00
1,50
2,00
2,50
X² sum
28
C. Methodology
1. “Nowcasting5” the results of the election with Google Trends
based on daily polls. Method developed by Hyunyoung Choi and Hal
Varian (2011)
a) Model
Hyunyoung Choi and Hal Varian have used Google Trends to predict present values of many
economic indicators which are published with a one to six months delay. Their idea is to forecast
a time series using its own lagged values and add Google Trends data as a predictor. From an
econometric point of view, the method requires 1) identifying the structure of the time series of
both the polls and the Google Trends, 2) regressing the time series of polls on the Google Trends
and their lags6
and finally 3) predicting the election with the fitted values of the regression.
b) Time series’ structure
In order to create a model that fits the data, we start by trying to fit a linear autoregressive model
to the poll results for each candidate. The partial autocorrelograph is typical of a time series
structured as an AR (1). This model provides a good fit for all candidates.
FIGURE 12: AUTOCORRELOGRAPH OF POLLS (IFOP) FOR NICOLAS SARKOZY BETWEEN OCTOBER
2011 AND APRIL 2012
5
Predicting the present
6
The lags are limited to those identified as important in the structures of the time-series.
13 0.6048 0.0225 1412.4 0.0000
12 0.6303 0.0864 1344.3 0.0000
11 0.6573 -0.1367 1270.8 0.0000
10 0.6926 0.0735 1191.3 0.0000
9 0.7200 -0.0347 1103.6 0.0000
8 0.7518 -0.1311 1009.5 0.0000
7 0.7802 0.0579 907.5 0.0000
6 0.7985 0.0188 798.3 0.0000
5 0.8190 0.1252 684.63 0.0000
4 0.8422 0.1015 565.76 0.0000
3 0.8763 -0.0774 440.82 0.0000
2 0.9193 0.0090 306.38 0.0000
1 0.9595 0.9625 159.28 0.0000
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]
-1 0 1 -1 0 1
. corrgram nicolas_sarkozy
29
The structure of the time series of Google Trends is less clear than that of the time series of polls
and varies among candidates. For some candidates, such as Francois Hollande (Figure below), the
PACs are not null for lags 2, 3 and 4 whereas they are null for Marine Le Pen. The first partial
autocorrelation being always very significantly positive, the basic model used for this paper will
only use the first lag of the Google Trend time series.
FIGURE 13: AUTOCORRELOGRAPH OF GOOGLE TRENDS FOR FRANCOIS HOLLANDE BETWEEN
OCTOBER 2011 AND APRIL 2012
FIGURE 14: AUTOCORRELOGRAPH OF GOOGLE TRENDS FOR MARINE LE PEN BETWEEN
OCTOBER 2011 AND APRIL 2012
2. Models
a) Model 1
Yi,t : Poll result at day t for candidate i
Xi,t : Google Trend Search at day t for candidate i
For example for Nicolas Sarkozy, the fitted equation is as follows:
YSarkozy,t = 0,038 + 0,84 .YSarkozy,t-1 - 0,030 . XSarkozy,t + 0,040. XSarkozy,t-1
6 0.2416 0.0538 141.29 0.0000
5 0.2624 0.0088 130.77 0.0000
4 0.2483 0.1800 118.44 0.0000
3 0.2196 0.1330 107.45 0.0000
2 0.3494 -0.1892 98.909 0.0000
1 0.6651 0.6690 77.412 0.0000
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]
-1 0 1 -1 0 1
. corrgram hollande
6 0.1992 -0.0326 202.66 0.0000
5 0.2557 0.0348 195.5 0.0000
4 0.2989 0.0632 183.78 0.0000
3 0.3739 0.0331 167.87 0.0000
2 0.5181 -0.0675 143.11 0.0000
1 0.7401 0.7403 95.86 0.0000
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]
-1 0 1 -1 0 1
. corrgram lepen
30
FIGURE 15: REGRESSION OF THE MODEL FOR NICOLAS SARKOZY (STATA)
The coefficient for the lag of polls (αi) is preponderant and very significantly positive. This is
coherent with the fact that Google Trend data is only a correction of the polls to take into account
the number of searches on the internet.
Although the two coefficients measuring the influence of Google Trends (βi,0 βi,1) are small, they
are also significantly positive which confirms the correlation between Google Trends and the
polls and validates the model.
For most other candidates, however, βi,0 is not significantly different from 0 while βi,1 is
significantly positive (see Philippe Poutou below). This is coherent with the fact that the polls are
published with a one day lag. Although the polls are “rolling”, the variation captured between Yi,t
and Yi,t-1 only captures the variation linked to the day t-1, thus coefficient βi,0 βi,2 and βi,3
capturing the effects of Google Trend at t, t-2 and t-3 are not significantly positive.
FIGURE 16: REGRESSION OF MODEL 1 FOR PHILLIPE POUTOU (STATA)
_cons .038776 .0091262 4.25 0.000 .0207577 .0567943
L1. .0397088 .0084197 4.72 0.000 .0230852 .0563323
--. -.0295735 .0085533 -3.46 0.001 -.0464608 -.0126862
sarkozy
L1. .8357339 .0326496 25.60 0.000 .7712719 .900196
nicolas_sa~y
nicolas_sa~y Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total .047053679 169 .000278424 Root MSE = .0075
Adj R-squared = 0.7978
Residual .009343365 166 .000056285 R-squared = 0.8014
Model .037710314 3 .012570105 Prob > F = 0.0000
F( 3, 166) = 223.33
Source SS df MS Number of obs = 170
. reg nicolas_sarkozy L.nicolas_sarkozy sarkozy L.sarkozy
_cons .0004887 .000219 2.23 0.027 .0000563 .0009211
L1. .0389948 .0117764 3.31 0.001 .015743 .0622466
--. -.0197268 .0115815 -1.70 0.090 -.0425938 .0031401
poutou
L1. .7925791 .0451287 17.56 0.000 .7034748 .8816833
philippe_p~u
philippe_p~u Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total .001413609 168 8.4143e-06 Root MSE = .00164
Adj R-squared = 0.6810
Residual .000442931 165 2.6844e-06 R-squared = 0.6867
Model .000970679 3 .00032356 Prob > F = 0.0000
F( 3, 165) = 120.53
Source SS df MS Number of obs = 169
. reg philippe_poutou L.philippe_poutou poutou L.poutou
31
b) Model 1’ : Adding extra lags to the model is not conclusive
In order to confirm the structure of our data and the model chosen, it is interesting to look at the
impact of polls and Google trends two days before, using the following model:
FIGURE 17: REGRESSEION OF MODEL 1’ WITH EXTRA LAGS FOR PHILLIPE POUTOU (STATA)
For all candidates, neither Google searches from two days before, nor polls from two days before
have a significant effect7
. This confirms the fact that the polls are Auto-regressive (1) time series
and are correlated with queries from the day before (t-1).
c) Model 2 : taking into account the variations in Google
Trends instead of the share or searches
Yi,t = Poll result at day t for candidate i
Xi,t = Google Trend Search at day t for candidate i
7
and are not significantly different from 0.
_cons .0004313 .0002264 1.91 0.059 -.0000158 .0008785
L2. .0100565 .0123057 0.82 0.415 -.0142437 .0343567
L1. .0312457 .0162509 1.92 0.056 -.0008452 .0633366
--. -.0199154 .0118362 -1.68 0.094 -.0432884 .0034577
poutou
L2. .0822185 .0772793 1.06 0.289 -.0703861 .2348231
L1. .7181832 .0783327 9.17 0.000 .5634983 .872868
philippe_p~u
philippe_p~u Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total .001411905 167 8.4545e-06 Root MSE = .00165
Adj R-squared = 0.6798
Residual .000438575 162 2.7073e-06 R-squared = 0.6894
Model .00097333 5 .000194666 Prob > F = 0.0000
F( 5, 162) = 71.91
Source SS df MS Number of obs = 168
. reg philippe_poutou L.philippe_poutou L.L. philippe_poutou poutou L.poutou L.L.poutou
32
FIGURE 18: REGRESSION OF MODEL 2 FOR FRANCOIS HOLLANDE (STATA)
The results are much less convincing for this model 2 because p values for β and β1 are over 9% :
we cannot reject the null hypothesis of coefficients being null.
This suggests that the level of searches (model 1) gives more information that the mere variations
of Google searches over the past 3 days (model 2).
D. Results
1. Predictions with model 1
The fitted values from both models are then used to predict the results on election day.
FIGURE 19 FITTED VALUES FOR NICOLAS SARKOZY BASED ON THE REGRESSION OF MODEL 1
_cons .0120656 .0056694 2.13 0.035 .0008702 .023261
d3hollande -.0016746 .0014069 -1.19 0.236 -.0044529 .0011037
d2hollande .0001732 .0013953 0.12 0.901 -.0025822 .0029286
dhollande -.0000486 .0014114 -0.03 0.973 -.0028357 .0027386
L1. .9572066 .0196541 48.70 0.000 .9183953 .9960179
franois_ho~e
franois_ho~e Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total .059342784 166 .000357487 Root MSE = .00484
Adj R-squared = 0.9346
Residual .003790186 162 .000023396 R-squared = 0.9361
Model .055552598 4 .01388815 Prob > F = 0.0000
F( 4, 162) = 593.61
Source SS df MS Number of obs = 167
. reg franois_hollande L. franois_hollande dhollande d2hollande d3hollande
27,18%
26,71%
0%
20%
40%
60%
80%
20%
21%
22%
23%
24%
25%
26%
27%
28%
29%
30%
Polls fitted values Google Trend (right axis)
33
The fitted values drawn on this graph give us a predicted result for Sarkozy on election day of
26.71%, while the election result was in fact 27.18%.
Quite intuitively, when the Google Trend is relatively low the fitted values (or predicted values)
are below the polls because the model takes into account the fewer internet searches.
Although Google Trends have a high variance (different scale used on the graph) because
candidates can attract attention to themselves without attracting votes, the significance of βi,1
shows that the Google Trend is a good explanatory variable.
2. Comparing predictions of election results
FIGURE 20 PREDICTIONS OF MODEL 1 WITH DAILY IFOP POLLS AND GOOGLE TRENDS
results last poll
fitted values
model 1
fitted values
model 2
Mme JOLY Eva 2,31% 2,50% 2,55% 2,53%
Mme LE PEN Marine 17,90% 16,50% 16,52% 16,54%
M, SARKOZY Nicolas 27,18% 27,00% 26,71% 27,52%
M, MÉLENCHON Jean-Luc 11,10% 13,50% 13,39% 13,53%
M, POUTOU Philippe 1,15% 1,00% 0,98% 0,87%
Mme ARTHAUD Nathalie 0,56% 0,50% 0,47% 0,47%
M, CHEMINADE Jacques 0,25% 0,00% 0,04% 0,01%
M, BAYROU François 9,13% 10,00% 10,09% 10,09%
M, DUPONT-AIGNAN Nicolas 1,79% 1,50% 1,41% 1,42%
M, HOLLANDE François 28,63% 27,50% 27,55% 26,98%
TABLE 7: PREDICTIONS OF ELECTION RESULTS WITH THE TWO MODELS
0% 5% 10% 15% 20% 25% 30%
Mme JOLY Eva
Mme LE PEN Marine
M, SARKOZY Nicolas
M, MÉLENCHON Jean-Luc
M, POUTOU Philippe
Mme ARTHAUD Nathalie
M, CHEMINADE Jacques
M, BAYROU François
M, DUPONT-AIGNAN Nicolas
M, HOLLANDE François
last poll predictedvalues results
34
Looking at table 7, the fitted values of model 1 are similar predictors to the last polls. Depending
on the candidate, the last poll or the predicted value is closer to the final election result. Although
model 1 has not considerably improved the forecasting a statistical tool will tell us if this model
has significantly improved the prediction. Table 20 also show the predicted values from model 2
which are quite similar in magnitude but almost always worse predictors.
3. Testing the accuracy of the predictions using a one-sample z-test
The one-sample z-test is used to test whether a proportion observed in a population sample is
significantly different from the theoretical value in the total population, given the size of the
sample. In France most polls are done with 1000 interviews. A z test on a poll would say whether
we can consider that the poll result is significantly different from the election result for a 1000
people sample. In order to compare our predictions with those of comparable polls, we shall apply
the Z test as if our estimation had been obtained by a poll with 1000 interviews.
To be compared to a standard normal distribution. (n=1000)
n° Candidate Last polls Model 1 Model2
1 Mme JOLY Eva 32,77% 34,53% 34,08%
2 Mme LE PEN Marine 43,80% 43,60% 43,45%
3 M, SARKOZY Nicolas 27,55% 31,58% 29,79%
4 M, MÉLENCHON Jean-Luc 49,61% 49,48% 49,64%
5 M, POUTOU Philippe 33,59% 34,91% 40,00%
6 Mme ARTHAUD Nathalie 30,02% 32,34% 32,81%
7 M, CHEMINADE Jacques 47,17% 45,50% 46,91%
8 M, BAYROU François 41,51% 42,75% 42,70%
9 M, DUPONT-AIGNAN Nicolas 37,77% 40,96% 40,41%
10 Mme JOLY Eva 39,27% 38,80% 43,80%
TABLE 8: P-VALUES OF Z-TEST FOR SAMPLES OF 1000 PEOPLE.
According to the Z test, the two models seem to perform similarly to the last polls. However all of
these perform quite poorly because the average p-value of 40% means that there was only a 60%
chance of the candidates getting a result as far from the prediction as they got.
35
4. Statistical distance (Χ2 sum) between model predictions and
election results
Applying the same method as in paragraph B.4, we use the following χ² sum as a measure of the
distance between the model’s predictions and the results.
is the predicted result from the model or the poll obtained with 1000 interviews.
is the real result obtained by the candidate
j is the candidate
The χ² sum is a measure of the distance between the model’s predictions and the results
Polls Model 1 Model 2
Χ² sum 12,41 12,02 14,17
TABLE 9: DISTANCE BETWEEN THE PREDICTION OR POLLS AND THE ELECTION RESULTS
Predictions made by the model 1 have the smallest χ² sum and are therefore the closest to the real
results of the elections.
As a conclusion, the predictions made by model 1 are slightly better than those of the polls but
model 2 is, as expected, much worse.
36
IV. Analysis of the French presidential Campaign in the light of
Google Trends: issue ownership theory and candidate strategy given
institutional constraints.
In addition to improving predictions of election results, Google Trends can also be used to analyse
of the political campaign. After a brief background to present an institutional and theoretical
background, this part will use Google Trends to identify the turning points of the campaign,
highlight the impact of institutional constraints of the campaign and analyse the political strategies
and communication in terms of issue ownership and negative narrative.
A. Theoretical and institutional background for the analysis
1. Institutional rules on media coverage during the presidential
campaign
The French Constitutional Council erected political pluralism as a constitutional principle. It is on
this basis that the French High Council of Audiovisual (CSA) establishes the rules that govern
radio stations and television channels (national and local) during the presidential campaign. These
rules concern the speaking time and airtime (speeches, reports, analyses, etc..) of declared
candidates (people who have publicly expressed their willingness to participate to the election) or
potential candidates (those who have received important public support in favour of their
candidacy) as well as their supporters (anyone calling to vote in favour of a candidate).
The CSA rules published in the Official Journal on 6 December 2011 can be divided into three
periods:
January 1st
until the day of the publication of the list of official candidates (mid-
March 2012): the radio-television media must respect the principle of equity. Equity is
based primarily on the popularity of the candidate himself which is derived from past
election results and opinion polls.
From the date of publication of the official list of candidates (midMarch 2012) until April
8 2012 midnight, talk times should be equal but airtime must only respect the principle of
equity.
From April 9th
to May 4th
, 2012 midnight (official campaign), candidates must have the
same speaking time and the same airtime.
37
2. Issue ownership theory
Issue ownership is a theory developed to understand how presidential candidates compete for the
agenda in order to underline their strengths. The theory which was first developed by Petrocik
(1991, 1996) states that “candidates campaign on issues that confer an advantage in order to prime
their salience in the decisional calculus of the voters” (Benoit, Petrocik, & Hansen, 2004). The
rational for this theory is that parties have a reputation of being more capable of dealing with
some issues. Candidates use this reputation to increase their credibility by increasing the salience
of these issues or those that show their opponents weaknesses - issue trespassing (Damore, The
Dynamics of Issue Ownership in Presidential Campaigns, 2004).
As this paper has shown that Google Trends can be used as a good indicator of public opinion, it
will now use Google Trends as a way to analyze how agenda competition is constrained by the
institutional constraints of the French Presidential campaign.
B. Political strategies through the multiple phases of the campaign
observed on Google Trends
Looking at the daily Google queries for the main candidates during the campaign (Figure 21), it is
quite trivial to identify different phases of the campaign. The following three phases can be
distinguished:
first phase: incumbent advantage until the end of December 2011;
second phase: competition for voters between 1st
January and mid-march;
third phase: information seeking between mid march and the election.
38
FIGURE 21: THREE PHASES OF THE CAMPAIGN
DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES NOVEMBER 2011 - 22ND APRIL
1. 1st Phase: The Incumbent Advantage
During the first phase, which lasted until the beginning of January 2012 (four months before the
election), Nicolas Sarkozy had a clear advantage in terms of media coverage because of his
incumbent position. This advantage is very visible in terms of Google queries since he has an
average of 38% of queries. Hollande only got decent coverage on the 17th
November when he
revealed the team of his campaign (only day when he got more queries that Sarkozy with 37%).
The main surges in Google searches for Nicolas Sarkozy are international events: The G20
summit in Cannes (3rd
and 4th
November 2011), the Greek Crisis, the new government in Italy, the
Toulon speech and a meeting with Angela Merkel (3rd
December 2011).
Sarkozy Bayrou Hollande Melenchon Le Pen
37,9% 5,3% 19,7% 4,0% 17,7%
TABLE 10: AVERAGE DAILY GOOGLE QUERIES DURING THE 1ST
PHASE
0,0%
10,0%
20,0%
30,0%
40,0%
50,0%
60,0%
70,0%
sarkozy bayrou hollande melenchon lepen
1st phase:
Incumbent Advantage
2nd phase:
Competition
3rd Phase:
Information
39
FIGURE 22: DAILY GOOGLE SEARCHES FOR MAIN CANDIDATES AT THE END OF YEAR 2011
Figure 22 highlights that the incumbent advantage lies mainly in the fact that during this first
phase the president keeps most of the agenda setting powers – especially media agenda. In line
with the issue ownership theory, Nicolas Sarkozy used this power to increase the salience of
international issues in order to reinforce his image as astatesman in contrast with his main
opponent, Francois Hollande, who had never held ministerial responsibilities.
These issues are typical of incumbent. In their study of broadcasts in national elections in
Germany (1990), the United States (1988), and France (1988), Holtz‐Bacha, Lee Kaid, &
Johnston (1994) show that iincumbents “were more likely … to show themselves consulting with
world leaders, and to try to represent the presidency (or chancellorship in the case of Germany) as
the standard bearer of governmental legitimacy”.
Nicolas Sarkozy used his incumbent advantage and devoted more time to the media than any of
his predecessors because he believed in the power of the media to set the agenda and influence
public opinion. (Kuhn, 12/2010). Moreover, during his term in office, Sarkozy developed formal
and informal structures and institutions of executive media management similar to those of the
US, UK and Germany. Sarkozy copied “agenda building and issue framing techniques practised
by political executives in other leading Western democracies” (Pfetsch, 2008).
2. 2nd Phase: Competition and Issue Ownership
The second phase goes on from the beginning of 2012, after the official New Year greetings up
until mid March. According to the Google searches, this phase is the key phase of the campaign
G 20
Toulon speech,
meeting Merkel
New Year
Greetings
Hollande's
team
0,0%
10,0%
20,0%
30,0%
40,0%
50,0%
60,0%
70,0%
3-nov.-2011 2-déc.-2011 31-déc.-2011
1st Phase : Incumbent Advantage
sarkozy bayrou hollande melenchon lepen
40
because issue salience and campaign agenda shifts very easily as voters are very reactive to the
candidates’ proposals - Google Trends have the greatest variance. During this phase candidates
fully compete and must choose how and when to use their limited resources. The institutional
constraints on candidates are not very important because media must only respect the principle of
equity (coverage proportionally to popularity).
FIGURE 23: DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES FROM JANUARY - 15TH
MARCH 2012
During this period the timing of proposals is crucial. On the one hand, the fact that TV and radio
airtime for big candidates is then more limited after mid march (shift from equity to equality)
means that candidates must use media extensively. On the other hand resources are limited and
candidates might not have any money left at the end of the campaign if they start too early.
Hollande started his campaign first and imposed his agenda (Figure 23). Between 24th
January and
2nd
February he had three very important Google search peaks related to big meetings and TV
shows. During that period he proposed the creation of a new level of income tax for high earners,
criticized the world of finance and then proposed a new 75% tax for people earning over 1 million
Euros. This choice validates the issue ownership theory because these issues were Sarkozy's
weaknesses.
Sarkozy announced important reforms on 29th
January (especially a VAT increase) but was still
acting as president and only announced his candidacy on 15th
February. Google Trends show that
0,0%
10,0%
20,0%
30,0%
40,0%
50,0%
60,0%
70,0%
5/1/12 12/1/12 19/1/12 26/1/12 2/2/12 9/2/12 16/2/12 23/2/12 1/3/12 8/3/12 15/3/12
2nd Phase: Competition
sarkozy bayrou hollande melenchon lepen
Bourget
meeting
Hollande
on TF1
Sarkozy
candidateSarkozy:
VAT increase
41
from then on he succeeded in attracting media attention but the main themes of the campaign were
those imposed by Hollande who had made his proposals earlier.
Marine Le Pen however does not follow these patterns. Contrary to the two main candidates she
was able to arouse voters’ interest all through the campaign but only for short periods of time.
Surprisingly, this is even the case during the first phase (incumbent advantage) when Le Pen often
arouses more interest that Hollande and occasionally more than Sarkozy.
The campaign strategies revealed by the analysis of Google Trends are in line with the issue
ownership theory for the two main candidates but what is rather unusual is that Francois Hollande
used negative narrative right from the beginning of his campaign. Damore (Candidate Strategy
and the Decision to Go Negative, 2002) performs a logit analysis of the UK general elections
between 1976 and 1996 and finds that the likelihood of a candidate going negative increases over
the course of the campaign and if the candidate is doing badly in polls. The 2012 French
presidential elections are therefore very unusual because Francois Hollande attacked his main
opponent right from the start and even though he had a clear lead in the polls. Downs (1957)
argues that to appeal to the largest segment of voters, candidates in a two-party system should cast
"some policies into the other's territory in order to convince voters that their net position is near
them." This might better explain Hollande’s strategy and would mean that even though there are
ten candidates in the first round of the French Presidential election, this second phase is almost
like a two-party contest.
3. Third Phase: Information on minor candidates
a) Major candidates
The third and last phase is from mid March until the election on the 22nd
April. During five weeks,
the Google queries for major candidates stayed constant whereas those for minor candidates
increased (Figure 24) - without polls changing accordingly.
42
FIGURE 24: DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES FROM 15TH MARCH 2012 -
22ND APRIL (ELECTION DAY)
The sudden stabilization of Google queries for major candidates between phase 2 (competition)
and phase 3 explains why it is so important for candidates to control agenda in the second phase.
After mid March, institutional rules impose the strict equality of speaking time among all
candidates so the agenda is difficult to control and minor candidates get relatively more attention.
Although the variations of queries for big candidates fall sharply after mid March, the share of
queries stays proportional to the popularity and to the final vote. Even after the 9th
April when the
official campaign starts and candidates get the exact same airtime, big candidates still get queries
proportionally to the poll results.
Labbé & Monière (2012) examined the diversity of candidates’ vocabulary -method of Labbé,
Labbé and Hubert (2004)- to assess the variation in themes of the candidate. During this third
phase and especially in the second half of March, Nicolas Sarkozy attempted to catch Hollande up
in opinion polls by bringing new proposals into the campaign (Figure 25) but the institutional
constraints prevented him from arousing more interest (no increase in Google queries: Figure 24).
This attempt to innovate is very common as candidates who are trailing in polls tend to “alter their
messages compared to issues traditionally championed by their party” (Damore, 2004)
0,0%
10,0%
20,0%
30,0%
40,0%
50,0%
60,0%
15/3/12 19/3/12 23/3/12 27/3/12 31/3/12 4/4/12 8/4/12 12/4/12 16/4/12 20/4/12
sarkozy bayrou hollande melenchon lepen
43
FIGURE 25 : SARKOZY AND UMP VOCABULARY GROWTH IN PRESS SINCE JANUARY 1, 2012.
NUMBER OF NEW WORDS (IN THOUSANDS), VARIABLE CENTERED AND REDUCED
(LABBÉ & MONIÈRE, 2012)
b) Minor candidates
Minor candidates benefit from the institutional rules and have an immediate surge in Google
queries when TVs and Radios give them the same speaking time as the main candidates (Figure
26).
FIGURE 26 DAILY GOOGLE SEARCHES FOR FOUR OF THE MINOR CANDIDATES FROM NOVEMBER
2011 TO 22ND
APRIL (ELECTION DAY)
The beginning of the official campaign (9th
April) and the beginning of airtime equality has no
effect on major candidates but is quite important for the three smallest candidates (Figure 27).
0,0%
5,0%
10,0%
15,0%
20,0%
cheminade dupontaignan arthaud poutou
1st phase: 2nd phase:
Competition
3rd Phase:
information
44
FIGURE 27: DAILY GOOGLE SEARCHES FOR SMALL CANDIDATES (LESS THAN 2% OF VOTE)
FROM 15TH MARCH 2012 - 22ND APRIL (ELECTION DAY)
0%
5%
10%
15%
20%
15-mars 20-mars 25-mars 30-mars 4-avr. 9-avr. 14-avr. 19-avr.
cheminade dupontaignan arthaud poutou
Official Campaign:
equal speaking time
equal airtime
equal speaking time
airtime proportional to popularity
45
V. Conclusion
A. Google Trends provides slightly better predictions but has its
limits.
The predictions of the French Presidential election in 2012 obtained by a combination of the polls
and Google Trends using the ‘predicting the present’ method are slightly better than those
obtained using the polls alone. This is promising for pollsters who could benefit from this method
to lower the costs of polls by reducing the frequency of interviews.
In addition to predictions, the very significant correlations found between Google Trends and
polls (60% with p-value=0) validates the use of Google Trends as a means of otaining an
estimation of public interest or even public opinion. Indeed Google Trends’s power lies in the
number of queries (about 100,000 per day only on the candidates’ names in France) and the
frequency of the data. Polls are often published two days after the interviews begin and cannot be
published at the end of the French Presidential campaign whereas Google Trends provides data
every day. Moreover, this study only used Google queries for the candidates’ names but there are
endless research opportunities in the use of other keywords.
However, there are also several limitations to the use of Google Trends. Although the number of
internet users has increased a lot in recent years (80% of households in 2012) and Google has a
92% market share in France, it is debatable whether the people who use the Internet are
representative of the whole population. Indeed only 36% of French people say they use the
Internet as a source of information during Political campaigns. Another drawback of Google
Trends is the fact that it is very difficult to distinguish between positive and negative queries for a
candidate. This might be a source of strong bias in this study because French presidential
campaign in 2012 was a particularly negative campaign in which many people voted against the
incumbent, Nicolas Sarkozy, rather than in favour of a candidate (Jeambar, 2012). Research done
on the use of Twitter in German elections (Tumasjan, Sprenger, Sandner, & Welpe, 2010)
distinguishes negative tweets but suffers much more from the representativeness bias. Further
research using additional functions of Google Trends such as “related terms” –which gives the
main words associated with the keyword– could minimize this bias and obtain an even better
estimation of public opinion. Nevertheless, it is very likely that negative Google queries for
competing candidates cancel each other overall. A final limitation might be the experiment bias
outlined by Mustafaraj and Metaxas (2009) (see II.D.2.e). This bias occurs if political activists
influence Google Trends by making thousands of queries. This is possible on a given website but
46
unlikely on Google given the vast number of queries per day and because Google Trends
eliminates multiple queries from the same person on the same day.
B. Timing as the key to issue ownership
The analysis of the campaign timeline highlights that the fact that the institutional rules abruptly
end the issue ownership competition between main candidates on March 15 th
by imposing a strict
equality of speaking time. The strict equality of airtime which is then introduced on April 9th
does
not however have any impact on major candidates. It is noteworthy that even when major
candidates lose their campaign agenda setting powers after mid March, the level of queries stays
relative to the importance of the candidates. The institutional rules create perfect equality on TV
and radio but the Internet still reveals differences among candidates. This confirms that although
Google queries might be influenced by media coverage– through media websites–, they also
reflect public opinion.
In terms of candidate strategy the analysis highlights the fact that Sarkozy did not succeed in
controlling the Agenda. He started his campaign very late (mid February) which cost him the
benefit of his incumbent advantage. Hollande imposed his themes in January and the first part of
February. Then, after mid March, Sarkozy was constrained by the institutional rules which
prevented him from introducing new issues in the campaign agenda. Even though he spent more
time than any of his predecessors dealing with communication, Sarkozy failed to maintain his
incumbent advantage in terms of agenda construction and issue framing because it has become
“impossible to exert effective management of the media, especially on social networks and
internet” (Kuhn, 12/2010).
47
Bibliography
Albrecht, S., Lübcke, M., & Hartig-Perschke, R. (2007). Weblog Campaigning in the German
Bundestag Election in 2005. Social Science Computer Review 25(4) , 504-520.
Armstrong, J. S., & Graefeb, A. (2011). Predicting elections from biographical information about
candidates: A test of the index method. Journal of Business Research, Volume 64, Issue 7 , 699–
706.
Askitas, Nikolaos, and Klaus F. Zimmermann. ( 2009). Google Econometrics and unemployment
Forecasting. Applied Economics Quarterly 55 , 107–20.
Benoit, W., Petrocik, J., & Hansen, G. (2004). Issue ownership and presidential campaigning,
1952-2000. Political Science Quarterly , 599.
Blais, A. (2004). How Many Voters Change Their Minds in the Month Preceding an Election?
American Political Science Association .
Boulianne, S. (2009). Does Internet Use Affect Engagement? A Meta-Analysis of Research.
Political Communication Volume 26, Issue 2 , 193-211.
Bourdeau, T. (19. April 2012). Les candidats passés à la moulinette d’internet. Abgerufen am 5.
August 2013 von http://www.rfi.fr/: http://www.rfi.fr/france/20120419-presidentielle-2012-
sarkozy-melenchon-hollande-le-pen-cheminade-internet-compuware
Bruter, Y. a. (2007). Electoral Behaviour. Encyclopedia of European elections , Basingstoke:
Palgrave Macmillan , pp.88-95.
Choe, S. H. (1980). Time of Decision and Media Use During the Ford-Carter Campaign. Public
Opinion Quaterly .
Choi, H.and Varian, H. (2011). Predicting the Present with Google Trends. Google Research Blog
.
Choy, M., Cheong, M., Nang Laik, M., & Ping Shung, K. (2012). US Presidential Election 2012
Prediction using Census Corrected Twitter Model.
D’Amuri, F., Marcucci, J., & Billari, F. (April 6, 2013). Forecasting Births Using Google.
Presentation at PAA Annual Meeting, 2013, New Orleans, Session 155:Methods and Models in
Fertility Research .
48
D’Amuri, F., Marcucci, J., & Billari, F. (April 6, 2013). Forecasting Births Using Google.
Presentation at PAA Annual Meeting, 2013, New Orleans, Session 155:Methods and Models in
Fertility Research .
Damore, D. F. (2002). Candidate Strategy and the Decision to Go Negative. Political Research
Quarterly , vol. 55 no. 3 p669-685.
Damore, D. F. (2004). The Dynamics of Issue Ownership in Presidential Campaigns. Political
Research Quarterly, Vol. 57, No. 3 , 391-397.
Elmelund-Præstekær, C. (2011). Issue ownership as a determinant of negative campaigning.
International Political Science Review , 209–221.
Ginsberg, J, Mohebbi, M., Patel, R, Brammer, L., Smolinski, M., Brilliant, L. (2009 ). Detecting
influenza epidemics using search engine query data. Nature, Vol. 457 , 19 .
Google. (2011). Google trend help. Abgerufen am 13. August 2013 von How does Google Trends
work?: https://support.google.com/trends/answer/87276?hl=en
Graefe, A., & Armstrong, J. S. (2013). Forecasting Elections from Voters' Perceptions of
Candidates' Ability to Handle Issues. Journal of Behavioral Decision Making, Volume 26, Issue 3
, 295–303.
Green, J. (November 29, 2012). The Science Behind Those Obama Campaign E-Mails.
BuisnessWeek .
Green, J., & Hobolt, S. B. (2008). Owning the issue agenda: Party strategies and vote choices in
British elections. Electoral studies .
Harrison, S., & Bruter, M. (2011). Ideology An Empirical Geography of the European Extreme
Right Research Officer. Palgrave Macmillan .
Holtz‐Bacha, C., Lee Kaid, L., & Johnston, A. (1994). Political Television Advertising in Western
Democracies: A Comparison of Campaign Broadcasts in the United States, Germany, and France.
Political Communication , 11:1, 67-80.
Ivaldi, G. (sep 2007). Presidential Strategies, Models of Leadership and the Development of
Parties in a Candidate-Centred Polity: The 2007 UMP and PS Presidential Nomination
Campaigns. French Politics , 253-277.
49
Jansen, B. J., Zhang, M., Sobel, K., & Chowdury. (2009). Twitter power: Tweets as electronic
word of mouth. Journal of the American Society for Information Science and Technology ,
60:120.
Jeambar, D. (2012). Explication de vote pour Francois Hollande (Explanation of the Vote for
Francois Hollande). Le Débat, Gallimard .
Kuhn, R. (12/2010). 'Les médias, c'est moi.' President Sarkozy and news media management.
French Politics Volume 8, Issue 4 , 355 - 376.
Labbé Cyril, Labbé Dominique and Hubert Pierre . (December 2004). Automatic Segmentation of
Texts and Corpora. Journal of Quantitative Linguistics , 11-3. p 193-213.
Labbé, D., & Monière, D. (2012). Radioscopies de la campagne présidentielle 2012: La course de
fond des candidats à l’élection présidentielle . www.trielec2012.fr (research notes) .
Lui, C., Metaxas, P. T., & Musta, E. (2011). On the predictability of US elections through search
volume activity. Department of Computer Science Wellesley College Wellesley, MA 02481 .
McLaren, N., & Shanbhorge, R. (2011). Using Internet Search Data as Economic Indicators. Bank
of England Quarterly Bulletin, Second Quarter.
O’Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From Tweets to
Polls: Linking Text Sentiment to Public Opinion Time Series. Proceedings of the International
AAAI Conference on Weblogs and Social Media (2010) Key: citeulike:7044833 .
O’Connor, B., Balasubramanyan, R., Routledge, B., Smith, A.,. (2010). From Tweets to Polls:
Linking Text Sentiment to Public Opinion Time Series. Washington. 1st International AAAI
Conference on Weblogs and Social Media .
Pollarda, T. D., Chesebro, J. W., & Studinski, D. P. (2009). The Role of the Internet in
Presidential Campaigns. Communication Studies Volume 60, Issue 5 , 574-588.
Preis, T., Moat, H. S., & Stanley, H. E. (25th April 2013). Quantifying Trading Behavior in
Financial Markets Using GoogleTrends. Nature .
Reilly, S., Richey, S., & Taylor, J. B. (2012). Using Google Search Data for State Politics
Research: An empirical validity Test Using Roll-Off Data. State Politics & Policy Quarterly .
50
Sano, Y., Yamada, K., Watanabe, H., Takayasu, H., & Takayasu, M. (January 2013). Empirical
analysis of collective human behavior for extraordinary events in the blogosphere. Physical
Review, 87, .
Serfaty, V. (May 2010). Web Campaigns: Popular Culture and Politics in the U.S. and French
Presidential Elections. Culture, Language, Representations , 115-129.
Seybert, H. (2012). Internet use in households and by individuals in 2012. Eurostat statistics focus
.
Sunstein. (2007). The blogosphere: Neither Hayek nor Habermas. Public Choice , 87-95.
Sylvain Brouard & Simona Zimmermann. (21. April 2012). Les pratiques médiatiques des
Français pendant la campagne présidentielle 2012. Centre E. Durkheim, Sciences Po Bordeaux .
Troy, G., Perera, D., & Sunner, D. (2012). Electronic Indicators of Economic Activity. Australian
Reserve Bank - Economic Bulletin June , 1-12.
Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting Elections with
Twitter:What 140 Characters Reveal about Political Sentiment. 4th International AAAI
Conference on Weblogs and Social Media .
Vaccari, C. (2008). Surfing to the Élysée: The Internet in the 2007 French Elections. French
Politics 6 , 1–22.
Vosen, S., & Schmidt, T. (2011). Forecasting Private Consumption: Survey-Based indicators vs.
Google Trends. Journal of Forecasting. 30 , 565–578.
51
Table of illustrations
Figure 1: Histogram of internet usage (% of individuals), 2012 (Eurostat)................................... 12
Figure 2: Frequency of monitoring information on the Internet .................................................... 15
Figure 3: Types of websites used for information on the election campaign Source: TNS Sofres –
TriÉlec, septembre 2012- marCH 2012......................................................................................... 15
Figure 4: Google Trends Dashboard - weekly Google queries for the five main candidates
between october 2011 and March 2012.......................................................................................... 17
Figure 5: Google Trends Dashboard - weekly Google queries for the five main candidates
between october 2011 and March 2012.......................................................................................... 19
Figure 6: Google Trends Dashboard - weekly Google queries for the four smallest candidates
between october 2011 and March 2012.......................................................................................... 19
Figure 7: Google Trends for all candidates during the six months preceding the election, rescALed
by the author (% of all queries for candidates)............................................................................... 20
Figure 8: Aggregation of all the polls during the six months preceding the election (TNS , Harris,
opionway, ifop, CSA, BVA, LH3)................................................................................................. 22
Figure 9: Comparison of weekly Google trends for all candidates with polls from all institutes
(TNS , Harris, opionway, ifop, CSA, BVA, LH3) ......................................................................... 24
Figure 10: Variances of Google Trends and Polls.......................................................................... 25
Figure 11: χ² sum = statistical distance between Google Trends and polls.................................... 27
Figure 12: Autocorrelograph of polls (IFOP) for Nicolas Sarkozy between october 2011 and april
2012................................................................................................................................................ 28
Figure 13: autocorrelograph of Google trends for Francois Hollande between october 2011 and
april 2012........................................................................................................................................ 29
Figure 14: Autocorrelograph of Google Trends for Marine Le Pen between october 2011 and april
2012................................................................................................................................................ 29
Figure 15: regression of the model for Nicolas Sarkozy (stata)..................................................... 30
Figure 16: regression of model 1 for Phillipe Poutou (stata) ......................................................... 30
Figure 17: regresseion of model 1’ with extra lags for Phillipe Poutou (stata).............................. 31
Figure 18: Regression of model 2 for Francois Hollande (stata) ................................................... 32
Figure 19 Fitted values for Nicolas Sarkozy based on the regression of model 1 ......................... 32
Figure 20 Predictions of model 1 with daily IFOP polls and google trends .................................. 33
Figure 21: Three phases of the campaign....................................................................................... 38
Figure 22: Daily Google searches for main candidates at the end of year 2011 ............................ 39
Figure 23: Daily Google searches for five main candidates from January - 15th
March 2012....... 40
52
Figure 24: Daily Google searches for five main candidates from 15th March 2012 - 22nd April
(election day).................................................................................................................................. 42
Figure 25 : Sarkozy and UMP Vocabulary growth in press since January 1, 2012. number of new
words (in thousands), variable centered and reduced..................................................................... 43
Figure 26 Daily Google searches for four OF THE Minor candidates from November 2011 to 22nd
April (election day) ........................................................................................................................ 43
Figure 27: Daily Google searches for small candidates (less than 2% of vote) from 15th March
2012 - 22nd April (election day) .................................................................................................... 44
Figure 28: fixed effect panel Data regresion output....................................................................... 54
Figure 29: correlation between polls and Google trends and their lags ......................................... 55
Figure 30: interest for elections and campaigns in France from Oct 2011 to may 2012................ 56
Figure 31: Interest in Politics in France in France from Oct 2011 to may 2012 ............................ 56
53
Tables
Table 1: Results of the first round of the election (22nd
April 2012).............................................. 11
Table 2: Individuals who used the internet, 2012........................................................................... 12
Table 3: relative weight of internet search engines in France in March 2013. Source: AT
internet 13
Table 4 The primary source of political information during election campaigns in 2007 and 2012.
........................................................................................................................................................ 14
Table 5: second source of political information during the 2007 and 2012 election ..................... 14
Table 6 estimate of the number of monthly google queries (in thousands) in 2013. source: Google
adwords .......................................................................................................................................... 20
Table 7: Predictions of election results with the two models ......................................................... 33
Table 8: P-values of z-test for samples of 1000 people.................................................................. 34
Table 9: distance between the prediction or polls and the elecTion results ................................... 35
Table 10: Average daily Google queries during the 1st
Phase........................................................ 38
54
Appendices
A. Fixed effect panel regression: inconclusive method
After reshaping polls and Google Trend data for each candidate (numbered from 1 to 10 as in
table 1 of this paper) into long with Stata, it is interesting to perform a panel data fixed effect
regression.
FIGURE 28: FIXED EFFECT PANEL DATA REGRESION OUTPUT
These results are difficult to interpret politically without the fixed effects for each candidate.
However, we can already note that the coefficient for Google Trends is significantly positive (t-
test of 5.42). This result is quite satisfactory because it suggests that Google Trends have an
explanatory power over the polls. Looking at the correlation table below confirms the link
between the Google trends and polls (overall correlation of 0,62 for all candidates).
F test that all u_i=0: F(8, 1514) = 523.68 Prob > F = 0.0000
rho .97112246 (fraction of variance due to u_i)
sigma_e .01220537
sigma_u .07077958
_cons .0562141 .0010687 52.60 0.000 .0541178 .0583105
L1. -.0085312 .0080348 -1.06 0.289 -.0242917 .0072294
--. .0438295 .0080849 5.42 0.000 .0279708 .0596883
trend
L1. .2064679 .011519 17.92 0.000 .1838731 .2290628
poll
poll Coef. Std. Err. t P>|t| [95% Conf. Interval]
corr(u_i, Xb) = 0.9440 Prob > F = 0.0000
F(3,1514) = 132.99
overall = 0.9190 max = 170
between = 0.9834 avg = 169.6
R-sq: within = 0.2086 Obs per group: min = 168
Group variable: candidat Number of groups = 9
Fixed-effects (within) regression Number of obs = 1526
. xtreg poll L.poll trend L.trend, fe
delta: 1 day
time variable: date2, 03nov2011 to 22apr2012
panel variable: candidat (strongly balanced)
. xtset candidat date2
55
FIGURE 29: CORRELATION BETWEEN POLLS AND GOOGLE TRENDS AND THEIR LAGS
The panel data regression was not be used in the rest of this paper because Google Trend data is
standardized over time and both the polls and the Google trends for each candidate are given in
percentage of the total which erases any time or cross sectional effects. We have therefore studied
the data for each candidate as autonomous time-series and then compared the predictions.
0.0000 0.0000 0.0000
L.trend 0.6174* 0.9357* 0.6109* 1.0000
0.0000 0.0000
L.poll 0.9719* 0.5918* 1.0000
0.0000
trend 0.6109* 1.0000
poll 1.0000
poll trend L.poll L.trend
. pwcorr poll trend L.poll L.trend, sig star(.001)
56
B. Predicting the turnout with Google Trends
The Category filter of Google Trends shows the change over time of of category of queries as a
percentage of growth, with respect to the first date on the graph (or the first date that has data).
Instead of a 0-100 label on the y-axis of the category comparison graph, the range is -100% to
+100%, and a starting point at 0.
The study of elections requires at least weekly results which limits Google Trends to 3 years at a
time (for a longer period, Google Trends only gives monthly results). Periods of two years can
very easily be linked together by continuity.
Looking at interest in Politics in France, it would be interesting to predict the turn-out of the
elections.
FIGURE 30: INTEREST FOR ELECTIONS AND CAMPAIGNS IN FRANCE FROM OCT 2011 TO MAY
2012
FIGURE 31: INTEREST IN POLITICS IN FRANCE IN FRANCE FROM OCT 2011 TO MAY 2012
57
C. Comparing candidates’ websites now and in 2007: bridging the
gap ?
The data reveals that, despite the media hype, online electioneering in France is still at an
intermediary stage, especially in terms of participation tools. Significant differences were found
among candidates and, especially, parties. The gap between large and small parties is found to be
greater than in most of similar country studies, thus providing new evidence against the internet's
ability to level the political playing field. Distinctive patterns of online electioneering emerge
between conservative and progressive parties and candidates (Vaccari, 2008)
PS Royal UMP Sarkozy UDF Bayrou FN Le Pen
Information (%) 72 60 81 63 41 50 63 60
Participation (%) 87 70 67 60 37 53 23 3
Professionalism (%) 70 65 61 74 39 61 61 43
Overall quality (%) 76 65 70 66 39 55 49 36
MAJOR PRESIDENTIAL CANDIDATES’ SITES AND THEIR PARTIES’ SITES, APRIL 2007 (VACCARI,
2008)8
8
“The information macro-section accounts for ‘pull’ (user-initiated) and ‘push’ (party-initiated)
information supply, and targeting of different groups of voters via dedicated tools. Participation
entails online interactivity, resource mobilization, and decentralization of communication. Finally,
professionalism is measured with respect to design and multimedia features, site accessibility,
navigability, and frequency of updates.”
58
PERFOMANCES OF CANDIDATES AND PARTIES’ WEBSITES DURING THE 2007 PRESIDENTIAL
CAMPAIGN.
AVAILIBILITY OF CANDIDATES’ WEBSITE (BOURDEAU, 2012)
RESPONSE TIME (IN SECONDES) OF THE CANDIDATES’ WEBSITE (BOURDEAU, 2012)
0
20
40
60
80
100
PS Royal UMP Sarkozy UDF Bayrou FN Le Pen
Information(%) Participation(%) Professionalism (%) Overallquality (%)
0 1 2 3 4 5 6
Bayrou
Joly
Sarkozy
Mélenc…
Hollande
Le Pen
Source : Compuware 2012
59
D. Weekly Google trends and Timeline of the main events during
the campaign.
31st
December: the new year greetings of the president,
29th
January 2012 Big television interview as president where he announced a VAT
increase
15th
February: Nicolas Sarkozy announces his candidacy.
2nd
March: Sarkozy at the opening of the Olympic games and changes his government
team.
7th
March: TV interview and debate with Laurent Fabuis on TV France 2
12th
March: TV Show 'Paroles de candidat” on TF1 with a panel of voters.
0%
20%
40%
60%
80%
100%
06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012
Google Trends for candidates during the six months
before the election
sarkozy bayrou hollande melenchon joly
le pen cheminade dupont-aignan arthaud poutou
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012
Google Trends for 5 main candidates during the six months
before the election (shares of weekly queries)
sarkozy bayrou hollande melenchon le pen

More Related Content

Viewers also liked

4 ltr powerpoint2010_ch21_pr1a_lisadanca_2
4 ltr powerpoint2010_ch21_pr1a_lisadanca_24 ltr powerpoint2010_ch21_pr1a_lisadanca_2
4 ltr powerpoint2010_ch21_pr1a_lisadanca_2Lisa Danca
 
34_UT_Graduation_Certificate
34_UT_Graduation_Certificate34_UT_Graduation_Certificate
34_UT_Graduation_CertificateHusam Alghamdi
 
Metas que tengo para el presente año
Metas que tengo para el presente año Metas que tengo para el presente año
Metas que tengo para el presente año luzpinta
 
Ecoturismo
EcoturismoEcoturismo
EcoturismoDanii I
 
Um vencedor - NELSON MANDELA
Um vencedor - NELSON MANDELAUm vencedor - NELSON MANDELA
Um vencedor - NELSON MANDELAFer Nanda
 
OC_Case Workflow_03242015
OC_Case Workflow_03242015OC_Case Workflow_03242015
OC_Case Workflow_03242015Michael Smith
 
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...Anibal Carrera
 
Färglära, kulörsystem, CIElab och ICC-profiler
Färglära, kulörsystem, CIElab och ICC-profilerFärglära, kulörsystem, CIElab och ICC-profiler
Färglära, kulörsystem, CIElab och ICC-profilerMax Holmberg
 
Memoria descriptiva 1
Memoria descriptiva 1Memoria descriptiva 1
Memoria descriptiva 1bdhv
 
Memoria descriptiva predio rural i
Memoria descriptiva predio rural iMemoria descriptiva predio rural i
Memoria descriptiva predio rural iCok Segundo
 
Réalisations France Souquès
Réalisations France SouquèsRéalisations France Souquès
Réalisations France SouquèsFrance Souques
 
Session 04 – field & collision effect
Session 04 – field & collision effectSession 04 – field & collision effect
Session 04 – field & collision effectTrí Bằng
 
Hobby clubs in jnv
Hobby clubs in jnvHobby clubs in jnv
Hobby clubs in jnvAnil Yadav
 

Viewers also liked (15)

4 ltr powerpoint2010_ch21_pr1a_lisadanca_2
4 ltr powerpoint2010_ch21_pr1a_lisadanca_24 ltr powerpoint2010_ch21_pr1a_lisadanca_2
4 ltr powerpoint2010_ch21_pr1a_lisadanca_2
 
Договор аренды
Договор арендыДоговор аренды
Договор аренды
 
34_UT_Graduation_Certificate
34_UT_Graduation_Certificate34_UT_Graduation_Certificate
34_UT_Graduation_Certificate
 
Metas que tengo para el presente año
Metas que tengo para el presente año Metas que tengo para el presente año
Metas que tengo para el presente año
 
Ecoturismo
EcoturismoEcoturismo
Ecoturismo
 
Um vencedor - NELSON MANDELA
Um vencedor - NELSON MANDELAUm vencedor - NELSON MANDELA
Um vencedor - NELSON MANDELA
 
OC_Case Workflow_03242015
OC_Case Workflow_03242015OC_Case Workflow_03242015
OC_Case Workflow_03242015
 
Ideándote Consulting
Ideándote ConsultingIdeándote Consulting
Ideándote Consulting
 
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...
21.de mayra regina para-anibal carrera asunto-notificación expediente no 1700...
 
Färglära, kulörsystem, CIElab och ICC-profiler
Färglära, kulörsystem, CIElab och ICC-profilerFärglära, kulörsystem, CIElab och ICC-profiler
Färglära, kulörsystem, CIElab och ICC-profiler
 
Memoria descriptiva 1
Memoria descriptiva 1Memoria descriptiva 1
Memoria descriptiva 1
 
Memoria descriptiva predio rural i
Memoria descriptiva predio rural iMemoria descriptiva predio rural i
Memoria descriptiva predio rural i
 
Réalisations France Souquès
Réalisations France SouquèsRéalisations France Souquès
Réalisations France Souquès
 
Session 04 – field & collision effect
Session 04 – field & collision effectSession 04 – field & collision effect
Session 04 – field & collision effect
 
Hobby clubs in jnv
Hobby clubs in jnvHobby clubs in jnv
Hobby clubs in jnv
 

Similar to 87692_GV499

Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...
Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...
Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...filippotrevisan
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAijnlc
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
 
Online media usage for political campaigning
Online media usage for political campaigningOnline media usage for political campaigning
Online media usage for political campaigningThomas Euler
 
Social Media and Events Report
Social Media and Events Report Social Media and Events Report
Social Media and Events Report XING EVENTS
 
A manager’s guide to assessing the impact of government social media interact...
A manager’s guide to assessing the impact of government social media interact...A manager’s guide to assessing the impact of government social media interact...
A manager’s guide to assessing the impact of government social media interact...Boris Loukanov
 
Google Benefit from News Content
Google Benefit from News ContentGoogle Benefit from News Content
Google Benefit from News ContentPortal Mediatelecom
 
Policies, Institutions, and Markets: Stronger Evidence for Better Decisions
Policies, Institutions, and Markets: Stronger Evidence for Better DecisionsPolicies, Institutions, and Markets: Stronger Evidence for Better Decisions
Policies, Institutions, and Markets: Stronger Evidence for Better DecisionsIFPRI-PIM
 
4차 산업혁명 시대의 싱크탱크의 변화(kdi)
4차 산업혁명 시대의 싱크탱크의 변화(kdi)4차 산업혁명 시대의 싱크탱크의 변화(kdi)
4차 산업혁명 시대의 싱크탱크의 변화(kdi)Sungho Lee
 

Similar to 87692_GV499 (20)

Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...
Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...
Mapping the 'Search Agenda' in Elections - ECREA Comms & Democracy 2013 Confe...
 
Google Trends
Google TrendsGoogle Trends
Google Trends
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
Online media usage for political campaigning
Online media usage for political campaigningOnline media usage for political campaigning
Online media usage for political campaigning
 
Social ROI report
Social ROI reportSocial ROI report
Social ROI report
 
The Social Media ROI Cookbook
The Social Media ROI CookbookThe Social Media ROI Cookbook
The Social Media ROI Cookbook
 
Final Dissertation
Final DissertationFinal Dissertation
Final Dissertation
 
Social Media and Events Report
Social Media and Events Report Social Media and Events Report
Social Media and Events Report
 
A manager’s guide to assessing the impact of government social media interact...
A manager’s guide to assessing the impact of government social media interact...A manager’s guide to assessing the impact of government social media interact...
A manager’s guide to assessing the impact of government social media interact...
 
Capsm twitter study 2010
Capsm twitter study 2010Capsm twitter study 2010
Capsm twitter study 2010
 
BOCAIP Writing Guide
BOCAIP Writing GuideBOCAIP Writing Guide
BOCAIP Writing Guide
 
Google Benefit from News Content
Google Benefit from News ContentGoogle Benefit from News Content
Google Benefit from News Content
 
Policies, Institutions, and Markets: Stronger Evidence for Better Decisions
Policies, Institutions, and Markets: Stronger Evidence for Better DecisionsPolicies, Institutions, and Markets: Stronger Evidence for Better Decisions
Policies, Institutions, and Markets: Stronger Evidence for Better Decisions
 
re
rere
re
 
The Triangulation of Truth
The Triangulation of TruthThe Triangulation of Truth
The Triangulation of Truth
 
4차 산업혁명 시대의 싱크탱크의 변화(kdi)
4차 산업혁명 시대의 싱크탱크의 변화(kdi)4차 산업혁명 시대의 싱크탱크의 변화(kdi)
4차 산업혁명 시대의 싱크탱크의 변화(kdi)
 
Google and others teldan 2 2011
Google and others teldan 2  2011Google and others teldan 2  2011
Google and others teldan 2 2011
 

87692_GV499

  • 1. 1 Forecasting and Analysing the 2012 French Presidential Election with Google Trends
  • 2. 2 Abstract In the last ten years there has been a strong surge in the use of the Internet during election campaigns. Google Trends gives political scientists an unprecedented access to non-intrusive data on voters’ web queries. Although this data has been extensively used in medical and economic research, surprisingly few political scientists use this data. Hyunyoung Choi and Hal Varian have developed a ‘forecasting the present’ method to predict the next value of an economic indicator, combining Google Trends with past economic data. This paper applies this method to predict the results of the first round of the 2012 French Presidential election. The predictions of the election results, obtained by combining daily polls with the Google queries for candidates, slightly outperform the poll’s predictions. In the future, pollsters could therefore use the 100,000 Google queries per day for candidates in France to improve their estimates and increase poll sensitivity to opinion changes. In addition to improving election predictions, Google Trends is used to analyse the turning points in the campaign and the influence of French institutional rules on candidates’ strategies. The analysis identifies three distinct phases. First, the incumbent effect dominates other communication strategies. Then from the begining of January to mid March, major candidates compete to control the campaign agenda. François Hollande started his campaign earlier and took the lead in the issue ownership competition during this crucial phase. Finally, between mid March and the election (22nd March), the major candidates are constrained by the institutional rules which impose equal media coverage for all candidates. Although Google Trends show a surge in queries for minor candidates during this phase, voters’ interest does not materialise in poll increases. The constraints on airtime nullified Sarkozy’s efforts to introduce new issues during this final period, as all major candidates had stable Google queries. Given the budget constraints of the French presidential election, this analysis suggests that major candidates should use their limited resources as soon as possible, in order to push their preferred issues on the agenda before the introduction of the media airtime limits.
  • 3. 3 Table Of Content Abstract ............................................................................................................................................ 2 I. Introduction............................................................................................................................... 5 II. Literature review and background ............................................................................................ 7 A. The use of the Internet in political campaigns: the US as forerunner ............................... 7 B. Google Trends and political science ..................................................................................... 8 C. Literature on forecasting....................................................................................................... 9 1. Public opinion and election forecasting with Twitter ....................................................... 9 2. Forecasts based on Google Trends: the examples of medical research and then economics. .............................................................................................................................. 10 D. Background to the 2012 French presidential election..................................................... 11 1. French presidential election ............................................................................................ 11 2. The Internet in French Politics........................................................................................ 12 III. Forecasting the election results with Google Trends .......................................................... 17 A. Data collection ................................................................................................................ 17 1. Google Trends................................................................................................................. 17 2. Polls................................................................................................................................. 21 B. First findings from the comparison of polls and Google trends ......................................... 23 1. Information surge for small candidates in the last 6 weeks ............................................ 23 2. Big variance in Google searches ..................................................................................... 25 3. Correlation between Google Trends and polls................................................................ 25 4. Comparison of weekly Google Trends with final election results. ................................. 26 C. Methodology....................................................................................................................... 28 1. “Nowcasting” the results of the election with Google Trends based on daily polls. Method developed by Hyunyoung Choi and Hal Varian (2011)............................................ 28 2. Models............................................................................................................................. 29 D. Results............................................................................................................................. 32
  • 4. 4 1. Predictions with model 1................................................................................................. 32 2. Comparing predictions of election results....................................................................... 33 3. Testing the accuracy of the predictions using a one-sample z-test ................................. 34 4. Statistical distance (Χ2 sum) between model predictions and election results................ 35 IV. Analysis of the French presidential Campaign in the light of Google Trends: issue ownership theory and candidate strategy given institutional constraints. ...................................... 36 A. Theoretical and institutional background for the analysis .............................................. 36 1. Institutional rules on media coverage during the presidential campaign ........................ 36 2. Issue ownership theory.................................................................................................... 37 B. Political strategies through the multiple phases of the campaign observed on Google Trends ......................................................................................................................................... 37 1. 1st Phase: The Incumbent Advantage.............................................................................. 38 2. 2nd Phase: Competition and Issue Ownership ................................................................. 39 3. Third Phase: Information on minor candidates............................................................... 41 V. Conclusion .......................................................................................................................... 45 A. Google Trends provides slightly better predictions but has its limits. ............................ 45 B. Timing as the key to issue ownership ................................................................................. 46 Bibliography................................................................................................................................... 47 Table of illustrations....................................................................................................................... 51 Tables ............................................................................................................................................. 53 Appendices ..................................................................................................................................... 54 A. Fixed effect panel regression: inconclusive method ....................................................... 54 B. Predicting the turnout with Google Trends......................................................................... 56 C. Comparing websites now and in 2007: bridging the gap ?................................................. 57 D. Weekly Google trends and Timeline of the main events during the campaign. ............. 59
  • 5. 5 I. Introduction As the Internet’s role is increasing in people’s lives, politicians have understood the importance of campaigning on the web. Political scientists have studied how parties and leaders use the Internet as a new means of communication to mobilize, raise funds or communicate. However, less academic work has used the Internet as a way to predict the outcome of an election or analyse a campaign. In Western democracies, the Internet has become the second medium of information during political campaigns. Nevertheless, the knowledge of voters’ political preferences as well as their reaction to political proposals or debates comes essentially from real world data studies such as polls. Parties and media have an ever growing appetite for information on public opinion but the intrusiveness of these real world studies makes them costly and often biased. The first platform of access to the Internet being the web search engine, significant information is accessible through analysing web queries. This paper aims to explore how Google Trends data can be used to increase the accuracy of election predictions as well as to analyse the key steps of a campaign. It will focus on the case study of the French Presidential election in 2012 but the methods used, which are inspired by economics and medical research, could be applied to most elections. The only existing political science literature using Google Trends is focused on the U.S. and hasn’t used any predictive econometric model. The French election offers the advantage of staging ten candidates competing nationally which provides a vast amount of data. The first round of the French presidential election is particularly interesting for a Google Trends analysis because the universal suffrage ensures the candidates are well known among the population and that therefore there are enough Google queries for each candidate to have good Google Trends outputs. In addition, the multiplicity of candidacies means there are more polls and Google Trends to compare which increases the likelihood of obtaining significant results. Finally, the demand for election predictions is twofold. In France, it is forbidden to publish polls after the end of the official political campaign, several days before the elections, to prevent polls from influencing voters. Poll institutes are looking for less intrusive and less costly information on voters. Google Trends could allow them to conduct fewer interviews
  • 6. 6 and rely on internet queries to give a revised estimate of their last poll over several days instead of producing a costly new poll every day1 . France’s very stringent institutional campaign rules offer a unique setting to analyze how candidates’ strategies are constrained and how voters react to equal airtime and speaking time for each candidate. Following on from the vast literature on the incumbent advantage and on campaign issue ownership, this paper analyses the Google Trends queries to identify the different steps of the campaign and the impact of candidates’ strategies on voters. Given the institutional cap on campaign budgets in France, choosing the proper moment to allocate candidate’s resources is a key to controlling the campaign agenda and performing well in the election. After a literature review, this paper applies Choi and Varian’s ‘forecasting the present’ method (2011) to predict the election results from daily Google Trends and daily IFOP polls. Then Google Trends are used to identify the turning points of the campaign, highlight the impact of institutional constraints of the campaign and analyse the political strategies and communication in terms of issue ownership and negative narrative. 1 The cost of polling has been a big issue in the recent political debate because the right wing party was accused of benefiting from its incumbent position to use public money to pay for the polls it needed for its campaign.
  • 7. 7 II. Literature review and background A. The use of the Internet in political campaigns: the US as forerunner The Internet has been widely used in election campaigns in the last ten years. As often in information and technology, the US has been a leader. The 2004 US presidential campaign was the first to demonstrate the potential power of the Internet in influencing campaign processes, if not election outcomes (Vaccari, 2008). Endres and Warnick (2004) report that “by early summer 2003, some ten presidential campaigns had already established an active Web presence for the 2004 presidential race”. Many academics have analyzed the 2008 US presidential campaign as a turning point. Barack Obama had thousands of volunteers to maintain his website, mobilize voters and to organizing fund-raising. However, the new use of the Internet as a political tool started with the Democrat primaries. The Democrat candidates first used their websites to “create ideological unity, involvement, and commitment among their supporters” (Pollarda, Chesebro, & Studinski, 2009). In other words, a candidate’ website ensures the core functions of the campaign: motivate supporters for concrete action and “convert undecided into supporters” (Bimber and Davis, 2003). In addition to classical political objectives, websites were also used for fundraising. In that sense, the US presidential election invented wide scale crowd-funding, a new source of financing based on many very small donors (or investors). The amounts of money raised in the 2008 and 2012 US presidential elections are unprecedented. According to BuisnessWeek, Barack Obama raised 690 million dollars online in 2012 out of the 720 million dollars of individual campaign spending (Green, November 29, 2012). The Internet has not increased activism massively but it has made it easier for small organizations to contribute to a campaign and mobilize their members. One of the main debates left unanswered is whether the lowering of the entrance cost has created a level playing field or widened the gap between rich and poor candidates. The answer to this debate certainly depends on the country. Indeed, until 2003 most of the research had failed to show any impact of new technologies on political participation in other Western democracies (Ward and Vedel, 2006). After the 2008 presidential campaign, there has been much debate among academics about the impact of the Internet use on civic and political engagement. Shelley Boulianne (2009) did a meta-analyses of
  • 8. 8 research in which she found that the Internet had an impact on engagement which decreases with time. Margolis and Resnick’s (2000) famously asserted “that ‘politics as usual’ would prevail online and that a process of ‘normalization’ would empty the internet of most of its innovative potential” (Vaccari, 2008). The situation is very different in two-party systems such as the US compared to multi-party systems. In the USA, Republican and Democrat parties and candidates regularly outperform their minor counterparts (Schweitzer, 2005). In France, “a quantitative study on French parties’ websites conducted in 2000 (Sauger, 2002) found that relatively small forces, such as the Parti Communiste Français (PCF), the Greens and the Union pour la Démocratie Française (UDF) outperformed the three largest parties: the Gaullist Rassemblement pour la République (RPR), the Parti Socialiste (PS), and the Front National (FN).” (Vaccari, 2008) More generally, the adoption of information and communication technologies (ICTs) by political parties has been found to depend on three broad factors: the technological development, the socio- political environment (including electoral laws, types of election, and party system structure), and internal variables (such as party resources, incentives, and philosophical orientation) (Nixon et al., 2003). Thus, the factors that influence political actors’ adoption of ICTs seem to be more nuanced than the mere availability of resources. More surprisingly, the internet was widely used during the 2008 Democrat primaries to provide “a foundation for tracking, if not predicting, the success of specific candidates at different stages in the campaign process”(Pollarda, Chesebro, & Studinski, 2009). Indeed, forecasts of election outcome were inaccurate (Jurkowitz, 2008; Kohut, 2008). Pollarda & al. describe digital technologies as “unobtrusive measures of political success”. For example they argue that the number of followers on MySpace.com or on Facebook and the number of hits on the candidates’ websites are better predictors than polls of political success. B. Google Trends and political science Google Trends has not been used extensively to analyze election results and all the research has been done on US politics. Catherine Lui, Panagiotis T. Metaxas and Eni Mustafaraj analyzed the 2008 and the 2010 US Congressional elections and compared the Google searches in each state to the voting results. They only use the raw Google trend data as a predictor: the candidate with the most Google
  • 9. 9 queries is predicted as the winner. Using this very basic model, they do not find that Google Trends is a good predictor except in very specific cases. However the analysis of incumbency effects and comparison with the polls of the New York Times give more interesting results (Lui, Metaxas, & Musta, 2011). Reilly & al. show that Google searches for state ballots one week before the 2008 US Presidential election are well correlated with the actual participation to these ballots (Reilly, Richey, & Taylor, 2012). The authors test two ways of searching—by ballot name or by ballot topic— to correct possible limitations of Google Trends. Most notably, search volumes depend a lot on the search term used to generate the query shares as well as the “temporal coding decision” (monthly, weekly or daily data). These studies both underline the potential of Google Trends in predicting election results but the mere calculation of correlations between Google queries and election results isn’t sufficient. It is therefore necessary to look at how other disciplines have made the most of Google queries and apply their methods to election forecasting. C. Literature on forecasting 1. Public opinion and election forecasting with Twitter Academics have studied twitter as a proxy for public opinion. O’Connor, Balasubramanyan, Routledge, & Smith (2010) find a high correlation between surveys and Twitter messages on political opinion in the US in 2008 and 2009. Given that twitter “replicates consumer confidence and presidential job approval polls”, they believe Twitter analysis could be a supplement to polling. However simple correlations are not conclusive enough and tweets alone cannot give better results than polling because of higher uncertainty. Tumasjan, Sprenger, Sandner, & Welpe (2010), made a more advanced Twitter analysis of the German elections. Not only do they find that the number of messages mentionning a candidate or a party reflects the election results, but even ‘joint mentions’ of two parties reflect the coalitions that form after the elections. These results are very robust because they take into accout the difference between original tweets and those retwitted. Although Sunstein (2007) had described the blogosphere as lacking a “pricing system”, Welpe and al. (2010) find that “The size of the followership and the rate of retweets may represent the Twittersphere's “currency” and provide it with its own kind of a pricing mechanism.”
  • 10. 10 Previous studies ((Albrecht, Lübcke, & Hartig-Perschke, 2007); (Jansen, Zhang, Sobel, & Chowdury, 2009)) find contradicting results but this can be explained by the increasing number of twitter users through the years. Indeed in 2005 small parties were always found to be over represented on twitter compared to the real political world. The increasing number of users makes it a better indicator of political opinion. A more recent unpublished study used the “forecasting the present” method (Choi, H.and Varian, H., 2011) to predict the US Presidential Election 2012 using twitter data (Choy, Cheong, Nang Laik, & Ping Shung, 2012). 2. Forecasts based on Google Trends: the examples of medical research and then economics. Google Trends was first famously used to detect influenza epidemics in the United States (Ginsberg, J, Mohebbi, M., Patel, R, Brammer, L., Smolinski, M., Brilliant, L., 2009 ). Although these epidemics have been studied through real world questionnaires, the use of Google Trends gave very accurate results with only one day lag and at almost no cost. The method established a relation between the number of Google queries for influenza symptoms and the number of cases by looking at the past five years and then applied successfully the equation to forecast the present. In 2009, Choi and Varian developed a new methodology to use web queries data as additional information to predict the current value of an economic indicator which is usually published monthly or quarterly: “Use data you already have and add the data trends as incremental … to help in predicting the present. For example, the volume of queries on automobile sales during the second week in June may be helpful in predicting the June auto sales report which is released several weeks later in July.” Hal R. Varian, Google Chief Economist They used this method that they call ‘predicting the present’ to forecast consumer behaviour (Retail and Automotive sales, Travel destinations) (Choi, H.and Varian, H., 2011). This method was then used to estimate the initial claims for unemployment benefits in the US (Askitas, Nikolaos, and Klaus F. Zimmermann, 2009) and private consumption (Vosen & Schmidt, 2011).
  • 11. 11 Central Banks are starting to study the possibility of using Google Trends in addition to more standard indicators of economic activity (the Australian Reserve Bank (Troy, Perera, & Sunner, 2012) and the Bank of England (McLaren & Shanbhorge, 2011)). In recent months, Google Trends has been used with this method in new fields such as birth forecasting (D’Amuri, Marcucci, & Billari, Forecasting Births Using Google., April 6, 2013) and trading behavior in financial markets (Preis, Moat, & Stanley, 25th April 2013). D. Background to the 2012 French presidential election 1. French presidential election The French presidential election is a two round election, also called run-off voting, in which a candidate can only be elected if he obtains the majority of votes. If no candidate gets the majority of votes in the first round, the two best candidates are qualified for a second round. In order to become candidates, citizens must obtain 500 signed nominations from elected officials. There are 45,543 elected officials who can only support one candidate. Among the 500 nominations, no more than 10% can come from the same départment and a candidate must have nominations from at least 30 different departments. In 2012, the first round of the presidential election was held on 22nd April 2012. There were 10 candidates: N° Candidate Number of votes % registered % votes 1 Mme JOLY Eva 828 345 1,8 2,31 2 Mme LE PEN Marine 6 421 426 13,95 17,9 3 M. SARKOZY Nicolas 9 753 629 21,19 27,18 4 M. MÉLENCHON Jean-Luc 3 984 822 8,66 11,1 5 M. POUTOU Philippe 411 160 0,89 1,15 6 Mme ARTHAUD Nathalie 202 548 0,44 0,56 7 M. CHEMINADE Jacques 89 545 0,19 0,25 8 M. BAYROU François 3 275 122 7,12 9,13 9 M. DUPONT-AIGNAN Nicolas 643 907 1,4 1,79 10 M. HOLLANDE François 10 272 705 22,32 28,63 TABLE 1: RESULTS OF THE FIRST ROUND OF THE ELECTION (22ND APRIL 2012)
  • 12. 12 2. The Internet in French Politics One of the main difficulties of using data on internet users is the risk of biases coming from the non-representativeness of Google users. Indeed most studies show that users of political websites are more interested in politics and more partisan than the average population (Vaccari, 2008). However, most research dates back to almost ten years ago when the internet was less used. This part looks at statistics on the use of the Internet in French households, the use of Google in the French population and the use of the internet as a source of information during the 2012 campaign. a) Internet use in households and by individuals in 2012 Eurostat provides very detailed information on the use of the Internet by European households. The 2012 report shows that 80% of the French population had access to internet at home (Seybert, 2012) Internet connection households Broadband connection households Year 2008 2010 2012 2008 2010 2012 France 62 74 80 57 66 77 TABLE 2: INDIVIDUALS WHO USED THE INTERNET, 2012 SOURCE : EUROSTAT FIGURE 1: HISTOGRAM OF INTERNET USAGE (% OF INDIVIDUALS), 2012 (EUROSTAT)
  • 13. 13 b) Google in France: a quasi-monopoly as search engine in France In March 2013 in France, 91.7% of visits to websites initiated from search engines came from Google. TABLE 3: RELATIVE WEIGHT OF INTERNET SEARCH ENGINES IN FRANCE IN MARCH 2013. SOURCE: AT INTERNET The situation since the 2012 Presidential election hasn’t changed. Indeed a study by Médiamétrie- eStat done in September 2011 finds similar results with 90.5% of searches done on Google. This very important market share gives a quasi-monopoly to Google. These figures support the idea of using internet searches as a proxy for the public opinion. c) Internet and the 2012 French elections Sylvain Brouard and Simona Zimmermann (2012) have studied the French media practices during the 2012 presidential campaign compared to those of 2007. It appears that the use of the Internet is on average more frequent (9 points increase), whereas on the contrary, the press declined: in 2012, the use of the national press lost 3 points and the use of the regional press lost 5 points. The use of television and radio remained stable. The use of the Internet as a primary source of information stays low but has increased continuously between 2007 and 2012 from 5% to 13%. Meanwhile, the use of national press regularly regresses 11% to 5% and the regional press is reduced from 9% to 3%. Thus, while the use of the Internet has doubled that of the printed media has halved. Television Radio Internet National printed Regional printed Free printed Neither 2007 58 16 5 10 8 1 0 2012 57 16 14 7 4 1 1
  • 14. 14 TABLE 4 THE PRIMARY SOURCE OF POLITICAL INFORMATION DURING ELECTION CAMPAIGNS IN 2007 AND 2012. Sources : Baromètre Politique Français : Septembre 2006-February 2007 , TNS Sofres - TriÉlec : Septembre 2011-Mar 2012 Radio, television and the Internet are almost equal as second source policy information. During the 2012 campaign, 23% of respondents quoted the television as a second source of political information, 23% the radio and 22% Internet. Then comes regional print media (15%), national newspapers (12%) and the free press (3%). Compared to the 2007 election campaign, the use of the internet as a second source of information is much higher (12 points) in 2012. Télévision Radio Internet National printed Regional printed Free printed Neither 2007 24 23 10 15 23 4 2 2012 23 23 22 12 15 3 2 TABLE 5: SECOND SOURCE OF POLITICAL INFORMATION DURING THE 2007 AND 2012 ELECTION Sources : Baromètre Politique Français : septembre 2006-février2007 , TNS Sofres - TriÉlec : septembre 2011-mars 2012 d) Frequency of internet use and type of sites browsed. Among people who use the Internet as a source of political information, the majority of respondents consult daily (54%), while two-thirds (66%) look up information at least 5 days a week on the Internet. These figures are considerably lower than those declaring a daily use of television or radio but are still important enough to consider that daily Google search queries are representative of the overall population.
  • 15. 15 FIGURE 2: FREQUENCY OF MONITORING INFORMATION ON THE INTERNET SOURCE: TNS SOFRES – TRIÉLEC, SEPTEMBRE 2012- MARCH 2012 During the 2012 campaign, the websites most often used by internet user portals were generalists such as Google, Yahoo, Orange, etc... (53%). National media websites are used by a quarter of Internet users whereas online newspapers are used by 13%. FIGURE 3: TYPES OF WEBSITES USED FOR INFORMATION ON THE ELECTION CAMPAIGN SOURCE: TNS SOFRES – TRIÉLEC, SEPTEMBRE 2012- MARCH 2012
  • 16. 16 e) Possible bias by political activists In addition to the usual biases related to the possible lack of representativeness, Google Trends also suffers from an experiment bias. Indeed Mustafaraj and Metaxas (2009) underline the fact that political activists have tried to influence web search results, “using link-bombing techniques to raise negative web pages with contents close to their agendas to the top-10 search results”. Google even admitted that this happened in the 2006 US Elections. However, the authors do not find such effects of “gaming the search engines” in the 2008 US Congressional Elections. This experiment effect is very common in behavioural economics because people act unnaturally because they are in an experiment. Here the fact that internet is becoming a focus of attention makes it vulnerable to individuals wanting to rig the number of searches for a candidate. However, the important increase in usage in the past five years, decreases the likelihood of malicious individuals being able to introduce a bias.
  • 17. 17 III. Forecasting the election results with Google Trends A. Data collection The variables for poll results are formatted as firstname_lastname whereas the variables for Google Trends are formatted as lastname. 1. Google Trends a) Google Trends raw data Google Trends is a free and public service provided by Google Inc. that shows how often a term is searched on the search engine relative to the total search-volume across a geographical region, for a given language and time period. The data is given relative to the total number of researches, corrected for the general use of Google. It is easily downloadable in CSV format but the number of keywords is limited to 5 at a time. FIGURE 4: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FIVE MAIN CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012 For confidentiality reasons, Google does not disclose the absolute number of queries and the results given by Google Trends are averaged, normalized and scaled. More specifically, Google performs the following adjustments. Firstly, Google only analyses a portion of the total number of web queries over the selected period of time and geographic area chosen by the user in order to display results quickly. As a result, queries made by only a few users over a short period of time are eliminated from the results.
  • 18. 18 Secondly, Google normalizes the results, which means that “the sets of data are divided by a common variable to cancel out the variable's effect on the data” (Google, 2011). That way, it eliminates the general “trend” resulting from the increase in Google’s usage as well as the difference related to varying use of Google in different regions. Finally, once the data has been normalized, it is scaled. Google divides all query numbers for the period of time considered by the highest number of queries for that particular keyword during that period. The results are then displayed as a percentage of the maximum for the period. Google Trends data has been available since January 2004. Depending on the time span requested and the popularity of the query, results are shown daily, weekly or monthly. However, Google doesn’t provide any information on keywords that generate very few queries on the web. Unfortunately, Google does not disclose the threshold for the minimum number of queries required to appear in Google Trend results. As a conclusion, we will consider that Google Trends provides the likelihood of a random user searching for a particular keyword in a given location during a specified period of time. b) Google trends data for the 2012 presidential election. The data used in this paper is daily and weekly queries starting in October 2011- after the socialist primaries (16 October 2011) and the birth of Sarkozy and Carla Bruni’s child Giulia (19th October 2011)- until the first round of the election (22nd April 2012). Google trends only allows the comparison of 5 names but given that the results are displayed relative to the highest point of the sample, it is possible to compare as many keywords as necessary as long as candidate with the maximum queries stays as baseline. For example, looking at Figure 4, Sarkozy has the highest peak (11-17 March 20122 ; marked 100 by Google Trends) between October 2011 and March 2012 and can be taken as baseline. Comparing all the 9 other candidates with Sarkozy on the same time span enables a perfect comparison. It is important to exclude April 2012 from the time span because on Election Day the queries are so big that they change the scale and override all previous months (see Figure 5). It is however 2 For the first time a poll predicted Sarkozy ahead of Hollande in the first round
  • 19. 19 possible to solve the problem by taking data from two different time spans and rescaling to build an extension by continuity (also called concatenation). FIGURE 5: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FIVE MAIN CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012 One of the main drawbacks of Google Trend results is that small outcomes are rounded up. This is highly problematic for small candidates which are always around 1%. To counter this difficulty, it is worth comparing the 4 smallest candidates among each other (Figure 6 which is much more precise for small candidates) and to rescale the results by continuity with the results for main candidates. This method gives very precise results for small candidates. FIGURE 6: GOOGLE TRENDS DASHBOARD - WEEKLY GOOGLE QUERIES FOR THE FOUR SMALLEST CANDIDATES BETWEEN OCTOBER 2011 AND MARCH 2012
  • 20. 20 Once the Google Trend for all candidates has been rescaled and concatenated, the queries are expressed as a percentage of the sum of queries for all candidates (Figure 7). This facilitates the comparison with polls and election results which are also expressed as a share of the total. In addition, the increase in the number of queries linked to the increase in political interest through the campaign is erased. FIGURE 7: GOOGLE TRENDS FOR ALL CANDIDATES DURING THE SIX MONTHS PRECEDING THE ELECTION, RESCALED BY THE AUTHOR (% OF ALL QUERIES FOR CANDIDATES) c) Estimation of the Average number of searches per month (with Adwords) Adwords is service provided by Google for businesses to increase the visibility of their website. Although the service is costly, the website allows free estimates of the absolute number of queries a site would have for a given keyword. SARKOZY 550 HOLLANDE 1200 LE PEN 1000 MELENCHON 240 JOLY 256 TABLE 6 ESTIMATE OF THE NUMBER OF MONTHLY GOOGLE QUERIES (IN THOUSANDS) IN 2013. SOURCE: GOOGLE ADWORDS 0% 10% 20% 30% 40% 50% 60% 06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012 sarkozy bayrou hollande melenchon joly le pen cheminade dupont-aignan arthaud poutou
  • 21. 21 Taking into account these figures we can estimate the number of searches for all the candidates at 3 million per month that is 100,000 per day. This enormous number of searches highlights the potential of Google Trends compared to Polls which are usually based on 1000 interviews and conducted every week. Google Trends gives access to the behaviour of one hundred thousand people every day for free. By way of comparison, in France, polls cost 1€ per person and per question (IFOP website). So the polls cost several thousand Euros. Learning how to use the results of Google Trends would give more precise information (based on 100,000 queries), more frequently and at a lower price. 2. Polls a) Gathering all French polls Most French poll institutes provide the results of their polls on their websites. Gathering the data of the seven biggest poll institutes 3 for the 2012 French Presidential campaign gives the following graph. 3 TNS , Harris, opionway, ifop, CSA, BVA and LH3
  • 22. 22 FIGURE 8: AGGREGATION OF ALL THE POLLS DURING THE SIX MONTHS PRECEDING THE ELECTION (TNS , HARRIS, OPIONWAY, IFOP, CSA, BVA, LH3) b) Restriction to daily IFOP polls only for regressions Although it is useful to gather data from different poll institutes, the regressions and predictions that follow are only based on IFOP polls. Indeed it is impossible to perform time series regressions when several polls are published on the same day. Moreover different poll institutes use different methods to correct the biases in their sample so using different polls in the same time series would have added undesired discontinuities. The IFOP (Institut Francais d’Opinion Publique or French Public Opinion Institute) published very frequent polls during the last six months before the election held on 22nd April 2012. Between 3rd November and 20th April, 129 polls were conducted in 171 days. More specifically, between January and April 2012, the IFOP institute carried out a poll on every working day. However, in 2011 the polls were a bit less frequent. In order to perform the regression daily and to use all the data provided by Google Trends, the daily time series of polls 0 5 10 15 20 25 30 35 40 20/10/2011 20/12/2011 20/02/2012 20/04/2012 Nathalie_Arthaud Philippe_Poutou Jean_Luc_Mélenchon François_Hollande Eva_Joly François_Bayrou Nicolas_Sarkozy Nicolas_Dupont_Aignan Marine_Le_Pen Jacques_Cheminade
  • 23. 23 was completed using the value of the last existing poll for any missing value (when no poll was conducted). It is important to note that IFOP does a “rolling poll” which means that the poll published every day represents the accumulated results of the last three days. About 350 people are interviewed every day, but daily publishing is the result of the latest 1000 interviews. This method tends to smooth the variations and increases slightly the uncertainty (variance) because the evolution between two successive polls is only based on 350 interviews. This survey methodology does not call into question the nowcasting method used but it is important to take it into account when interpreting the results. B. First findings from the comparison of polls and Google trends 1. Information surge for small candidates in the last 6 weeks Figure 9 shows a comparison between daily Google Trends and daily polls for all candidates, ranked by order of election results (black number). Although all candidates are not displayed with the same scale, it is noteworthy that the 3 smallest candidates have a relatively very important surge in the number of queries on Google starting mid March which doesn’t correlate with an increase in polls or votes on Election Day. These surges can be explained by the fact that those candidates are not very well know among voters and many people want to learn more about them before voting even if they won’t vote for them. This information surge is also caused by the institutional rules in France which impose an equal time in the media during the final weeks before the elections.
  • 24. 24 FIGURE 9: COMPARISON OF WEEKLY GOOGLE TRENDS FOR ALL CANDIDATES WITH POLLS FROM ALL INSTITUTES (TNS , HARRIS, OPIONWAY, IFOP, CSA, BVA, LH3)
  • 25. 25 2. Big variance in Google searches FIGURE 10: VARIANCES OF GOOGLE TRENDS AND POLLS4 Figure 9 clearly highlights that Google Trends tend to vary more and be more uncertain than polls. Indeed Google queries are much more influenced by the media and current events. Figure 10 highlights that variances on Google Trends are much higher than those on polls. More interestingly, the variances of Google Trends for different candidates match the results of the election. This can be interpreted in terms of communication because efficient or expensive campaigns arouse the interest of a lot of people at the same time but for a short period of time, which increases variance. Although campaign budgets in France are capped at 21 million Euros and media coverage is regulated, minor candidates have less money and candidates usually perform according to their ability to arouse interest when they master the political agenda. 3. Correlation between Google Trends and polls. Correlation is the most basic statistical tool to compare the evolution of two time series without taking into account the average. The correlation measures the linear dependence between the two 4 TNS , Harris, opionway, ifop, CSA, BVA, LH3 0 0,001 0,002 0,003 0,004 0,005 0,006 0,007 0,008 polls google trend
  • 26. 26 time series but can also be seen geometrically as the angular distance between vectors representing the two time series. The correlation between the daily Google Trends for all candidates (1548 observations) and daily IFOP polls is 61% with a 0 p-value (see Appendices). This correlation means that the two times series are relatively close and that they partly describe the same phenomenon or two effects that are related. Correlations between polls and Google Trends for each candidate vary quite a lot (because of the variance differences observed previously). While Jean-Luc Mélenchon’s Google Trends and polls are correlated at 75.33% (p-value=0), Nicolas Sarkozy’s polls and queries only correlate at 21.6% (p-value = 4.5%). For all candidates the correlation is significantly different from zero at the 5% level. Intuitively, if polls and Google Trends both described public opinion exactly they should vary together and be perfectly correlated. Although the correlation is not perfect, especially because of the strong variance of Google Trends which reacts more to current affairs, a 61% correlation is convincing by political science standards. By way of comparison, O’Connor, Balasubramanyan, Routledge, & Smith (2010), on the one hand, compare measures polls with sentiment measured from Twitter and find most correlations around 70%. Reilly, Richey, & Taylor (2012), on the other hand, want to establish that Google data taken a week before the election significantly correlates with actual electoral participation on ballot measures. They find that Google queries “data on ballot question names has a negative correlation of −.191 (p value = .02), and topic searches are correlated at −.150 (p value = .06)” and conclude that the correlations are significant. 4. Comparison of weekly Google Trends with final election results. In order to quantify the quality of Google Trend data to approximate public opinion (here polls), this paper proposes to use a statistical distance taken from the χ² test of adjustment to a given probability law. The test is originally designed to check whether two distribution functions are from the same random phenomenon (same probability law). The test calculates an χ² sum or distance between the two distribution functions which should follow an χ² law. In our case the Google Trends are not derived from real tries so we cannot perform the test completely but we can calculate the statistical distance between the two time-series and analyze the evolution.
  • 27. 27 is the share or Google trend for candidate j at time t is the poll results obtained by the candidate j at time t FIGURE 11: Χ² SUM = STATISTICAL DISTANCE BETWEEN GOOGLE TRENDS AND POLLS We could have compared the weekly searches (Google Trends) to the polls of the same week but since the variation of the polls is much smaller than that of the search trends it wouldn’t have improved the measure significantly. Figure 11 shows that Google queries are closely related to votes (small distance of χ² sum) between December 2011 and mid March 2012. Using Google Trends only, the election results would therefore be best forecasted by the average share of queries during that period. However these averages would be very bad predictors because they would only reflect very long term trends disregarding the effect of the campaign. For this reason, this paper will only use Google Trends combined with poll results to make predictions. After the week of the 13th March, searches for small candidates start to rise without any link to the polls. People start being interested in small candidates they have not heard of before but won’t change their vote. This increase in distance after mid March confirms the findings of paragraph 1) on the information surge in Google Trends during that period without any changes in polls. 0,00 0,50 1,00 1,50 2,00 2,50 X² sum
  • 28. 28 C. Methodology 1. “Nowcasting5” the results of the election with Google Trends based on daily polls. Method developed by Hyunyoung Choi and Hal Varian (2011) a) Model Hyunyoung Choi and Hal Varian have used Google Trends to predict present values of many economic indicators which are published with a one to six months delay. Their idea is to forecast a time series using its own lagged values and add Google Trends data as a predictor. From an econometric point of view, the method requires 1) identifying the structure of the time series of both the polls and the Google Trends, 2) regressing the time series of polls on the Google Trends and their lags6 and finally 3) predicting the election with the fitted values of the regression. b) Time series’ structure In order to create a model that fits the data, we start by trying to fit a linear autoregressive model to the poll results for each candidate. The partial autocorrelograph is typical of a time series structured as an AR (1). This model provides a good fit for all candidates. FIGURE 12: AUTOCORRELOGRAPH OF POLLS (IFOP) FOR NICOLAS SARKOZY BETWEEN OCTOBER 2011 AND APRIL 2012 5 Predicting the present 6 The lags are limited to those identified as important in the structures of the time-series. 13 0.6048 0.0225 1412.4 0.0000 12 0.6303 0.0864 1344.3 0.0000 11 0.6573 -0.1367 1270.8 0.0000 10 0.6926 0.0735 1191.3 0.0000 9 0.7200 -0.0347 1103.6 0.0000 8 0.7518 -0.1311 1009.5 0.0000 7 0.7802 0.0579 907.5 0.0000 6 0.7985 0.0188 798.3 0.0000 5 0.8190 0.1252 684.63 0.0000 4 0.8422 0.1015 565.76 0.0000 3 0.8763 -0.0774 440.82 0.0000 2 0.9193 0.0090 306.38 0.0000 1 0.9595 0.9625 159.28 0.0000 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] -1 0 1 -1 0 1 . corrgram nicolas_sarkozy
  • 29. 29 The structure of the time series of Google Trends is less clear than that of the time series of polls and varies among candidates. For some candidates, such as Francois Hollande (Figure below), the PACs are not null for lags 2, 3 and 4 whereas they are null for Marine Le Pen. The first partial autocorrelation being always very significantly positive, the basic model used for this paper will only use the first lag of the Google Trend time series. FIGURE 13: AUTOCORRELOGRAPH OF GOOGLE TRENDS FOR FRANCOIS HOLLANDE BETWEEN OCTOBER 2011 AND APRIL 2012 FIGURE 14: AUTOCORRELOGRAPH OF GOOGLE TRENDS FOR MARINE LE PEN BETWEEN OCTOBER 2011 AND APRIL 2012 2. Models a) Model 1 Yi,t : Poll result at day t for candidate i Xi,t : Google Trend Search at day t for candidate i For example for Nicolas Sarkozy, the fitted equation is as follows: YSarkozy,t = 0,038 + 0,84 .YSarkozy,t-1 - 0,030 . XSarkozy,t + 0,040. XSarkozy,t-1 6 0.2416 0.0538 141.29 0.0000 5 0.2624 0.0088 130.77 0.0000 4 0.2483 0.1800 118.44 0.0000 3 0.2196 0.1330 107.45 0.0000 2 0.3494 -0.1892 98.909 0.0000 1 0.6651 0.6690 77.412 0.0000 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] -1 0 1 -1 0 1 . corrgram hollande 6 0.1992 -0.0326 202.66 0.0000 5 0.2557 0.0348 195.5 0.0000 4 0.2989 0.0632 183.78 0.0000 3 0.3739 0.0331 167.87 0.0000 2 0.5181 -0.0675 143.11 0.0000 1 0.7401 0.7403 95.86 0.0000 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] -1 0 1 -1 0 1 . corrgram lepen
  • 30. 30 FIGURE 15: REGRESSION OF THE MODEL FOR NICOLAS SARKOZY (STATA) The coefficient for the lag of polls (αi) is preponderant and very significantly positive. This is coherent with the fact that Google Trend data is only a correction of the polls to take into account the number of searches on the internet. Although the two coefficients measuring the influence of Google Trends (βi,0 βi,1) are small, they are also significantly positive which confirms the correlation between Google Trends and the polls and validates the model. For most other candidates, however, βi,0 is not significantly different from 0 while βi,1 is significantly positive (see Philippe Poutou below). This is coherent with the fact that the polls are published with a one day lag. Although the polls are “rolling”, the variation captured between Yi,t and Yi,t-1 only captures the variation linked to the day t-1, thus coefficient βi,0 βi,2 and βi,3 capturing the effects of Google Trend at t, t-2 and t-3 are not significantly positive. FIGURE 16: REGRESSION OF MODEL 1 FOR PHILLIPE POUTOU (STATA) _cons .038776 .0091262 4.25 0.000 .0207577 .0567943 L1. .0397088 .0084197 4.72 0.000 .0230852 .0563323 --. -.0295735 .0085533 -3.46 0.001 -.0464608 -.0126862 sarkozy L1. .8357339 .0326496 25.60 0.000 .7712719 .900196 nicolas_sa~y nicolas_sa~y Coef. Std. Err. t P>|t| [95% Conf. Interval] Total .047053679 169 .000278424 Root MSE = .0075 Adj R-squared = 0.7978 Residual .009343365 166 .000056285 R-squared = 0.8014 Model .037710314 3 .012570105 Prob > F = 0.0000 F( 3, 166) = 223.33 Source SS df MS Number of obs = 170 . reg nicolas_sarkozy L.nicolas_sarkozy sarkozy L.sarkozy _cons .0004887 .000219 2.23 0.027 .0000563 .0009211 L1. .0389948 .0117764 3.31 0.001 .015743 .0622466 --. -.0197268 .0115815 -1.70 0.090 -.0425938 .0031401 poutou L1. .7925791 .0451287 17.56 0.000 .7034748 .8816833 philippe_p~u philippe_p~u Coef. Std. Err. t P>|t| [95% Conf. Interval] Total .001413609 168 8.4143e-06 Root MSE = .00164 Adj R-squared = 0.6810 Residual .000442931 165 2.6844e-06 R-squared = 0.6867 Model .000970679 3 .00032356 Prob > F = 0.0000 F( 3, 165) = 120.53 Source SS df MS Number of obs = 169 . reg philippe_poutou L.philippe_poutou poutou L.poutou
  • 31. 31 b) Model 1’ : Adding extra lags to the model is not conclusive In order to confirm the structure of our data and the model chosen, it is interesting to look at the impact of polls and Google trends two days before, using the following model: FIGURE 17: REGRESSEION OF MODEL 1’ WITH EXTRA LAGS FOR PHILLIPE POUTOU (STATA) For all candidates, neither Google searches from two days before, nor polls from two days before have a significant effect7 . This confirms the fact that the polls are Auto-regressive (1) time series and are correlated with queries from the day before (t-1). c) Model 2 : taking into account the variations in Google Trends instead of the share or searches Yi,t = Poll result at day t for candidate i Xi,t = Google Trend Search at day t for candidate i 7 and are not significantly different from 0. _cons .0004313 .0002264 1.91 0.059 -.0000158 .0008785 L2. .0100565 .0123057 0.82 0.415 -.0142437 .0343567 L1. .0312457 .0162509 1.92 0.056 -.0008452 .0633366 --. -.0199154 .0118362 -1.68 0.094 -.0432884 .0034577 poutou L2. .0822185 .0772793 1.06 0.289 -.0703861 .2348231 L1. .7181832 .0783327 9.17 0.000 .5634983 .872868 philippe_p~u philippe_p~u Coef. Std. Err. t P>|t| [95% Conf. Interval] Total .001411905 167 8.4545e-06 Root MSE = .00165 Adj R-squared = 0.6798 Residual .000438575 162 2.7073e-06 R-squared = 0.6894 Model .00097333 5 .000194666 Prob > F = 0.0000 F( 5, 162) = 71.91 Source SS df MS Number of obs = 168 . reg philippe_poutou L.philippe_poutou L.L. philippe_poutou poutou L.poutou L.L.poutou
  • 32. 32 FIGURE 18: REGRESSION OF MODEL 2 FOR FRANCOIS HOLLANDE (STATA) The results are much less convincing for this model 2 because p values for β and β1 are over 9% : we cannot reject the null hypothesis of coefficients being null. This suggests that the level of searches (model 1) gives more information that the mere variations of Google searches over the past 3 days (model 2). D. Results 1. Predictions with model 1 The fitted values from both models are then used to predict the results on election day. FIGURE 19 FITTED VALUES FOR NICOLAS SARKOZY BASED ON THE REGRESSION OF MODEL 1 _cons .0120656 .0056694 2.13 0.035 .0008702 .023261 d3hollande -.0016746 .0014069 -1.19 0.236 -.0044529 .0011037 d2hollande .0001732 .0013953 0.12 0.901 -.0025822 .0029286 dhollande -.0000486 .0014114 -0.03 0.973 -.0028357 .0027386 L1. .9572066 .0196541 48.70 0.000 .9183953 .9960179 franois_ho~e franois_ho~e Coef. Std. Err. t P>|t| [95% Conf. Interval] Total .059342784 166 .000357487 Root MSE = .00484 Adj R-squared = 0.9346 Residual .003790186 162 .000023396 R-squared = 0.9361 Model .055552598 4 .01388815 Prob > F = 0.0000 F( 4, 162) = 593.61 Source SS df MS Number of obs = 167 . reg franois_hollande L. franois_hollande dhollande d2hollande d3hollande 27,18% 26,71% 0% 20% 40% 60% 80% 20% 21% 22% 23% 24% 25% 26% 27% 28% 29% 30% Polls fitted values Google Trend (right axis)
  • 33. 33 The fitted values drawn on this graph give us a predicted result for Sarkozy on election day of 26.71%, while the election result was in fact 27.18%. Quite intuitively, when the Google Trend is relatively low the fitted values (or predicted values) are below the polls because the model takes into account the fewer internet searches. Although Google Trends have a high variance (different scale used on the graph) because candidates can attract attention to themselves without attracting votes, the significance of βi,1 shows that the Google Trend is a good explanatory variable. 2. Comparing predictions of election results FIGURE 20 PREDICTIONS OF MODEL 1 WITH DAILY IFOP POLLS AND GOOGLE TRENDS results last poll fitted values model 1 fitted values model 2 Mme JOLY Eva 2,31% 2,50% 2,55% 2,53% Mme LE PEN Marine 17,90% 16,50% 16,52% 16,54% M, SARKOZY Nicolas 27,18% 27,00% 26,71% 27,52% M, MÉLENCHON Jean-Luc 11,10% 13,50% 13,39% 13,53% M, POUTOU Philippe 1,15% 1,00% 0,98% 0,87% Mme ARTHAUD Nathalie 0,56% 0,50% 0,47% 0,47% M, CHEMINADE Jacques 0,25% 0,00% 0,04% 0,01% M, BAYROU François 9,13% 10,00% 10,09% 10,09% M, DUPONT-AIGNAN Nicolas 1,79% 1,50% 1,41% 1,42% M, HOLLANDE François 28,63% 27,50% 27,55% 26,98% TABLE 7: PREDICTIONS OF ELECTION RESULTS WITH THE TWO MODELS 0% 5% 10% 15% 20% 25% 30% Mme JOLY Eva Mme LE PEN Marine M, SARKOZY Nicolas M, MÉLENCHON Jean-Luc M, POUTOU Philippe Mme ARTHAUD Nathalie M, CHEMINADE Jacques M, BAYROU François M, DUPONT-AIGNAN Nicolas M, HOLLANDE François last poll predictedvalues results
  • 34. 34 Looking at table 7, the fitted values of model 1 are similar predictors to the last polls. Depending on the candidate, the last poll or the predicted value is closer to the final election result. Although model 1 has not considerably improved the forecasting a statistical tool will tell us if this model has significantly improved the prediction. Table 20 also show the predicted values from model 2 which are quite similar in magnitude but almost always worse predictors. 3. Testing the accuracy of the predictions using a one-sample z-test The one-sample z-test is used to test whether a proportion observed in a population sample is significantly different from the theoretical value in the total population, given the size of the sample. In France most polls are done with 1000 interviews. A z test on a poll would say whether we can consider that the poll result is significantly different from the election result for a 1000 people sample. In order to compare our predictions with those of comparable polls, we shall apply the Z test as if our estimation had been obtained by a poll with 1000 interviews. To be compared to a standard normal distribution. (n=1000) n° Candidate Last polls Model 1 Model2 1 Mme JOLY Eva 32,77% 34,53% 34,08% 2 Mme LE PEN Marine 43,80% 43,60% 43,45% 3 M, SARKOZY Nicolas 27,55% 31,58% 29,79% 4 M, MÉLENCHON Jean-Luc 49,61% 49,48% 49,64% 5 M, POUTOU Philippe 33,59% 34,91% 40,00% 6 Mme ARTHAUD Nathalie 30,02% 32,34% 32,81% 7 M, CHEMINADE Jacques 47,17% 45,50% 46,91% 8 M, BAYROU François 41,51% 42,75% 42,70% 9 M, DUPONT-AIGNAN Nicolas 37,77% 40,96% 40,41% 10 Mme JOLY Eva 39,27% 38,80% 43,80% TABLE 8: P-VALUES OF Z-TEST FOR SAMPLES OF 1000 PEOPLE. According to the Z test, the two models seem to perform similarly to the last polls. However all of these perform quite poorly because the average p-value of 40% means that there was only a 60% chance of the candidates getting a result as far from the prediction as they got.
  • 35. 35 4. Statistical distance (Χ2 sum) between model predictions and election results Applying the same method as in paragraph B.4, we use the following χ² sum as a measure of the distance between the model’s predictions and the results. is the predicted result from the model or the poll obtained with 1000 interviews. is the real result obtained by the candidate j is the candidate The χ² sum is a measure of the distance between the model’s predictions and the results Polls Model 1 Model 2 Χ² sum 12,41 12,02 14,17 TABLE 9: DISTANCE BETWEEN THE PREDICTION OR POLLS AND THE ELECTION RESULTS Predictions made by the model 1 have the smallest χ² sum and are therefore the closest to the real results of the elections. As a conclusion, the predictions made by model 1 are slightly better than those of the polls but model 2 is, as expected, much worse.
  • 36. 36 IV. Analysis of the French presidential Campaign in the light of Google Trends: issue ownership theory and candidate strategy given institutional constraints. In addition to improving predictions of election results, Google Trends can also be used to analyse of the political campaign. After a brief background to present an institutional and theoretical background, this part will use Google Trends to identify the turning points of the campaign, highlight the impact of institutional constraints of the campaign and analyse the political strategies and communication in terms of issue ownership and negative narrative. A. Theoretical and institutional background for the analysis 1. Institutional rules on media coverage during the presidential campaign The French Constitutional Council erected political pluralism as a constitutional principle. It is on this basis that the French High Council of Audiovisual (CSA) establishes the rules that govern radio stations and television channels (national and local) during the presidential campaign. These rules concern the speaking time and airtime (speeches, reports, analyses, etc..) of declared candidates (people who have publicly expressed their willingness to participate to the election) or potential candidates (those who have received important public support in favour of their candidacy) as well as their supporters (anyone calling to vote in favour of a candidate). The CSA rules published in the Official Journal on 6 December 2011 can be divided into three periods: January 1st until the day of the publication of the list of official candidates (mid- March 2012): the radio-television media must respect the principle of equity. Equity is based primarily on the popularity of the candidate himself which is derived from past election results and opinion polls. From the date of publication of the official list of candidates (midMarch 2012) until April 8 2012 midnight, talk times should be equal but airtime must only respect the principle of equity. From April 9th to May 4th , 2012 midnight (official campaign), candidates must have the same speaking time and the same airtime.
  • 37. 37 2. Issue ownership theory Issue ownership is a theory developed to understand how presidential candidates compete for the agenda in order to underline their strengths. The theory which was first developed by Petrocik (1991, 1996) states that “candidates campaign on issues that confer an advantage in order to prime their salience in the decisional calculus of the voters” (Benoit, Petrocik, & Hansen, 2004). The rational for this theory is that parties have a reputation of being more capable of dealing with some issues. Candidates use this reputation to increase their credibility by increasing the salience of these issues or those that show their opponents weaknesses - issue trespassing (Damore, The Dynamics of Issue Ownership in Presidential Campaigns, 2004). As this paper has shown that Google Trends can be used as a good indicator of public opinion, it will now use Google Trends as a way to analyze how agenda competition is constrained by the institutional constraints of the French Presidential campaign. B. Political strategies through the multiple phases of the campaign observed on Google Trends Looking at the daily Google queries for the main candidates during the campaign (Figure 21), it is quite trivial to identify different phases of the campaign. The following three phases can be distinguished: first phase: incumbent advantage until the end of December 2011; second phase: competition for voters between 1st January and mid-march; third phase: information seeking between mid march and the election.
  • 38. 38 FIGURE 21: THREE PHASES OF THE CAMPAIGN DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES NOVEMBER 2011 - 22ND APRIL 1. 1st Phase: The Incumbent Advantage During the first phase, which lasted until the beginning of January 2012 (four months before the election), Nicolas Sarkozy had a clear advantage in terms of media coverage because of his incumbent position. This advantage is very visible in terms of Google queries since he has an average of 38% of queries. Hollande only got decent coverage on the 17th November when he revealed the team of his campaign (only day when he got more queries that Sarkozy with 37%). The main surges in Google searches for Nicolas Sarkozy are international events: The G20 summit in Cannes (3rd and 4th November 2011), the Greek Crisis, the new government in Italy, the Toulon speech and a meeting with Angela Merkel (3rd December 2011). Sarkozy Bayrou Hollande Melenchon Le Pen 37,9% 5,3% 19,7% 4,0% 17,7% TABLE 10: AVERAGE DAILY GOOGLE QUERIES DURING THE 1ST PHASE 0,0% 10,0% 20,0% 30,0% 40,0% 50,0% 60,0% 70,0% sarkozy bayrou hollande melenchon lepen 1st phase: Incumbent Advantage 2nd phase: Competition 3rd Phase: Information
  • 39. 39 FIGURE 22: DAILY GOOGLE SEARCHES FOR MAIN CANDIDATES AT THE END OF YEAR 2011 Figure 22 highlights that the incumbent advantage lies mainly in the fact that during this first phase the president keeps most of the agenda setting powers – especially media agenda. In line with the issue ownership theory, Nicolas Sarkozy used this power to increase the salience of international issues in order to reinforce his image as astatesman in contrast with his main opponent, Francois Hollande, who had never held ministerial responsibilities. These issues are typical of incumbent. In their study of broadcasts in national elections in Germany (1990), the United States (1988), and France (1988), Holtz‐Bacha, Lee Kaid, & Johnston (1994) show that iincumbents “were more likely … to show themselves consulting with world leaders, and to try to represent the presidency (or chancellorship in the case of Germany) as the standard bearer of governmental legitimacy”. Nicolas Sarkozy used his incumbent advantage and devoted more time to the media than any of his predecessors because he believed in the power of the media to set the agenda and influence public opinion. (Kuhn, 12/2010). Moreover, during his term in office, Sarkozy developed formal and informal structures and institutions of executive media management similar to those of the US, UK and Germany. Sarkozy copied “agenda building and issue framing techniques practised by political executives in other leading Western democracies” (Pfetsch, 2008). 2. 2nd Phase: Competition and Issue Ownership The second phase goes on from the beginning of 2012, after the official New Year greetings up until mid March. According to the Google searches, this phase is the key phase of the campaign G 20 Toulon speech, meeting Merkel New Year Greetings Hollande's team 0,0% 10,0% 20,0% 30,0% 40,0% 50,0% 60,0% 70,0% 3-nov.-2011 2-déc.-2011 31-déc.-2011 1st Phase : Incumbent Advantage sarkozy bayrou hollande melenchon lepen
  • 40. 40 because issue salience and campaign agenda shifts very easily as voters are very reactive to the candidates’ proposals - Google Trends have the greatest variance. During this phase candidates fully compete and must choose how and when to use their limited resources. The institutional constraints on candidates are not very important because media must only respect the principle of equity (coverage proportionally to popularity). FIGURE 23: DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES FROM JANUARY - 15TH MARCH 2012 During this period the timing of proposals is crucial. On the one hand, the fact that TV and radio airtime for big candidates is then more limited after mid march (shift from equity to equality) means that candidates must use media extensively. On the other hand resources are limited and candidates might not have any money left at the end of the campaign if they start too early. Hollande started his campaign first and imposed his agenda (Figure 23). Between 24th January and 2nd February he had three very important Google search peaks related to big meetings and TV shows. During that period he proposed the creation of a new level of income tax for high earners, criticized the world of finance and then proposed a new 75% tax for people earning over 1 million Euros. This choice validates the issue ownership theory because these issues were Sarkozy's weaknesses. Sarkozy announced important reforms on 29th January (especially a VAT increase) but was still acting as president and only announced his candidacy on 15th February. Google Trends show that 0,0% 10,0% 20,0% 30,0% 40,0% 50,0% 60,0% 70,0% 5/1/12 12/1/12 19/1/12 26/1/12 2/2/12 9/2/12 16/2/12 23/2/12 1/3/12 8/3/12 15/3/12 2nd Phase: Competition sarkozy bayrou hollande melenchon lepen Bourget meeting Hollande on TF1 Sarkozy candidateSarkozy: VAT increase
  • 41. 41 from then on he succeeded in attracting media attention but the main themes of the campaign were those imposed by Hollande who had made his proposals earlier. Marine Le Pen however does not follow these patterns. Contrary to the two main candidates she was able to arouse voters’ interest all through the campaign but only for short periods of time. Surprisingly, this is even the case during the first phase (incumbent advantage) when Le Pen often arouses more interest that Hollande and occasionally more than Sarkozy. The campaign strategies revealed by the analysis of Google Trends are in line with the issue ownership theory for the two main candidates but what is rather unusual is that Francois Hollande used negative narrative right from the beginning of his campaign. Damore (Candidate Strategy and the Decision to Go Negative, 2002) performs a logit analysis of the UK general elections between 1976 and 1996 and finds that the likelihood of a candidate going negative increases over the course of the campaign and if the candidate is doing badly in polls. The 2012 French presidential elections are therefore very unusual because Francois Hollande attacked his main opponent right from the start and even though he had a clear lead in the polls. Downs (1957) argues that to appeal to the largest segment of voters, candidates in a two-party system should cast "some policies into the other's territory in order to convince voters that their net position is near them." This might better explain Hollande’s strategy and would mean that even though there are ten candidates in the first round of the French Presidential election, this second phase is almost like a two-party contest. 3. Third Phase: Information on minor candidates a) Major candidates The third and last phase is from mid March until the election on the 22nd April. During five weeks, the Google queries for major candidates stayed constant whereas those for minor candidates increased (Figure 24) - without polls changing accordingly.
  • 42. 42 FIGURE 24: DAILY GOOGLE SEARCHES FOR FIVE MAIN CANDIDATES FROM 15TH MARCH 2012 - 22ND APRIL (ELECTION DAY) The sudden stabilization of Google queries for major candidates between phase 2 (competition) and phase 3 explains why it is so important for candidates to control agenda in the second phase. After mid March, institutional rules impose the strict equality of speaking time among all candidates so the agenda is difficult to control and minor candidates get relatively more attention. Although the variations of queries for big candidates fall sharply after mid March, the share of queries stays proportional to the popularity and to the final vote. Even after the 9th April when the official campaign starts and candidates get the exact same airtime, big candidates still get queries proportionally to the poll results. Labbé & Monière (2012) examined the diversity of candidates’ vocabulary -method of Labbé, Labbé and Hubert (2004)- to assess the variation in themes of the candidate. During this third phase and especially in the second half of March, Nicolas Sarkozy attempted to catch Hollande up in opinion polls by bringing new proposals into the campaign (Figure 25) but the institutional constraints prevented him from arousing more interest (no increase in Google queries: Figure 24). This attempt to innovate is very common as candidates who are trailing in polls tend to “alter their messages compared to issues traditionally championed by their party” (Damore, 2004) 0,0% 10,0% 20,0% 30,0% 40,0% 50,0% 60,0% 15/3/12 19/3/12 23/3/12 27/3/12 31/3/12 4/4/12 8/4/12 12/4/12 16/4/12 20/4/12 sarkozy bayrou hollande melenchon lepen
  • 43. 43 FIGURE 25 : SARKOZY AND UMP VOCABULARY GROWTH IN PRESS SINCE JANUARY 1, 2012. NUMBER OF NEW WORDS (IN THOUSANDS), VARIABLE CENTERED AND REDUCED (LABBÉ & MONIÈRE, 2012) b) Minor candidates Minor candidates benefit from the institutional rules and have an immediate surge in Google queries when TVs and Radios give them the same speaking time as the main candidates (Figure 26). FIGURE 26 DAILY GOOGLE SEARCHES FOR FOUR OF THE MINOR CANDIDATES FROM NOVEMBER 2011 TO 22ND APRIL (ELECTION DAY) The beginning of the official campaign (9th April) and the beginning of airtime equality has no effect on major candidates but is quite important for the three smallest candidates (Figure 27). 0,0% 5,0% 10,0% 15,0% 20,0% cheminade dupontaignan arthaud poutou 1st phase: 2nd phase: Competition 3rd Phase: information
  • 44. 44 FIGURE 27: DAILY GOOGLE SEARCHES FOR SMALL CANDIDATES (LESS THAN 2% OF VOTE) FROM 15TH MARCH 2012 - 22ND APRIL (ELECTION DAY) 0% 5% 10% 15% 20% 15-mars 20-mars 25-mars 30-mars 4-avr. 9-avr. 14-avr. 19-avr. cheminade dupontaignan arthaud poutou Official Campaign: equal speaking time equal airtime equal speaking time airtime proportional to popularity
  • 45. 45 V. Conclusion A. Google Trends provides slightly better predictions but has its limits. The predictions of the French Presidential election in 2012 obtained by a combination of the polls and Google Trends using the ‘predicting the present’ method are slightly better than those obtained using the polls alone. This is promising for pollsters who could benefit from this method to lower the costs of polls by reducing the frequency of interviews. In addition to predictions, the very significant correlations found between Google Trends and polls (60% with p-value=0) validates the use of Google Trends as a means of otaining an estimation of public interest or even public opinion. Indeed Google Trends’s power lies in the number of queries (about 100,000 per day only on the candidates’ names in France) and the frequency of the data. Polls are often published two days after the interviews begin and cannot be published at the end of the French Presidential campaign whereas Google Trends provides data every day. Moreover, this study only used Google queries for the candidates’ names but there are endless research opportunities in the use of other keywords. However, there are also several limitations to the use of Google Trends. Although the number of internet users has increased a lot in recent years (80% of households in 2012) and Google has a 92% market share in France, it is debatable whether the people who use the Internet are representative of the whole population. Indeed only 36% of French people say they use the Internet as a source of information during Political campaigns. Another drawback of Google Trends is the fact that it is very difficult to distinguish between positive and negative queries for a candidate. This might be a source of strong bias in this study because French presidential campaign in 2012 was a particularly negative campaign in which many people voted against the incumbent, Nicolas Sarkozy, rather than in favour of a candidate (Jeambar, 2012). Research done on the use of Twitter in German elections (Tumasjan, Sprenger, Sandner, & Welpe, 2010) distinguishes negative tweets but suffers much more from the representativeness bias. Further research using additional functions of Google Trends such as “related terms” –which gives the main words associated with the keyword– could minimize this bias and obtain an even better estimation of public opinion. Nevertheless, it is very likely that negative Google queries for competing candidates cancel each other overall. A final limitation might be the experiment bias outlined by Mustafaraj and Metaxas (2009) (see II.D.2.e). This bias occurs if political activists influence Google Trends by making thousands of queries. This is possible on a given website but
  • 46. 46 unlikely on Google given the vast number of queries per day and because Google Trends eliminates multiple queries from the same person on the same day. B. Timing as the key to issue ownership The analysis of the campaign timeline highlights that the fact that the institutional rules abruptly end the issue ownership competition between main candidates on March 15 th by imposing a strict equality of speaking time. The strict equality of airtime which is then introduced on April 9th does not however have any impact on major candidates. It is noteworthy that even when major candidates lose their campaign agenda setting powers after mid March, the level of queries stays relative to the importance of the candidates. The institutional rules create perfect equality on TV and radio but the Internet still reveals differences among candidates. This confirms that although Google queries might be influenced by media coverage– through media websites–, they also reflect public opinion. In terms of candidate strategy the analysis highlights the fact that Sarkozy did not succeed in controlling the Agenda. He started his campaign very late (mid February) which cost him the benefit of his incumbent advantage. Hollande imposed his themes in January and the first part of February. Then, after mid March, Sarkozy was constrained by the institutional rules which prevented him from introducing new issues in the campaign agenda. Even though he spent more time than any of his predecessors dealing with communication, Sarkozy failed to maintain his incumbent advantage in terms of agenda construction and issue framing because it has become “impossible to exert effective management of the media, especially on social networks and internet” (Kuhn, 12/2010).
  • 47. 47 Bibliography Albrecht, S., Lübcke, M., & Hartig-Perschke, R. (2007). Weblog Campaigning in the German Bundestag Election in 2005. Social Science Computer Review 25(4) , 504-520. Armstrong, J. S., & Graefeb, A. (2011). Predicting elections from biographical information about candidates: A test of the index method. Journal of Business Research, Volume 64, Issue 7 , 699– 706. Askitas, Nikolaos, and Klaus F. Zimmermann. ( 2009). Google Econometrics and unemployment Forecasting. Applied Economics Quarterly 55 , 107–20. Benoit, W., Petrocik, J., & Hansen, G. (2004). Issue ownership and presidential campaigning, 1952-2000. Political Science Quarterly , 599. Blais, A. (2004). How Many Voters Change Their Minds in the Month Preceding an Election? American Political Science Association . Boulianne, S. (2009). Does Internet Use Affect Engagement? A Meta-Analysis of Research. Political Communication Volume 26, Issue 2 , 193-211. Bourdeau, T. (19. April 2012). Les candidats passés à la moulinette d’internet. Abgerufen am 5. August 2013 von http://www.rfi.fr/: http://www.rfi.fr/france/20120419-presidentielle-2012- sarkozy-melenchon-hollande-le-pen-cheminade-internet-compuware Bruter, Y. a. (2007). Electoral Behaviour. Encyclopedia of European elections , Basingstoke: Palgrave Macmillan , pp.88-95. Choe, S. H. (1980). Time of Decision and Media Use During the Ford-Carter Campaign. Public Opinion Quaterly . Choi, H.and Varian, H. (2011). Predicting the Present with Google Trends. Google Research Blog . Choy, M., Cheong, M., Nang Laik, M., & Ping Shung, K. (2012). US Presidential Election 2012 Prediction using Census Corrected Twitter Model. D’Amuri, F., Marcucci, J., & Billari, F. (April 6, 2013). Forecasting Births Using Google. Presentation at PAA Annual Meeting, 2013, New Orleans, Session 155:Methods and Models in Fertility Research .
  • 48. 48 D’Amuri, F., Marcucci, J., & Billari, F. (April 6, 2013). Forecasting Births Using Google. Presentation at PAA Annual Meeting, 2013, New Orleans, Session 155:Methods and Models in Fertility Research . Damore, D. F. (2002). Candidate Strategy and the Decision to Go Negative. Political Research Quarterly , vol. 55 no. 3 p669-685. Damore, D. F. (2004). The Dynamics of Issue Ownership in Presidential Campaigns. Political Research Quarterly, Vol. 57, No. 3 , 391-397. Elmelund-Præstekær, C. (2011). Issue ownership as a determinant of negative campaigning. International Political Science Review , 209–221. Ginsberg, J, Mohebbi, M., Patel, R, Brammer, L., Smolinski, M., Brilliant, L. (2009 ). Detecting influenza epidemics using search engine query data. Nature, Vol. 457 , 19 . Google. (2011). Google trend help. Abgerufen am 13. August 2013 von How does Google Trends work?: https://support.google.com/trends/answer/87276?hl=en Graefe, A., & Armstrong, J. S. (2013). Forecasting Elections from Voters' Perceptions of Candidates' Ability to Handle Issues. Journal of Behavioral Decision Making, Volume 26, Issue 3 , 295–303. Green, J. (November 29, 2012). The Science Behind Those Obama Campaign E-Mails. BuisnessWeek . Green, J., & Hobolt, S. B. (2008). Owning the issue agenda: Party strategies and vote choices in British elections. Electoral studies . Harrison, S., & Bruter, M. (2011). Ideology An Empirical Geography of the European Extreme Right Research Officer. Palgrave Macmillan . Holtz‐Bacha, C., Lee Kaid, L., & Johnston, A. (1994). Political Television Advertising in Western Democracies: A Comparison of Campaign Broadcasts in the United States, Germany, and France. Political Communication , 11:1, 67-80. Ivaldi, G. (sep 2007). Presidential Strategies, Models of Leadership and the Development of Parties in a Candidate-Centred Polity: The 2007 UMP and PS Presidential Nomination Campaigns. French Politics , 253-277.
  • 49. 49 Jansen, B. J., Zhang, M., Sobel, K., & Chowdury. (2009). Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology , 60:120. Jeambar, D. (2012). Explication de vote pour Francois Hollande (Explanation of the Vote for Francois Hollande). Le Débat, Gallimard . Kuhn, R. (12/2010). 'Les médias, c'est moi.' President Sarkozy and news media management. French Politics Volume 8, Issue 4 , 355 - 376. Labbé Cyril, Labbé Dominique and Hubert Pierre . (December 2004). Automatic Segmentation of Texts and Corpora. Journal of Quantitative Linguistics , 11-3. p 193-213. Labbé, D., & Monière, D. (2012). Radioscopies de la campagne présidentielle 2012: La course de fond des candidats à l’élection présidentielle . www.trielec2012.fr (research notes) . Lui, C., Metaxas, P. T., & Musta, E. (2011). On the predictability of US elections through search volume activity. Department of Computer Science Wellesley College Wellesley, MA 02481 . McLaren, N., & Shanbhorge, R. (2011). Using Internet Search Data as Economic Indicators. Bank of England Quarterly Bulletin, Second Quarter. O’Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Proceedings of the International AAAI Conference on Weblogs and Social Media (2010) Key: citeulike:7044833 . O’Connor, B., Balasubramanyan, R., Routledge, B., Smith, A.,. (2010). From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Washington. 1st International AAAI Conference on Weblogs and Social Media . Pollarda, T. D., Chesebro, J. W., & Studinski, D. P. (2009). The Role of the Internet in Presidential Campaigns. Communication Studies Volume 60, Issue 5 , 574-588. Preis, T., Moat, H. S., & Stanley, H. E. (25th April 2013). Quantifying Trading Behavior in Financial Markets Using GoogleTrends. Nature . Reilly, S., Richey, S., & Taylor, J. B. (2012). Using Google Search Data for State Politics Research: An empirical validity Test Using Roll-Off Data. State Politics & Policy Quarterly .
  • 50. 50 Sano, Y., Yamada, K., Watanabe, H., Takayasu, H., & Takayasu, M. (January 2013). Empirical analysis of collective human behavior for extraordinary events in the blogosphere. Physical Review, 87, . Serfaty, V. (May 2010). Web Campaigns: Popular Culture and Politics in the U.S. and French Presidential Elections. Culture, Language, Representations , 115-129. Seybert, H. (2012). Internet use in households and by individuals in 2012. Eurostat statistics focus . Sunstein. (2007). The blogosphere: Neither Hayek nor Habermas. Public Choice , 87-95. Sylvain Brouard & Simona Zimmermann. (21. April 2012). Les pratiques médiatiques des Français pendant la campagne présidentielle 2012. Centre E. Durkheim, Sciences Po Bordeaux . Troy, G., Perera, D., & Sunner, D. (2012). Electronic Indicators of Economic Activity. Australian Reserve Bank - Economic Bulletin June , 1-12. Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting Elections with Twitter:What 140 Characters Reveal about Political Sentiment. 4th International AAAI Conference on Weblogs and Social Media . Vaccari, C. (2008). Surfing to the Élysée: The Internet in the 2007 French Elections. French Politics 6 , 1–22. Vosen, S., & Schmidt, T. (2011). Forecasting Private Consumption: Survey-Based indicators vs. Google Trends. Journal of Forecasting. 30 , 565–578.
  • 51. 51 Table of illustrations Figure 1: Histogram of internet usage (% of individuals), 2012 (Eurostat)................................... 12 Figure 2: Frequency of monitoring information on the Internet .................................................... 15 Figure 3: Types of websites used for information on the election campaign Source: TNS Sofres – TriÉlec, septembre 2012- marCH 2012......................................................................................... 15 Figure 4: Google Trends Dashboard - weekly Google queries for the five main candidates between october 2011 and March 2012.......................................................................................... 17 Figure 5: Google Trends Dashboard - weekly Google queries for the five main candidates between october 2011 and March 2012.......................................................................................... 19 Figure 6: Google Trends Dashboard - weekly Google queries for the four smallest candidates between october 2011 and March 2012.......................................................................................... 19 Figure 7: Google Trends for all candidates during the six months preceding the election, rescALed by the author (% of all queries for candidates)............................................................................... 20 Figure 8: Aggregation of all the polls during the six months preceding the election (TNS , Harris, opionway, ifop, CSA, BVA, LH3)................................................................................................. 22 Figure 9: Comparison of weekly Google trends for all candidates with polls from all institutes (TNS , Harris, opionway, ifop, CSA, BVA, LH3) ......................................................................... 24 Figure 10: Variances of Google Trends and Polls.......................................................................... 25 Figure 11: χ² sum = statistical distance between Google Trends and polls.................................... 27 Figure 12: Autocorrelograph of polls (IFOP) for Nicolas Sarkozy between october 2011 and april 2012................................................................................................................................................ 28 Figure 13: autocorrelograph of Google trends for Francois Hollande between october 2011 and april 2012........................................................................................................................................ 29 Figure 14: Autocorrelograph of Google Trends for Marine Le Pen between october 2011 and april 2012................................................................................................................................................ 29 Figure 15: regression of the model for Nicolas Sarkozy (stata)..................................................... 30 Figure 16: regression of model 1 for Phillipe Poutou (stata) ......................................................... 30 Figure 17: regresseion of model 1’ with extra lags for Phillipe Poutou (stata).............................. 31 Figure 18: Regression of model 2 for Francois Hollande (stata) ................................................... 32 Figure 19 Fitted values for Nicolas Sarkozy based on the regression of model 1 ......................... 32 Figure 20 Predictions of model 1 with daily IFOP polls and google trends .................................. 33 Figure 21: Three phases of the campaign....................................................................................... 38 Figure 22: Daily Google searches for main candidates at the end of year 2011 ............................ 39 Figure 23: Daily Google searches for five main candidates from January - 15th March 2012....... 40
  • 52. 52 Figure 24: Daily Google searches for five main candidates from 15th March 2012 - 22nd April (election day).................................................................................................................................. 42 Figure 25 : Sarkozy and UMP Vocabulary growth in press since January 1, 2012. number of new words (in thousands), variable centered and reduced..................................................................... 43 Figure 26 Daily Google searches for four OF THE Minor candidates from November 2011 to 22nd April (election day) ........................................................................................................................ 43 Figure 27: Daily Google searches for small candidates (less than 2% of vote) from 15th March 2012 - 22nd April (election day) .................................................................................................... 44 Figure 28: fixed effect panel Data regresion output....................................................................... 54 Figure 29: correlation between polls and Google trends and their lags ......................................... 55 Figure 30: interest for elections and campaigns in France from Oct 2011 to may 2012................ 56 Figure 31: Interest in Politics in France in France from Oct 2011 to may 2012 ............................ 56
  • 53. 53 Tables Table 1: Results of the first round of the election (22nd April 2012).............................................. 11 Table 2: Individuals who used the internet, 2012........................................................................... 12 Table 3: relative weight of internet search engines in France in March 2013. Source: AT internet 13 Table 4 The primary source of political information during election campaigns in 2007 and 2012. ........................................................................................................................................................ 14 Table 5: second source of political information during the 2007 and 2012 election ..................... 14 Table 6 estimate of the number of monthly google queries (in thousands) in 2013. source: Google adwords .......................................................................................................................................... 20 Table 7: Predictions of election results with the two models ......................................................... 33 Table 8: P-values of z-test for samples of 1000 people.................................................................. 34 Table 9: distance between the prediction or polls and the elecTion results ................................... 35 Table 10: Average daily Google queries during the 1st Phase........................................................ 38
  • 54. 54 Appendices A. Fixed effect panel regression: inconclusive method After reshaping polls and Google Trend data for each candidate (numbered from 1 to 10 as in table 1 of this paper) into long with Stata, it is interesting to perform a panel data fixed effect regression. FIGURE 28: FIXED EFFECT PANEL DATA REGRESION OUTPUT These results are difficult to interpret politically without the fixed effects for each candidate. However, we can already note that the coefficient for Google Trends is significantly positive (t- test of 5.42). This result is quite satisfactory because it suggests that Google Trends have an explanatory power over the polls. Looking at the correlation table below confirms the link between the Google trends and polls (overall correlation of 0,62 for all candidates). F test that all u_i=0: F(8, 1514) = 523.68 Prob > F = 0.0000 rho .97112246 (fraction of variance due to u_i) sigma_e .01220537 sigma_u .07077958 _cons .0562141 .0010687 52.60 0.000 .0541178 .0583105 L1. -.0085312 .0080348 -1.06 0.289 -.0242917 .0072294 --. .0438295 .0080849 5.42 0.000 .0279708 .0596883 trend L1. .2064679 .011519 17.92 0.000 .1838731 .2290628 poll poll Coef. Std. Err. t P>|t| [95% Conf. Interval] corr(u_i, Xb) = 0.9440 Prob > F = 0.0000 F(3,1514) = 132.99 overall = 0.9190 max = 170 between = 0.9834 avg = 169.6 R-sq: within = 0.2086 Obs per group: min = 168 Group variable: candidat Number of groups = 9 Fixed-effects (within) regression Number of obs = 1526 . xtreg poll L.poll trend L.trend, fe delta: 1 day time variable: date2, 03nov2011 to 22apr2012 panel variable: candidat (strongly balanced) . xtset candidat date2
  • 55. 55 FIGURE 29: CORRELATION BETWEEN POLLS AND GOOGLE TRENDS AND THEIR LAGS The panel data regression was not be used in the rest of this paper because Google Trend data is standardized over time and both the polls and the Google trends for each candidate are given in percentage of the total which erases any time or cross sectional effects. We have therefore studied the data for each candidate as autonomous time-series and then compared the predictions. 0.0000 0.0000 0.0000 L.trend 0.6174* 0.9357* 0.6109* 1.0000 0.0000 0.0000 L.poll 0.9719* 0.5918* 1.0000 0.0000 trend 0.6109* 1.0000 poll 1.0000 poll trend L.poll L.trend . pwcorr poll trend L.poll L.trend, sig star(.001)
  • 56. 56 B. Predicting the turnout with Google Trends The Category filter of Google Trends shows the change over time of of category of queries as a percentage of growth, with respect to the first date on the graph (or the first date that has data). Instead of a 0-100 label on the y-axis of the category comparison graph, the range is -100% to +100%, and a starting point at 0. The study of elections requires at least weekly results which limits Google Trends to 3 years at a time (for a longer period, Google Trends only gives monthly results). Periods of two years can very easily be linked together by continuity. Looking at interest in Politics in France, it would be interesting to predict the turn-out of the elections. FIGURE 30: INTEREST FOR ELECTIONS AND CAMPAIGNS IN FRANCE FROM OCT 2011 TO MAY 2012 FIGURE 31: INTEREST IN POLITICS IN FRANCE IN FRANCE FROM OCT 2011 TO MAY 2012
  • 57. 57 C. Comparing candidates’ websites now and in 2007: bridging the gap ? The data reveals that, despite the media hype, online electioneering in France is still at an intermediary stage, especially in terms of participation tools. Significant differences were found among candidates and, especially, parties. The gap between large and small parties is found to be greater than in most of similar country studies, thus providing new evidence against the internet's ability to level the political playing field. Distinctive patterns of online electioneering emerge between conservative and progressive parties and candidates (Vaccari, 2008) PS Royal UMP Sarkozy UDF Bayrou FN Le Pen Information (%) 72 60 81 63 41 50 63 60 Participation (%) 87 70 67 60 37 53 23 3 Professionalism (%) 70 65 61 74 39 61 61 43 Overall quality (%) 76 65 70 66 39 55 49 36 MAJOR PRESIDENTIAL CANDIDATES’ SITES AND THEIR PARTIES’ SITES, APRIL 2007 (VACCARI, 2008)8 8 “The information macro-section accounts for ‘pull’ (user-initiated) and ‘push’ (party-initiated) information supply, and targeting of different groups of voters via dedicated tools. Participation entails online interactivity, resource mobilization, and decentralization of communication. Finally, professionalism is measured with respect to design and multimedia features, site accessibility, navigability, and frequency of updates.”
  • 58. 58 PERFOMANCES OF CANDIDATES AND PARTIES’ WEBSITES DURING THE 2007 PRESIDENTIAL CAMPAIGN. AVAILIBILITY OF CANDIDATES’ WEBSITE (BOURDEAU, 2012) RESPONSE TIME (IN SECONDES) OF THE CANDIDATES’ WEBSITE (BOURDEAU, 2012) 0 20 40 60 80 100 PS Royal UMP Sarkozy UDF Bayrou FN Le Pen Information(%) Participation(%) Professionalism (%) Overallquality (%) 0 1 2 3 4 5 6 Bayrou Joly Sarkozy Mélenc… Hollande Le Pen Source : Compuware 2012
  • 59. 59 D. Weekly Google trends and Timeline of the main events during the campaign. 31st December: the new year greetings of the president, 29th January 2012 Big television interview as president where he announced a VAT increase 15th February: Nicolas Sarkozy announces his candidacy. 2nd March: Sarkozy at the opening of the Olympic games and changes his government team. 7th March: TV interview and debate with Laurent Fabuis on TV France 2 12th March: TV Show 'Paroles de candidat” on TF1 with a panel of voters. 0% 20% 40% 60% 80% 100% 06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012 Google Trends for candidates during the six months before the election sarkozy bayrou hollande melenchon joly le pen cheminade dupont-aignan arthaud poutou 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 06/11/2011 06/12/2011 06/01/2012 06/02/2012 06/03/2012 06/04/2012 Google Trends for 5 main candidates during the six months before the election (shares of weekly queries) sarkozy bayrou hollande melenchon le pen