1.
Statistical mechanics of SCOTUS Edward D. Lee1,2 Chase P. Broedersz1,2 William Bialek1,2 1Department of Physics, Princeton U. 2Lewis-Sigler Institute for Integrative Genomics, Princeton U.
2.
From complex to simpleGroup behavior of social organisms canmanifest complex patterns at the grouplevel even though they might reducedown to simple rules for individuals[1β3]. For example, starlings seem tointeract only with local neighbors, andthese local interactions produce globalpatterns like on the right. We call thisemergent collective phenomenabecause such behavior is not explicitlyencoded in the behavior of individuals,but arises from interactions.Is it the case that decisions by groupsof people, despite apparent higher Flocks of starlings form complex patternsorder behaviors, are likewise reducible (http://webodysseum.com/videos/spectacuto simpler rules? lar-starling-flocks-video-murmuration/).
3.
US Supreme Court (SCOTUS)We investigate the voting behavior of the SCOTUS. We explore the structureof decisions to gain insight into how these decisions are reached. We showresults for the second Rehnquist Court (1994β2005, π = 895 cases) anddiscuss other βnatural courtsββperiods of time when member remainconstantβwhere relevant.SCOTUS facts:β’ highest court in the US government β’ write a majority and a minorityβ’ nine Justices appointed for life opinion, legally clarifying theirβ’ vote on constitutionality of legislative decisions, which can be and executive actions supplemented with separate opinionsβ’ usually hears appeals from lower β’ Justices must ultimately render a courts decisions, which Justices binary decision affirm or reverse β’ the second Rehnquist Court isβ’ sometimes Justices are recused for typically considered as 4 liberals and conflict of interest, sickness, etc. 5 conservativesβ’ decisions by majority vote
4.
Previous approaches to SCOTUS votingAlthough there is a long history of scholarly study of SCOTUS, nearly all approaches rely onimportant assumptionsβeven though they may be justified. Some assume Justices voteindependently given ideological preferences [5]. Others, if including interactions, do notinclude them in a general model of voting or posit their structure instead of deriving it fromdata [6]. Most importantly, many draw on complex, underlying cognitive frameworks likerationality or expression of internal beliefs, which are validated by predictions made of thedata [5β8]. However, it is impossible to validate all aspects of these complex models. Isthere a way to construct accurate models while abstaining from introducing complexities?Characteristics ofprevious approaches β’ Decision-making framework [7β8] β’ Independent β’ Ideological liberal vs. β’ Externally posited voters given conservative axis [5,7β interaction [6] ideological 8] preferences [5] β’ Subset of votes deemed How do we approach relevant [5] the system with β’ Not generally predictive minimal Attitudinal Game theoretical assumptions?
5.
Back to the dataThe data, published by [4],immediately reveals strongstructure in Court behavior.For example, we see that inthe second Rehnquist Court,44% of votes were unanimous.Overall, when considering thenatural courts shown on theright, 36% of votes areunanimous on average. Only10% of votes fall along theliberal vs. conservative divide.Does an independent modelsupport this observation? Distribution of voting data for natural courts starting in given year. Blue, 0 dissenting votes; red, 1; yellow, 2; green, 3; black, 4. Terms are the number of years that the same members remained on the Court. The number of votes on record for each set of years is in gray.
6.
The simplest model Independent votersEach Justice π has probability π π of voting to affirm, so the probability of π votes inthe majority out of π Justices is π π πβπ π π π = π 1β π + π πβπ 1 β π π π πβ πAn independent model fails to explain the distribution of votes in the majority.This is not so surprising because weβveassumed that all higher order behaviors areencoded within the first moments. Forexample, for a unanimous vote (π = 9) tooccur in the independent model, all Justiceswould happen to vote the same way, but thishappens much more rarely than observed,yielding 0.5% of the observed value. Indeed,interactions are crucial to SCOTUS votingbehavior.
7.
More evidence for interactionJustices take at least two votes for a case: an initial secret vote and a finaldecision. Justices may attempt to persuade each other in between, but it isdifficult to measure such interactions partly because the first vote is secret.According to data available on the Waite Court (1874β1887), 9% of finalvotes had at least one dissenting vote while 40% had at least one in the initialvote [5]. In general, Justices more often switch to the majority than thereverse, suggestive of consensus-promoting interaction. Maltzmann et al.show from memos that Justices strategically manipulate their communicationto attempt to influence the vote and written opinions of the Court [6].Nonetheless, most models either treat the Justices as independent or do notexplicitly include interactions in a predictive framework. Indeed, it is difficultto devise the right structure for interactions!How do we account for interactions in a principled fashion?
8.
Including interactionsβ¦Given that Justice π can vote in two ways, we represent his or her vote as π π β β1,1 . Then, the independent model is the simplest model that fitsaverage voting records { π π } and all higher order correlations,{ π π ππ , π π ππ π π β¦ π π ππ β¦ π π }, are reducible to { π π }. In fact, all higherorder statistics are as random as possible given the individual means, so thereis no reason any higher order correlations should match that of the data. Wecan generalize this idea to a πth order model that fits all correlations up toorder π yet generates all > π order correlations randomly. Since thesedistributions are as random as possible given what is fit, that also means thatwe make no further assumptions than what is given in the fitted correlationsor about how these distributions are generated.With SCOTUS, we might expect that we need to account for the bloc behavior(5 vs. 4) and unanimous behavior by including terms of the 4th, 5th and 9thorders explicitly. However, let us take only the next step of fitting both π πand π π ππ .
9.
β¦as maximizing entropyThe formalization for generating these distributions is called the principle of maximumentropy [9]. Entropy is a measure of the randomness of a distribution. The entropy ofa probability distribution π(π) of the votes of a set of π voters π = {π1 , β¦ , π π } is π π π =β π π log π(π) πwhich we maximize while constraining π π and π π ππ π π 1 π π π , β π , π½ ππ = π β β π ππ β π½ ππ π π ππ 2 π=1 π,π=1with Lagrange multipliers β π , π½ ππ . The resulting model is known as the Ising model 1 π π = π βπ»(π) π π 1 π π» π =β β π ππ β π½ ππ π π ππ π=1 2 π,π=1with a normalizing constant, the partition function π, and Hamiltonian π»(π).
11.
Mapping spinsWe have yet to define how the values of π π correspond toactual vote. It is not as simple as calling one value affirm andthe other reverse: the outcome of affirming or reversingdepends on how the case is posed. It is entirely possible thataffirming one case is a liberal decision and conservative inanother. What is the right dimension along which to orient the π π ? We abstain from making a choice, and introducingexternal bias, by symmetrizing the up and down votes suchthat β1 and 1 are equivalent.This keeps π π ππ the same and fixes π π = 0.Correspondingly, β π = 0. We find that absence of a bias is areasonable assumption because bias is not the dominant termfor judicial voting behavior.
12.
Model fitRemarkably, the Ising model fits thedata well. One measure of the fit is toconsider the difference in entropy of the πth order model with the data πΌ π = π π β πdata [2]. As we increase π,we capture more correlation and theentropy of our models monotonicallydecreases to that of the data, where π π = πdata . The furthest distance πΌ1 iscalled the multi-information. Our modelcaptures 90% of the multi-information(right).Thus, it nearly captures all the structurein the data. It also follows β log π π β The model π»(π) for the most frequent states. The captures 90%least fit states only appear one or twice of the multi-on average in a bootstrap sample of the information.data.
13.
Implications of Ising model fitThe fit by the Ising model shows that higher order behaviorslike ideological blocs and unanimity can emerge from lowerorder behaviors at the level of pairwise interactions betweenindividuals. Including higher order terms will result in amarginal improvement in the fit.This result is surprising because it suggests that higher levelcoordination is not the dominant explanation of votingbehavior. Previously, scholars have pointed to the high level ofconsensus in the Court to as evidence for a βnorm ofconsensus,β which seems analogous to an effective ninthorder term for behavior [6]. Our results point to a differentsort of decision-making structure.
14.
Found coupling network πΆ ππ = π π ππ β π π ππ and π½ ππ graphs. Justices with a liberal voting record arecolored blue whereas those with a conservative are colored red. Positive edgesare red and negative blue. Widths are proportional to magnitude. All πΆ ππ arepositive whereas some π½ ππ are negative. Justices are initialed: John Stevens (JS),Ruth Ginsburg (RG), David Souter (DS), Steven Breyer (SB), Sandra OβConnor(SO), Anthony Kennedy (AK), William Rehnquist (WR), Antonin Scalia (AS),Clarence Thomas (CT).
15.
Understanding couplingsAs a simple check, we see that the average π½ ππ withinideological blocs (blue to blue or red to red) are positivewhile the average between (blue to red) is negative(previous slide). The corresponding averages of πΆ ππ alsoshow this relative change although all πΆ ππ are positive,obscuring the antagonistic tendency.To better understand the distribution of π½ ππ , we considerthe effective field on π π from its neighbors. π 1 βeff = π π½ ππ ππ 2 π=1Note that it depends on the state of neighbors ππ . Sincethis distribution over all π is symmetric around 0, we Distributions π βeff π . Red histogram isonly show the positive half (right). πWe fix π π = 1 and compare the shifts in the distributions distribution of fields from only conservative Justices. Ordered from most liberal to mostof βeff of its neighbors π , which we measure by taking π π conservative voting record from left to right,the mean over standard deviation π/Ξ£ ππ of π βeff . In π top to bottom. The more conservatively athe absence of such fixing, π/Ξ£ ππ = 0 (next slide). Justice votes, the more the mean field due to conservatives marches to the right.
16.
Shifts in β eff π π = 0.8 Average shifts in Ξ£ ππ distributions of π βeff over = 4.7 πΞ£ ππ Liberals Conservatives π ideological blocs = 4.3 when holding one Ξ£ ππ member of a bloc, π, at 1 at a time. Average shift in liberals (π) π when holding = 0.8 conservatives fixed (π) Ξ£ ππAs expected, ideological neighbors are much more affected by fixing π π = 1by a factor of 5-6, showing that ideological blocs are a natural division of theCourt. Overall, the Court always shifts in the same direction as theperturbation. Thus, we find that higher order behavior as ideological blocsand general unanimity are reflected in the couplings.OConnor and Kennedy shift conservatives (liberals) to π/Ξ£ ππ = 2.6 (1.4) and π/Ξ£ ππ = 3.1 (1.1), reaffirming their moderate credentials. Stevens, however,has overall weakest connections with π/Ξ£ ππ = 0.36 (2.72) as if more isolated.
17.
Caveats with couplingsWe must be careful not to interpret the π½ ππ literally as corresponding tobehavioral interaction on the Court. The distinction that we cannotmake, which is indeed impossible with this data set, is to explain theunderlying mechanism for correlations. We may find two Justices thatvote together too much for chance, but it could be the case that eitherthey collaborate to a large extent or that their perspectives have beenshaped by a similar background. The latter involves a hidden thirdactor, but it is indistinguishable from the other with only the votingrecord. In many ways, possible confounding factors that contribute to π½ ππ reflect fundamental limitations of the data.Our guiding principle is that we refrain from assuming anything beyondwhat is already given from the data; other models do not have thesame claim minimal assumptions. Furthermore, we know fromanecdotal evidence that Justices persuade each other, so there arecertainly interactions captured by the π½ ππ .
18.
Probing influenceWith this model of voting behavior, we can probe thebehavior of the system under perturbations.The quantity of interest here is the majority outcome ofthe court π π ππ πΎ= | ππ π π |because this is the decision rendered.How sensitive is the average decision πΈ to a smallchanges in the average behavior of a Justice π π ?
20.
Are ideological medians influential?The typical wisdom in the political science literature is that theseideological medians are the most influential for Court decisions.Basically, the argument is that voters who sit in the middle of aunidimensional, symmetric preference space will be predictive of themajority [10]. The relevant space is liberal vs. conservative as we haveconfirmed with βeff . The Justices to whom the outcome is most πsensitive here are the ideological medians SO and AK, in agreementwith the claim.However, real systems may be complicated by interactions thatconstrain how such a voter may cast her vote or how a majority formsinitially and persists. We do not find that it is the ideological mediansto whom the outcome of the court is most sensitive in general.Importantly, our results are derived under minimal assumptions. Wedo not assume ideological behaviorβwhich though generally visible,still has to be imposed by the observerβand we account forinteractions.
25.
Works cited1. W. Bialek, A. Cavagna, et al., PNAS 109, 4786 (2012).2. E. Schneidman, M. Berry, et al., Nature 440, 1007 (2006).3. I. Couzin, J. Krause, et al., Nature 433, 7025 (2005).4. H. J. Spaeth, L. Epstein, et al., Supreme Court Database (2011).5. A.D. Martin & K. M. Quinn, Pol. Anal. 10, 134 (2002).6. L. Epstein, J. A. Segal, et al., Am. J. of Pol. Sci. 83, 557 (2001).7. S. Brenner & R. H. Dorff, J. of Th. Pol. 4, 2 (1992).8. F. Maltzmann & J. F. Spriggs II, et al., Crafting law on the Supreme Court (2000).9. E. T. Jaynes, Phy. Rev. 106, 620 (1957).10. D. Black, J. of Pol. Econ. 56, 23 (1948).
Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.
Be the first to comment