1. Mathematical Modelling and Analysis of
Legislation Networks
Neda Sakhaee
University of Auckland
nsak206@aucklanduni.ac.nz
February 25, 2016
Neda Sakhaee (UOA) Legislation Network February 25, 2016 1 / 27
2. Overview
1 Introduction
2 Research Scope
Questions
Objectives and Methodologies
3 Progress To-Date and Initial Results
Building the Network
Network General Measures
Centrality, Important Nodes
Initial Community Detection
Modelling Example
Academic outcomes
4 Future Studies
Overview
Generative Models, Pattern Prediction
Community Detection Method
5 Requirements and Limitations
Data
Software
6 Time-table for my PhD project
Neda Sakhaee (UOA) Legislation Network February 25, 2016 2 / 27
3. Introduction
Introduction
This research looks at Legislation Networks within the wider class of citation
networks.
Main case study is New Zealand Legislation Network, but comparison
studies with other countries are included.
Legislation Network is a multi-layer graph and has some novel features
which make it an excellent test case for new network science tools.
They involve legal documents, but differ substantially from citation networks
involving case law, Supreme Court opinions, etc.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 3 / 27
4. Introduction
Legislation Network
The concept of Legislation Network was introduced in 2015 as a novel
approach to show interdependence of the European Union laws.
In Legislation Network:
Nodes:
Laws (Acts, Regulations, etc.)
Edges:
Any time one law references another law (definitions, amendments, etc.).
We classify these as amendment edges or citation edges.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 4 / 27
5. Research Scope Questions
Research Scope
Questions
How to build a Legislation Network? Is the Legislation Network a
meaningful network, or just a set of random relationships?
What are the differences and similarities between Legislation Network
and other citation networks?
Can we measure the importance of legal documents using the network
science tools? How does this relate to human expert opinion?
Is there any meaningful relationship between the Legislation Network
measures and the political or social processes?
Neda Sakhaee (UOA) Legislation Network February 25, 2016 5 / 27
6. Research Scope Questions
Research Scope
Questions
Do the legal documents tend to cluster?
Can we find a good generative model for Legislation Network?
How can we study time evolution of Legislation Network?
Is this possible to predict the attributes of the legal documents in
legislation network?
Is this possible to predict the historical missing edges in Legislation
Network?
Neda Sakhaee (UOA) Legislation Network February 25, 2016 6 / 27
7. Research Scope Objectives and Methodologies
Research Scope
Objectives and Methodologies
Stage I
Investigate how Legislation Network can be built and compared to the
other networks.
Stage II
Implement appropriate centrality measures to determine and compare the
importance of legal documents.
Stage III
Develop a generative model of Legislation Network.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 7 / 27
8. Research Scope Objectives and Methodologies
Research Scope
Objectives and Methodologies
Stage IV
Contribute to community detection algorithms of directed networks.
Stage V
Investigate relationships between the network science properties and the
political or social processes.
Stage VI
Propose link prediction and attribute prediction models for Legislation
Networks.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 8 / 27
9. Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial Results
Building the Network
We downloaded 8900 xml files of laws, dated from 1267 to October 2015,
from Legislation.govt.nz. Substantial manual cross checking and data
cleaning is required.
All types of references are extracted by a C sharp program from the xml
files. We have six different Acts networks based on the node and edge type:
Binary Whole Network (BWN)
Weighted Whole Network (WWN)
Binary Citation Network (BCN)
Weighted Citation Network (WCN)
Binary Amendment Network (BAN)
Weighted Amendment network (WAN)
Neda Sakhaee (UOA) Legislation Network February 25, 2016 9 / 27
10. Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial Results
Figure: Dataset Building Process
Act
Name:
Marriage
Act
1955
Type:
Public
Date:
27/11/1955
Terminated:
0
Year:
1955
Reprint:
1
Date
reprinted:
19/08/2013
Cites:
Child
Welfare
Amendment
Act
1948
Data
Extraction
Process
a)
Citation
Link
Data
Extraction
Process
Act
Name:
Marriage
(Definition
of
Marriage)
Amendment
Act
2013
Type:
Public
Date:
19/04/2013
Terminated:
0
Year:
2013
Reprint:
0
Date
reprinted:
-‐
Amends:
Marriage
Act
1955
b)
Amendment
Link
Neda Sakhaee (UOA) Legislation Network February 25, 2016 10 / 27
11. Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial Results
Data sets and Visualizations
Data set and visualisation are available at:
https://dataverse.harvard.edu/dataverse/LN
Neda Sakhaee (UOA) Legislation Network February 25, 2016 11 / 27
12. Progress To-Date and Initial Results Network General Measures
Progress To-Date and Initial Results
Network General Measures
RN*** BWN WWN RN*** BCN WCN RN*** BAN WAN
Nodes 3856 3856 3856 2142 2142 2142 3856 3856 3856
Edges 33884 33884 33884 20124 20124 20124 9030 9030 9030
Average Degree 9.712 8.878 13.233 10.112 9.395 17.257 3.207 2.342 3.648
Diameter 15 15 15 15 15 15 15 15 15
CCcyc 0.003 0.223 0.223 0.004 0.492 0.492 0.001 0.031 0.031
CCmid 0.003 0.305 0.305 0.004 0.655 0.655 0.001 0.066 0.066
CCin 0.003 0.528 0.528 0.004 0.414 0.414 0.001 0.03 0.030
CCout 0.003 0.506 0.506 0.004 0.374 0.374 0.001 0.033 0.033
Average CC* 0.003 0.446 0.446 0.004 0.484 0.484 0.001 0.004 0.004
Average Path length** 6.124 3.569 3.569 7.254 3.346 3.346 1.817 4.43 4.43
Small world No Yes Yes No Yes Yes No No No
*Based on directed clustering coefficient as proposed by B.M.Tabak in 2014.
**Based on Average Path length proposed by S.H.Strogatz in 1995.
***Random Network is a graph with specific number of vertices n and connection probability of p. The
indexes are calculated based on a sample of 100 graphs.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 12 / 27
13. Progress To-Date and Initial Results Network General Measures
Progress To-Date and Initial Results
In-degree Out-degree correlation
There is not any meaningfulll correlation between out-degree and in-degree
of regular citation networks, but:
R result for Pearson’s
product-moment correlation
X=In-Degree Y=Out-Degree
t = 57.332 df = 2140 p-value < 2.2e-16
Alternative hypothesis: true
Correlation is not equal to 0
95 percent confidence interval: 0.761 0.794
Sample estimates correlation: 0.778
0 50 100 150 200 250 300
0100200300
X
Y
Neda Sakhaee (UOA) Legislation Network February 25, 2016 13 / 27
14. Progress To-Date and Initial Results Centrality, Important Nodes
Progress To-Date and Initial Results
Centrality, Important Nodes
Intuitively presented by Borgatti in 2005, centrality measures describe
the importance of nodes. Each measure depends on an implicit model
of how traffic flows in the network.
We chose a standard measure (similar to PageRank), Eigenvector
Centrality as our key measure which initially proposed by Bonacic in
1972.
Later in community detection part, we use Betweenness centrality,
introduced by Borgatti in 2005, to label the communities.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 14 / 27
15. Progress To-Date and Initial Results Centrality, Important Nodes
Progress To-Date and Initial Results
Eigenvector Centrality Result
Table: Top ten Acts
BWN & WWN BCN & WCN BAN & WAN
Criminal Procedure Act 2011 1 Public Finance Act 1989 1 Public Finance Act 1989 1
Public Finance Act 1989 0.86 Criminal Procedure Act 2011 0.94 State Sector Act 1988 0.95
Summary Proceedings Act 1957 0.84 Summary Proceedings Act 1957 0.93 Companies Act 1993 0.87
State Sector Act 1988 0.77 State Sector Act 1988 0.85 Summary Proceedings Act 1957 0.83
Companies Act 1993 0.67 District Courts Act 1947 0.82 Local Government Act 2002 0.83
Local Government Act 2002 0.62 Judicature Act 1908 0.74 Criminal Procedure Act 2011 0.79
Privacy Act 1993 0.62 Crimes Act 1961 0.72 District Courts Act 1947 0.69
Crimes Act 1961 0.61 Privacy Act 1993 0.69 Official Information Act 1982 0.68
Regulations (Disallowance) Act 1989 0.55 Companies Act 1993 0.67 Education Act 1989 0.63
Official Information Act 1982 0.54 Local Government Act 1974 0.65 Land Transfer Act 1952 0.6
Average EC 0.05 Average EC 0.05 Average EC 0.05
Eigenvector centrality values are normalised.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 15 / 27
16. Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial Results
Initial Community Detection
Community detection algorithms provide a clearer picture of the
network.
Network clustering is a specific type of data clustering problem which
includes network measures as variables in the objective functions. The
most famous network clustering models are spectral clustering and
modularity clustering.
We will use the Modularity method and Louvain algorithm to solve it.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 16 / 27
17. Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial Results
BCN clusters based Modularity method and Louvain algorithm
Criminal Procedure Act 2011
Resource Management Act 1991
Public Finance Act 1989Local Government Act 1974
Companies Act 1993
Judicature Act 1908
Decimal Currency Act 1964
Patents Act 1953
Remainder
Neda Sakhaee (UOA) Legislation Network February 25, 2016 17 / 27
18. Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial Results
WAN clusters based Modularity method and Louvain algorithm
Summary Proceedings Act 1957
Decimal Currency Act 1964
Public Finance Act 1989
Reserve Bank of New Zealand Act 1989
Income Tax Act 2004
Official Information Act 1982
Local Government Act 1974
Criminal Procedure Act 2011
Employment Relations Act 2000
Public Works Act 1981
Government Superannuation Fund Act 1956
Remainder
Neda Sakhaee (UOA) Legislation Network February 25, 2016 18 / 27
19. Progress To-Date and Initial Results Modelling Example
Progress To-Date and Initial Results
Modelling Example
Hypothesis: Major Party in the government impacts the creation of
important laws.
H0 CNational = CLabour
H1 CNational = CLabour
t-test result from R:
Parameter Value
t statistics −3.0250
df 885
P-value 0.0026
%95 confidence interval [−0.0308, −0.0007]
µCNational
0.0430
µCLabour
0.0680
Result: H0 is rejected
On average Labour governments produce more important
legislation.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 19 / 27
20. Progress To-Date and Initial Results Academic outcomes
Progress To-Date and Initial Results
Academic outcomes
Presentation in INFORMS 2015, Philadelphia, Modelling of New
Zealand Acts Network.
Poster presentation in INFORMS 2015, Philadelphia, Legislation
Network.
Presentation in CMSS 2016, Auckland, Mathematical analysis of New
Zealand Legislation Network.
Presentation in Pitch on the Plains 2015, Christchurch, Law Sense.
Journal paper under preparation for Cambridge Network Science, 85
percent completed.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 20 / 27
21. Future Studies Overview
Future Studies
Overview
Generative model, degree distribution, pattern studies.
Contribute to community detection.
Relation of network properties with political and legal processes.
Improving historical data and studying time evolution.
Comparative studies (Canada, Australia, e.c.).
Studying death of nodes (repeals, replacements).
Link and attribute prediction.
Legislative drafting tool, identifying dependency between Acts.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 21 / 27
22. Future Studies Generative Models, Pattern Prediction
Future Studies
Generative Models, Pattern Prediction
Referring to Clauset in 2014, network generative models:
Makes network different from noise and random graphs.
Helps to describe the network succinctly and capture most relevant
patterns.
Helps us to generalise from one part of the network to another, from
one network to other of same type, from small scale to large scale, or
from past to future.
Consider Graph G, then a generative model can be proposed as a
probability distribution P (G | θ) with parameter θ.
The aim of the research is to find this probability distribution for
Legislation Network using Bayesian inference.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 22 / 27
23. Future Studies Community Detection Method
Future Studies
Community Detection Method
It is an optimising problem, then:
Objective function captures the notion of community structure as
groups of nodes with better internal connectivity than external. A
Quality Measure needs to be defined.
An algorithmic techniques, assigns the nodes of the network to
specific communities, optimising the objective function.
It is a computing difficult problem and needs heuristic methods.
It is well studied for undirected graph, but a gap exist in the literature
for directed networks. This research aims to contribute to the latest
methods and algorithems for directed graph clustering problem. Then
the results can be examined using the context of the documents.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 23 / 27
24. Requirements and Limitations
Requirements and Limitations
Data
In the dataset there are some historical documents which are cited by
other documents, but their own document don’t exist in the xml files set.
This missing historical data is required to complete to time evolution and
generative models studies.
Software
Required softwares for this research are:
R
MATLAB
Gephi
LATEX
Neda Sakhaee (UOA) Legislation Network February 25, 2016 24 / 27
25. Time-table for my PhD project
The time-table of PhD project completion
i Name/Title Start Date End Date Percent Complete
1 Mathematical Modelling and Analysis of Legislation Network 01/02/15 31/01/18 34.7
1.1 Provisional Year 01/02/15 25/02/16 100
1.1.1 Data Preparation 01/02/15 14/05/15 100
1.1.2 Build the Network 14/05/15 15/06/15 100
1.1.3 Network 5/29/15 5/29/15 100
1.1.4 Literature of general measures 15/07/15 15/11/15 100
1.1.5 Initial results 16/07/15 31/07/15 100
1.1.6 First talk (INFORMS 2015) 01/08/15 04/11/15 100
1.1.7 INFORMS 04/11/15 04/11/15 100
1.1.8 Thesis Proposal 01/09/15 10/02/16 100
1.1.9 CMSS Workshop preparation 01/01/16 20/02/16 100
1.1.10 Departmental Seminar 01/01/16 25/02/16 100
1.2 First Paper: network structure, general measures, initial modelings 01/08/15 15/03/16 85
1.2.1 Outline 01/09/15 30/09/15 100
1.2.2 Network Building 01/08/15 31/08/15 100
1.2.3 General Measures 01/08/15 30/09/15 100
1.2.4 Centrality 01/08/15 29/02/16 80
1.2.5 Community Detection 30/9/15 29/02/16 80
1.2.6 Initial Modelling 15/10/15 29/02/16 100
1.2.7 Writing 22/11/15 15/03/16 65
1.2.8 Submit the paper 15/03/16 15/03/16 0
1.3 Second Paper: develop community detection method for directed networks 15/03/16 20/10/16 11
1.3.1 Outline 15/03/16 31/03/16 0
1.3.2 Analysis and results 15/03/16 15/08/16 25
1.3.3 Writing 20/04/16 20/10/16 0
1.3.4 Submit the paper to the Cambridge Network Science Journal 20/10/16 20/10/16 0
1.4 Third Paper: generative models and prediction studies 01/08/16 01/03/17 5
1.4.1 Outline 01/08/16 31/08/16 0
1.4.2 Analysis and Result 01/09/16 28/02/17 10
1.4.3 Writing 15/09/16 01/03/17 0
1.4.4 Submit the paper to the Cambridge Network Science Journal 01/03/17 01/03/17 0
1.5 Fourth Paper: modelling and comparison studies 01/01/17 30/06/17 7
1.5.1 Outline 01/03/17 31/03/17 0
1.5.2 Analysis and Results 01/01/17 01/06/17 15
1.5.3 Writing 01/02/17 30/06/17 0
1.5.4 Submit the paper to the World Politics Journal 30/06/17 30/06/17 0
1.6 Theis 01/02/17 31/01/18 5
1.6.1 Write the thesis 01/02/17 31/01/18 5
Conference talks: SUNBELT, JURIX, e.c.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 25 / 27
26. Time-table for my PhD project
References
[1] A. Barabási. Emergence of Scaling in Random Networks. Science 286.5439 (Oct. 15, 1999), 509-512.
[2] Michael J. Bommarito, Daniel Katz, and Jon Zelner. Law as a seamless web: compar- ison of various network
representations of the United States Supreme Court corpus (1791-2005). In: ACM Press, 2009, p. 234.
[3] Stephen P. Borgatti. Centrality and network flow. Social networks 27.1 (2005), 55-71.
[4] U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner. On Modularity
Clustering. IEEE Transactions on Knowledge and Data Engineering 20.2 (Feb. 2008), 172-188.
[5] Aaron Clauset, Cristopher Moore, and Mark EJ Newman. Hierarchical structure and the prediction of missing
links in networks. Nature 453.7191 (2008), 98-101.
[6] Steven H. Strogatz Duncan J. Watts. Collective dynamics of ?small-world? networks (1998).
[7] P. Erdo ?s and A. Rényi. On the strength of connectedness of a random graph. Acta Mathematica Academiae
Scientiarum Hungarica 12.1 (Sept. 29, 2013), 261-267.
[8] Giorgio Fagiolo. Clustering in complex directed networks. Physical Review E 76.2 (2007), 026107.
[9] Timothy R. and Spriggs James F. and Jeon Sangick and Wahlbeck Paul J. Fowler James H. and Johnson.
Network Analysis and the Law: Measuring the Legal Im- portance of Precedents at the U.S. Supreme Court.
Political Analysis (2007).
[10] Michel Grabisch. Social networks: Prestige, centrality, and influence (Invited paper). 2011.
[11] Marios Koniaris, Ioannis Anagnostopoulos, and Yannis Vassiliou. Network Analysis in the Legal Domain: A
complex model for European Union legal sources. (2015).
[12] Elizabeth A. Leicht, Gavin Clarkson, Kerby Shedden, and Mark EJ Newman. Large- scale structure of time
evolving citation networks. The European Physical Journal B 59.1 (2007), 75-83.
[13] Linyuan Lü and Tao Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics
and its Applications 390.6 (2011), 1150-1170.
[14] Pierre Mazzega, Danièle Bourcier, and Romain Boulet. The network of French legal codes. In: Proceedings of
the 12th international conference on artificial intelligence and law. ACM, 2009, pp. 236-237.
[15] Mark EJ Newman. Modularity and community structure in networks. Proceedings of the National Academy of
Sciences 103.23 (2006), 8577-8582.
[16] Mark EJ Newman. The structure and function of complex networks. SIAM review 45.2 (2003), 167-256.
[17] Benjamin M. Tabak, Marcelo Takami, Jadson M. C. Rocha, Daniel O. Cajueiro, and Sergio R. S. Souza.
Directed clustering coefficient as a measure of systemic risk in complex banking networks. Physica A: Statistical
Mechanics and its Applications 394 (Jan. 15, 2014), 211-216.
[18] Duncan J. Watts and Steven H. Strogatz. Collective dynamics of ?small-world? net- works. Nature 393.6684
(June 4, 1998), 440.
[19] Lavanya Zhang Paul and Koppaka. Semantics-based Legal Citation Network. In: Proceedings of the 11thNeda Sakhaee (UOA) Legislation Network February 25, 2016 26 / 27
27. Time-table for my PhD project
Question?
Neda Sakhaee (UOA) Legislation Network February 25, 2016 27 / 27