SlideShare a Scribd company logo
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Mining parliamentary data and news articles
to find patterns of collaboration between
politicians and third party actors.
Francisco Rodr´ıguez Drumond
DAMA & LARCA - UPC
July 7,2014
1 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Social Networks: a natural tool for political analysis.
Nodes: Families of the
political landscape of XV
century Florence.
Links: marriages between
families (alliances).
2 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Analizing parliaments through SNs.
Why?
Main challenge: source of information (nodes and
relationships)
Co-sponsorship. [Fow06]
Speeches. [TPL06]
Strong and weak ties. [Kir11]
Can we discover relationships involving third-party actors?
Third party discovery
Defining meaningful relationships.
3 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
An overview of our task
4 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
An overview of our task
5 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
An overview of our task
6 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
An overview of our task
7 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
SOPA: A motivating example.
Policy Networks (PN): Social networks for political analysis.
8 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
An overview of the literature.
Co-occurrence. [EESGGHAC14], [PSIO06].
Enriching links with the strength and semantics of relations.
[Tan07],[PSB07],[ZAR03].
Beyond document co-occurrence. [NCSS06],[Bra06].
A (very) related paper. [MID+13]
9 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
A (very) related paper.
Moschopoulos (2013) Toward the automatic extraction of policy
networks using web links and documents
Two pre-computed PNs: Ireland and Greece.
Ground truth used for measuring correlations with similarity
measures.
Web based.
Three types of similarity metrics:
Co-occurrence metrics (Set comparisons).
Text-based metrics.
Link-based metrics.
10 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Generating bill based Policy Networks: the architecture.
11 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Finding news articles that talk about a bill.
Topic modeling:
TF-IDF for keyword
extraction.
One bill - one document.
Whole set of bills as the
corpus.
1,2,3-ngrams.
Top 1000 keywords for each
bill.
Querying news articles:
Bills and news articles
modeled as vectors
Cosine similarity for
comparison.
Rocchio’s rule for improving
queries.
12 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Selecting relevant news articles.
Threshold: point that maximizes:
threshold = argmax
p
|p − (p.b )b |
Intuition: point at which there is
no significant gain in score.
13 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Entity extraction and preprocessing.
MITIE for entity extraction +
1 Entity Normalization
‘The Univ. lumiere Lyon 2’ → ‘Univ Lumiere Lyon 2’
2 Mapping organization initials to the whole name
‘The World Life Fund (WLF) has...’
→ ‘World Life Fund’ = ‘WLF’
3 Mapping partial names with full names
‘George Harrison preferred .... Harrison also...’
→ ‘George Harrison’ = ‘Harrison’
4 Expanding names based on the news corpus
‘Polit`ecnica de Catalunya’
→ ‘Universitat Polit`ecnica de Catalunya’
14 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Filtering relevant entities.
Problem: +3000 entities per bill
Noise.
Expensive comparisons.
Solution:
Document co-occurrence + Latent Semantic Indexing (LSI)
for fast similarity computation.
Hierarchical Agglomerative Clustering (HAC) for grouping
entities based on their similarity.
Politicians → seed entities.
Silhoutte for detecting the best cluster containing seed
entities.
15 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Computing and thresholding entity similarities.
Entities represented as vectors of 1...3-grams occurring in
paragraphs they are mentioned in.
TF-IDF with sublinear TF scaling (tf = 1 + log(frequency))
Cosine similarity for comparing the vectors.
Elbows for detecting relevant entities for each entity.
Two entities e1 and e2 are related iff they are in each others
relevant entities list.
16 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Results.
Two bills:
BCN-World.
Law of Popular Non-referendary Consults.
Look at:
Communities → colors.
Influencers → node size.
17 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
BCN-World - Organizations.
Acencas
CUP
Tripartito
Canales Y Puertos De Tarragona
Puerto De Tarragona
Diputacion De Tarragona
Camara De Comercio De Tarragona Pimec
Govern
PSC
Parlament
Ciu
Veremonte
ERC
Icv-euia
PP
La Caixa
Melco
Hard Rock
Cepta
PPC
Sociedad Centre Medics Selva Maresme
Ciutadans
Grup Peralada
Camara Catalana
AECE
URV
Value RetailFerrari
Melia
Hard Rock Cafe
18 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
BCN-World - Persons-Organizations.
Felip Puig
Francesc Xavier Mena
Pimec
Diputacion De Tarragona
Govern
Veremonte
Josep Gonzalez
Josep Poblet
Pere Granados
Xavier Adsera
Salvador Guillermo
Isidre Faine
Xavier Pallares
Francesc Homs
Antoni Belmonte
Cepta
Dolors Llobet
Caixabank
Macia Alavedra
Javier De La Rosa
Daniel De Alfonso
Hortensia GrauJoan Herrera
Santi Vila
Jordi VilajoanaLluis Rullan
Melco
Xavier Sabate
Josep Felix Ballesteros
Puerto De Tarragona
Enrique Bañuelos
Josep Andreu
URV
Isabel Vallet
Joan Pons
Icv-euia
Pere Aragones
Parlament
Hard Rock
Grup Peralada
PSC
PP
Jordi Turull
Marta Rovira
Miquel Salazar
Jordi Pons
Jaume Amat
Sociedad Centre Medics Selva Maresme
Pere Navarro
Andreu Mas-colell
Caixa
Damia Calvet
Oriol Junqueras
Artur Mas
Camara Catalana
Ciu
ERC
PPC
Albert Batet
Alicia Romero
Melia
Value Retail
Alejandro Fernandez
Enric Millo
Alicia Sanchez-camacho
CUP
Enric Genesca
Tripartito
Ciutadans
Agusti Colom
Camara De Comercio De Tarragona
Ferrari
La Roca
Ernest Maragall
Josep Mayoral
Carles Pellicer
Acencas
Francesc Perendreu
AECE
Jordi Sierra
19 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Law of Popular Non-referendary Consults. - Organizations.
Manresadecideix
UDC
Comunidad Valenciana
Senado
Upyd
Omnium
Tribunal Constitucional
Moviment Arenyenc
PP
Juzgado De Lo Contencioso
Parlament
Greenpeace
Solidaritat
ERC
PSC
Podemos
PSOE
Icv-euia
Ciudadanos
Congreso Ciu
Compromis
CDC
CUP
BNG
Barcelona DecideixReagrupament
Bildu
ANC
Els Verds
20 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Law of Popular Non-referendary Consults. - Persons.
Lehendakari Ibarretxe
Manuel Fraga
Carles Mora
Joan Ridao
Uriel Bertran
Felip Puig
Jordi Fabrega
Joan Saura
Miquel Calcada
Joan Puigcercos
Lluis Corominas
Jaume De Frontanya
Jose Blanco
Jose Manuel Durao Barroso
Jose Luis Rodriguez Zapatero
Mariano Rajoy
Josep Maria Pelegri
Josep Lluis Carod-roviraJosep Antoni Duran
Jose Zaragoza
Maria Emilia Casas
Maria Mas Jose Montilla
Leon
Josep Pique
Josep Camprubi
Lauren Uria
Ramon Torramade
Jordi Pujol
Josep Rull
Pedro Sanchez
Rodrigo Rato
Javier Arenas
Jose Maria Aznar
Maria Dolores De Cospedal
Carles MartiJordi Hereu
Joan Clos
Xavier Trias
Jordi Portabella
Joan Ferran
Ricard Goma
Ferran Mascarell
Pasqual Maragall
Franco
Miquel Iceta
Soraya Saenz De Santamaria
Patxi Lopez
Angel Acebes
Pere Jover
Marc Carrillo
Eduardo Zaplana
Oriol Pujol
Dolors Camats
Jordi Molto
Laia Bonet
Joan Botella
Joana Ortega
Joan Tarda
Santiago Rodriguez
Joan Herrera
Ferran Requejo
Abogado Del Estado
Andreu Mumbru
Artur Mas
Assumpta Escarp
Antoni Castells
Jone Goyricelaya
Ernest Benach
Angel Ros
Francesc Homs
Jordi Ausas
Nuria De Gispert
Oriol Junqueras
Albert Rivera
Jordi Turull
Marta Rovira
Alicia Sanchez-camacho
Ramon Espadaler
Pere Navarro
David Fernandez
Maurici Lucena
Alfredo Perez Rubalcaba
Carme Forcadell
Anna Simo
Joan Ignasi ElenaJoan Rigol
Esperanza Aguirre
Jose Manuel Soria
Laia Ortiz
Carles Viver Pi-sunyer
Rosa Diez
Carme Garcia
Josep Duran Lleida
Alfred Morales
Josep Maria Alvarez
Muriel Casals
Alfred Bosch
Cayo Lara
Pedro Arriola
Jose Maria Mena
Josep Maria Terricabras
Dolors Batalla
Arnaldo Otegi
Joan Carles Gallego
Alicia Sanchez Camacho
Jose Domingo
Angel Colom
Carme Capdevila
Joan Carretero
Francesc Ribera
Ahora Marti
Ernest Maragall
Jorge Fernandez Diaz
Alfons Lopez Tena
Alfonso Alonso
Marina Llansana
Marc Sanglas
Alberto Ruiz Gallardon David Cameron
Lluis Companys
Carme Chacon
Santi Vila
Jose Antonio Perez Tapias
Jordi Guillot
Josep Felix Ballesteros
Isabel Vallet
Alberto Fernandez Diaz
Antonio Hernando
Cristobal Montoro
Jaume Collboni
Ban Ki-moon
Ada Colau
21 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Conclusions.
1 An unbiased, low-cost, automated tool to aid the process of
Policy Network generation and analysis.
2 The system automatically:
1 Detect entities related to a bill.
2 Computes and thresholds similarity measures for SN
generation.
3 The method works better for finding relationships between
organizations than for persons, particularly politicians.
22 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Contributions.
1 The use of bills as a cornerstone relating political actors,
allowing to:
Understand better the discovered relations.
Find fine-grained relationships which would otherwise be
missed.
2 A method for combining parliamentary open data and news
papers for PN generation.
3 An unsupervised method for automatically detecting relevant
entities of a given topic from a corpus of documents given a
set of seed entities.
23 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Future work.
1 A more rigorous evaluation and problem definition.
2 Improving the PN generation phase.
3 Generative models.
4 Use-case driven PN generation.
5 Time component.
6 Signed Social Network Analysis
24 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
The end.
Merci beacoup!
Gr`acies!
Grazie!
Mult¸umesc!
Questions?
25 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
Understanding the representation of entities and
documents.
26 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
References I
Roger B Bradford, Application of latent semantic indexing in
generating graphs of terrorist networks, Intelligence and
Security Informatics, Springer, 2006, pp. 674–675.
Jes´us Espinal-Enr´ıquez, J Mario Siqueiros-Garc´ıa, Rodrigo
Garc´ıa-Herrera, and Sergio Antonio Alcal´a-Corona, A
literature-based approach to a narco-network, Social
Informatics, Springer, 2014, pp. 97–101.
James H Fowler, Connecting the congress: A study of
cosponsorship networks, Political Analysis 14 (2006), no. 4,
456–487.
Justin H Kirkland, The relational determinants of legislative
outcomes: Strong and weak ties between legislators, The
Journal of Politics 73 (2011), no. 03, 887–898.
27 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
References II
Theodosis Moschopoulos, Elias Iosif, Leeda Demetropoulou,
Alexandros Potamianos, and Shrikanth Shri Narayanan,
Toward the automatic extraction of policy networks using web
links and documents, Knowledge and Data Engineering, IEEE
Transactions on 25 (2013), no. 10, 2404–2417.
David Newman, Chaitanya Chemudugunta, Padhraic Smyth,
and Mark Steyvers, Analyzing entities and topics in news
articles using statistical topic models, Intelligence and Security
Informatics, Springer, 2006, pp. 93–104.
Bruno Pouliquen, Ralf Steinberger, and Clive Best, Automatic
detection of quotations in multilingual news, Proceedings of
Recent Advances in Natural Language Processing, 2007,
pp. 487–492.
28 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
References III
Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, and Tamara
Oellinger, arXiv preprint cs/0609066 (2006).
Hristo Tanev, Unsupervised learning of social networks from a
multiple-source news corpus, MuLTISOuRcE, MuLTILINguAL
INfORMATION ExTRAc-TION ANd SuMMARIzATION
(2007), 33.
Matt Thomas, Bo Pang, and Lillian Lee, Get out the vote:
Determining support or opposition from congressional
floor-debate transcripts, Proceedings of the 2006 conference
on empirical methods in natural language processing,
Association for Computational Linguistics, 2006, pp. 327–335.
29 / 30
Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work
References IV
Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella,
Kernel methods for relation extraction, The Journal of
Machine Learning Research 3 (2003), 1083–1106.
30 / 30

More Related Content

Similar to DAMA - Final Presentation

FutureTDM Workshop II 29 March
FutureTDM Workshop II 29 MarchFutureTDM Workshop II 29 March
FutureTDM Workshop II 29 March
FutureTDM
 
Developments in European Statistics
Developments in European StatisticsDevelopments in European Statistics
Developments in European Statistics
Statistikaamet / Statistics Estonia
 
DescriptionBanking DomainThis is only exampleCrea.docx
DescriptionBanking DomainThis is only exampleCrea.docxDescriptionBanking DomainThis is only exampleCrea.docx
DescriptionBanking DomainThis is only exampleCrea.docx
cuddietheresa
 
The Co-organising projects
The Co-organising projectsThe Co-organising projects
The Co-organising projects
Samos2019Summit
 
The impact of AI and Blockchain technologies in the Legal Industry
The impact of AI and Blockchain technologies in the Legal IndustryThe impact of AI and Blockchain technologies in the Legal Industry
The impact of AI and Blockchain technologies in the Legal Industry
Hunter Thompson
 
Machine Learning and Social Participation
Machine Learning and Social ParticipationMachine Learning and Social Participation
Machine Learning and Social Participation
Yasodara Cordova
 
The role of online monitoring in influencing political behaviour: an explorat...
The role of online monitoring in influencing political behaviour: an explorat...The role of online monitoring in influencing political behaviour: an explorat...
The role of online monitoring in influencing political behaviour: an explorat...
Simon Collister & Associates
 
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docxSUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
deanmtaylor1545
 
Consider some of the organizations you have been affiliated with..docx
Consider some of the organizations you have been affiliated with..docxConsider some of the organizations you have been affiliated with..docx
Consider some of the organizations you have been affiliated with..docx
bobbywlane695641
 
E police gara prezentacija en
E police gara prezentacija enE police gara prezentacija en
E police gara prezentacija en
madaravinberga
 
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
susannedejong
 
New Metrics for Sustainable Prosperity: Options for GDP+3
New Metrics for Sustainable Prosperity: Options for GDP+3New Metrics for Sustainable Prosperity: Options for GDP+3
New Metrics for Sustainable Prosperity: Options for GDP+3
susannedejong
 
2002 EGPA Conference presentation
2002 EGPA Conference presentation2002 EGPA Conference presentation
2002 EGPA Conference presentation
Mentxu Ramilo Araujo
 
FUTURE POLICY MODELLING (FUPOL)
FUTURE POLICY MODELLING (FUPOL)FUTURE POLICY MODELLING (FUPOL)
FUTURE POLICY MODELLING (FUPOL)
Danube University Krems, Centre for E-Governance
 
Chapter 12 – From our weekly chapter reading, we learned that Crow.docx
Chapter 12 – From our weekly chapter reading, we learned that Crow.docxChapter 12 – From our weekly chapter reading, we learned that Crow.docx
Chapter 12 – From our weekly chapter reading, we learned that Crow.docx
bartholomeocoombs
 
Public Administration and InformationTechnologyVolum.docx
Public Administration and InformationTechnologyVolum.docxPublic Administration and InformationTechnologyVolum.docx
Public Administration and InformationTechnologyVolum.docx
woodruffeloisa
 
Advocacy presentation Amnesty International
Advocacy presentation Amnesty InternationalAdvocacy presentation Amnesty International
Advocacy presentation Amnesty International
CCIVS
 
Discussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim BarcelonaDiscussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim Barcelona
Pablo Aragón
 
231232233234235236.docx
231232233234235236.docx231232233234235236.docx
231232233234235236.docx
novabroom
 
Pablo de Pedraza: Labor market matching, economic cycle and online vacancies
Pablo de Pedraza: Labor market matching, economic cycle and online vacanciesPablo de Pedraza: Labor market matching, economic cycle and online vacancies
Pablo de Pedraza: Labor market matching, economic cycle and online vacancies
Textkernel
 

Similar to DAMA - Final Presentation (20)

FutureTDM Workshop II 29 March
FutureTDM Workshop II 29 MarchFutureTDM Workshop II 29 March
FutureTDM Workshop II 29 March
 
Developments in European Statistics
Developments in European StatisticsDevelopments in European Statistics
Developments in European Statistics
 
DescriptionBanking DomainThis is only exampleCrea.docx
DescriptionBanking DomainThis is only exampleCrea.docxDescriptionBanking DomainThis is only exampleCrea.docx
DescriptionBanking DomainThis is only exampleCrea.docx
 
The Co-organising projects
The Co-organising projectsThe Co-organising projects
The Co-organising projects
 
The impact of AI and Blockchain technologies in the Legal Industry
The impact of AI and Blockchain technologies in the Legal IndustryThe impact of AI and Blockchain technologies in the Legal Industry
The impact of AI and Blockchain technologies in the Legal Industry
 
Machine Learning and Social Participation
Machine Learning and Social ParticipationMachine Learning and Social Participation
Machine Learning and Social Participation
 
The role of online monitoring in influencing political behaviour: an explorat...
The role of online monitoring in influencing political behaviour: an explorat...The role of online monitoring in influencing political behaviour: an explorat...
The role of online monitoring in influencing political behaviour: an explorat...
 
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docxSUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
SUPPLEMENTAL MATERIALSCRJ510 Criminal Justice Policy & Theory.docx
 
Consider some of the organizations you have been affiliated with..docx
Consider some of the organizations you have been affiliated with..docxConsider some of the organizations you have been affiliated with..docx
Consider some of the organizations you have been affiliated with..docx
 
E police gara prezentacija en
E police gara prezentacija enE police gara prezentacija en
E police gara prezentacija en
 
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
New Metrics for Sustainable Prosperity: Options for GDP+3 (preliminary study)
 
New Metrics for Sustainable Prosperity: Options for GDP+3
New Metrics for Sustainable Prosperity: Options for GDP+3New Metrics for Sustainable Prosperity: Options for GDP+3
New Metrics for Sustainable Prosperity: Options for GDP+3
 
2002 EGPA Conference presentation
2002 EGPA Conference presentation2002 EGPA Conference presentation
2002 EGPA Conference presentation
 
FUTURE POLICY MODELLING (FUPOL)
FUTURE POLICY MODELLING (FUPOL)FUTURE POLICY MODELLING (FUPOL)
FUTURE POLICY MODELLING (FUPOL)
 
Chapter 12 – From our weekly chapter reading, we learned that Crow.docx
Chapter 12 – From our weekly chapter reading, we learned that Crow.docxChapter 12 – From our weekly chapter reading, we learned that Crow.docx
Chapter 12 – From our weekly chapter reading, we learned that Crow.docx
 
Public Administration and InformationTechnologyVolum.docx
Public Administration and InformationTechnologyVolum.docxPublic Administration and InformationTechnologyVolum.docx
Public Administration and InformationTechnologyVolum.docx
 
Advocacy presentation Amnesty International
Advocacy presentation Amnesty InternationalAdvocacy presentation Amnesty International
Advocacy presentation Amnesty International
 
Discussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim BarcelonaDiscussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim Barcelona
 
231232233234235236.docx
231232233234235236.docx231232233234235236.docx
231232233234235236.docx
 
Pablo de Pedraza: Labor market matching, economic cycle and online vacancies
Pablo de Pedraza: Labor market matching, economic cycle and online vacanciesPablo de Pedraza: Labor market matching, economic cycle and online vacancies
Pablo de Pedraza: Labor market matching, economic cycle and online vacancies
 

DAMA - Final Presentation

  • 1. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Mining parliamentary data and news articles to find patterns of collaboration between politicians and third party actors. Francisco Rodr´ıguez Drumond DAMA & LARCA - UPC July 7,2014 1 / 30
  • 2. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Social Networks: a natural tool for political analysis. Nodes: Families of the political landscape of XV century Florence. Links: marriages between families (alliances). 2 / 30
  • 3. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Analizing parliaments through SNs. Why? Main challenge: source of information (nodes and relationships) Co-sponsorship. [Fow06] Speeches. [TPL06] Strong and weak ties. [Kir11] Can we discover relationships involving third-party actors? Third party discovery Defining meaningful relationships. 3 / 30
  • 4. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work An overview of our task 4 / 30
  • 5. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work An overview of our task 5 / 30
  • 6. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work An overview of our task 6 / 30
  • 7. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work An overview of our task 7 / 30
  • 8. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work SOPA: A motivating example. Policy Networks (PN): Social networks for political analysis. 8 / 30
  • 9. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work An overview of the literature. Co-occurrence. [EESGGHAC14], [PSIO06]. Enriching links with the strength and semantics of relations. [Tan07],[PSB07],[ZAR03]. Beyond document co-occurrence. [NCSS06],[Bra06]. A (very) related paper. [MID+13] 9 / 30
  • 10. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work A (very) related paper. Moschopoulos (2013) Toward the automatic extraction of policy networks using web links and documents Two pre-computed PNs: Ireland and Greece. Ground truth used for measuring correlations with similarity measures. Web based. Three types of similarity metrics: Co-occurrence metrics (Set comparisons). Text-based metrics. Link-based metrics. 10 / 30
  • 11. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Generating bill based Policy Networks: the architecture. 11 / 30
  • 12. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Finding news articles that talk about a bill. Topic modeling: TF-IDF for keyword extraction. One bill - one document. Whole set of bills as the corpus. 1,2,3-ngrams. Top 1000 keywords for each bill. Querying news articles: Bills and news articles modeled as vectors Cosine similarity for comparison. Rocchio’s rule for improving queries. 12 / 30
  • 13. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Selecting relevant news articles. Threshold: point that maximizes: threshold = argmax p |p − (p.b )b | Intuition: point at which there is no significant gain in score. 13 / 30
  • 14. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Entity extraction and preprocessing. MITIE for entity extraction + 1 Entity Normalization ‘The Univ. lumiere Lyon 2’ → ‘Univ Lumiere Lyon 2’ 2 Mapping organization initials to the whole name ‘The World Life Fund (WLF) has...’ → ‘World Life Fund’ = ‘WLF’ 3 Mapping partial names with full names ‘George Harrison preferred .... Harrison also...’ → ‘George Harrison’ = ‘Harrison’ 4 Expanding names based on the news corpus ‘Polit`ecnica de Catalunya’ → ‘Universitat Polit`ecnica de Catalunya’ 14 / 30
  • 15. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Filtering relevant entities. Problem: +3000 entities per bill Noise. Expensive comparisons. Solution: Document co-occurrence + Latent Semantic Indexing (LSI) for fast similarity computation. Hierarchical Agglomerative Clustering (HAC) for grouping entities based on their similarity. Politicians → seed entities. Silhoutte for detecting the best cluster containing seed entities. 15 / 30
  • 16. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Computing and thresholding entity similarities. Entities represented as vectors of 1...3-grams occurring in paragraphs they are mentioned in. TF-IDF with sublinear TF scaling (tf = 1 + log(frequency)) Cosine similarity for comparing the vectors. Elbows for detecting relevant entities for each entity. Two entities e1 and e2 are related iff they are in each others relevant entities list. 16 / 30
  • 17. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Results. Two bills: BCN-World. Law of Popular Non-referendary Consults. Look at: Communities → colors. Influencers → node size. 17 / 30
  • 18. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work BCN-World - Organizations. Acencas CUP Tripartito Canales Y Puertos De Tarragona Puerto De Tarragona Diputacion De Tarragona Camara De Comercio De Tarragona Pimec Govern PSC Parlament Ciu Veremonte ERC Icv-euia PP La Caixa Melco Hard Rock Cepta PPC Sociedad Centre Medics Selva Maresme Ciutadans Grup Peralada Camara Catalana AECE URV Value RetailFerrari Melia Hard Rock Cafe 18 / 30
  • 19. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work BCN-World - Persons-Organizations. Felip Puig Francesc Xavier Mena Pimec Diputacion De Tarragona Govern Veremonte Josep Gonzalez Josep Poblet Pere Granados Xavier Adsera Salvador Guillermo Isidre Faine Xavier Pallares Francesc Homs Antoni Belmonte Cepta Dolors Llobet Caixabank Macia Alavedra Javier De La Rosa Daniel De Alfonso Hortensia GrauJoan Herrera Santi Vila Jordi VilajoanaLluis Rullan Melco Xavier Sabate Josep Felix Ballesteros Puerto De Tarragona Enrique Bañuelos Josep Andreu URV Isabel Vallet Joan Pons Icv-euia Pere Aragones Parlament Hard Rock Grup Peralada PSC PP Jordi Turull Marta Rovira Miquel Salazar Jordi Pons Jaume Amat Sociedad Centre Medics Selva Maresme Pere Navarro Andreu Mas-colell Caixa Damia Calvet Oriol Junqueras Artur Mas Camara Catalana Ciu ERC PPC Albert Batet Alicia Romero Melia Value Retail Alejandro Fernandez Enric Millo Alicia Sanchez-camacho CUP Enric Genesca Tripartito Ciutadans Agusti Colom Camara De Comercio De Tarragona Ferrari La Roca Ernest Maragall Josep Mayoral Carles Pellicer Acencas Francesc Perendreu AECE Jordi Sierra 19 / 30
  • 20. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Law of Popular Non-referendary Consults. - Organizations. Manresadecideix UDC Comunidad Valenciana Senado Upyd Omnium Tribunal Constitucional Moviment Arenyenc PP Juzgado De Lo Contencioso Parlament Greenpeace Solidaritat ERC PSC Podemos PSOE Icv-euia Ciudadanos Congreso Ciu Compromis CDC CUP BNG Barcelona DecideixReagrupament Bildu ANC Els Verds 20 / 30
  • 21. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Law of Popular Non-referendary Consults. - Persons. Lehendakari Ibarretxe Manuel Fraga Carles Mora Joan Ridao Uriel Bertran Felip Puig Jordi Fabrega Joan Saura Miquel Calcada Joan Puigcercos Lluis Corominas Jaume De Frontanya Jose Blanco Jose Manuel Durao Barroso Jose Luis Rodriguez Zapatero Mariano Rajoy Josep Maria Pelegri Josep Lluis Carod-roviraJosep Antoni Duran Jose Zaragoza Maria Emilia Casas Maria Mas Jose Montilla Leon Josep Pique Josep Camprubi Lauren Uria Ramon Torramade Jordi Pujol Josep Rull Pedro Sanchez Rodrigo Rato Javier Arenas Jose Maria Aznar Maria Dolores De Cospedal Carles MartiJordi Hereu Joan Clos Xavier Trias Jordi Portabella Joan Ferran Ricard Goma Ferran Mascarell Pasqual Maragall Franco Miquel Iceta Soraya Saenz De Santamaria Patxi Lopez Angel Acebes Pere Jover Marc Carrillo Eduardo Zaplana Oriol Pujol Dolors Camats Jordi Molto Laia Bonet Joan Botella Joana Ortega Joan Tarda Santiago Rodriguez Joan Herrera Ferran Requejo Abogado Del Estado Andreu Mumbru Artur Mas Assumpta Escarp Antoni Castells Jone Goyricelaya Ernest Benach Angel Ros Francesc Homs Jordi Ausas Nuria De Gispert Oriol Junqueras Albert Rivera Jordi Turull Marta Rovira Alicia Sanchez-camacho Ramon Espadaler Pere Navarro David Fernandez Maurici Lucena Alfredo Perez Rubalcaba Carme Forcadell Anna Simo Joan Ignasi ElenaJoan Rigol Esperanza Aguirre Jose Manuel Soria Laia Ortiz Carles Viver Pi-sunyer Rosa Diez Carme Garcia Josep Duran Lleida Alfred Morales Josep Maria Alvarez Muriel Casals Alfred Bosch Cayo Lara Pedro Arriola Jose Maria Mena Josep Maria Terricabras Dolors Batalla Arnaldo Otegi Joan Carles Gallego Alicia Sanchez Camacho Jose Domingo Angel Colom Carme Capdevila Joan Carretero Francesc Ribera Ahora Marti Ernest Maragall Jorge Fernandez Diaz Alfons Lopez Tena Alfonso Alonso Marina Llansana Marc Sanglas Alberto Ruiz Gallardon David Cameron Lluis Companys Carme Chacon Santi Vila Jose Antonio Perez Tapias Jordi Guillot Josep Felix Ballesteros Isabel Vallet Alberto Fernandez Diaz Antonio Hernando Cristobal Montoro Jaume Collboni Ban Ki-moon Ada Colau 21 / 30
  • 22. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Conclusions. 1 An unbiased, low-cost, automated tool to aid the process of Policy Network generation and analysis. 2 The system automatically: 1 Detect entities related to a bill. 2 Computes and thresholds similarity measures for SN generation. 3 The method works better for finding relationships between organizations than for persons, particularly politicians. 22 / 30
  • 23. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Contributions. 1 The use of bills as a cornerstone relating political actors, allowing to: Understand better the discovered relations. Find fine-grained relationships which would otherwise be missed. 2 A method for combining parliamentary open data and news papers for PN generation. 3 An unsupervised method for automatically detecting relevant entities of a given topic from a corpus of documents given a set of seed entities. 23 / 30
  • 24. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Future work. 1 A more rigorous evaluation and problem definition. 2 Improving the PN generation phase. 3 Generative models. 4 Use-case driven PN generation. 5 Time component. 6 Signed Social Network Analysis 24 / 30
  • 25. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work The end. Merci beacoup! Gr`acies! Grazie! Mult¸umesc! Questions? 25 / 30
  • 26. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work Understanding the representation of entities and documents. 26 / 30
  • 27. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work References I Roger B Bradford, Application of latent semantic indexing in generating graphs of terrorist networks, Intelligence and Security Informatics, Springer, 2006, pp. 674–675. Jes´us Espinal-Enr´ıquez, J Mario Siqueiros-Garc´ıa, Rodrigo Garc´ıa-Herrera, and Sergio Antonio Alcal´a-Corona, A literature-based approach to a narco-network, Social Informatics, Springer, 2014, pp. 97–101. James H Fowler, Connecting the congress: A study of cosponsorship networks, Political Analysis 14 (2006), no. 4, 456–487. Justin H Kirkland, The relational determinants of legislative outcomes: Strong and weak ties between legislators, The Journal of Politics 73 (2011), no. 03, 887–898. 27 / 30
  • 28. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work References II Theodosis Moschopoulos, Elias Iosif, Leeda Demetropoulou, Alexandros Potamianos, and Shrikanth Shri Narayanan, Toward the automatic extraction of policy networks using web links and documents, Knowledge and Data Engineering, IEEE Transactions on 25 (2013), no. 10, 2404–2417. David Newman, Chaitanya Chemudugunta, Padhraic Smyth, and Mark Steyvers, Analyzing entities and topics in news articles using statistical topic models, Intelligence and Security Informatics, Springer, 2006, pp. 93–104. Bruno Pouliquen, Ralf Steinberger, and Clive Best, Automatic detection of quotations in multilingual news, Proceedings of Recent Advances in Natural Language Processing, 2007, pp. 487–492. 28 / 30
  • 29. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work References III Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, and Tamara Oellinger, arXiv preprint cs/0609066 (2006). Hristo Tanev, Unsupervised learning of social networks from a multiple-source news corpus, MuLTISOuRcE, MuLTILINguAL INfORMATION ExTRAc-TION ANd SuMMARIzATION (2007), 33. Matt Thomas, Bo Pang, and Lillian Lee, Get out the vote: Determining support or opposition from congressional floor-debate transcripts, Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, 2006, pp. 327–335. 29 / 30
  • 30. Motivation. Related works. Generating bill based Policy Networks. Results Conclusions Future work References IV Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella, Kernel methods for relation extraction, The Journal of Machine Learning Research 3 (2003), 1083–1106. 30 / 30