Exploring the Networks in Open Public Data

  • 800 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
800
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • the raw data not always immediately useful to wide public - using open data - discovering patterns - making sense of it
  • It’s worthwhile to explore networks that emerge from the data you’re looking atVarious kinds of networks: - people in companies (who communicates with whom) - MPs, based on co-voting patterns - companies (networks of)
  • Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. - http://opendefinition.org/http://opendatahandbook.org/en/what-is-open-data/index.html
  • - scrape the data -make it open - clean up the data - transform the data - make it usable [for the purpose]how do we define an edge?
  • We want to choose those parts of data from which we can deduce something - simple procedural decisions are outChose voting instances where there were notable opinion differencesNoise = MPs who had votes only a few times (throws off %s)---Some votes are more important than others
  • Harmony CentreGreens/Farmers–choice: (a) join one of twoclusters; (b) isolation; (c) bridge between them
  • strong voting discipline in the Harmony Centre. majority of the rest do not vote the same (at this value of n%)
  • far opposition / near opposition / coalitionlooks prettydoesnot give much useful information - almost a full graph
  • does it look right at first sight? (the “sniff test”)show to domain expertspeople can make pretty graphs - but what do they mean? - what can we explain or show via them?
  • the Greens / Farmers party is bridging between the strong opposition party Harmony Centre and the ruling coalition - sometimes agree with the opposition, sometimes with the coalitionsee slides 21, 22 re “live animation” showing what happens if you take them off the graph
  • learned from experts: not everything appears as a vote; some votes are more important than others - more insights -> better visualisations (more truthful, etc.)some advanced visualisations will need more information - e.g., to define what laws are on what topicsbringing in more data - annotate nodes & edges with additional data / explanations of why this edge appears here - profiles for members of parliament (e.g., TheyWorkUs site in the UK) - linked data
  • another example of an open data graph visualisation
  • another view of this data: http://www.slideshare.net/DERIGalway/valdis-krebs-social-network-analysis-19872007/15The central red cluster corresponds to the company headquarters. Eachvertex in the network represents an employee, colored according to the locationthey work at. Graph edges denote frequent, confirmed, work-related communi-cations between employees. Cluster overlaps reveal which employees frequentlyinteract with other locations, serving as boundary-spanners. This visualizationhelps to identify key connectors in the company [0].
  • what do we do with thesevisualisations next? = how do we use them (to have impact, explain data, …)
  • social network visualisation & analysis allow to see what was previously invisible“Social Network Analysis” talk by Valdis Krebs - for more info re SNA and network visualization
  • demo how the Greens / Farmers party is bridging between the stong opposition Harmony Centre and the ruling coalition - sometimes agree with the opposition, sometimes with the coalition - (edge connection criteria n = 25%)
  • demo how the Greens / Farmers party is bridging between the stong opposition Harmony Centre and the ruling coalitionwhen the Greens / Farmers party nodes are hidden from the graph, there is no connection. - the coalition and the Harmony Centre do not vote the same

Transcript

  • 1. Exploring the Networks in Open Public Data Uldis BojārsInstitute of Mathematics and Computer Science University of Latvia Using Open Data Workshop Brussels, 20-Jun-2012
  • 2. About us• Institute of Mathematics and Computer Science, University of Latvia – http://www.lumii.lv/resource/show/170 – Uldis Bojārs @CaptSolo – Valdis Krebs http://orgnet.com – Pēteris Ručevskis
  • 3. Network visualisation and analysisApplications:• discover interesting patterns• explore data in [more] detailWork from the Open Data Hackaton in Riga• analysis of Saeima voting patterns• http://opendata.lv
  • 4. Overview• Data needs to be Open• Pre-processing and filtering the data – selecting what to show• Data visualization – iterative process (visualize, refine, repeat)• What’s next?
  • 5. Open Data needed first (!)“Open data is data that can befreely used, reused and redistributed by anyone …” http://opendefinition.org/Data needs to be:• open• easy to useStill a problem in Latvia:• only a few datasets are open in an easy-to-consume form (PDF does not count :)
  • 6. http://titania.saeima.lv/LIVS11/SaeimaLIVS2_DK.nsf/0/9DEA96450E79B7E5C2257944007E589D?OpenDocument
  • 7. Pre-processing• Input: – raw vote data (scraped from the website) published at http://data.opendata.lv/• Output: – nodes (MPs) – edges (connections between them)• What is a connection?
  • 8. Defining graph connections• Connect MPs if they have voted similarly – disagreed on at most n% of decisions• Filter out cases where almost all MPs voted the same• Filter out trivial decisions• Filter out noise
  • 9. Node colour legend• Ruling coalition: – Zatler’s Reform Party – Unity – the National Alliance• Opposition: – Harmony Centre – Greens / Farmers Party• a few non-party MPs
  • 10. MPs who always vote the same (n = 0%) Connection criteria too narrow
  • 11. MPs who disagree in less than 35% of cases Connection criteria too broad (everyone agrees, really?)
  • 12. Refining the visualisation• Need to find the right cut-off values (n%) – where patterns [start to] appear – and the visualisation makes sense• Show the results to domain experts – MPs, journalists, political researchers, …• Experts: – help improve visualisations – can discover new things for themselves
  • 13. MPs who disagree in less than 11% of casesOpposition parties [sometimes] vote the same
  • 14. MPs who disagree in less than 25% of cases Bridges appear b/w position and opposition parties(see slides 21, 22 re the bridging role of yellow nodes)
  • 15. What next?• Improve our understanding of data• Enhance visualisations – add clusters, etc.• Create multiple visualisations – different topics, changes in time, etc.• Bring in more data – explain nodes & edges
  • 16. network visualisation example #1 Donations to political partieshttp://www.thenetworkthinkers.com/2011/12/ innovation-happens-at-intersections.html
  • 17. network visualisation example #2Intra-company communication patterns
  • 18. Conclusion• Need more, useful Open Data• Discovering patterns, making sense of data – helping make sense = purpose of visualisations• Looking forward to collaboration re: – Using Open Data – Data Visualisation and Analysis
  • 19. More info• Uldis Bojārs uldis.bojars@gmail.com• Social Network Analysis talk / Valdis Krebs http://www.slideshare.net/DERIGalway/ valdis-krebs-social-network-analysis-19872007• Smart Network Analyzer tool http://sna.lumii.lv/ in development at IMCS, University of Latvia