Exploring the Networks in Open Public Data
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,127
On Slideshare
1,127
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • the raw data not always immediately useful to wide public - using open data - discovering patterns - making sense of it
  • It’s worthwhile to explore networks that emerge from the data you’re looking atVarious kinds of networks: - people in companies (who communicates with whom) - MPs, based on co-voting patterns - companies (networks of)
  • Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. - http://opendefinition.org/http://opendatahandbook.org/en/what-is-open-data/index.html
  • - scrape the data -make it open - clean up the data - transform the data - make it usable [for the purpose]how do we define an edge?
  • We want to choose those parts of data from which we can deduce something - simple procedural decisions are outChose voting instances where there were notable opinion differencesNoise = MPs who had votes only a few times (throws off %s)---Some votes are more important than others
  • Harmony CentreGreens/Farmers–choice: (a) join one of twoclusters; (b) isolation; (c) bridge between them
  • strong voting discipline in the Harmony Centre. majority of the rest do not vote the same (at this value of n%)
  • far opposition / near opposition / coalitionlooks prettydoesnot give much useful information - almost a full graph
  • does it look right at first sight? (the “sniff test”)show to domain expertspeople can make pretty graphs - but what do they mean? - what can we explain or show via them?
  • the Greens / Farmers party is bridging between the strong opposition party Harmony Centre and the ruling coalition - sometimes agree with the opposition, sometimes with the coalitionsee slides 21, 22 re “live animation” showing what happens if you take them off the graph
  • learned from experts: not everything appears as a vote; some votes are more important than others - more insights -> better visualisations (more truthful, etc.)some advanced visualisations will need more information - e.g., to define what laws are on what topicsbringing in more data - annotate nodes & edges with additional data / explanations of why this edge appears here - profiles for members of parliament (e.g., TheyWorkUs site in the UK) - linked data
  • another example of an open data graph visualisation
  • another view of this data: http://www.slideshare.net/DERIGalway/valdis-krebs-social-network-analysis-19872007/15The central red cluster corresponds to the company headquarters. Eachvertex in the network represents an employee, colored according to the locationthey work at. Graph edges denote frequent, confirmed, work-related communi-cations between employees. Cluster overlaps reveal which employees frequentlyinteract with other locations, serving as boundary-spanners. This visualizationhelps to identify key connectors in the company [0].
  • what do we do with thesevisualisations next? = how do we use them (to have impact, explain data, …)
  • social network visualisation & analysis allow to see what was previously invisible“Social Network Analysis” talk by Valdis Krebs - for more info re SNA and network visualization
  • demo how the Greens / Farmers party is bridging between the stong opposition Harmony Centre and the ruling coalition - sometimes agree with the opposition, sometimes with the coalition - (edge connection criteria n = 25%)
  • demo how the Greens / Farmers party is bridging between the stong opposition Harmony Centre and the ruling coalitionwhen the Greens / Farmers party nodes are hidden from the graph, there is no connection. - the coalition and the Harmony Centre do not vote the same

Transcript

  • 1. Exploring the Networks in Open Public Data Uldis BojārsInstitute of Mathematics and Computer Science University of Latvia Using Open Data Workshop Brussels, 20-Jun-2012
  • 2. About us• Institute of Mathematics and Computer Science, University of Latvia – http://www.lumii.lv/resource/show/170 – Uldis Bojārs @CaptSolo – Valdis Krebs http://orgnet.com – Pēteris Ručevskis
  • 3. Network visualisation and analysisApplications:• discover interesting patterns• explore data in [more] detailWork from the Open Data Hackaton in Riga• analysis of Saeima voting patterns• http://opendata.lv
  • 4. Overview• Data needs to be Open• Pre-processing and filtering the data – selecting what to show• Data visualization – iterative process (visualize, refine, repeat)• What’s next?
  • 5. Open Data needed first (!)“Open data is data that can befreely used, reused and redistributed by anyone …” http://opendefinition.org/Data needs to be:• open• easy to useStill a problem in Latvia:• only a few datasets are open in an easy-to-consume form (PDF does not count :)
  • 6. http://titania.saeima.lv/LIVS11/SaeimaLIVS2_DK.nsf/0/9DEA96450E79B7E5C2257944007E589D?OpenDocument
  • 7. Pre-processing• Input: – raw vote data (scraped from the website) published at http://data.opendata.lv/• Output: – nodes (MPs) – edges (connections between them)• What is a connection?
  • 8. Defining graph connections• Connect MPs if they have voted similarly – disagreed on at most n% of decisions• Filter out cases where almost all MPs voted the same• Filter out trivial decisions• Filter out noise
  • 9. Node colour legend• Ruling coalition: – Zatler’s Reform Party – Unity – the National Alliance• Opposition: – Harmony Centre – Greens / Farmers Party• a few non-party MPs
  • 10. MPs who always vote the same (n = 0%) Connection criteria too narrow
  • 11. MPs who disagree in less than 35% of cases Connection criteria too broad (everyone agrees, really?)
  • 12. Refining the visualisation• Need to find the right cut-off values (n%) – where patterns [start to] appear – and the visualisation makes sense• Show the results to domain experts – MPs, journalists, political researchers, …• Experts: – help improve visualisations – can discover new things for themselves
  • 13. MPs who disagree in less than 11% of casesOpposition parties [sometimes] vote the same
  • 14. MPs who disagree in less than 25% of cases Bridges appear b/w position and opposition parties(see slides 21, 22 re the bridging role of yellow nodes)
  • 15. What next?• Improve our understanding of data• Enhance visualisations – add clusters, etc.• Create multiple visualisations – different topics, changes in time, etc.• Bring in more data – explain nodes & edges
  • 16. network visualisation example #1 Donations to political partieshttp://www.thenetworkthinkers.com/2011/12/ innovation-happens-at-intersections.html
  • 17. network visualisation example #2Intra-company communication patterns
  • 18. Conclusion• Need more, useful Open Data• Discovering patterns, making sense of data – helping make sense = purpose of visualisations• Looking forward to collaboration re: – Using Open Data – Data Visualisation and Analysis
  • 19. More info• Uldis Bojārs uldis.bojars@gmail.com• Social Network Analysis talk / Valdis Krebs http://www.slideshare.net/DERIGalway/ valdis-krebs-social-network-analysis-19872007• Smart Network Analyzer tool http://sna.lumii.lv/ in development at IMCS, University of Latvia