Improving decision-making based on government data and visualizations

Improving decision making
based on government data
and visualizations
Alvaro Graves
gravea3@rpi.edu

1

Agenda
• Background
• Open Government Data
• Problem
• How to use this data?
• Proposed Solution
• Personas
• (Re)use of visualizations
• Future Work 2

Open Government Data
• Governments are releasing huge amounts of
data (geographical, budget, transit, etc)
• Goal: Improve transparency, economy, make
people take informed decisions, etc.

Open data is the electricity of the 21st century! - M. Hausenblas
4

The government data landscape
• Independent Data

• Different goals Consumer
(In govt)

• No coordination Data
Data

Civil

• Highly decoupled Hacker

• Asynchronous
Data

Data Data

Data
Producer

Data

Data
Consumer
(data journalist)

5

Scenario

Problem: Some stakeholders can’t use most of
this government data and use them in their
decision-making process, since don’t have the
skills or training needed to consume it.*

*Based on interviews

6

Objectives

• Our goal: Allow more people to use and
understand government data to make more
informed decisions
• A solution: Improve creation, sharing and
reuse of data-based visualizations, so they
can consume and communicate data

7

Challenges
• Who are the stakeholders?
• Govt. Data producers and consumers, Data journalists/
Activists, Civil hackers, Citizens
• How do we help people to (re)use all this data?
• Use of visualizations as a medium to communication [1]

• ... but this is hard [2]

• How can we ease these processes?
[1] Crapo, A.W., et al. Visualization and the process of modeling: a cognitive-theoretic view, 2000
[2] Viegas, F.B., et al. Manyeyes: a site for visualization at internet scale, 2007
8

Who are the
stakeholders?

9

Stakeholders
• Government Data Provider
• Government Data Consumer
• Data Journalist / Activist
• Civil Hacker
• Already use the data, have the skills
• Common Citizen
• Not interested [3] [4] in being part of this ecosystem
(directly)
[3] DiFranzo, D. and Graves, A. A Farm in Every Window: A Study into the Incentives for Participation in
the Windowfarm Virtual Community, 2010
[4] Preece, J. and Shneiderman, B. The reader-to-leader framework: Motivating technology-mediated
10
social participation, 2009

Proﬁle modelling using
Personas
• Personas[5] is a technique common in HCI
and human factors to understand user
types
• Based on interviews, create a “persona”
that represents a set of users with common
characteristics
• Add as much many details as possible to
understand environment,
[5] Blomkvist, S. Personas - An overview, 2004
11

Persona: Government data provider*
• Phillip Mancini, 35, married, one daughter.

• He is a data analyst working for the agency for Electronic Government

• His work consists in promoting the government’s data portal

• This means coordinate and request data from other agencies and publish it
in the government portal

• Promote and make easier for others to use the data available

• He knows some programming, but he is not an expert (he knows well
several datasets though)

• Eventually create mashups to his boss or other government employees to
show the beneﬁts of Open Data (but he doesn’t have much time/expertise
for this)

* Based on interviews with government employees
12

How can we help
people to (re)use all
these data?

13

Visualizations as a way to consume and share data
• Visualizations are a simple way for humans to communicate
data and quantitative information[6]

• A visualization can be

• A graph

•
Full Chart Title Goes
Here
Subtitle appears here if it exists

Pie
1
5

1
2

Y Axis Label
Category A Category B Category C Category D

•
X Axis Label

Scatterplot

•
Full Chart Title Goes Here

Others
15

12

Y Axis Label
9

6

3

•
X Axis Label

A table, list

• A map
[6] Few, S. Data Visualization for Human Perception, Encyclopedia of Human Interaction, 2010
14

Problems for the creator*
• Create visualizations is hard
• Creator needs to understand underlying data
• Creator needs to choose a visualization strategy
• Visualizations of Open Government Data
• Different formats
• Distributed data
• Focus on how to tie everything up
* Based on preliminary interviews (Govt. data provider & consumer)
15

Problems for the observer*
• Accountability questions
• Visualization’s provenance
• Where does the data Full Chart Title Goes Here

15

12

Y Axis Label
9

comes from?
6

3

X Axis Label

• When was collected?
• How was processed?
*Based on preliminary interviews (Data journalist)
16

Problems for the reuser*
• “I wonder how this data looks in a map”
• “What if we use the data from previous year?”
• “What if we take the median instead of the
average?”

Full Graph Title Goes Here

15

12

Y Axis Label
9

6

3

0 10 20 30 40 50 60
X Axis Label

* Based on interviews (Govt. data consumer & Data journalist)
17

How can we ease the
process of creating and
reusing a visualization?

18

Visualizations as
declarative components
• Instead of forcing users
to interact with code,
use formal components
that mediates between a
user and the computer
• This components will
reduce the efforts,
training and skills Full Chart Title Goes Here

necessary to create
1
5
1
Y Axis Label

2
9

visualizations
6

3

Category
Category B Category C Category D
A
X Axis Label

19

Step 1: Encode this
knowledge
• Use of semantics
opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
opmv:wasControlledBy
skos:Concept opmv:Agent

NameId

rdfs:subClassOf
rdfs:subClassOf
rdfs:subPropertyOf rdfs:subClassOf

rdfs:label

rdf:type
:Application

:Message
rdfs:subPropertyOf
dc:hasFormat

to represent
rdfs:subClassOf skos:broader
blank

:usedParameter

:Component :usedInput cnt:chars

rdfs:subClassOf
dc:format
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
Code
rdfs:subClassOf :VisualizationComponent

:DataComponent

• High-level
:ProcessComponent

mime type
:Input :Parameter


:UrlDereferencer :SparqlEndpointRetriever

representation
of different Full Graph Title Goes Here

component of a

15

12

Y Axis Label
9

6

3

visualization
0 10 20 30 40 50 60
X Axis Label

opmv:A opmv:Proc opmv:Arti cnt:ContentAs
skos:Con opmv:wasControlledBy opmv:used
gent ess fact Text
cept
NameId
rdfs:subClassOf
rdfs:subClassOf rdfs:subPropertyOf
rdfs:subClassOf
rdfs:label
:Applicati rdf:type
on :Mess
rdfs:subPropertyOf age
dc:hasFormat
blank
:usedParameter
:Compon
ent :usedInput cnt:chars

rdfs:subClassOf dc:format
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf
:VisualizationCompo Code
rdfs:subClassOf
:DataCompo nent
nent
:ProcessCompone
nt mime
:Input :Parameter type

:UrlDereference :SparqlEndpointRetr
r iever

<HTML>

20

Step 2: Explore
Visualization
• Allow users to
obtain the
formalization of it
• High-level Full Graph Title Goes Here

15
opmv:wasControlledBy
opmv:Process opmv:used opmv:Artifact

NameId
cnt:ContentAsText

components
rdfs:subClassOf
12 rdfs:subClassOf
Y Axis Label

rdfs:subPropertyOf rdfs:subClassOf

9 rdfs:label

rdf:type
6 :Application

:Message
3 rdfs:subPropertyOf
dc:hasFormat

0 10 20 30 40 50 60 blank

:usedParameter
X Axis Label


rdfs:subClassOf
dc:format
rdfs:subClassOf
opmv:Proc opmv:Arti cnt:ContentAs
Code
skos:Con opmv:A opmv:wasControlledBy opmv:used
gent ess fact Text rdfs:subClassOf :VisualizationComponent
cept
NameId
rdfs:subClassOf
rdfs:subClassOf
rdfs:label
:DataComponent

• The relations
:Applicati rdf:type
on :Mess
rdfs:subPropertyOf age
dc:hasFormat
blank :ProcessComponent
:usedParameter
:Compon
ent :usedInput cnt:chars
mime type
rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf dc:format :Input :Parameter
rdfs:subClassOf
:VisualizationCompo Code
rdfs:subClassOf
:DataCompo nent
nent
:ProcessCompone
nt rdfs:subClassOf rdfs:subClassOf
mime
:Input :Parameter type

:UrlDereference :SparqlEndpointRetr
r iever


<HTML>

among them
• Display it in graphical
terms (workﬂow,
forms, etc) 21

Step 3: Reuse of a
visualization
• Modify a new copy of a visualization
• Represented as a formalization to the
user, no code
Full Graph Title Goes Here Full Chart Title Goes Here
Subtitle appears here if it exists Subtitle appears here if it exists

15 15

12 12

Y Axis Label
Y Axis Label

9 9

6
6

3
3

0 10 20 30 40 50 60
X Axis Label
X Axis Label

opmv:wasControlledBy opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
opmv:wasControlledBy opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
NameId skos:Concept opmv:Agent

rdfs:subClassOf NameId
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:label rdfs:subPropertyOf
rdfs:subClassOf
rdf:type
:Application rdfs:label
:Message rdf:type
:Application
rdfs:subPropertyOf
dc:hasFormat :Message
rdfs:subClassOf skos:broader rdfs:subPropertyOf
blank dc:hasFormat
:usedParameter rdfs:subClassOf skos:broader
blank
:usedParameter
rdfs:subClassOf dc:format
Code dc:format
rdfs:subClassOf :VisualizationComponent rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
Code
:DataComponent rdfs:subClassOf :VisualizationComponent
:DataComponent
:ProcessComponent

:Input :Parameter mime type :ProcessComponent

:Input :Parameter mime type


<HTML> <HTML>

Backlinking
22

What should we measure?
• Time required to complete tasks
• Create visualization from scratch vs. using
formalization
• Reuse visualization from scratch vs. using
formalization
• Self report
• Can you do a task you weren’t able to do before?
• Can you perform better (time, # errors) using this
approach?
23

Future work
• Do a more complete creation of personas
• Work with more Data Producers and
Data Journalists
• Build tools based on our formalization
• Several components already created
• Test it against real users
• Design experiments in details
• A dozen volunteers available so far
24

References
• [1] Crapo, A.W., et al. Visualization and the process of modeling: a cognitive-
theoretic view, 2000

• [2] Viegas, F.B., et al. Manyeyes: a site for visualization at internet scale, 2007

• [3] DiFranzo, D. and Graves, A. A Farm in Every Window: A Study into the
Incentives for Participation in the Windowfarm Virtual Community, 2010

• [4] Preece, J. and Shneiderman, B. The reader-to-leader framework: Motivating
technology-mediated social participation, 2009

• [5] Blomkvist, S. Personas - An overview, 2004

• [6] Few, S. Data Visualization for Human Perception, Encyclopedia of Human
Interaction, 2010

25

Improving decision-making based on government data and visualizations

Recommended

Recommended

More Related Content

Similar to Improving decision-making based on government data and visualizations

Similar to Improving decision-making based on government data and visualizations (20)

More from Alvaro Graves

More from Alvaro Graves (16)

Recently uploaded

Recently uploaded (20)

Improving decision-making based on government data and visualizations

Editor's Notes