The document discusses challenges in using open government data for decision making and proposes using visualizations to help more people consume and communicate data, but notes it can be difficult to create visualizations from distributed government data and reuse existing visualizations. It suggests using personas to understand different stakeholder types and exploring how to ease the processes of creating and reusing visualizations from open data to allow more informed decisions.
Visualizing Healthcare Data with Tableau (Toronto Central LHIN Presentation)Stefan Popowycz
This is the presentation I gave to the Toronto Central LHIN about using Tableau to visualizing healthcare metrics (April 16 2013). I also have a section on how Information Design best practices can be leveraged in order to effectively communicate your key messages to your end users.
The first step toward understanding what data assets mean for your organization is understanding what those assets mean for each other. Metadata—literally, data about data—is one of many Data Management disciplines inherent in good systems development and is perhaps the most mislabeled and misunderstood of the lot. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and can also enable you to combine more sophisticated Data Management techniques in support of larger and more complex business initiatives.In this webinar, we will:Illustrate how to leverage Metadata Management in support of your business strategyDiscuss foundational metadata concepts based on the DAMA Guide to Data Management Book of Knowledge (DAMA DMBOK)Enumerate guiding principles for and lessons previously learned from metadata and its practical uses
Visualizing Healthcare Data with Tableau (Toronto Central LHIN Presentation)Stefan Popowycz
This is the presentation I gave to the Toronto Central LHIN about using Tableau to visualizing healthcare metrics (April 16 2013). I also have a section on how Information Design best practices can be leveraged in order to effectively communicate your key messages to your end users.
The first step toward understanding what data assets mean for your organization is understanding what those assets mean for each other. Metadata—literally, data about data—is one of many Data Management disciplines inherent in good systems development and is perhaps the most mislabeled and misunderstood of the lot. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and can also enable you to combine more sophisticated Data Management techniques in support of larger and more complex business initiatives.In this webinar, we will:Illustrate how to leverage Metadata Management in support of your business strategyDiscuss foundational metadata concepts based on the DAMA Guide to Data Management Book of Knowledge (DAMA DMBOK)Enumerate guiding principles for and lessons previously learned from metadata and its practical uses
This presentation talks about what Open Data is, how to get it and share for public use. This presentation is made possible by https://websiteghana.com and https://saviour-sanders.com.
Data + Audience: Connecting to Create ImpactCourtney Clark
Presenting data that is compelling enough to get a reaction is a challenge that all organizations face, both big and small. Communicating data and maximizing impact are about supplying the right audience with the right amount of data in the right format. To increase the likelihood that your audience will latch onto your data means that you’ve got to be sure your data resonates with them.
In this session, you’ll learn how to:
- Identify your audience types and their data consumption tendencies
- Map the types of data presentation each of your audiences needs
- Choose the right data tools to communicate your data most effectively
As we work through these areas, we will also provide you with real-life examples where organizations have successfully mapped out how their data speak to each of their audiences.
The session is half presentation and half workshop where you’ll work hands-on with the data your organization produces and craft it to better target your audiences.
Learning Objectives
- Understand data audience types and their needs
- Choose the right data products and tools to communicate most effectively
- Learn lessons from an actual real-world case study
Co-presented with David Mascarina at the 2018 Nonprofit Technology Conference.
Series of Leading Change slides illustrate an aspect of my resume, namely a range of early professional experiments related to advancing--in small ways--sources of government innovation: transparency, collaboration, public participation and organization design.
Big Data Definition & Characteristic.
Company Dominates Big Data.
Big Data and Other Technologies.
Big Data and UN.
Big Data for Statistics.
Big Data for Development.
Big data & Open Data.
Big data & SDG’s.
Ligado nos Políticos at ESWC'2011 WorkshopPablo Mendes
Publishing Linked Data from Brazilian Politicians on the Web
Lucas de Ramos Araújo
Pablo N. Mendes
Jairo Francisco de Souza
At the Workshop on Semantics in Governance and Policy Modelling, Extended Semantic Web Conference 2011 ESWC2010.
May 30, 2011 - Crete, Greece
DataEd Slides: Data Management Best PracticesDATAVERSITY
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This approach combines the DM BoK and the CMMI/DMM, permitting organizations with the opportunity to benefit from the best of both. The approach permits organizations to understand current Data Management practices, strengths to leverage, and remediation opportunities. In a nutshell, it describes what must be done at the programmatic level to achieve better data use.
With the Census in England and Wales taking place on 21 March 2021, we created a programme of webinars to showcase our plans for design and quality assurance. The series, which was carried out through November and December 2020, included a high-level introductory overview as well as 'In Focus' sessions that outlined specific aspects in more detail. These webinars gave attendees the opportunity to ask questions and provide feedback.
How to build and run a big data platform in the 21st centuryAli Dasdan
This tutorial was presented in the IEEE Big Data Conference in 2019. It shows that building and running a big data platform for both real-time streaming and batch data processing for all kinds of applications involving analytics, data science, reporting, and the like in today’s world can be as easy as following a checklist. We live in a fortunate time that many of the components needed are already available in the open source or as a service from commercial vendors. This tutorial shows how to put these components together in multiple sophistication levels to cover the spectrum from a basic reporting need to a full fledged operation across geographically distributed regions with business continuity measures in place. This tutorial provides enough information and checklists to the audience that it can also serve as a goto reference in the actual process of building and running.
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...DATAVERSITY
Self-Service data analysis holds the promise of more rapid time-to-value for both business and IT users as advanced tooling & visualization helps make sense of raw and source data sets. Does this mean that the paradigm of ‘design-then-build’ that’s typical of data modeling is no longer relevant? Or is it more relevant than ever, as more eyes on the data means more questions about core business definitions.
Join Donna Burbank for this webinar to discuss the realities of where data modeling fits in this new paradigm.
Data Architecture Strategies: The Rise of the Graph DatabaseDATAVERSITY
Graph databases are growing in popularity, with their ability to quickly discover and integrate key relationship between enterprise data sets. Business use cases such as recommendation engines, master data management, social networks, enterprise knowledge graphs and more provide valuable ways to leverage graph databases in your organization. This webinar provides an overview of graph database technologies, and how they can be used for practical applications to drive business value.
Presentación para ABRELATAM13 donde hablo de la necesidad de mejores estándares y tecnología para las iniciativas de Datos Abiertos y cómo la tecnología afecta la utilidad y transparencia de estas iniciativas.
Creation of visualizations based on Linked DataAlvaro Graves
A common task with any relatively large amount of data is to create visual representations that help users to make sense of such data and observe trends that otherwise would be hard for them to appreciate. The creation of these visual- izations usually requires some knowledge in a programming language, making it difficult for non-technical savvy users to create visualizations. In this paper we present Visualbox, a system that makes it easier for non-programmers to create web visualizations based on Linked Data. These visualiza- tions can be accessed by any modern web browser and can be easily embedded in web pages and blogs. We describe how people can create visualizations using Visualbox and we show examples of work done by real users. Finally we present a study that shows that Visualbox makes it easier for users to create Linked Data-based visualizations.
More Related Content
Similar to Improving decision-making based on government data and visualizations
This presentation talks about what Open Data is, how to get it and share for public use. This presentation is made possible by https://websiteghana.com and https://saviour-sanders.com.
Data + Audience: Connecting to Create ImpactCourtney Clark
Presenting data that is compelling enough to get a reaction is a challenge that all organizations face, both big and small. Communicating data and maximizing impact are about supplying the right audience with the right amount of data in the right format. To increase the likelihood that your audience will latch onto your data means that you’ve got to be sure your data resonates with them.
In this session, you’ll learn how to:
- Identify your audience types and their data consumption tendencies
- Map the types of data presentation each of your audiences needs
- Choose the right data tools to communicate your data most effectively
As we work through these areas, we will also provide you with real-life examples where organizations have successfully mapped out how their data speak to each of their audiences.
The session is half presentation and half workshop where you’ll work hands-on with the data your organization produces and craft it to better target your audiences.
Learning Objectives
- Understand data audience types and their needs
- Choose the right data products and tools to communicate most effectively
- Learn lessons from an actual real-world case study
Co-presented with David Mascarina at the 2018 Nonprofit Technology Conference.
Series of Leading Change slides illustrate an aspect of my resume, namely a range of early professional experiments related to advancing--in small ways--sources of government innovation: transparency, collaboration, public participation and organization design.
Big Data Definition & Characteristic.
Company Dominates Big Data.
Big Data and Other Technologies.
Big Data and UN.
Big Data for Statistics.
Big Data for Development.
Big data & Open Data.
Big data & SDG’s.
Ligado nos Políticos at ESWC'2011 WorkshopPablo Mendes
Publishing Linked Data from Brazilian Politicians on the Web
Lucas de Ramos Araújo
Pablo N. Mendes
Jairo Francisco de Souza
At the Workshop on Semantics in Governance and Policy Modelling, Extended Semantic Web Conference 2011 ESWC2010.
May 30, 2011 - Crete, Greece
DataEd Slides: Data Management Best PracticesDATAVERSITY
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This approach combines the DM BoK and the CMMI/DMM, permitting organizations with the opportunity to benefit from the best of both. The approach permits organizations to understand current Data Management practices, strengths to leverage, and remediation opportunities. In a nutshell, it describes what must be done at the programmatic level to achieve better data use.
With the Census in England and Wales taking place on 21 March 2021, we created a programme of webinars to showcase our plans for design and quality assurance. The series, which was carried out through November and December 2020, included a high-level introductory overview as well as 'In Focus' sessions that outlined specific aspects in more detail. These webinars gave attendees the opportunity to ask questions and provide feedback.
How to build and run a big data platform in the 21st centuryAli Dasdan
This tutorial was presented in the IEEE Big Data Conference in 2019. It shows that building and running a big data platform for both real-time streaming and batch data processing for all kinds of applications involving analytics, data science, reporting, and the like in today’s world can be as easy as following a checklist. We live in a fortunate time that many of the components needed are already available in the open source or as a service from commercial vendors. This tutorial shows how to put these components together in multiple sophistication levels to cover the spectrum from a basic reporting need to a full fledged operation across geographically distributed regions with business continuity measures in place. This tutorial provides enough information and checklists to the audience that it can also serve as a goto reference in the actual process of building and running.
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...DATAVERSITY
Self-Service data analysis holds the promise of more rapid time-to-value for both business and IT users as advanced tooling & visualization helps make sense of raw and source data sets. Does this mean that the paradigm of ‘design-then-build’ that’s typical of data modeling is no longer relevant? Or is it more relevant than ever, as more eyes on the data means more questions about core business definitions.
Join Donna Burbank for this webinar to discuss the realities of where data modeling fits in this new paradigm.
Data Architecture Strategies: The Rise of the Graph DatabaseDATAVERSITY
Graph databases are growing in popularity, with their ability to quickly discover and integrate key relationship between enterprise data sets. Business use cases such as recommendation engines, master data management, social networks, enterprise knowledge graphs and more provide valuable ways to leverage graph databases in your organization. This webinar provides an overview of graph database technologies, and how they can be used for practical applications to drive business value.
Similar to Improving decision-making based on government data and visualizations (20)
Presentación para ABRELATAM13 donde hablo de la necesidad de mejores estándares y tecnología para las iniciativas de Datos Abiertos y cómo la tecnología afecta la utilidad y transparencia de estas iniciativas.
Creation of visualizations based on Linked DataAlvaro Graves
A common task with any relatively large amount of data is to create visual representations that help users to make sense of such data and observe trends that otherwise would be hard for them to appreciate. The creation of these visual- izations usually requires some knowledge in a programming language, making it difficult for non-technical savvy users to create visualizations. In this paper we present Visualbox, a system that makes it easier for non-programmers to create web visualizations based on Linked Data. These visualiza- tions can be accessed by any modern web browser and can be easily embedded in web pages and blogs. We describe how people can create visualizations using Visualbox and we show examples of work done by real users. Finally we present a study that shows that Visualbox makes it easier for users to create Linked Data-based visualizations.
In this talk I will show Visualbox, a "visualization server" based on LODSPeaKr that can make easy for non javascript experts to create simple but meaningful visualizations.
Publishing Linked Open Data in 15 minutesAlvaro Graves
In this presentation I will show why Linked Open Data is the best technique available to publish government data and how can you use LODSPeaKr, a simple kit for publishing Linked Data, to create from prototypes in minutes to Open Data Portals, APIs and mobile webapps.
TWC LOGD: A Portal for Linking Government DataAlvaro Graves
Experiencias de LOGD un portal sobre open government data. En él es posible encontrar datasets, demos, tutoriales, etc. El mayor colaborador del Linked Data cloud y un socio importante del gobierno de EEUU.
POMELo, a simple, web-based PML (Proof Markup Language) editor. The objective of POMELo is to allow users to create, edit, validate and export provenance information in the form of PML documents. This application was developed with provenance novices in mind, making it usable in various settings, from educational to scientific. Since this is a web-based application, users do not need to install or run any software aside from a normal web browser, which simplifies its adoption and makes it more attractive for inexperienced users.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. Agenda
• Background
• Open Government Data
• Problem
• How to use this data?
• Proposed Solution
• Personas
• (Re)use of visualizations
• Future Work 2
4. Open Government Data
• Governments are releasing huge amounts of
data (geographical, budget, transit, etc)
• Goal: Improve transparency, economy, make
people take informed decisions, etc.
Open data is the electricity of the 21st century! - M. Hausenblas
4
5. The government data landscape
• Independent Data
• Different goals Consumer
(In govt)
• No coordination Data
Data
Civil
• Highly decoupled Hacker
• Asynchronous
Data
Data Data
Data
Producer
Data
Data
Consumer
(data journalist)
5
6. Scenario
Problem: Some stakeholders can’t use most of
this government data and use them in their
decision-making process, since don’t have the
skills or training needed to consume it.*
*Based on interviews
6
7. Objectives
• Our goal: Allow more people to use and
understand government data to make more
informed decisions
• A solution: Improve creation, sharing and
reuse of data-based visualizations, so they
can consume and communicate data
7
8. Challenges
• Who are the stakeholders?
• Govt. Data producers and consumers, Data journalists/
Activists, Civil hackers, Citizens
• How do we help people to (re)use all this data?
• Use of visualizations as a medium to communication [1]
• ... but this is hard [2]
• How can we ease these processes?
[1] Crapo, A.W., et al. Visualization and the process of modeling: a cognitive-theoretic view, 2000
[2] Viegas, F.B., et al. Manyeyes: a site for visualization at internet scale, 2007
8
10. Stakeholders
• Government Data Provider
• Government Data Consumer
• Data Journalist / Activist
• Civil Hacker
• Already use the data, have the skills
• Common Citizen
• Not interested [3] [4] in being part of this ecosystem
(directly)
[3] DiFranzo, D. and Graves, A. A Farm in Every Window: A Study into the Incentives for Participation in
the Windowfarm Virtual Community, 2010
[4] Preece, J. and Shneiderman, B. The reader-to-leader framework: Motivating technology-mediated
10
social participation, 2009
11. Profile modelling using
Personas
• Personas[5] is a technique common in HCI
and human factors to understand user
types
• Based on interviews, create a “persona”
that represents a set of users with common
characteristics
• Add as much many details as possible to
understand environment,
[5] Blomkvist, S. Personas - An overview, 2004
11
12. Persona: Government data provider*
• Phillip Mancini, 35, married, one daughter.
• He is a data analyst working for the agency for Electronic Government
• His work consists in promoting the government’s data portal
• This means coordinate and request data from other agencies and publish it
in the government portal
• Promote and make easier for others to use the data available
• He knows some programming, but he is not an expert (he knows well
several datasets though)
• Eventually create mashups to his boss or other government employees to
show the benefits of Open Data (but he doesn’t have much time/expertise
for this)
* Based on interviews with government employees
12
13. How can we help
people to (re)use all
these data?
13
14. Visualizations as a way to consume and share data
• Visualizations are a simple way for humans to communicate
data and quantitative information[6]
• A visualization can be
• A graph
•
Full Chart Title Goes
Here
Subtitle appears here if it exists
Pie
1
5
1
2
Y Axis Label
Category A Category B Category C Category D
•
X Axis Label
Scatterplot
•
Full Chart Title Goes Here
Subtitle appears here if it exists
Others
15
12
Y Axis Label
9
6
3
•
Category A Category B Category C Category D
X Axis Label
A table, list
• A map
[6] Few, S. Data Visualization for Human Perception, Encyclopedia of Human Interaction, 2010
14
15. Problems for the creator*
• Create visualizations is hard
• Creator needs to understand underlying data
• Creator needs to choose a visualization strategy
• Visualizations of Open Government Data
• Different formats
• Distributed data
• Focus on how to tie everything up
* Based on preliminary interviews (Govt. data provider & consumer)
15
16. Problems for the observer*
• Accountability questions
• Visualization’s provenance
• Where does the data Full Chart Title Goes Here
Subtitle appears here if it exists
15
12
Y Axis Label
9
comes from?
6
3
Category A Category B Category C Category D
X Axis Label
• When was collected?
• How was processed?
*Based on preliminary interviews (Data journalist)
16
17. Problems for the reuser*
• “I wonder how this data looks in a map”
• “What if we use the data from previous year?”
• “What if we take the median instead of the
average?”
Full Graph Title Goes Here
Subtitle appears here if it exists
15
12
Y Axis Label
9
6
3
0 10 20 30 40 50 60
X Axis Label
* Based on interviews (Govt. data consumer & Data journalist)
17
18. How can we ease the
process of creating and
reusing a visualization?
18
19. Visualizations as
declarative components
• Instead of forcing users
to interact with code,
use formal components
that mediates between a
user and the computer
• This components will
reduce the efforts,
training and skills Full Chart Title Goes Here
Subtitle appears here if it exists
necessary to create
1
5
1
Y Axis Label
2
9
visualizations
6
3
Category
Category B Category C Category D
A
X Axis Label
19
20. Step 1: Encode this
knowledge
• Use of semantics
opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
opmv:wasControlledBy
skos:Concept opmv:Agent
NameId
rdfs:subClassOf
rdfs:subClassOf
rdfs:subPropertyOf rdfs:subClassOf
rdfs:label
rdf:type
:Application
:Message
rdfs:subPropertyOf
dc:hasFormat
to represent
rdfs:subClassOf skos:broader
blank
:usedParameter
:Component :usedInput cnt:chars
rdfs:subClassOf
dc:format
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
Code
rdfs:subClassOf :VisualizationComponent
:DataComponent
• High-level
:ProcessComponent
mime type
:Input :Parameter
rdfs:subClassOf rdfs:subClassOf
:UrlDereferencer :SparqlEndpointRetriever
representation
of different Full Graph Title Goes Here
component of a
Subtitle appears here if it exists
15
12
Y Axis Label
9
6
3
visualization
0 10 20 30 40 50 60
X Axis Label
opmv:A opmv:Proc opmv:Arti cnt:ContentAs
skos:Con opmv:wasControlledBy opmv:used
gent ess fact Text
cept
NameId
rdfs:subClassOf
rdfs:subClassOf rdfs:subPropertyOf
rdfs:subClassOf
rdfs:label
:Applicati rdf:type
on :Mess
rdfs:subPropertyOf age
dc:hasFormat
rdfs:subClassOf skos:broader
blank
:usedParameter
:Compon
ent :usedInput cnt:chars
rdfs:subClassOf dc:format
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf
:VisualizationCompo Code
rdfs:subClassOf
:DataCompo nent
nent
:ProcessCompone
nt mime
:Input :Parameter type
rdfs:subClassOf rdfs:subClassOf
:UrlDereference :SparqlEndpointRetr
r iever
<HTML>
20
21. Step 2: Explore
Visualization
• Allow users to
obtain the
formalization of it
• High-level Full Graph Title Goes Here
Subtitle appears here if it exists
15
skos:Concept opmv:Agent
opmv:wasControlledBy
opmv:Process opmv:used opmv:Artifact
NameId
cnt:ContentAsText
components
rdfs:subClassOf
12 rdfs:subClassOf
Y Axis Label
rdfs:subPropertyOf rdfs:subClassOf
9 rdfs:label
rdf:type
6 :Application
:Message
3 rdfs:subPropertyOf
dc:hasFormat
rdfs:subClassOf skos:broader
0 10 20 30 40 50 60 blank
:usedParameter
X Axis Label
:Component :usedInput cnt:chars
rdfs:subClassOf
dc:format
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
opmv:Proc opmv:Arti cnt:ContentAs
Code
skos:Con opmv:A opmv:wasControlledBy opmv:used
gent ess fact Text rdfs:subClassOf :VisualizationComponent
cept
NameId
rdfs:subClassOf
rdfs:subClassOf rdfs:subPropertyOf
rdfs:subClassOf
rdfs:label
:DataComponent
• The relations
:Applicati rdf:type
on :Mess
rdfs:subPropertyOf age
dc:hasFormat
rdfs:subClassOf skos:broader
blank :ProcessComponent
:usedParameter
:Compon
ent :usedInput cnt:chars
mime type
rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf dc:format :Input :Parameter
rdfs:subClassOf
:VisualizationCompo Code
rdfs:subClassOf
:DataCompo nent
nent
:ProcessCompone
nt rdfs:subClassOf rdfs:subClassOf
mime
:Input :Parameter type
rdfs:subClassOf rdfs:subClassOf
:UrlDereference :SparqlEndpointRetr
r iever
:UrlDereferencer :SparqlEndpointRetriever
<HTML>
among them
• Display it in graphical
terms (workflow,
forms, etc) 21
22. Step 3: Reuse of a
visualization
• Modify a new copy of a visualization
• Represented as a formalization to the
user, no code
Full Graph Title Goes Here Full Chart Title Goes Here
Subtitle appears here if it exists Subtitle appears here if it exists
15 15
12 12
Y Axis Label
Y Axis Label
9 9
6
6
3
3
Category A Category B Category C Category D
0 10 20 30 40 50 60
X Axis Label
X Axis Label
opmv:wasControlledBy opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
skos:Concept opmv:Agent
opmv:wasControlledBy opmv:Process opmv:used opmv:Artifact cnt:ContentAsText
NameId skos:Concept opmv:Agent
rdfs:subClassOf NameId
rdfs:subClassOf rdfs:subPropertyOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:label rdfs:subPropertyOf
rdfs:subClassOf
rdf:type
:Application rdfs:label
:Message rdf:type
:Application
rdfs:subPropertyOf
dc:hasFormat :Message
rdfs:subClassOf skos:broader rdfs:subPropertyOf
blank dc:hasFormat
:usedParameter rdfs:subClassOf skos:broader
blank
:usedParameter
:Component :usedInput cnt:chars
:Component :usedInput cnt:chars
rdfs:subClassOf dc:format
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf
Code dc:format
rdfs:subClassOf :VisualizationComponent rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
Code
:DataComponent rdfs:subClassOf :VisualizationComponent
:DataComponent
:ProcessComponent
:Input :Parameter mime type :ProcessComponent
:Input :Parameter mime type
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf
:UrlDereferencer :SparqlEndpointRetriever
:UrlDereferencer :SparqlEndpointRetriever
<HTML> <HTML>
Backlinking
22
23. What should we measure?
• Time required to complete tasks
• Create visualization from scratch vs. using
formalization
• Reuse visualization from scratch vs. using
formalization
• Self report
• Can you do a task you weren’t able to do before?
• Can you perform better (time, # errors) using this
approach?
23
24. Future work
• Do a more complete creation of personas
• Work with more Data Producers and
Data Journalists
• Build tools based on our formalization
• Several components already created
• Test it against real users
• Design experiments in details
• A dozen volunteers available so far
24
25. References
• [1] Crapo, A.W., et al. Visualization and the process of modeling: a cognitive-
theoretic view, 2000
• [2] Viegas, F.B., et al. Manyeyes: a site for visualization at internet scale, 2007
• [3] DiFranzo, D. and Graves, A. A Farm in Every Window: A Study into the
Incentives for Participation in the Windowfarm Virtual Community, 2010
• [4] Preece, J. and Shneiderman, B. The reader-to-leader framework: Motivating
technology-mediated social participation, 2009
• [5] Blomkvist, S. Personas - An overview, 2004
• [6] Few, S. Data Visualization for Human Perception, Encyclopedia of Human
Interaction, 2010
25