SlideShare a Scribd company logo
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer
Computer Science Department
Stanford University

CIDR 2009 – Monterey, CA
5 January 2009
A Tale of Two Visualizations
vizster
Observations
Groups spent more time in front of the
visualization than individuals.

Friends encouraged each other to unearth
relationships, probe community boundaries, and
challenge reported information.

Social play resulted in informal analysis, often
driven by story-telling of group histories.
NameVoyager
The Baby Name Voyager
Social Data Analysis
Visual sensemaking can be social as
well as cognitive.
Analysis of data coupled with social
interpretation and deliberation.

How can user interfaces catalyze and
support collaborative visual analysis?
sense.us
A Web Application for Collaborative
Visualization of Demographic Data
Voyagers and Voyeurs
Complementary faces of analysis
Voyager – focus on visualized data
Active engagement with the data
Serendipitous comment discovery

Voyeur – focus on comment listings
Investigate others’ explorations
Find people and topics of interest
Catalyze new explorations
Out of the Lab,
 Into the Wild
Wikimapia.org
DecisionSite posters




Spotfire Decision Site Posters
Tableau Server
Many-Eyes
Social Data Analysis In Action
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming

For each, some thoughts on future directions.
I asked my colleagues: if you could give database
researchers a wish list, what would it be?
Discussion and Debate
Tableau X-Box / Quest Diag?

              “Valley of Death”
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage



 Feature prevalence from content analysis (min Cohen’s = .74)
 High co-occurrence of Observations, Questions, and Hypotheses
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions




Wikipedia: Shared Revisions   NASA ClickWorkers: Statistics
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions

Can we represent data, visualizations, and social
activity in a unified data model?
Text is Data, Too
Visualization Popularity
                                                  Service
                              Many-Eyes                            Swivel
       Tag Cloud
   Bubble Graph
      Word Tree
        Bar Chart
            Maps
Network Diagram
        Treemap
    Matrix Chart
      Line Graph
      Scatterplot
  Stacked Graph
        Pie Chart
      Histogram
                    0.0 0.1   0.2    0.3    0.4   0.5 0.0 0.1   0.2    0.3    0.4   0.5
                               Percentage                        Percentage


Over 1/3 of Many-Eyes visualizations use free text
Alberto Gonzales
WANTED: Better Tools for Text

Statistical Analysis of text (with ties to source!)
Entity Extraction
Aggregation and Comparison of texts
  Get a “global” view of documents

We can do better than Tag Clouds (!?)
Use text analysis tools to enable analysis of
structured conversation by the community.
Data Integrity and Cleaning
No cooks in 1910? … There may have
been cooks then. But maybe not.
The great postmaster
scourge of 1910?
      Or just a bug
      in the data?
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage


 16% of sense.us comments and 10% of Many-Eyes comments
 reference data quality or integrity.
WANTED: Data Cleaning Tools

Reshape data, reformat rows & columns
Handle missing data: label, repair, interpolate
Entity resolution and de-duplication
Group related values into aggregates
Assist table lookups & data transforms

Provide tools in situ to leverage collective
Transparency requires provenance
Integrating Data in Context
College Drug Use
College Drug Use
Harry Potter is Freaking Popular
WANTED: In-Situ Data Integration

Search for and suggest related data or views
User input for types, schema matching, or data
Apply in context of the current task
 But record mappings for future use
Record provenance: chain of data sources

Examples: Google Web Tables, Pay-As-You-Go,
  Stanford Vispedia, Utah VisTrails
Pointing and Naming
“Look at that spike.”
“Look at the spike for Turkey.”
“Look at the spike in the middle.”
Free-form   Data-aware
Visual Queries
Model selections as declarative queries over
interface elements or underlying data




  (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
Visual Queries
Model selections as declarative queries over
interface elements or underlying data

Applicable to dynamic, time-varying data
Retarget selection across visual encodings
Support social navigation and data mining
WANTED: Data-Aware Annotation

Meta-queries linking annotations to views
Visually specifying notification triggers
Annotating data aggregates (use lineage?)
Unified model (again!) to facilitate reference
How to make it work at scale?

How else to use machine-readable annotations?
Can annotations be used to steer data mining?
Conclusion
Social Data Analysis
Collective analysis of data supported
by social interaction.
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming
Summary
As visualization becomes common on the web,
opportunities for collaborative analysis abound.
Weave visualizations into the web: data access,
visualization creation, view sharing and pointing.
Support discovery, discussion, and integration
of contributions to leverage the collective.
Improve both processes and technologies for
communication and dissemination.
Parting Thoughts
Visualizations may have a catalytic effect
on social interaction around data.

Encourage participation by minimizing or
offsetting interaction costs.

Provide incentives by fostering the
personal relevance of the data.
Acknowledgements

@ Berkeley: Maneesh Agrawala, Wes Willett,
  danah boyd, Marti Hearst, Joe Hellerstein
@ IBM: Martin Wattenberg, Fernanda Viégas
@ PARC: Stu Card
@ Tableau: Jock Mackinlay, Chris Stolte,
  Christian Chabot
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer Stanford University
jheer@stanford.edu
http://jheer.org
With a collaborative spirit, with a collaborative platform
where people can upload data, explore data, compare
solutions, discuss the results, build consensus, we can
engage passionate people, local communities, media and
this will raise - incredibly - the amount of people who can
understand what is going on.

And this would have fantastic outcomes: the engagement of
people, especially new generations; it would increase
knowledge, unlock statistics, improve transparency and
accountability of public policies, change culture, increase
numeracy, and in the end, improve democracy and welfare.

       Enrico Giovannini, Chief Statistician, OECD. June 2007.

More Related Content

Viewers also liked

Cidr
CidrCidr
Cidr
JCipriano
 
C I D R
C I D RC I D R
C I D R
colmbennett
 
Cidr.ppt
Cidr.pptCidr.ppt
Cidr.ppt
thwayyib
 
Unicast multicast & broadcast
Unicast multicast & broadcastUnicast multicast & broadcast
Unicast multicast & broadcast
NetProtocol Xpert
 
Ch05
Ch05Ch05
Classless addressing
Classless addressingClassless addressing
Classless addressing
Iqra Abbas
 

Viewers also liked (6)

Cidr
CidrCidr
Cidr
 
C I D R
C I D RC I D R
C I D R
 
Cidr.ppt
Cidr.pptCidr.ppt
Cidr.ppt
 
Unicast multicast & broadcast
Unicast multicast & broadcastUnicast multicast & broadcast
Unicast multicast & broadcast
 
Ch05
Ch05Ch05
Ch05
 
Classless addressing
Classless addressingClassless addressing
Classless addressing
 

Similar to CIDR 2009: Jeff Heer Keynote

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Andre Freitas
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systems
Davide Eynard
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
Mathieu d'Aquin
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Stephanie Steinhardt
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2
Stefano A Gazziano
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
ACMBangalore
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
Vincenzo Patruno
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game Design
Sam Pottinger
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
Christoforos Anagnostopoulos
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
Collabor8now Ltd
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data Web
Edward Curry
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
suresh sood
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
DataWorks Summit/Hadoop Summit
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing Work
Research Data Alliance
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
Marianne Sweeny
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
P2Pvalue
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
Elena Simperl
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
Fernando de Assis Rodrigues
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
Elena Simperl
 
MPhil Lecture on Data Vis for Analysis
MPhil Lecture on Data Vis for AnalysisMPhil Lecture on Data Vis for Analysis
MPhil Lecture on Data Vis for Analysis
Shawn Day
 

Similar to CIDR 2009: Jeff Heer Keynote (20)

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systems
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game Design
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data Web
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing Work
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
MPhil Lecture on Data Vis for Analysis
MPhil Lecture on Data Vis for AnalysisMPhil Lecture on Data Vis for Analysis
MPhil Lecture on Data Vis for Analysis
 

More from infoblog

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynote
infoblog
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)
infoblog
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
infoblog
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
infoblog
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
infoblog
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
infoblog
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
infoblog
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)
infoblog
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
infoblog
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
infoblog
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
infoblog
 
Spot Sigs
Spot SigsSpot Sigs
Spot Sigs
infoblog
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)
infoblog
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
infoblog
 

More from infoblog (14)

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynote
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
 
Spot Sigs
Spot SigsSpot Sigs
Spot Sigs
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 

Recently uploaded

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

CIDR 2009: Jeff Heer Keynote

  • 1. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Computer Science Department Stanford University CIDR 2009 – Monterey, CA 5 January 2009
  • 2. A Tale of Two Visualizations
  • 4. Observations Groups spent more time in front of the visualization than individuals. Friends encouraged each other to unearth relationships, probe community boundaries, and challenge reported information. Social play resulted in informal analysis, often driven by story-telling of group histories.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. Social Data Analysis Visual sensemaking can be social as well as cognitive. Analysis of data coupled with social interpretation and deliberation. How can user interfaces catalyze and support collaborative visual analysis?
  • 11. sense.us A Web Application for Collaborative Visualization of Demographic Data
  • 12.
  • 13. Voyagers and Voyeurs Complementary faces of analysis Voyager – focus on visualized data Active engagement with the data Serendipitous comment discovery Voyeur – focus on comment listings Investigate others’ explorations Find people and topics of interest Catalyze new explorations
  • 14. Out of the Lab, Into the Wild
  • 15.
  • 16.
  • 20.
  • 22. Social Data Analysis In Action 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming For each, some thoughts on future directions. I asked my colleagues: if you could give database researchers a wish list, what would it be?
  • 24.
  • 25.
  • 26.
  • 27. Tableau X-Box / Quest Diag? “Valley of Death”
  • 28.
  • 29.
  • 30.
  • 31. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage Feature prevalence from content analysis (min Cohen’s = .74) High co-occurrence of Observations, Questions, and Hypotheses
  • 32. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Wikipedia: Shared Revisions NASA ClickWorkers: Statistics
  • 33. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Can we represent data, visualizations, and social activity in a unified data model?
  • 35. Visualization Popularity Service Many-Eyes Swivel Tag Cloud Bubble Graph Word Tree Bar Chart Maps Network Diagram Treemap Matrix Chart Line Graph Scatterplot Stacked Graph Pie Chart Histogram 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Percentage Percentage Over 1/3 of Many-Eyes visualizations use free text
  • 36.
  • 38. WANTED: Better Tools for Text Statistical Analysis of text (with ties to source!) Entity Extraction Aggregation and Comparison of texts Get a “global” view of documents We can do better than Tag Clouds (!?) Use text analysis tools to enable analysis of structured conversation by the community.
  • 39. Data Integrity and Cleaning
  • 40. No cooks in 1910? … There may have been cooks then. But maybe not.
  • 41. The great postmaster scourge of 1910? Or just a bug in the data?
  • 42.
  • 43.
  • 44. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage 16% of sense.us comments and 10% of Many-Eyes comments reference data quality or integrity.
  • 45. WANTED: Data Cleaning Tools Reshape data, reformat rows & columns Handle missing data: label, repair, interpolate Entity resolution and de-duplication Group related values into aggregates Assist table lookups & data transforms Provide tools in situ to leverage collective Transparency requires provenance
  • 47.
  • 48.
  • 51. Harry Potter is Freaking Popular
  • 52.
  • 53. WANTED: In-Situ Data Integration Search for and suggest related data or views User input for types, schema matching, or data Apply in context of the current task But record mappings for future use Record provenance: chain of data sources Examples: Google Web Tables, Pay-As-You-Go, Stanford Vispedia, Utah VisTrails
  • 55. “Look at that spike.”
  • 56. “Look at the spike for Turkey.”
  • 57. “Look at the spike in the middle.”
  • 58. Free-form Data-aware
  • 59. Visual Queries Model selections as declarative queries over interface elements or underlying data (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
  • 60. Visual Queries Model selections as declarative queries over interface elements or underlying data Applicable to dynamic, time-varying data Retarget selection across visual encodings Support social navigation and data mining
  • 61. WANTED: Data-Aware Annotation Meta-queries linking annotations to views Visually specifying notification triggers Annotating data aggregates (use lineage?) Unified model (again!) to facilitate reference How to make it work at scale? How else to use machine-readable annotations? Can annotations be used to steer data mining?
  • 63. Social Data Analysis Collective analysis of data supported by social interaction. 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming
  • 64. Summary As visualization becomes common on the web, opportunities for collaborative analysis abound. Weave visualizations into the web: data access, visualization creation, view sharing and pointing. Support discovery, discussion, and integration of contributions to leverage the collective. Improve both processes and technologies for communication and dissemination.
  • 65. Parting Thoughts Visualizations may have a catalytic effect on social interaction around data. Encourage participation by minimizing or offsetting interaction costs. Provide incentives by fostering the personal relevance of the data.
  • 66. Acknowledgements @ Berkeley: Maneesh Agrawala, Wes Willett, danah boyd, Marti Hearst, Joe Hellerstein @ IBM: Martin Wattenberg, Fernanda Viégas @ PARC: Stu Card @ Tableau: Jock Mackinlay, Chris Stolte, Christian Chabot
  • 67. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Stanford University jheer@stanford.edu http://jheer.org
  • 68. With a collaborative spirit, with a collaborative platform where people can upload data, explore data, compare solutions, discuss the results, build consensus, we can engage passionate people, local communities, media and this will raise - incredibly - the amount of people who can understand what is going on. And this would have fantastic outcomes: the engagement of people, especially new generations; it would increase knowledge, unlock statistics, improve transparency and accountability of public policies, change culture, increase numeracy, and in the end, improve democracy and welfare. Enrico Giovannini, Chief Statistician, OECD. June 2007.