Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer
Computer Science Department
Stanford University

CIDR 2...
A Tale of Two Visualizations
vizster
Observations
Groups spent more time in front of the
visualization than individuals.

Friends encouraged each other to unea...
NameVoyager
The Baby Name Voyager
Social Data Analysis
Visual sensemaking can be social as
well as cognitive.
Analysis of data coupled with social
interpret...
sense.us
A Web Application for Collaborative
Visualization of Demographic Data
Voyagers and Voyeurs
Complementary faces of analysis
Voyager – focus on visualized data
Active engagement with the data
Se...
Out of the Lab,
 Into the Wild
Wikimapia.org
DecisionSite posters




Spotfire Decision Site Posters
Tableau Server
Many-Eyes
Social Data Analysis In Action
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating...
Discussion and Debate
Tableau X-Box / Quest Diag?

              “Valley of Death”
Content Analysis of Comments
                                           Service
                           Sense.us       ...
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions




Wikipedia: Shared Revisions   NASA Clic...
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions

Can we represent data, visualizations, and...
Text is Data, Too
Visualization Popularity
                                                  Service
                              Many-Eyes...
Alberto Gonzales
WANTED: Better Tools for Text

Statistical Analysis of text (with ties to source!)
Entity Extraction
Aggregation and Compa...
Data Integrity and Cleaning
No cooks in 1910? … There may have
been cooks then. But maybe not.
The great postmaster
scourge of 1910?
      Or just a bug
      in the data?
Content Analysis of Comments
                                           Service
                           Sense.us       ...
WANTED: Data Cleaning Tools

Reshape data, reformat rows & columns
Handle missing data: label, repair, interpolate
Entity ...
Integrating Data in Context
College Drug Use
College Drug Use
Harry Potter is Freaking Popular
WANTED: In-Situ Data Integration

Search for and suggest related data or views
User input for types, schema matching, or d...
Pointing and Naming
“Look at that spike.”
“Look at the spike for Turkey.”
“Look at the spike in the middle.”
Free-form   Data-aware
Visual Queries
Model selections as declarative queries over
interface elements or underlying data




  (-118.371 ≤ lon AN...
Visual Queries
Model selections as declarative queries over
interface elements or underlying data

Applicable to dynamic, ...
WANTED: Data-Aware Annotation

Meta-queries linking annotations to views
Visually specifying notification triggers
Annotat...
Conclusion
Social Data Analysis
Collective analysis of data supported
by social interaction.
1. Discussion and Debate
2. Text is Data...
Summary
As visualization becomes common on the web,
opportunities for collaborative analysis abound.
Weave visualizations ...
Parting Thoughts
Visualizations may have a catalytic effect
on social interaction around data.

Encourage participation by...
Acknowledgements

@ Berkeley: Maneesh Agrawala, Wes Willett,
  danah boyd, Marti Hearst, Joe Hellerstein
@ IBM: Martin Wat...
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer Stanford University
jheer@stanford.edu
http://jheer.org
With a collaborative spirit, with a collaborative platform
where people can upload data, explore data, compare
solutions, ...
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
CIDR 2009: Jeff Heer Keynote
Upcoming SlideShare
Loading in …5
×

CIDR 2009: Jeff Heer Keynote

4,868 views

Published on

This is a CIDR 2009 presentation. See http://infoblog.stanford.edu/ for more information and http://www-db.cs.wisc.edu/cidr/cidr2009/program.html for downloads.

Published in: Technology
1 Comment
3 Likes
Statistics
Notes
  • Fioricet is often prescribed for tension headaches caused by contractions of the muscles in the neck and shoulder area. Buy now from http://www.fioricetsupply.com and make a deal for you.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
4,868
On SlideShare
0
From Embeds
0
Number of Embeds
2,871
Actions
Shares
0
Downloads
43
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

CIDR 2009: Jeff Heer Keynote

  1. 1. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Computer Science Department Stanford University CIDR 2009 – Monterey, CA 5 January 2009
  2. 2. A Tale of Two Visualizations
  3. 3. vizster
  4. 4. Observations Groups spent more time in front of the visualization than individuals. Friends encouraged each other to unearth relationships, probe community boundaries, and challenge reported information. Social play resulted in informal analysis, often driven by story-telling of group histories.
  5. 5. NameVoyager The Baby Name Voyager
  6. 6. Social Data Analysis Visual sensemaking can be social as well as cognitive. Analysis of data coupled with social interpretation and deliberation. How can user interfaces catalyze and support collaborative visual analysis?
  7. 7. sense.us A Web Application for Collaborative Visualization of Demographic Data
  8. 8. Voyagers and Voyeurs Complementary faces of analysis Voyager – focus on visualized data Active engagement with the data Serendipitous comment discovery Voyeur – focus on comment listings Investigate others’ explorations Find people and topics of interest Catalyze new explorations
  9. 9. Out of the Lab, Into the Wild
  10. 10. Wikimapia.org
  11. 11. DecisionSite posters Spotfire Decision Site Posters
  12. 12. Tableau Server
  13. 13. Many-Eyes
  14. 14. Social Data Analysis In Action 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming For each, some thoughts on future directions. I asked my colleagues: if you could give database researchers a wish list, what would it be?
  15. 15. Discussion and Debate
  16. 16. Tableau X-Box / Quest Diag? “Valley of Death”
  17. 17. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage Feature prevalence from content analysis (min Cohen’s = .74) High co-occurrence of Observations, Questions, and Hypotheses
  18. 18. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Wikipedia: Shared Revisions NASA ClickWorkers: Statistics
  19. 19. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Can we represent data, visualizations, and social activity in a unified data model?
  20. 20. Text is Data, Too
  21. 21. Visualization Popularity Service Many-Eyes Swivel Tag Cloud Bubble Graph Word Tree Bar Chart Maps Network Diagram Treemap Matrix Chart Line Graph Scatterplot Stacked Graph Pie Chart Histogram 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Percentage Percentage Over 1/3 of Many-Eyes visualizations use free text
  22. 22. Alberto Gonzales
  23. 23. WANTED: Better Tools for Text Statistical Analysis of text (with ties to source!) Entity Extraction Aggregation and Comparison of texts Get a “global” view of documents We can do better than Tag Clouds (!?) Use text analysis tools to enable analysis of structured conversation by the community.
  24. 24. Data Integrity and Cleaning
  25. 25. No cooks in 1910? … There may have been cooks then. But maybe not.
  26. 26. The great postmaster scourge of 1910? Or just a bug in the data?
  27. 27. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage 16% of sense.us comments and 10% of Many-Eyes comments reference data quality or integrity.
  28. 28. WANTED: Data Cleaning Tools Reshape data, reformat rows & columns Handle missing data: label, repair, interpolate Entity resolution and de-duplication Group related values into aggregates Assist table lookups & data transforms Provide tools in situ to leverage collective Transparency requires provenance
  29. 29. Integrating Data in Context
  30. 30. College Drug Use
  31. 31. College Drug Use
  32. 32. Harry Potter is Freaking Popular
  33. 33. WANTED: In-Situ Data Integration Search for and suggest related data or views User input for types, schema matching, or data Apply in context of the current task But record mappings for future use Record provenance: chain of data sources Examples: Google Web Tables, Pay-As-You-Go, Stanford Vispedia, Utah VisTrails
  34. 34. Pointing and Naming
  35. 35. “Look at that spike.”
  36. 36. “Look at the spike for Turkey.”
  37. 37. “Look at the spike in the middle.”
  38. 38. Free-form Data-aware
  39. 39. Visual Queries Model selections as declarative queries over interface elements or underlying data (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
  40. 40. Visual Queries Model selections as declarative queries over interface elements or underlying data Applicable to dynamic, time-varying data Retarget selection across visual encodings Support social navigation and data mining
  41. 41. WANTED: Data-Aware Annotation Meta-queries linking annotations to views Visually specifying notification triggers Annotating data aggregates (use lineage?) Unified model (again!) to facilitate reference How to make it work at scale? How else to use machine-readable annotations? Can annotations be used to steer data mining?
  42. 42. Conclusion
  43. 43. Social Data Analysis Collective analysis of data supported by social interaction. 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming
  44. 44. Summary As visualization becomes common on the web, opportunities for collaborative analysis abound. Weave visualizations into the web: data access, visualization creation, view sharing and pointing. Support discovery, discussion, and integration of contributions to leverage the collective. Improve both processes and technologies for communication and dissemination.
  45. 45. Parting Thoughts Visualizations may have a catalytic effect on social interaction around data. Encourage participation by minimizing or offsetting interaction costs. Provide incentives by fostering the personal relevance of the data.
  46. 46. Acknowledgements @ Berkeley: Maneesh Agrawala, Wes Willett, danah boyd, Marti Hearst, Joe Hellerstein @ IBM: Martin Wattenberg, Fernanda Viégas @ PARC: Stu Card @ Tableau: Jock Mackinlay, Chris Stolte, Christian Chabot
  47. 47. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Stanford University jheer@stanford.edu http://jheer.org
  48. 48. With a collaborative spirit, with a collaborative platform where people can upload data, explore data, compare solutions, discuss the results, build consensus, we can engage passionate people, local communities, media and this will raise - incredibly - the amount of people who can understand what is going on. And this would have fantastic outcomes: the engagement of people, especially new generations; it would increase knowledge, unlock statistics, improve transparency and accountability of public policies, change culture, increase numeracy, and in the end, improve democracy and welfare. Enrico Giovannini, Chief Statistician, OECD. June 2007.

×