• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Graphical inference

on

  • 1,625 views

 

Statistics

Views

Total Views
1,625
Views on SlideShare
1,625
Embed Views
0

Actions

Likes
1
Downloads
17
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Graphical inference Graphical inference Presentation Transcript

    • “The human understanding, on account of its own nature, readily supposes a greater order and uniformity in things than it finds. And ... it devises parallels and correspondences and relations which are not there.” —Francis Bacon, 1620 Wednesday, 10 November 2010
    • Is wh “The human understanding, on account of its at we own nature, readily supposes a greater order see r eally and uniformity in things than it finds. And ... there ? it devises parallels and correspondences and relations which are not there.” —Francis Bacon, 1620 Wednesday, 10 November 2010
    • Graphical inference for infovis Hadley Wickham, Dianne Cook, Heike Hofmann, Andreas Buja October 2010 Wednesday, 10 November 2010
    • Which one of these plots is not like the others? Which of these plots just doesn’t belong? Wednesday, 10 November 2010
    • 7 of those plots were plots of random (null) data. 1 plot was the real data. If you correctly picked the true plot from the null plots then we have evidence that it really is different. In fact, we have rigorous statistical evidence that there is a difference, just using Sesame Street skills! Wednesday, 10 November 2010
    • 1. The statistical justice system 2. Line up protocol 3. Rorschach protocol 4. Future work Wednesday, 10 November 2010
    • Hypothesis testing? http://www.flickr.com/photos/joegratz/117048243 Wednesday, 10 November 2010
    • Hypothesis testing? The statistical justice system http://www.flickr.com/photos/joegratz/117048243 Wednesday, 10 November 2010
    • Ho: null hypothesis Defence Ha: alternative hypothesis Prosecution Wednesday, 10 November 2010
    • Ho: null hypothesis Defence Ha: alternative hypothesis Prosecution Null distribution Innocents Wednesday, 10 November 2010
    • Ho: null hypothesis Defence Ha: alternative hypothesis Prosecution Null distribution Innocents Reject the null Guilty Fail to reject the null Not guilty Wednesday, 10 November 2010
    • Ho: null hypothesis Defence Ha: alternative hypothesis Prosecution Null distribution Innocents Reject the null Guilty Fail to reject the null Not guilty p-value Probability that a truly innocent dataset would look as guilty as the suspect Wednesday, 10 November 2010
    • Line up Wednesday, 10 November 2010
    • believe believe believe believe believe believe believe believe believe believe case case case case case case closely closely descendants case closely case closely closely closely descendants closely descendants case closely descendants case closely closely descendants descendants few few descendants few few descendants few few descendants few few descendants few few long long modified long long modified long long modified long long modified long long modified modified variations modified variations modified variations modified variations modified variations variations very variations very variations very variations very variations very very view view very view view very view view very view view very view view Five tag clouds of selected words from the 1st (red) and 6th (blue) editions of Darwin’s “Origin of Species”. Four of the tag clouds were generated under the null hypothesis of no difference between editions, and one is the true data. Can you spot it? Wednesday, 10 November 2010
    • believe believe believe believe believe believe believe believe believe believe case case case case case case closely closely descendants case closely case closely closely closely descendants closely descendants case closely descendants case closely closely descendants descendants few few descendants few few descendants few few descendants few few descendants few few long long modified long long modified long long modified long long modified long long modified modified variations modified variations modified variations modified variations modified variations variations very variations very variations very variations very variations very very view view very view view very view view very view view very view view Five tag clouds of selected words from the 1st (red) and 6th (blue) editions of Darwin’s “Origin of Species”. Four of the tag clouds were generated under the null hypothesis of no difference between editions, and one is the true data. Can you spot it? Wednesday, 10 November 2010
    • Protocol Generate n-1 decoys (null datasets) Plot the decoys + the real data (randomly positioned) Show to an impartial observer. Can they spot the real data? If so, you have evidence for true difference (p-value = 1/n) Wednesday, 10 November 2010
    • E. L. Scott, C. D. Shane, and M. D. Swanson. Comparison of the synthetic and actual distribution of galaxies on a photographic plate. Astrophysical Journal, 119:91–112, Jan. 1954. Wednesday, 10 November 2010
    • A. M. Noll. Human or machine: A subjective comparison of Piet Mondrian’s “composition with lines” (1917) and a computer- generated picture. The Psychological Record, 16:1–10, 1966. Wednesday, 10 November 2010
    • vs. classical tests Of course, if we know what we’re looking for, we can always develop an algorithm or numerical test. The advantage of visual inference is that works for very general tasks, including when you don’t know exactly what you’re looking for. Wednesday, 10 November 2010
    • ower of the test Recent work shows that power only a little worse than classical test sigma = 12 sigma = 5 1.0 sample size = 100 0.8 0.6 0.4 power_curve 0.2 Theoretical test Power 0.0 Visual test 1.0 lower_CL sample size = 300 0.8 upper_CL 0.6 0.4 0.2 0.0 !15 !10 !5 0 5 10 15 !15 !10 !5 0 5 10 15 ! Wednesday, 10 November 2010
    • Plot Task Choropleth Is there a spatial trend? map Is the distribution in higher Treemap level categories the same? Are the two variables Scatterplot independent? Is there a trend in mean or Time series variability? Wednesday, 10 November 2010
    • Wednesday, 10 November 2010
    • Wednesday, 10 November 2010
    • Wednesday, 10 November 2010
    • Once we’ve seen the plot, we’re no longer impartial Wednesday, 10 November 2010
    • Code # Support package written in R # http://github.com/ggobi/nullabor # Provides reference implementation of ideas library(nullabor) library(ggplot2) qplot(angle * 180 / pi, r, data = threept) %+% lineup(null_model(r ~ poly(angle, 2)), n = 10) + facet_wrap(~ .sample, ncol = 5) Wednesday, 10 November 2010
    • Rorschach Wednesday, 10 November 2010
    • Rorschach We’re surprisingly bad at appreciating the amount of variation in random data. Showing only null plots is a good way to calibrate our intuition. We also plan on using these plots as an empirical tool to understand what features people pick up on. Anecdotally, undergrads focus too much on outliers Wednesday, 10 November 2010
    • 1 2 3 100 80 60 40 20 0 4 5 6 100 80 count 60 40 20 0 7 8 9 100 80 60 40 20 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 result Wednesday, 10 November 2010
    • Future work Wednesday, 10 November 2010
    • Future work How can visual inference be integrated into visualisation software at a fundamental level? How does training impact results? How do novices vs. experts differ? What patterns do people pick up on? What are the alternatives that people respond to? Wednesday, 10 November 2010
    • Questions? Wednesday, 10 November 2010
    • Wednesday, 10 November 2010
    • This work is licensed under the Creative Commons Attribution-Noncommercial 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/ 3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Wednesday, 10 November 2010