A brief 15 minute overview of what does and doesn't work in information visualization, plus a brief discussion of how to address issues of scale (collaborative analysis, crowdsourcing, machine learning)
This was a 15 minute talk at the C3E workshop on navigating cyberspace. I give a brief overview of what works and what doesn’t for visualization. I also talk a bit at the end about ways of scaling things up (in particular, collaboration, crowdsourcing, and machine learning). Some slides in this talk borrowed from Chris Harrison and Jeff Heer
Qmeeinfographic on amount of information
Yeah, we’re more like this
Wonderful book, with a wonderful title that really summarizes the essence of infoviz: using vision to think
Here’s an example of infoviz. Can have text instructions. Can also have a map. Note that this map is good in that it shows relationships, distances, etc. However, this map also has a lot of clutter, in terms of too many unimportant streets, text running into each other, and color makes it hard to differentiate between what’s important and what’s not.
Compare to Google Maps, they de-emphasize certain roads, emphasize others more, and are better at layout of text labels.
Another case study. If you squint, entire map looks red.
Compare to this one, shows that America is actually more purple than red. Same data, different representation.
Divide things up by county. Can immediately see missing data, as well as distribution of who votes for whom.
Distortion view, shows state sizes based on electoral votes.
This is by population size, can see that major population areas tend to vote blue.
All visualizations have biases. Need fast alternatives to help understand things (so you don’t fool yourself), and you need to realize this when dealing with data.
One of the most beautiful visualizations. Note that it’s roughly geographic, but also relational, showing stops relative to each other. Note that the river Thames does not turn at 90 degrees, and it doesn’t show exact distances. The task of a person in the Tube is not about distances, but just relative distances and relative spaces.
Some data sets don’t have a natural visualization though. This is an art piece by Ben Fry of processing fame, and while it’s very cool, note that it doesn’t really use “vision to think”, things don’t pop out here.
And if there’s too much data, sometimes all you get is a big fat blob
This notion of navigating cyberspace probably won’t be successful, because it doesn’t have the same characteristics of a space that we normally think of
But just because there might not be a good natural metaphor doesn’t preclude us from trying to build good conceptual models. If you physically open a computer, you won’t find icons, folders, windows, etc, but it’s still a fantastic conceptual model for helping us make use of the power of a computer. (Despite the fact that it’s 40 years old)
Slide from Jeff Heer
Slide from Jeff Heer
Slide from Jeff HeerData is messy (missing in this case), a common problem
Slide also from Jeff HeerAlso see Licklider’s quote in Man-Machine Symbiosis, he says something similar
Slide also from Jeff HeerToolchain of work (sort of similar to Clang and LLVM toolchain)
Maybe crowdsourcing can help too, this is based on our work on analyzing smartphone apps, to find unusual behaviors
Polo Chau, Christos Faloutsos, Jason Hong, NikiKitturA bottom-up approach for understanding graphs with hundreds of thousands of nodes and edgesUses a bottom-up approach, where you start with exemplars, and then uses machine learning algorithms to expand and cluster the graph