Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MDST 3703 F10 Seminar 11


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

MDST 3703 F10 Seminar 11

  1. 1. Seminar 11 Maps, Timelines, Big Data, and Visualization Introduction to the Digital Liberal Arts MDST 3703 / 7703 Fall 2010
  2. 2. Business • Quiz 2 available in Collab after this class • Same protocol as Quiz 1 and Midterm • Due by start of class Thursday
  3. 3. Review • • The Blogosphere, Wikiverse and other “regions” of the web have produced massive, aggregated sources of information – Big Data • An unintended consequence of this is that these sources are now being mined for patterns – Freebase, dbPedia, Facebook, etc. • As a result, new level of information is emerging on the web – the datasphere
  4. 4. Overview • The Datasphere raises two big questions – What can we do with it? – What will it do with us? • Today, we look at both questions
  5. 5. What can we do with Big Data?
  6. 6. Different Approaches • Traditional approaches – Geographical data (Robertson) – Historical data (Elliot and Gillies) • Radical approaches – Distant Reading (Moretti) – Cultural Analytics (Manovich)
  7. 7. Geographical Data (Places) • Geographical data are low-hanging fruit – Names can be extracted from a variety of sources and then “meshed” with gazetteers – e.g. GeoNames • Maps can help visualize that data • Maps can also serve as an interface to the data • Elliot and Gillies exemplify this approach in Classics
  8. 8. The writings of Thomas Jefferson (Google Books)
  9. 9. Historical Data (Events) • HEML (Historical Event Markup Language) provides a model for defining events – Written in RDF • Can be used to extract events from texts or convert from other formats – CIDOC-CRM – Semantic MediaWiki • These can be aggregated and visualized
  10. 10. HEML Sample
  11. 11. Top-level events in an RDF dataset accumulated with Semantic Mediawiki for a first-year Ancient History course
  12. 12. Timeline Software • Dipity – • SIMILE –
  13. 13. Time Maps • Google Timemap – – 1858.html • TimeMap – • VisualEyes – • HyperCities –
  14. 14. Cultural Analytics • Lev Manovich • Applies interactive visualization to Big Data • ral-analytics.html
  15. 15. Distant Reading • Franco Moretti • Part of a long tradition of “statistical criticism” • Influenced by the French historian, Fernand Braudel
  16. 16. One of Moretti’s graphs shows the emergence of the market for novels in Britain, Japan, Italy, Spain, and Nigeria between about 1700 and 2000. In each case, the number of new novels produced per year grows -- not at the smooth, gradual pace one might expect, but with the wild upward surge one might expect of a lab rat’s increasing interest in a liquid cocaine drip. “Five countries, three continents, over two centuries apart,” writes Moretti, “and it’s the same pattern ... in twenty years or so, the graph leaps from five [to] ten new titles per year, which means one new novel every month or so, to one new novel per week. And at that point, the horizon of novel-reading changes. As long as only a handful of new titles are published each year, I mean, novels remain unreliable products, that disappear for long stretches of time, and cannot really command the loyalty of the reading public; they are commodities, yes, but commodities still waiting for a fully developed market.” But as that market emerges and consolidates itself -- with at least one new title per week becoming available -- the novel becomes “the great capitalist oxymoron of the regular novelty: the unexpected that is produced with such efficiency and punctuality that readers become unable to do without it.”
  17. 17. What are some similarities and differences between the traditional and radical approaches?
  18. 18. Digital Traditionalists and Radicals • Similarities – Visualization – Pattern recognition – Desire to express data in terms of RDF, etc. • Allows programs to aggregate, mash up, and analyze data • Differences – Traditionalists favor metadata and ontologies, where the radicals believe the data will “speak for themselves”
  19. 19. What will Big Data do to us?
  20. 20. Anderson • End of theory, models • Compare to Shirky • A different relationship to data • Is the difference quantitative?