Mdst3703 culturomics-2012-11-01

439 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
439
On SlideShare
0
From Embeds
0
Number of Embeds
173
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Is this correct?
  • Was he schooled?
  • Durer, Melancolia 1, 1514, engraving
  • Mdst3703 culturomics-2012-11-01

    1. 1. Lecture/Studio: Culturomics Prof. AlvaradoMDST 3703/77031 November 2012
    2. 2. Business• Everyone’s families and friends OK?
    3. 3. Review• The New Epistemology – Rise of Big Data: massive, available, social – Shifts our relationship to primary sources – From reading to quantitative methods and visualizations – Example of media determinism• Manovich – Consistent with database logic – Applies spirit of Big Data methods to art
    4. 4. Review• Rationalization Effects – What are we looking at? – What is theory? – What are models? – What is culture? – What are the humanities?
    5. 5. Overview• Combined Studio and Lecture• Lecture – Google’s NGram Viewer – Culturomics• Studio: – Collaborative Topic Index
    6. 6. Google Does the Humanities
    7. 7. Google NGrams• Google Books comprises 11% of the corpus of published books, about 2 trillion words• NGrams uses 5.2 million books (4% of the corpus)• 500 billion words• Published between 1500-1800• In English, French, Spanish, German, Chinese and Russian (Hebrew too)
    8. 8. Erez Lieberman Aiden and Jean-Baptiste Michel
    9. 9. What’s an NGram?
    10. 10. A space-delimited string N = number of strings Case sensitive Purely syntactic Very hard to index
    11. 11. Culturomics• A method more than a model (like Anderson argues)• Analogy is to genomics – Does this make sense? – What is the analog to the gene?
    12. 12. Parallel Crossing Convergent/Divergent
    13. 13. AmericanBritish
    14. 14. “There’s not even a historian of the book connected to the project,” Mr. Menand noted.
    15. 15. Anthony Grafton, History, Princeton
    16. 16. Studio• We are now at the point where we have all the pieces in place – HTML markup, CSS, JavaScript – Structured data (table in Google Docs) – Visualization tools• Create Character Index – We will use everything we have done so far – notes, network visualizations, etc. – Today we begin to collaboratively create the Character Index (a subset of a full topic index)

    ×