Your SlideShare is downloading. ×
Mdst3703 culturomics-2012-11-01
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Mdst3703 culturomics-2012-11-01

307
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
307
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Is this correct?
  • Was he schooled?
  • Durer, Melancolia 1, 1514, engraving
  • Transcript

    • 1. Lecture/Studio: Culturomics Prof. AlvaradoMDST 3703/77031 November 2012
    • 2. Business• Everyone’s families and friends OK?
    • 3. Review• The New Epistemology – Rise of Big Data: massive, available, social – Shifts our relationship to primary sources – From reading to quantitative methods and visualizations – Example of media determinism• Manovich – Consistent with database logic – Applies spirit of Big Data methods to art
    • 4. Review• Rationalization Effects – What are we looking at? – What is theory? – What are models? – What is culture? – What are the humanities?
    • 5. Overview• Combined Studio and Lecture• Lecture – Google’s NGram Viewer – Culturomics• Studio: – Collaborative Topic Index
    • 6. Google Does the Humanities
    • 7. Google NGrams• Google Books comprises 11% of the corpus of published books, about 2 trillion words• NGrams uses 5.2 million books (4% of the corpus)• 500 billion words• Published between 1500-1800• In English, French, Spanish, German, Chinese and Russian (Hebrew too)
    • 8. Erez Lieberman Aiden and Jean-Baptiste Michel
    • 9. What’s an NGram?
    • 10. A space-delimited string N = number of strings Case sensitive Purely syntactic Very hard to index
    • 11. Culturomics• A method more than a model (like Anderson argues)• Analogy is to genomics – Does this make sense? – What is the analog to the gene?
    • 12. Parallel Crossing Convergent/Divergent
    • 13. AmericanBritish
    • 14. “There’s not even a historian of the book connected to the project,” Mr. Menand noted.
    • 15. Anthony Grafton, History, Princeton
    • 16. Studio• We are now at the point where we have all the pieces in place – HTML markup, CSS, JavaScript – Structured data (table in Google Docs) – Visualization tools• Create Character Index – We will use everything we have done so far – notes, network visualizations, etc. – Today we begin to collaboratively create the Character Index (a subset of a full topic index)