Diaries on the Web: A Practical Guide Joanne Riley Associate University Librarian Joseph P. Healey Library University of M...
“ Let’s digitize a diary and put it on the Web!” <ul><li>Means creating: </li></ul><ul><ul><li>Scanned page images or </li...
Scanned Page Images Elizabeth  Cowperthwaite 1857 - 1858 Source:  Special Collections, University of Pennsylvania Library
Transcribed Text <ul><li>Archibald Thompson (1736 - ?), Montgomery County, VA </li></ul><ul><li>Transcribed by descendents...
Scanned AND Transcribed <ul><li>Diary of Nancy Holmes Corse (1840 - ?)  Enosburgh, VT </li></ul><ul><li>Transcription and ...
Annotated Text <ul><li>Camping with the Sioux: Fieldwork Diary of Alice Cunningham Fletcher  (1838 – 1923) </li></ul><ul><...
Cross-Referenced Text (and more…) <ul><li>Diary of Samuel Pepys’ </li></ul><ul><li>Multiple Inter-related weblogs: </li></...
Twittered Diaries (i.e. “microblogging”)
Blogged Diary And annotated Source: American Antiquarian Society’s  Past is Present
Blogged Diaries <ul><li>PROS </li></ul><ul><ul><li>Built for daily entries </li></ul></ul><ul><ul><li>Easy to populate </l...
Getting Started: a Suggested Process <ul><li>Low barrier to entry </li></ul><ul><li>No $$ charges (beyond labor   ) </li>...
Suggested Process for Publishing Historic Diaries on the Web: <ul><li>Identify  a good diary prospect </li></ul><ul><li>Si...
1.  Identify a good diary prospect <ul><li>Is it in the Public Domain?  </li></ul><ul><li>In 2010:  </li></ul><ul><li>Any ...
2.  Sign up for a Wordpress Account <ul><li>www.wordpress.com/signup/ </li></ul>
3.  Sign up for a Google Account <ul><li>www.google.com </li></ul>
4. Scan the Diary Pages <ul><li>Digitization Specs for Image Projects </li></ul><ul><ul><li>“ Master Files” aka “Archival ...
5.  Upload Image Files to the Web <ul><ul><li>Upload COPIES of the files </li></ul></ul><ul><ul><li>Use web server that gi...
6.  Upload Image Files to the Web Flicker Restrictions on Linking Images Flickr won’t allow you to link to a full-screen i...
Google Picasa – free image storage <ul><li>Copies of your working files </li></ul><ul><li>Easier to store just one size an...
7.  Transcribe Text into a Structured File <ul><li>Separate fields for month, day, year, diarist, transcription text </li>...
Sample Transcription Workspace Browser Window #1: Google Spreadsheet “ Pocket Diary Data Entry Template” Browser Window #2...
8.  Paste the transcribed text and add page image links into a blog post “ Pocket Diary Data Entry Template” file on Docs....
8.  Paste the transcribed text and image links into a blog post – a shortcut! “ Pocket Diary Data Entry Template” file on ...
 
6.  Tag, Categorize, Comment, Annotate…
Suggested Process for Publishing Historic Diaries on the Web: <ul><li>Identify  a good diary prospect </li></ul><ul><li>Si...
 
 
Upcoming SlideShare
Loading in …5
×

Historical Diaries on the Web: A Practical Guide

3,020 views

Published on

Publishing historic diaries on the web: a practical guide. Presented at the Mass. History Conference "Imagining Lives", June 7, 2010

1 Comment
2 Likes
Statistics
Notes
  • This a great guide. Obviously I have opinions on the transcription side, but I really like the clear lay-out of hosting options.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
3,020
On SlideShare
0
From Embeds
0
Number of Embeds
616
Actions
Shares
0
Downloads
0
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide
  • These can be done in stages
  • Image only – just a picture of a page. Which can be useful for research, etc. But cannot search for particular words or phrases – simply a digital safe and widely accessible way of paging through the manuscript.
  • Not ideal to have only transcribed text, with no page image, but sometime necessary. Sometimes the originals have disappeared and you’re left with someone’s typescript transcription, or it may not be possible to digitize the images, or not yet. Harder too to put into a structured file like a spreadsheet (which I’m going to be recommending for many of these diaries) when you have layout like “written in opposite direction)
  • They photographed the transcription and page image side-by-side. Also full text of the transcripton for searching
  • No pages images, but has enormous depth and richness - BUT requires knowledge of relational databases, PHP, mySQL, a lot of attention to programming, harvesting from wikipedia and google maps – a tour-de-force, a real labor of love by a non-professional. This is also true of the Martha Ballard diary presentationon dohistory.org ================================ The site is built around Movable Type with some additional custom code. There are five Movable Type weblogs: The Diary All the entries of the diary are entries in a weblog with a little tweaking to work better with the diary&apos;s old dates. The &amp;quot;Also on this day&amp;quot; items are stored separately in a MySQL database and displayed with PHP . The Encyclopedia Every topic in the Encyclopedia is a separate entry in a weblog, organised into a large number of categories. Some entries include content from Wikipedia which is fetched and cached locally using PHP. Some entries include Google Maps which are displayed using Mapstraction . The latitude, longitude and zoom levels are stored in Movable Type using Custom Fields . The larger maps are also from Google, with additional PdMarker code, and the pages are generated using PHP to extract the location data from the Movable Type database. In-Depth Articles , Site News and The story so far These are all straightforward Movable Type weblogs. The Recent Activity page is an aggregation of the latest activity from all the Movable Type weblogs.
  • Short entries – 140 characters – daybook, pocket diaries.
  • Pros win. Of the cons, the date problem is the worst. But even so, a blog is an excellent way to PRESENT (not to store, or preserve!) an historic diary.
  • Criteria for this suggested process:
  • Good prospect is legally publishable and engaging – not necessarily wordy, but with evocative text. For me, my interest and experience is in short-entry diaries – daybooks, pocket diaries that can be treated as snippets, little windows into the past. Known diarist, unpublished, unregistered manuscript: Life of the author + 70 years Anonymous diarist, unpublished, unregistered manuscript: Creation date + 120 years FAIR USE Fair Use checklist http://www.copyright.com/Services/copyrightoncampus/basics/fairuse_list.html Link to Four Factors for Determining Fair Use is on the top of that page.
  • Can also make thumbnails, and name accordingly. Naming conventions should be logical, extensible and relate all versions of one image to each other. Master Files High resolution scans allow for multiple uses (print, zoom, etc.) Large file size Often stored on CDs, DVDs, external drives, etc. Maintain over time: refresh/migrate Working Files - For printing or detailed viewing on the web 300 DPI
  • Most people will choose to upload to the global network called the World Wide Web. You can upload to your own org’s server, or to a collaborating university or org, or to a public site. Remember this is NOT your repository for this stuff, just a storage place for the project.
  • Most people will choose to upload to the global network called the World Wide Web. You can upload to your own org’s server, or to a collaborating university or org, or to a public site. Remember this is NOT your repository for this stuff, just a storage place for the project.
  • Here’s where your Google account comes in handy
  • Master data file – like the master image file – if you do it right, you can use that file to repurposes and move to other formats. Storing diary data in a structured master file is: Flexible Re-purposable Stable Safe Alas, OCR won’t help with manuscripts Unless it’s an old typescript transcription Abbyy Finereader excellent OCR software
  • You can add the diary entries in any order you want because you can sort them by date! This is helpful for having workgroups collaborating on transcribing a diary
  • Categories can have unique names. Tags need to be known names. Categories don’t help search engines find information. Tags help search engines and tag directories catalog your site. Categories help visitors find related information on your site. Tags help visitors find related information on your site and on other sites.
  • As of June 2010 – may change!
  • There are many ways to approach this challenge of sharing short entry diaries online – I’ll be very interested to hear of your experiences and thoughts on all this. I guess I just want to leave you with the encouragement to put your data into a structured master format to save yourselves trouble and complications in the long run. Once you do that, many variations on format and style in presentation and functionality are possible. These are all different presentations of the very same diary data, entered once into a database.
  • Historical Diaries on the Web: A Practical Guide

    1. 1. Diaries on the Web: A Practical Guide Joanne Riley Associate University Librarian Joseph P. Healey Library University of Massachusetts Boston [email_address] Mass. History Conference, June 7, 2010
    2. 2. “ Let’s digitize a diary and put it on the Web!” <ul><li>Means creating: </li></ul><ul><ul><li>Scanned page images or </li></ul></ul><ul><ul><li>Transcribed text or </li></ul></ul><ul><ul><li>Both </li></ul></ul><ul><li>Plus, perhaps: </li></ul><ul><ul><li>Annotations </li></ul></ul><ul><ul><li>Cross-referencing </li></ul></ul><ul><ul><li>Social networking options </li></ul></ul>
    3. 3. Scanned Page Images Elizabeth Cowperthwaite 1857 - 1858 Source: Special Collections, University of Pennsylvania Library
    4. 4. Transcribed Text <ul><li>Archibald Thompson (1736 - ?), Montgomery County, VA </li></ul><ul><li>Transcribed by descendents in plain text form </li></ul>Source: Doug Moore, Arizona State University
    5. 5. Scanned AND Transcribed <ul><li>Diary of Nancy Holmes Corse (1840 - ?) Enosburgh, VT </li></ul><ul><li>Transcription and page image photographed side-by-side </li></ul><ul><li>Full text also available </li></ul>Source: Dept of Spec. Collection, , University of Missouri-Kansas City
    6. 6. Annotated Text <ul><li>Camping with the Sioux: Fieldwork Diary of Alice Cunningham Fletcher (1838 – 1923) </li></ul><ul><li>1881 Diary </li></ul><ul><li>Rich web context </li></ul>Source: National Anthropological Archives, Smithsonian Institute
    7. 7. Cross-Referenced Text (and more…) <ul><li>Diary of Samuel Pepys’ </li></ul><ul><li>Multiple Inter-related weblogs: </li></ul><ul><ul><li>The Diary </li></ul></ul><ul><ul><li>“ Also On This Day” items </li></ul></ul><ul><ul><li>Encyclopedia </li></ul></ul><ul><ul><li>In-Depth Articles </li></ul></ul><ul><ul><li>Site News </li></ul></ul><ul><ul><li>The story so far </li></ul></ul><ul><ul><li>Recent Activity </li></ul></ul><ul><li>Uses Movable Type, relational database theory, mySQL, PHP, Mapstration, PdMarker code, etc </li></ul>
    8. 8. Twittered Diaries (i.e. “microblogging”)
    9. 9. Blogged Diary And annotated Source: American Antiquarian Society’s Past is Present
    10. 10. Blogged Diaries <ul><li>PROS </li></ul><ul><ul><li>Built for daily entries </li></ul></ul><ul><ul><li>Easy to populate </li></ul></ul><ul><ul><li>World-wide access </li></ul></ul><ul><ul><li>Visitor interactivity </li></ul></ul><ul><ul><li>Ability to add controlled vocabularies </li></ul></ul><ul><ul><li>Annotation and image options </li></ul></ul><ul><ul><li>Web 2.0 interoperability: connect to Facebook, Twitter, RSS feeds, etc. </li></ul></ul><ul><li>CONS </li></ul><ul><ul><li>Posts are attached to the current date – historic dates must be added as text. </li></ul></ul><ul><ul><li>No complex searches </li></ul></ul><ul><ul><li>Can only export as XML file - backend data is not as flexible or accessible (yet) as a traditional database </li></ul></ul>
    11. 11. Getting Started: a Suggested Process <ul><li>Low barrier to entry </li></ul><ul><li>No $$ charges (beyond labor  ) </li></ul><ul><li>Secure local storage </li></ul><ul><li>World-wide web access </li></ul><ul><li>Social networking options </li></ul><ul><li>Flexibility to move to other platforms, re-purpose content as needed </li></ul>
    12. 12. Suggested Process for Publishing Historic Diaries on the Web: <ul><li>Identify a good diary prospect </li></ul><ul><li>Sign up for a Wordpress account </li></ul><ul><li>Sign up for a Google account </li></ul><ul><li>Scan the diary pages </li></ul><ul><li>Upload copies of images to a web server </li></ul><ul><li>Transcribe content into a structured file </li></ul><ul><li>Copy-and-paste the transcribed text and image links into a blog post </li></ul><ul><li>Tag, Categorize, Comment, Annotate… </li></ul>
    13. 13. 1. Identify a good diary prospect <ul><li>Is it in the Public Domain? </li></ul><ul><li>In 2010: </li></ul><ul><li>Any unpublished manuscript by a known diarist who died before 1939 may be freely published (i.e. author’s life + 70) </li></ul><ul><li>Any unpublished manuscript by an anonymous diarist whose diary is dated before 1889 may be freely published (i.e. creation + 120) </li></ul><ul><li>Is the Content Useful and/or Engaging? </li></ul><ul><li>Historically significant </li></ul><ul><li>Interesting, engaging </li></ul><ul><li>Invites further research </li></ul>Source: LLRX.COM and COPYRIGHT.COM
    14. 14. 2. Sign up for a Wordpress Account <ul><li>www.wordpress.com/signup/ </li></ul>
    15. 15. 3. Sign up for a Google Account <ul><li>www.google.com </li></ul>
    16. 16. 4. Scan the Diary Pages <ul><li>Digitization Specs for Image Projects </li></ul><ul><ul><li>“ Master Files” aka “Archival Images” </li></ul></ul><ul><ul><ul><li>600+ DPI </li></ul></ul></ul><ul><ul><ul><li>TIFF file format (non-lossy) </li></ul></ul></ul><ul><ul><ul><li>Bit depth: 16 bit grayscale, 48 bit color </li></ul></ul></ul><ul><ul><li>“ Working Files” aka “Access Images” </li></ul></ul><ul><ul><ul><li>Create from the master file </li></ul></ul></ul><ul><ul><ul><li>300 DPI </li></ul></ul></ul><ul><ul><ul><li>File format: JPEG </li></ul></ul></ul><ul><ul><ul><li>Bit depth: 8 bit grayscale, 24 bit color </li></ul></ul></ul><ul><ul><li>Thumbnails </li></ul></ul><ul><ul><ul><li>Create from the master file </li></ul></ul></ul><ul><ul><ul><li>150 DPI, JPEG, 8 bit grayscale, 24 bit color </li></ul></ul></ul>
    17. 17. 5. Upload Image Files to the Web <ul><ul><li>Upload COPIES of the files </li></ul></ul><ul><ul><li>Use web server that gives stable URL </li></ul></ul><ul><ul><ul><li>your organization’s webhost </li></ul></ul></ul><ul><ul><ul><li>Consortial repository </li></ul></ul></ul><ul><ul><ul><li>Digital Commonwealth </li></ul></ul></ul><ul><ul><ul><li>Free image hosting site </li></ul></ul></ul><ul><ul><ul><ul><li>Flickr – not recommended for this due to TOS restrictions </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Google Picasa – good, free, stable </li></ul></ul></ul></ul>
    18. 18. 6. Upload Image Files to the Web Flicker Restrictions on Linking Images Flickr won’t allow you to link to a full-screen image
    19. 19. Google Picasa – free image storage <ul><li>Copies of your working files </li></ul><ul><li>Easier to store just one size and resize on the web page although that can slow down page loads… </li></ul>
    20. 20. 7. Transcribe Text into a Structured File <ul><li>Separate fields for month, day, year, diarist, transcription text </li></ul><ul><li>Use </li></ul><ul><ul><li>Excel spreadsheet </li></ul></ul><ul><ul><li>mySQL tables </li></ul></ul><ul><ul><li>MS Access database </li></ul></ul><ul><ul><li>Quickbase table </li></ul></ul><ul><ul><li>Zoho table </li></ul></ul><ul><ul><li>Google Docs Spreadsheet </li></ul></ul><ul><li>This becomes your “Master Data File” </li></ul>docs.google.com
    21. 21. Sample Transcription Workspace Browser Window #1: Google Spreadsheet “ Pocket Diary Data Entry Template” Browser Window #2: Picasa Image Gallery
    22. 22. 8. Paste the transcribed text and add page image links into a blog post “ Pocket Diary Data Entry Template” file on Docs.Google.com Wordpress blog – Add New Post screen
    23. 23. 8. Paste the transcribed text and image links into a blog post – a shortcut! “ Pocket Diary Data Entry Template” file on Docs.Google.com Wordpress blog – Add New Post screen
    24. 25. 6. Tag, Categorize, Comment, Annotate…
    25. 26. Suggested Process for Publishing Historic Diaries on the Web: <ul><li>Identify a good diary prospect </li></ul><ul><li>Sign up for a Wordpress account </li></ul><ul><li>Sign up for a Google account </li></ul><ul><li>Scan the diary pages </li></ul><ul><li>Upload copies of images to Google Picasa </li></ul><ul><li>Transcribe content into a Google Spreadsheet </li></ul><ul><li>Copy-and-paste the transcribed text and image links into a WordPress blog post </li></ul><ul><li>Tag, Categorize, Comment, Annotate… </li></ul>

    ×