Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this document? Why not share!

Diacritics presentation20101109 jstrass

on

  • 854 views

 

Statistics

Views

Total Views
854
Slideshare-icon Views on SlideShare
853
Embed Views
1

Actions

Likes
1
Downloads
11
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Diacritics presentation20101109 jstrass Diacritics presentation20101109 jstrass Document Transcript

    • How to import diacritics into CONTENTdm  from a library catalog using Excel and  MarcEdit This talk was inspired by our struggles to digitize some Nordic Solo Songs as collected by Dan Dressen and bravely cataloged and uploaded by Kathy Blough. Jill Strass Jill Strass St. Olaf College Upper Midwest Online CONTENTdm Conference November 8‐9, 2010 The Challenge• Shortcut to metadata: obtain MARC records Shortcut to metadata: obtain MARC records  containing diacritics from a library catalog  as a tab‐delimited file for easy import into  CONTENTdm 1
    • The Method• Export our records from the library catalog Export our records from the library catalog  as a delimited file The Method• Export our records from the library catalog Export our records from the library catalog  as a delimited file• Use the tab‐delimited file to generate  metadata for CONTENTdm 2
    • The Method• Export our records from the library catalog Export our records from the library catalog  as a delimited file• Use the tab‐delimited file to generate  metadata for CONTENTdm• Upload as a compound object into  Up oad as a co pou d object to CONTENTdm   The Challenge• Uh oh we have an export bug that won’t Uh oh, we have an export bug that won t  allow us to cleanly export fields with  repeating values from the catalog to a  delimited file.  3
    • The Workaround – Catalog to MarcEdit• Uh oh we have an export bug that won’t Uh oh, we have an export bug that won t  allow us to cleanly export fields with  repeating values from the catalog to a  delimited file.• No worries, we’ll use MarcEditThe Workaround – Catalog to MarcEdit• Uh oh we have an export bug that won’t Uh oh, we have an export bug that won t  allow us to cleanly export fields with  repeating values from the catalog to a  delimited file.• No worries, we’ll use MarcEdit• Convert the tab delimited file (.out) from  the catalog into an (.mrc) format file using  MarcEdit 4
    • The Workaround – Catalog to MarcEdit• Uh oh we have an export bug that won’t Uh oh, we have an export bug that won t  allow us to cleanly export from the catalog  to a delimited file.• No worries, we’ll use MarcEdit• Convert the tab delimited file (.out) from  Co e t t e tab de ted e ( out) o the catalog into an (.mrc) format file using  MarcEdit• Take the (.mrc) file and export using  MarcEdit’s tool for tab‐delimited files. The Workaround – Catalog to MarcEdit• Uh oh we have an export bug that won’t allow us to Uh oh, we have an export bug that won t allow us to  cleanly export from the catalog to a delimited file.• No worries, we’ll use MarcEdit• Convert the tab delimited file (.out) from the catalog  into an (.mrc) format file using MarcEdit• Take the (.mrc) file and export using MarcEdit’s tool  ( ) p g for tab‐delimited files. • In MarcEdit, we choose which MARC fields we want  for our metadata in digital collections. 5
    • The Trick to know in MarcEdit for  diacritics • Use the MarcEdit Characterset Translation Use the MarcEdit Characterset Translation  tool, and while breaking the record, select  UTF‐8 as the format, so Excel can recognize  diacritic characters. The Trick to know in MarcEdit for  diacritics Note that the box for Translate to UTF-8 is checked. 6
    • The Trick to know in MarcEdit for  diacritics Yippee! If you look real close, you can see diacritics are showing up in the text editor in MarcEdit. Trick for Diacritics in Excel• Now we have our diacritics within a tab Now we have our diacritics within a tab  delimited file, courtesy of MarcEdit. • There is a trick you’ll need to use when you  first open Excel.  7
    • Trick for Diacritics in Excel When you first open your tab-delimited file from MarcEdit, when Excel takes you through its wizard for importing the tab delimited file, select 65001 Unicode (UTF-8) from the File Origin pull-down menu. This will allow Excel to “see” the diacritics.Generating Metadata from tab‐ delimited files• We use a tricked out spreadsheet that We use a tricked‐out spreadsheet that  allows us to take a row from a tab delimited  file, copy and paste it into Excel, and then  Excel generates a compound object  template for easy upload into CONTENTdm. 8
    • Generating Metadata from tab‐ delimited files• We use a tricked out spreadsheet that We use a tricked‐out spreadsheet that  allows us to take a row from a tab‐ delimited file, copy and paste it into Excel,  and then Excel generates a compound  object template for easy upload into  CONTENTdm.• We do this to avoid manual data entry as  much as possible.Generating Metadata from tab‐ delimited files• We use a tricked out spreadsheet that We use a tricked‐out spreadsheet that  allows us to take a row from a tab‐ delimited file, copy and paste it into Excel,  and then Excel generates a compound  object template for easy upload into  CONTENTdm.• We do this to avoid manual data entry as  much as possible.• If you’d like a spreadsheet file and  documentation on how to use it contact 9
    • Generating Metadata from tab‐ delimited files • To convert the xls file To convert the .xls file  to .txt, we select,  copy and paste from  Excel into Notepad++. • We do this so we can  see exactly what  characters are  showing up in our  text files. Generating Metadata from tab‐ delimited files • Note that Notepad++ Note that Notepad++  is so cool, we don’t  need any tricks to  use it! 10
    • Uploading into CONTENTdm with  Diacritics (CDM 5.3) From Project Client, j , select Add Multiple Compound Objects, then select the Map Fields Tab.Uploading into CONTENTdm with  Diacritics (CDM 5.3) Click the Encoding button. 11
    • Uploading into CONTENTdm with  Diacritics (CDM 5.3) If only it were this simple…. For us, we had to select ANSI for this to work, but according to the documentation, UTF 8 UTF-8 as encoding is supposed to work. Uploading into CONTENTdm with  Diacritics (CDM 5.3) We may never y know why this is so for us. Please share your experiences. 12
    • A Sample of Diacritics on CONTENTdm And here we are, at journey’s , j y end…. Summary of Diacritics on  CONTENTdm• Export MARC records from your catalog or source for Export MARC records from your catalog or source for  text with diacritics.• If you need to use MarcEdit in this process, select the  UTF‐8 box in the Characterset Translation Tool.• When first opening a tab‐delimited file in Excel, select  65001 Unicode (UTF‐8) from the File Origin pull‐down  menu.• When uploading to CONTENTdm, experiment with the  UTF‐8 vs ANSI setting in the Add Compound Object,  File Mapping, Encoding box. 13
    • How to import Diacritics from a Library  Catalog into CONTENTdm Using Excel  and MarcEditJill StrassDigital Initiatives and Metadata LibrarianSt. Olaf Collegestrass@stolaf.edu 14