Presentation by Jenny O Neill 'Librarian as databrarian' delivered at #asl2015 'The inside out library: collaboration, inspiration, transformation' Feb 26th 2015
3. Mission
DRI is a trusted digital repository for Humanities
and Social Sciences Data
DRI links and preserves the rich data held by Irish
institutions, providing a central access point
and multimedia tools
♯ASL2015
5. Data Curator
“This position will involve
assembling and curating
diverse data sets from
the humanities and
social sciences for
ingestion into the Digital
Repository of Ireland”
But also…
o Metadata Taskforce
o Workflows
o Organisational Liasion
o Linked data factsheet
o Metadata guidelines for
MARC
o Feature champion for
Ingestion
♯ASL2015
6. 1641 Depositions
The 1641 Depositions are witness testimonies
from all social backgrounds, concerning their
experiences of the 1641 Irish rebellion.
o Digitised images of the Depostions (TIFF)
o Transcriptions encoded in TEI (XML)
♯ASL2015
7. Step 1 – Explore the database
♯ASL2015
https://class.stanford.edu/courses/Home/Databases/Engineering/about
14. Step 3 – Map to QDC
♯ASL2015
Mandatory in DRI
o Title
o Creator
o Created
o Description
o Rights
Optional in DRI
o Identifier
o Modified
Recommended in DRI
o Contributor
o Language
o Source
o Spatial
o Temporal
o Type
o Relation
15. Step 3
♯ASL2015
o Title
o Creator
o Created
o Source
o Subject
o Identifier
title
surname, forename, patronymics, age,
person_type_desc
day, month, year
manuscript_number, folio_start,
folio_end, page
gender_desc, nationality_desc,
religion_desc
deposition_id
16. Step 4 – Clean the metadata
♯ASL2015
http://openrefine.org/
http://www.dri.ie/publications#guidelines
20. Next steps
♯ASL2015
o Create relationships between objects
o Create QDC XML
o https://www.utsc.utoronto.ca/digitalschola
rship/content/blogs/converting-
spreadsheets-modsxml-using-open-refine
o Ingest into DRI
First I would like to give a brief introduction to the DRI. The DRI is a trusted digital repository for Humanities and Social Sciences data…
The DRI is funded under the HEA PRTLI 5 program and is built by a research consortium of six academic partners working together to deliver the repository, policies, guidelines and training. These are the RIA, TCD, NUIM, DIT, NUIG, and NCAD. The DRI now consists of 36 people across 6 sites and has partners in the cultural and academic sectors, and as well as in industry.
First I would like to give a brief introduction to the DRI. The DRI is a trusted digital repository for Humanities and Social Sciences data…
I started working in the Trinity Centre of High Performance Computing in June of this year. Don’t ask me what High Performance Computing is, despite working there for four months I still have no idea. The sentence above is taken from the job ad for my job. When I went back to the job ad the day before I started the job I realised this was the only line describing what I would actually be doing.
Lots of other areas of DRI to get involved in, my advice to any new librarians is get stuck in. If you see a gap that your skills can fill jump in. Mention each of these briefly but don’t go into detail.
My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).
My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).
My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).
The DRI is funded under the HEA PRTLI 5 program and is built by a research consortium of six academic partners working together to deliver the repository, policies, guidelines and training. These are the RIA, TCD, NUIM, DIT, NUIG, and NCAD. The DRI now consists of 36 people across 6 sites and has partners in the cultural and academic sectors, and as well as in industry.
First I removed any white space,
Next I added leading zeros to the cells that needed two didgets
Then I merged three columns so that date is now in the correct format.
First I removed any white space,
Next I added leading zeros to the cells that needed two didgets
Then I merged three columns so that date is now in the correct format.