Finding, searching and sharing qualitative data: the uses of XML

Finding, searching and sharing
qualitative data: the uses of XML

Libby Bishop
Producer Relations and
Research Ethics
Data Management in Practice
LSHTM, London, 14 November 2013

UK Data Service seeking to improve
• We have one of the largest qualitative data collections–
•

over 300 data collections in the social sciences
Currently users find and download these from our
website – generally good, we would like to improve:

• No searching within collections
• Hard to display complex relationships among related
•

files within a collection (transcript, audio, image, memo)
Cannot reliably cite parts of data

What researchers want from data centres

• Search - find data regardless of location
• Use – ways to use data flexibly
• Examine interview extract in context, online
• Decide before download
• Support analysis led by research questions (not technology)

• Cite – get and give credit appropriately
• Preserve – for own or others’ use later
XML is not a miracle cure,
just a (key) part of the solution

XML – eXtensible Mark-up Language
• Language – system for communication
• Mark-up – encoding descriptive features of text
• Tags, e.g. <u>words spoken in an interview</u>

• Extensible – set of tags is not fixed
• Text Encoding Initiative (TEI) has 100s
• Independent of specific hard/software
• Open
XML allows qual data (rich, deep, but messy,
unstructured) to benefit from computing power
typically applied to structured, numeric data.

Search: all types of resource available
Data
collections

• studies
• variables

Case
studies

• research
• teaching

ESRC
outputs

•
•
•
•

Support/
‘how to’
guides

conference paper
article
report
research summary

• dataset
• theme
• methods/statistics

What makes all this possible? XML…..

Data Documentation Initiative (DDI)
DDI: A metadata specification
for the social sciences

Use and Cite: Digital Futures project
• Build a user-friendly system for publishing and

•
•
•
•

exploring qualitative data online
Project includes large-scale digitisation of precious
and undigitized materials
Browse search results in context
Improve display complex data
Offer a mechanism for reliably citing data located in
the system

Search results – displayed in context

Many formats for different research questions

School Leaver Essay 53 – My Past
aaa In 1978 I left school, I was sixteen years old. I came straight out of school into an
apprenticeship heavy meter machanics. I served my four year apprenticeship in a garage for
another year and the left and started my own garage. At the age of twenty three I got married.
The garage was doing well so I didn’t have Much prodlems setting up a home. One year
After I had/been married my wife had her first child. When I had some spare time I made up
a car for rally cross racing but In the time I was racing I only won a few. When I was twenty
five our second child was born. Once when rally driving I had a smash and was in hospital
for five months when I was twenty nine we had our third child. I would get up at six o clock
and drive to the garage and open it at Saturdays. On some Sundays when I wasn’t rally
driving the family would go horse riding or for a picnic whilst I went fishing. In the garage I
took an apprenticship from people who had just left school. When I was thirty six we had our
fourth child. My first child would come and help in the garage at least when he left school he
would get a job. When I was forty I had an extension built on to the garage. I also bought 4
acres of land and built a racetrack and made go-karts for my second and third eldest sons
when my last child was eight I brought her a pony and taught her to ride. From when I was
forty four My mother died and my father had died when I was twenty nine.

Corrected spelling – for accurate searches

<sic>apprenticship</sic><corr>apprenticeship<corr/>

Status quo - rft transcript for download

DF - Target page for an interview

Objects in collection metadata

Richer metadata = richer discovery
• Use of DDI 2.5, QuDEx and TEI schema
• QuDEx allows identification of data objects:
• Interview transcript or audio recording etc.
• Relationship to another data object or part of data
• Descriptive categories at the object level, e.g. mime
•

type, interview characteristics, interview setting
Capacity to capture rich annotation of parts of data

• QuDEx model in use (Schema at: www.data•

archive.ac.uk/create-manage/projects/qudex/)
Object-level description = a lot of manual work!

Citation – of collection, and utterance
World Health Organization and International Collaborative Study
of Medical Care Utilization, WHO/ICS Medical Care Utilization
Study Data, 1968-1969 [computer file]. Colchester, Essex: UK Data
Archive [distributor], January 1981. SN:
1427, http://dx.doi.org/10.5255/UKDA-SN-1427-1

Preservation – benefits of XML
• Open standard
• Widely adopted as the basis for interchange of
documents and data over the Web
• Human readable
• Best for metadata; some challenges for preserving data
itself

How can researchers help?
• Produce and share high quality metadata and
documentation….and,
• Using XML is not that different than text processing and
spread sheets

Questions

Libby Bishop
ebishop@essex.ac.uk

Finding, searching and sharing qualitative data: the uses of XML

Recommended

Recommended

More Related Content

What's hot

What's hot (14)

Viewers also liked

Viewers also liked (20)

Similar to Finding, searching and sharing qualitative data: the uses of XML

Similar to Finding, searching and sharing qualitative data: the uses of XML (20)

More from London School of Hygiene and Tropical Medicine

More from London School of Hygiene and Tropical Medicine (20)

Recently uploaded

Recently uploaded (20)

Finding, searching and sharing qualitative data: the uses of XML

Editor's Notes