SlideShare a Scribd company logo
1 of 26
Finding, searching and sharing
qualitative data: the uses of XML

Libby Bishop
Producer Relations and
Research Ethics
Data Management in Practice
LSHTM, London, 14 November 2013
UK Data Service seeking to improve
• We have one of the largest qualitative data collections–
•

over 300 data collections in the social sciences
Currently users find and download these from our
website – generally good, we would like to improve:

• No searching within collections
• Hard to display complex relationships among related
•

files within a collection (transcript, audio, image, memo)
Cannot reliably cite parts of data
What researchers want from data centres

• Search - find data regardless of location
• Use – ways to use data flexibly
• Examine interview extract in context, online
• Decide before download
• Support analysis led by research questions (not technology)

• Cite – get and give credit appropriately
• Preserve – for own or others’ use later
XML is not a miracle cure,
just a (key) part of the solution
XML – eXtensible Mark-up Language
• Language – system for communication
• Mark-up – encoding descriptive features of text
• Tags, e.g. <u>words spoken in an interview</u>

• Extensible – set of tags is not fixed
• Text Encoding Initiative (TEI) has 100s
• Independent of specific hard/software
• Open
XML allows qual data (rich, deep, but messy,
unstructured) to benefit from computing power
typically applied to structured, numeric data.
Search: all types of resource available
Data
collections

• studies
• variables

Case
studies

• research
• teaching

ESRC
outputs

•
•
•
•

Support/
‘how to’
guides

conference paper
article
report
research summary

• dataset
• theme
• methods/statistics
Search
What makes all this possible? XML…..
Data Documentation Initiative (DDI)
DDI: A metadata specification
for the social sciences
Use and Cite: Digital Futures project
• Build a user-friendly system for publishing and

•
•
•
•

exploring qualitative data online
Project includes large-scale digitisation of precious
and undigitized materials
Browse search results in context
Improve display complex data
Offer a mechanism for reliably citing data located in
the system
Search results – displayed in context
Many formats for different research questions
School Leaver Essay 53 – My Past
aaa In 1978 I left school, I was sixteen years old. I came straight out of school into an
apprenticeship heavy meter machanics. I served my four year apprenticeship in a garage for
another year and the left and started my own garage. At the age of twenty three I got married.
The garage was doing well so I didn’t have Much prodlems setting up a home. One year
After I had/been married my wife had her first child. When I had some spare time I made up
a car for rally cross racing but In the time I was racing I only won a few. When I was twenty
five our second child was born. Once when rally driving I had a smash and was in hospital
for five months when I was twenty nine we had our third child. I would get up at six o clock
and drive to the garage and open it at Saturdays. On some Sundays when I wasn’t rally
driving the family would go horse riding or for a picnic whilst I went fishing. In the garage I
took an apprenticship from people who had just left school. When I was thirty six we had our
fourth child. My first child would come and help in the garage at least when he left school he
would get a job. When I was forty I had an extension built on to the garage. I also bought 4
acres of land and built a racetrack and made go-karts for my second and third eldest sons
when my last child was eight I brought her a pony and taught her to ride. From when I was
forty four My mother died and my father had died when I was twenty nine.
Corrected spelling – for accurate searches

<sic>apprenticship</sic><corr>apprenticeship<corr/>
Status quo - rft transcript for download
DF - Target page for an interview
Objects in collection metadata
Richer metadata = richer discovery
• Use of DDI 2.5, QuDEx and TEI schema
• QuDEx allows identification of data objects:
• Interview transcript or audio recording etc.
• Relationship to another data object or part of data
• Descriptive categories at the object level, e.g. mime
•

type, interview characteristics, interview setting
Capacity to capture rich annotation of parts of data

• QuDEx model in use (Schema at: www.data•

archive.ac.uk/create-manage/projects/qudex/)
Object-level description = a lot of manual work!
Citation – of collection, and utterance
World Health Organization and International Collaborative Study
of Medical Care Utilization, WHO/ICS Medical Care Utilization
Study Data, 1968-1969 [computer file]. Colchester, Essex: UK Data
Archive [distributor], January 1981. SN:
1427, http://dx.doi.org/10.5255/UKDA-SN-1427-1
Preservation – benefits of XML
• Open standard
• Widely adopted as the basis for interchange of
documents and data over the Web
• Human readable
• Best for metadata; some challenges for preserving data
itself
How can researchers help?
• Produce and share high quality metadata and
documentation….and,
• Using XML is not that different than text processing and
spread sheets
Questions

Libby Bishop
ebishop@essex.ac.uk

More Related Content

What's hot

ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"Fabien Gandon
 
Development of Semantic Web based Disaster Management System
Development of Semantic Web based Disaster Management SystemDevelopment of Semantic Web based Disaster Management System
Development of Semantic Web based Disaster Management SystemNIT Durgapur
 
121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1manujam
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011Peter Mika
 
NLP to RDF: a Step towards Web 4.0
NLP to RDF: a Step towards Web 4.0NLP to RDF: a Step towards Web 4.0
NLP to RDF: a Step towards Web 4.0Fariz Darari
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge GraphsJeff Z. Pan
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for LibrariesLukas Koster
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in LibrariesCarl Hess
 
Linked open data for cultural heritage
Linked open data for cultural heritageLinked open data for cultural heritage
Linked open data for cultural heritageAthanasios Velios
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers Getaneh Alemu
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Richard Urban
 

What's hot (14)

ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"
 
Development of Semantic Web based Disaster Management System
Development of Semantic Web based Disaster Management SystemDevelopment of Semantic Web based Disaster Management System
Development of Semantic Web based Disaster Management System
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011
 
NLP to RDF: a Step towards Web 4.0
NLP to RDF: a Step towards Web 4.0NLP to RDF: a Step towards Web 4.0
NLP to RDF: a Step towards Web 4.0
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
Rdf
RdfRdf
Rdf
 
Linked open data for cultural heritage
Linked open data for cultural heritageLinked open data for cultural heritage
Linked open data for cultural heritage
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 

Viewers also liked

160130 airborne kortversjon (iv)
160130 airborne   kortversjon (iv)160130 airborne   kortversjon (iv)
160130 airborne kortversjon (iv)Rune Sundland
 
Vst sound generation 4
Vst sound generation 4Vst sound generation 4
Vst sound generation 4BenWhite101
 
Campus Ambassador Program BJYM
Campus Ambassador Program BJYMCampus Ambassador Program BJYM
Campus Ambassador Program BJYMbjym
 
(190) (long) el debate del salario mínimo
(190) (long) el debate del salario mínimo(190) (long) el debate del salario mínimo
(190) (long) el debate del salario mínimoManfredNolte
 
Gaurav kumar sharawat
Gaurav kumar sharawatGaurav kumar sharawat
Gaurav kumar sharawatGaurav Kumar
 
JetlineCat+ílogoProdutos_Baixa.compressed
JetlineCat+ílogoProdutos_Baixa.compressedJetlineCat+ílogoProdutos_Baixa.compressed
JetlineCat+ílogoProdutos_Baixa.compressedSilvia Minami
 
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45Reading5(1) - Trung tâm Luyện thi Đại học QSC-45
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45Trungtâmluyệnthi Qsc
 
Loveisblind
LoveisblindLoveisblind
Loveisblinddj96
 
20102 bt25012540625010701116997
20102 bt2501254062501070111699720102 bt25012540625010701116997
20102 bt25012540625010701116997Fernando Granados
 
Key To Driving Theory - Amiga Format Issue 121 Review
Key To Driving Theory - Amiga Format Issue 121 ReviewKey To Driving Theory - Amiga Format Issue 121 Review
Key To Driving Theory - Amiga Format Issue 121 Reviewbuddaboy
 
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...Rondoniadinamica Jornal Eletrônico
 
Why are students coming into college poorly prepaped to write
Why are students coming into college poorly prepaped to writeWhy are students coming into college poorly prepaped to write
Why are students coming into college poorly prepaped to writeEssayAcademy
 
Instabill News Items for the Week of June 8-12
Instabill News Items for the Week of June 8-12Instabill News Items for the Week of June 8-12
Instabill News Items for the Week of June 8-12Instabill
 

Viewers also liked (20)

160130 airborne kortversjon (iv)
160130 airborne   kortversjon (iv)160130 airborne   kortversjon (iv)
160130 airborne kortversjon (iv)
 
Vst sound generation 4
Vst sound generation 4Vst sound generation 4
Vst sound generation 4
 
Stanbic Social Media Research
Stanbic Social Media ResearchStanbic Social Media Research
Stanbic Social Media Research
 
Campus Ambassador Program BJYM
Campus Ambassador Program BJYMCampus Ambassador Program BJYM
Campus Ambassador Program BJYM
 
(190) (long) el debate del salario mínimo
(190) (long) el debate del salario mínimo(190) (long) el debate del salario mínimo
(190) (long) el debate del salario mínimo
 
Gaurav kumar sharawat
Gaurav kumar sharawatGaurav kumar sharawat
Gaurav kumar sharawat
 
JetlineCat+ílogoProdutos_Baixa.compressed
JetlineCat+ílogoProdutos_Baixa.compressedJetlineCat+ílogoProdutos_Baixa.compressed
JetlineCat+ílogoProdutos_Baixa.compressed
 
тема 9
тема 9тема 9
тема 9
 
certificate (1)
certificate (1)certificate (1)
certificate (1)
 
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45Reading5(1) - Trung tâm Luyện thi Đại học QSC-45
Reading5(1) - Trung tâm Luyện thi Đại học QSC-45
 
Loveisblind
LoveisblindLoveisblind
Loveisblind
 
20102 bt25012540625010701116997
20102 bt2501254062501070111699720102 bt25012540625010701116997
20102 bt25012540625010701116997
 
Key To Driving Theory - Amiga Format Issue 121 Review
Key To Driving Theory - Amiga Format Issue 121 ReviewKey To Driving Theory - Amiga Format Issue 121 Review
Key To Driving Theory - Amiga Format Issue 121 Review
 
BEGIRA_Alterio
BEGIRA_AlterioBEGIRA_Alterio
BEGIRA_Alterio
 
arotosinresume 1
arotosinresume 1arotosinresume 1
arotosinresume 1
 
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...
Ex-secretário e ex-coordenador de Saúde de Rondônia terão de pagar R$ 30 mil ...
 
Why are students coming into college poorly prepaped to write
Why are students coming into college poorly prepaped to writeWhy are students coming into college poorly prepaped to write
Why are students coming into college poorly prepaped to write
 
Instabill News Items for the Week of June 8-12
Instabill News Items for the Week of June 8-12Instabill News Items for the Week of June 8-12
Instabill News Items for the Week of June 8-12
 
SMART CARDS
SMART CARDSSMART CARDS
SMART CARDS
 
Boletín técnico (flanges)
Boletín técnico (flanges)Boletín técnico (flanges)
Boletín técnico (flanges)
 

Similar to Finding, searching and sharing qualitative data: the uses of XML

Linked Data Applications - WWW2010
Linked Data Applications - WWW2010Linked Data Applications - WWW2010
Linked Data Applications - WWW2010Juan Sequeda
 
Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6ARDC
 
Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world CILIPScotland
 
Towards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData DiscoveryTowards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData DiscoveryJack Park
 
Common Sense for the Common Core: Part Two (assessment edition)
Common Sense for the Common Core: Part Two (assessment edition)Common Sense for the Common Core: Part Two (assessment edition)
Common Sense for the Common Core: Part Two (assessment edition)bcurran
 
Common Sense for the Common Core Part Two: Assessment Edition
Common Sense for the Common Core Part Two: Assessment EditionCommon Sense for the Common Core Part Two: Assessment Edition
Common Sense for the Common Core Part Two: Assessment Editionbcurran
 
How data informs decision making 2
How data informs decision making 2How data informs decision making 2
How data informs decision making 2jaccalder
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data DiscoveryARDC
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data LibrarianLibrary_Connect
 
Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management PlanKristin Briney
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationLorri Mon
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...SEAD
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppttpoelzer
 
Data Management for Undergraduate Research
Data Management for Undergraduate ResearchData Management for Undergraduate Research
Data Management for Undergraduate ResearchRebekah Cummings
 

Similar to Finding, searching and sharing qualitative data: the uses of XML (20)

Linked Data Applications - WWW2010
Linked Data Applications - WWW2010Linked Data Applications - WWW2010
Linked Data Applications - WWW2010
 
Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6
 
Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world
 
Towards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData DiscoveryTowards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData Discovery
 
Common Sense for the Common Core: Part Two (assessment edition)
Common Sense for the Common Core: Part Two (assessment edition)Common Sense for the Common Core: Part Two (assessment edition)
Common Sense for the Common Core: Part Two (assessment edition)
 
Common Sense for the Common Core Part Two: Assessment Edition
Common Sense for the Common Core Part Two: Assessment EditionCommon Sense for the Common Core Part Two: Assessment Edition
Common Sense for the Common Core Part Two: Assessment Edition
 
IRT Unit_I.pptx
IRT Unit_I.pptxIRT Unit_I.pptx
IRT Unit_I.pptx
 
How data informs decision making 2
How data informs decision making 2How data informs decision making 2
How data informs decision making 2
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data Librarian
 
Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management Plan
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Open University Data
Open University DataOpen University Data
Open University Data
 
Lecture - Data Mining
Lecture - Data MiningLecture - Data Mining
Lecture - Data Mining
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
Websci 2018
Websci 2018Websci 2018
Websci 2018
 
Data Management for Undergraduate Research
Data Management for Undergraduate ResearchData Management for Undergraduate Research
Data Management for Undergraduate Research
 
Domain Identification for Linked Open Data
Domain Identification for Linked Open DataDomain Identification for Linked Open Data
Domain Identification for Linked Open Data
 

More from London School of Hygiene and Tropical Medicine

More from London School of Hygiene and Tropical Medicine (20)

Preparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTMPreparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTM
 
Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...
 
Enhance your rese​arch impact through open science
Enhance your rese​arch impact through open scienceEnhance your rese​arch impact through open science
Enhance your rese​arch impact through open science
 
Information Security and GDPR
Information Security and GDPRInformation Security and GDPR
Information Security and GDPR
 
GDPR and Research Data Management
GDPR and Research Data ManagementGDPR and Research Data Management
GDPR and Research Data Management
 
Towards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunitiesTowards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunities
 
Data Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharingData Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharing
 
Crowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public healthCrowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public health
 
Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...
 
i-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseasesi-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseases
 
Internet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platformInternet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platform
 
An overview of the MyHeart Counts app
An overview of the MyHeart Counts appAn overview of the MyHeart Counts app
An overview of the MyHeart Counts app
 
Electronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in EthiopiaElectronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in Ethiopia
 
Mobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour ResearchMobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour Research
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
RDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspectiveRDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspective
 
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
 
Research data services at the University of Oxford
Research data services at the University of OxfordResearch data services at the University of Oxford
Research data services at the University of Oxford
 
Research Data Management at The University of Edinburgh
Research Data Management at The University of EdinburghResearch Data Management at The University of Edinburgh
Research Data Management at The University of Edinburgh
 
Research data management at UAL
Research data management at UALResearch data management at UAL
Research data management at UAL
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 

Finding, searching and sharing qualitative data: the uses of XML

  • 1. Finding, searching and sharing qualitative data: the uses of XML Libby Bishop Producer Relations and Research Ethics Data Management in Practice LSHTM, London, 14 November 2013
  • 2. UK Data Service seeking to improve • We have one of the largest qualitative data collections– • over 300 data collections in the social sciences Currently users find and download these from our website – generally good, we would like to improve: • No searching within collections • Hard to display complex relationships among related • files within a collection (transcript, audio, image, memo) Cannot reliably cite parts of data
  • 3. What researchers want from data centres • Search - find data regardless of location • Use – ways to use data flexibly • Examine interview extract in context, online • Decide before download • Support analysis led by research questions (not technology) • Cite – get and give credit appropriately • Preserve – for own or others’ use later XML is not a miracle cure, just a (key) part of the solution
  • 4. XML – eXtensible Mark-up Language • Language – system for communication • Mark-up – encoding descriptive features of text • Tags, e.g. <u>words spoken in an interview</u> • Extensible – set of tags is not fixed • Text Encoding Initiative (TEI) has 100s • Independent of specific hard/software • Open XML allows qual data (rich, deep, but messy, unstructured) to benefit from computing power typically applied to structured, numeric data.
  • 5. Search: all types of resource available Data collections • studies • variables Case studies • research • teaching ESRC outputs • • • • Support/ ‘how to’ guides conference paper article report research summary • dataset • theme • methods/statistics
  • 7.
  • 8.
  • 9. What makes all this possible? XML…..
  • 10. Data Documentation Initiative (DDI) DDI: A metadata specification for the social sciences
  • 11.
  • 12.
  • 13.
  • 14. Use and Cite: Digital Futures project • Build a user-friendly system for publishing and • • • • exploring qualitative data online Project includes large-scale digitisation of precious and undigitized materials Browse search results in context Improve display complex data Offer a mechanism for reliably citing data located in the system
  • 15. Search results – displayed in context
  • 16. Many formats for different research questions
  • 17. School Leaver Essay 53 – My Past aaa In 1978 I left school, I was sixteen years old. I came straight out of school into an apprenticeship heavy meter machanics. I served my four year apprenticeship in a garage for another year and the left and started my own garage. At the age of twenty three I got married. The garage was doing well so I didn’t have Much prodlems setting up a home. One year After I had/been married my wife had her first child. When I had some spare time I made up a car for rally cross racing but In the time I was racing I only won a few. When I was twenty five our second child was born. Once when rally driving I had a smash and was in hospital for five months when I was twenty nine we had our third child. I would get up at six o clock and drive to the garage and open it at Saturdays. On some Sundays when I wasn’t rally driving the family would go horse riding or for a picnic whilst I went fishing. In the garage I took an apprenticship from people who had just left school. When I was thirty six we had our fourth child. My first child would come and help in the garage at least when he left school he would get a job. When I was forty I had an extension built on to the garage. I also bought 4 acres of land and built a racetrack and made go-karts for my second and third eldest sons when my last child was eight I brought her a pony and taught her to ride. From when I was forty four My mother died and my father had died when I was twenty nine.
  • 18. Corrected spelling – for accurate searches <sic>apprenticship</sic><corr>apprenticeship<corr/>
  • 19. Status quo - rft transcript for download
  • 20. DF - Target page for an interview
  • 22. Richer metadata = richer discovery • Use of DDI 2.5, QuDEx and TEI schema • QuDEx allows identification of data objects: • Interview transcript or audio recording etc. • Relationship to another data object or part of data • Descriptive categories at the object level, e.g. mime • type, interview characteristics, interview setting Capacity to capture rich annotation of parts of data • QuDEx model in use (Schema at: www.data• archive.ac.uk/create-manage/projects/qudex/) Object-level description = a lot of manual work!
  • 23. Citation – of collection, and utterance World Health Organization and International Collaborative Study of Medical Care Utilization, WHO/ICS Medical Care Utilization Study Data, 1968-1969 [computer file]. Colchester, Essex: UK Data Archive [distributor], January 1981. SN: 1427, http://dx.doi.org/10.5255/UKDA-SN-1427-1
  • 24. Preservation – benefits of XML • Open standard • Widely adopted as the basis for interchange of documents and data over the Web • Human readable • Best for metadata; some challenges for preserving data itself
  • 25. How can researchers help? • Produce and share high quality metadata and documentation….and, • Using XML is not that different than text processing and spread sheets

Editor's Notes

  1. Main points – not to teach xml. Researchers need to locate, explore and use data.xml behind the scenes, makes that possible. Even if you are not technical, useful to understand.
  2. We have a lot to be proud of, but technologies are advancing, and we want to improve ways we provide access and disseminate data to our users.
  3. WE try to listen and learn from researchers – and respond to their/your needs. And those expectations are rising quickly.
  4. A bit like html, but for marking up structural features (paragraphs), not format (bold)Hard (for me) to get in the abstract, so going to turn now to examples, cases of uses of XML,For search, use, citation and sharing.
  5. And here is a list of the data types we are going to talk about today, we’re not going to go into a huge amount of detail – but this talkwill give you a taste of the data we host
  6. 391 hits from search on health survey
  7. Similar search – but limited to keyword = tropical and data only. We find a LSHTM data collection!
  8. This is just a small part of the cat record for this collection. Note fields like title, and depositor. Pretty standard kinds of metadata (data about data) needed for any data collection. Note the upper right – get DDI XML record….
  9. No need to be scared – this is exactly the same info….but you can see its XML structure – using tags.
  10. The use of xml for metadata, and doing it in a standardised way, enables exponential increase in power of searching for data. DDI – standardised set of tags for handling social science data.
  11. Now, a quick look at some other capabilities made possible by this xml structured mdata.
  12. The ability to find and locate variables across surveys and decide if they are “close enough” for your purposes.
  13. And the ability to search for health data across archives – for example, this portal for European data archives.
  14. DF – intended to address a couple of those areas where UKDS wants to improve
  15. Search results show all interviews with search term.With context – surrounding sentence.And with metadata about the interviewee – age, gender, region, SES
  16. Here is an example of some of the paper being digitised – essays about future life written by 16yo school leavers, SheppeyFor this collection – there is value in retaining an image file – the handwriting itself is data.
  17. WE are scanning, then ocr – to fully digitise text. This version shows original spelling. What to do – show correct or original?
  18. XML allows us to keep both in the same document.
  19. Typical transcript user could download. Good, but cannot be browsed online, and can’t modify display.
  20. Available online. Can use formatting to display turn-taking and Can modify speaker tags with multiple versions of metadata
  21. This is QuDex Schema (a bit like DDI) – Qualitative Data Exchange – family of tags created specifically for qualitative data.
  22. Made possible by GUID – Globally Unique ID – for every utterance