SlideShare a Scribd company logo
1 of 30
Download to read offline
Improved Discoverability of Digital
Objects in Institutional Repositories
Using Controlled Vocabularies
University of Zambia
Lusaka, ZAMBIA
Bertha Chipangila · Eric Liswaniso · Andrew Mawila
Philomena Mwanza · Daisy Nawila · Robert M’sendo
Mayumbo Nyirenda · Lighton Phiri
ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
September 27–30, 2021
2/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
About Us (1/2)
3/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
About Us (2/2)
● The DataLab research group at
The University of Zambia is
composed of faculty staff and
students—undergraduate and
postgraduate—working in three
main areas
○ Data Mining
○ Digital Libraries
○ Technology-Enhanced Learning
http://datalab.unza.zm
4/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Outline
● Motivation
● Problem Statement
● Methodology
● Results and Discussion
● Conclusion and Future Work
5/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
There is an Ever Increasing Amount of
Scholarly Research Generated
https://scholar.google.com
6/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
There is an Ever Increasing Amount of
Scholarly Research Generated
https://academic.microsoft.com
7/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Discoverability Services Facilitate Findability
of Scholarly Research in IRs
http://open.uct.ac.za
http://dspace.unza.zm
8/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Problem Statement
● There are numerous
inconsistencies with
digital object metadata
elements used to
describe subjects
○ Lack of use of controlled
vocabularies and subject
headings compromises
effective searching and
browsing of scholarly
research
9/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Problem Statement
● There are numerous
inconsistencies with
digital object metadata
elements used to
describe subjects
○ Lack of use of controlled
vocabularies and subject
headings compromises
effective searching and
browsing of scholarly
research
10/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Problem Statement
● There are numerous
inconsistencies with
digital object metadata
elements used to
describe subjects
○ Lack of use of controlled
vocabularies and subject
headings compromises
effective searching and
browsing of scholarly
research
11/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Methodology
● Situational analysis to determine the implications of non-use
of controlled vocabularies
● Identification of subject-specific controlled vocabularies for
various disciplines
● Usability study of IRs integrated with controlled vocabularies
when compared with IRs without controlled vocabularies
● Implementation of multi-label subject classification model
for classifying ACM CCS concepts and arXiv subjects
12/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Methodology: Situational Analysis
● Dublin Core encoded
metadata harvested
from three repositories
○ NDLTD Union Catalog
○ University of Cape
Town Computer
Science Document
Archive
○ University of Zambia
Institutional
Repository
13/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Methodology: Identification of Subjects and
Usability Study
● 7 faculty interviewed to
determine appropriate
controlled vocabularies
● DSpace-powered IRs
set-up to conduct
controlled comparative
study
○ IR #1: LCSH subjects
○ IR #2: No subjects
○ System Usability Scale
used to assess usability
14/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Methodology: Identification of Subjects and
Usability Study
● 7 faculty interviewed to
determine appropriate
controlled vocabularies
● DSpace-powered IRs
set-up to conduct
controlled comparative
study
○ IR #1: LCSH subjects
○ IR #2: No subjects
○ System Usability Scale
used to assess usability
● Within subject SUS study
conducted with 50
participants
15/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Methodology: Multi-label Subject Classifier
Title + Abstract
F1 Score Hamming
Loss
Jaccard
Score
SGDClassifier TF-IDF 0.540 0.005 0.431
[...] [...] [...] [...] [...]
● Multi-label subject classification model implemented using
arXiv CoRR dataset and validated using the UCT@ CS
Document archive
○ Titles and Abstracts used as input features
16/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Situational Analysis
(2/2)
● Analysis 1. Metadata preparation
and ingestion workflow based on
internal policy
● Analysis 2. Subject heading usage
is sparing. 92.1% of tags only
associated with one publication
● Analysis 3. Domain-specific
subject headings are not used.
Internally devised LCSH used
17/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Situational Analysis
(1/2)
18/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Situational Analysis
(1/2)
● Incidentally, the
problem manifests
itself in other
repositories and
downstream services
19/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Comparative
Analysis (1/2)
20/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Comparative
Analysis (1/2)
● SUS average scores
○ [66.2] Baseline
○ [68.9] Intervention
21/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Comparative
Analysis (2/2)
22/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Comparative
Analysis (2/2)
23/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Multi-label Subject
Classification Model—Implementation
Title + Abstract
F1 Score Hamming
Loss
Jaccard
Score
SGDClassifier TF-IDF 0.540 0.005 0.431
[...] [...] [...] [...] [...]
● Approaches used: Binary Relevance, Classifier Chains and
One-Versus-Rest
● Estimators: MultinomialNB vs SGDClassifier vs RandomForest
● Features: TF vs TF-IDF; Title vs Abstract vs Title + Abstract
24/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Multi-label Subject
Classification Model—Validation
● Model evaluated using CS
subject repository with
self-archiving
implemented
25/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Results and Discussion: Multi-label Subject
Classification Model—Demonstration
C.2.4 · D.2.11 · F.1.1 · H.3.4 · H.3.5 ·
H.5.2
Computer Science - Artificial
Intelligence · Computer Science -
Computation and Language ·
Computer Science - General
Literature · Computer Science -
Human-Computer Interaction
● Six (6) arXiv and four (4)
ACM CCS subject predicted
by model
26/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Conclusions and Future Work
● Integrating IRs with subject controlled vocabularies can
potentially complement self-archiving and, additionally, has
the benefit ensuring that IRs are usable and effective
● Potential future work and/or direction
○ Metadata cleaning, enhancement and augmentation of existing
descriptive metadata
○ Implementation of subject classification models for other
subject-specific controlled vocabularies
○ Automatic generation of subject classes for large-scale
repositories such as the NDLTD Union Catalog
27/30
September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
Q & A Session
● Comments, concerns and complaints?
[1] Phiri, L. (2018). Research Visibility in the Global South: Towards
Increased Online Visibility of Scholarly Research Output in
Zambia. IEEE International Conference in Information and
Communication Technologies.
[2] Phiri, L. (2020). A Multi-Faceted Multi-Stakeholder Approach for
Increased Visibility of ETDs in Zambia. Cadernos BAD, (1).
DOI: 10.1017/S0269888910000032
[3] Phiri, L. (2020). Automatic classification of digital objects for
improved metadata quality of electronic theses and dissertations
in institutional repositories. International Journal of Metadata,
Semantics and Ontologies, 14(3), 234-248.
DOI: 10.1504/IJMSO.2020.112804
Bibliography
lighton.phiri@unza.zm
http://datalab.unza.zm
http://lis.unza.zm/~lightonphiri
Improved Discoverability of Digital
Objects in Institutional Repositories
Using Controlled Vocabularies
University of Zambia
Lusaka, ZAMBIA
Bertha Chipangila · Eric Liswaniso · Andrew Mawila
Philomena Mwanza · Daisy Nawila · Robert M’sendo
Mayumbo Nyirenda · Lighton Phiri
ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021)
September 27–30, 2021

More Related Content

Similar to Improved Discoverability of Digital Objects in Institutional Repositories Using Controlled Vocabularies

Ppt tale kn_intro_final
Ppt tale kn_intro_finalPpt tale kn_intro_final
Ppt tale kn_intro_finalManuel Castro
 
Smithies bodleian 2017_v.2.0
Smithies bodleian 2017_v.2.0Smithies bodleian 2017_v.2.0
Smithies bodleian 2017_v.2.0jamessmithies
 
Growing the Knowledge Tree: Core concepts, methods, outcomes, and tools
Growing the Knowledge Tree: Core concepts, methods, outcomes, and toolsGrowing the Knowledge Tree: Core concepts, methods, outcomes, and tools
Growing the Knowledge Tree: Core concepts, methods, outcomes, and toolsLeonel Morgado
 
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...Lighton Phiri
 
Scottish UPA Meeting 20/04/10
Scottish UPA Meeting 20/04/10Scottish UPA Meeting 20/04/10
Scottish UPA Meeting 20/04/10Lorraine Paterson
 
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAMMULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAMeMadrid network
 
EADTU 2018 conference MECA project
EADTU 2018 conference MECA project EADTU 2018 conference MECA project
EADTU 2018 conference MECA project Manuel Castro
 
20200408_210832.pptx
20200408_210832.pptx20200408_210832.pptx
20200408_210832.pptxeceschmidt
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsHong-Linh Truong
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeEUDAT
 
Session 3: Vocabulary enrichment, Gerda Koch
Session 3: Vocabulary enrichment, Gerda KochSession 3: Vocabulary enrichment, Gerda Koch
Session 3: Vocabulary enrichment, Gerda Kochlocloud
 
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...Daniele Malitesta
 
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...Vera G. Meister
 
A history of clu
A history of cluA history of clu
A history of clusugeladi
 
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...PERICLES_FP7
 
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...Manuel Castro
 

Similar to Improved Discoverability of Digital Objects in Institutional Repositories Using Controlled Vocabularies (20)

Students guide mec mat_2021_2022
Students guide mec mat_2021_2022Students guide mec mat_2021_2022
Students guide mec mat_2021_2022
 
Ppt tale kn_intro_final
Ppt tale kn_intro_finalPpt tale kn_intro_final
Ppt tale kn_intro_final
 
Smithies bodleian 2017_v.2.0
Smithies bodleian 2017_v.2.0Smithies bodleian 2017_v.2.0
Smithies bodleian 2017_v.2.0
 
Growing the Knowledge Tree: Core concepts, methods, outcomes, and tools
Growing the Knowledge Tree: Core concepts, methods, outcomes, and toolsGrowing the Knowledge Tree: Core concepts, methods, outcomes, and tools
Growing the Knowledge Tree: Core concepts, methods, outcomes, and tools
 
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...
Factors Influencing Co-Creation of Open Education Resources Using Learning Ob...
 
Scottish UPA Meeting 20/04/10
Scottish UPA Meeting 20/04/10Scottish UPA Meeting 20/04/10
Scottish UPA Meeting 20/04/10
 
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAMMULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM
MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM
 
EADTU 2018 conference MECA project
EADTU 2018 conference MECA project EADTU 2018 conference MECA project
EADTU 2018 conference MECA project
 
drc 3-2.pptx
drc 3-2.pptxdrc 3-2.pptx
drc 3-2.pptx
 
20200408_210832.pptx
20200408_210832.pptx20200408_210832.pptx
20200408_210832.pptx
 
Enase20.ppt
Enase20.pptEnase20.ppt
Enase20.ppt
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and Clouds
 
Shamane-PhD-Defence-Final.pptx
Shamane-PhD-Defence-Final.pptxShamane-PhD-Defence-Final.pptx
Shamane-PhD-Defence-Final.pptx
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science Theme
 
Session 3: Vocabulary enrichment, Gerda Koch
Session 3: Vocabulary enrichment, Gerda KochSession 3: Vocabulary enrichment, Gerda Koch
Session 3: Vocabulary enrichment, Gerda Koch
 
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...
[MM2023] Ducho: A Unified Framework for the Extraction of Multimodal Features...
 
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...
Towards a Knowledge Graph for a Research Group with Focus on Qualitative Anal...
 
A history of clu
A history of cluA history of clu
A history of clu
 
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
 
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...
Needs and Training in Microelectronics Courses in the MicroElectronics Cloud ...
 

More from Lighton Phiri

Enterprise Medical Imaging for Streamlined Radiological Diagnosis in Zambian...
Enterprise Medical Imaging for Streamlined Radiological Diagnosis  in Zambian...Enterprise Medical Imaging for Streamlined Radiological Diagnosis  in Zambian...
Enterprise Medical Imaging for Streamlined Radiological Diagnosis in Zambian...Lighton Phiri
 
User Centred Design and Implementation of Useful Picture Archiving and Commun...
User Centred Design and Implementation of Useful Picture Archiving and Commun...User Centred Design and Implementation of Useful Picture Archiving and Commun...
User Centred Design and Implementation of Useful Picture Archiving and Commun...Lighton Phiri
 
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...Lighton Phiri
 
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...Lighton Phiri
 
Enterprise Medical Imaging in the Global South: Challenges and Opportunities
Enterprise Medical Imaging in the Global South: Challenges and OpportunitiesEnterprise Medical Imaging in the Global South: Challenges and Opportunities
Enterprise Medical Imaging in the Global South: Challenges and OpportunitiesLighton Phiri
 
DRGS OJS Training: Electronic Publishing Using Open Journal Systems
DRGS OJS Training: Electronic Publishing Using Open Journal SystemsDRGS OJS Training: Electronic Publishing Using Open Journal Systems
DRGS OJS Training: Electronic Publishing Using Open Journal SystemsLighton Phiri
 
OJS Training: Users and User Roles
OJS Training: Users and User RolesOJS Training: Users and User Roles
OJS Training: Users and User RolesLighton Phiri
 
OJS Training: Journal Settings and Configuration
OJS Training: Journal Settings and ConfigurationOJS Training: Journal Settings and Configuration
OJS Training: Journal Settings and ConfigurationLighton Phiri
 
OJS Training: Managing The Submission Process
OJS Training: Managing The Submission ProcessOJS Training: Managing The Submission Process
OJS Training: Managing The Submission ProcessLighton Phiri
 
OJS Training: Creating and Managing Journal Issues
OJS Training: Creating and Managing Journal IssuesOJS Training: Creating and Managing Journal Issues
OJS Training: Creating and Managing Journal IssuesLighton Phiri
 
Institutional Repository Single Sources of Truth
Institutional Repository Single Sources of TruthInstitutional Repository Single Sources of Truth
Institutional Repository Single Sources of TruthLighton Phiri
 
Improved Scholarly Communication Using Machine Learning
Improved Scholarly Communication Using Machine LearningImproved Scholarly Communication Using Machine Learning
Improved Scholarly Communication Using Machine LearningLighton Phiri
 
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...Lighton Phiri
 
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...Lighton Phiri
 
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...Lighton Phiri
 
Post PhD Transition Experience: Successes and Challenges
Post PhD Transition Experience: Successes and ChallengesPost PhD Transition Experience: Successes and Challenges
Post PhD Transition Experience: Successes and ChallengesLighton Phiri
 
Technology-Enhanced Learning for Improved Quality of Teaching and Learning
Technology-Enhanced Learning for Improved Quality of Teaching and LearningTechnology-Enhanced Learning for Improved Quality of Teaching and Learning
Technology-Enhanced Learning for Improved Quality of Teaching and LearningLighton Phiri
 
Research Visibility in the Global South: Towards Increased Online Visibility...
Research Visibility  in the Global South: Towards Increased Online Visibility...Research Visibility  in the Global South: Towards Increased Online Visibility...
Research Visibility in the Global South: Towards Increased Online Visibility...Lighton Phiri
 
Ph.D Research Proposal: Software Tools for Orchestration
Ph.D Research Proposal: Software Tools for OrchestrationPh.D Research Proposal: Software Tools for Orchestration
Ph.D Research Proposal: Software Tools for OrchestrationLighton Phiri
 
Research Visibility in the Global South: Towards Increased Online Visibility ...
Research Visibility in the Global South: Towards Increased Online Visibility ...Research Visibility in the Global South: Towards Increased Online Visibility ...
Research Visibility in the Global South: Towards Increased Online Visibility ...Lighton Phiri
 

More from Lighton Phiri (20)

Enterprise Medical Imaging for Streamlined Radiological Diagnosis in Zambian...
Enterprise Medical Imaging for Streamlined Radiological Diagnosis  in Zambian...Enterprise Medical Imaging for Streamlined Radiological Diagnosis  in Zambian...
Enterprise Medical Imaging for Streamlined Radiological Diagnosis in Zambian...
 
User Centred Design and Implementation of Useful Picture Archiving and Commun...
User Centred Design and Implementation of Useful Picture Archiving and Commun...User Centred Design and Implementation of Useful Picture Archiving and Commun...
User Centred Design and Implementation of Useful Picture Archiving and Commun...
 
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...
Enterprise Medical Imaging for Improved Radiological Workflows in Zambian Pub...
 
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...
Enterprise Medical Imaging in Public Health Facilities in Zambia: Towards a U...
 
Enterprise Medical Imaging in the Global South: Challenges and Opportunities
Enterprise Medical Imaging in the Global South: Challenges and OpportunitiesEnterprise Medical Imaging in the Global South: Challenges and Opportunities
Enterprise Medical Imaging in the Global South: Challenges and Opportunities
 
DRGS OJS Training: Electronic Publishing Using Open Journal Systems
DRGS OJS Training: Electronic Publishing Using Open Journal SystemsDRGS OJS Training: Electronic Publishing Using Open Journal Systems
DRGS OJS Training: Electronic Publishing Using Open Journal Systems
 
OJS Training: Users and User Roles
OJS Training: Users and User RolesOJS Training: Users and User Roles
OJS Training: Users and User Roles
 
OJS Training: Journal Settings and Configuration
OJS Training: Journal Settings and ConfigurationOJS Training: Journal Settings and Configuration
OJS Training: Journal Settings and Configuration
 
OJS Training: Managing The Submission Process
OJS Training: Managing The Submission ProcessOJS Training: Managing The Submission Process
OJS Training: Managing The Submission Process
 
OJS Training: Creating and Managing Journal Issues
OJS Training: Creating and Managing Journal IssuesOJS Training: Creating and Managing Journal Issues
OJS Training: Creating and Managing Journal Issues
 
Institutional Repository Single Sources of Truth
Institutional Repository Single Sources of TruthInstitutional Repository Single Sources of Truth
Institutional Repository Single Sources of Truth
 
Improved Scholarly Communication Using Machine Learning
Improved Scholarly Communication Using Machine LearningImproved Scholarly Communication Using Machine Learning
Improved Scholarly Communication Using Machine Learning
 
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...
Open Access Electronic Publishing for Increased Online Visibility: Tooling Ch...
 
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
 
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs i...
 
Post PhD Transition Experience: Successes and Challenges
Post PhD Transition Experience: Successes and ChallengesPost PhD Transition Experience: Successes and Challenges
Post PhD Transition Experience: Successes and Challenges
 
Technology-Enhanced Learning for Improved Quality of Teaching and Learning
Technology-Enhanced Learning for Improved Quality of Teaching and LearningTechnology-Enhanced Learning for Improved Quality of Teaching and Learning
Technology-Enhanced Learning for Improved Quality of Teaching and Learning
 
Research Visibility in the Global South: Towards Increased Online Visibility...
Research Visibility  in the Global South: Towards Increased Online Visibility...Research Visibility  in the Global South: Towards Increased Online Visibility...
Research Visibility in the Global South: Towards Increased Online Visibility...
 
Ph.D Research Proposal: Software Tools for Orchestration
Ph.D Research Proposal: Software Tools for OrchestrationPh.D Research Proposal: Software Tools for Orchestration
Ph.D Research Proposal: Software Tools for Orchestration
 
Research Visibility in the Global South: Towards Increased Online Visibility ...
Research Visibility in the Global South: Towards Increased Online Visibility ...Research Visibility in the Global South: Towards Increased Online Visibility ...
Research Visibility in the Global South: Towards Increased Online Visibility ...
 

Recently uploaded

Vinícius Portella In Media Res Media Component
Vinícius Portella In Media Res Media ComponentVinícius Portella In Media Res Media Component
Vinícius Portella In Media Res Media ComponentInMediaRes1
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptxUmeshTimilsina1
 
Jason Potel In Media Res Media Component
Jason Potel In Media Res Media ComponentJason Potel In Media Res Media Component
Jason Potel In Media Res Media ComponentInMediaRes1
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Farrington HS Streamlines Guest Entrance
Farrington HS Streamlines Guest EntranceFarrington HS Streamlines Guest Entrance
Farrington HS Streamlines Guest Entrancejulius27264
 
How to create _name_search function in odoo 17
How to create _name_search function in odoo 17How to create _name_search function in odoo 17
How to create _name_search function in odoo 17Celine George
 
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxTransdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxinfo924062
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipKarl Donert
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 

Recently uploaded (20)

Vinícius Portella In Media Res Media Component
Vinícius Portella In Media Res Media ComponentVinícius Portella In Media Res Media Component
Vinícius Portella In Media Res Media Component
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx
 
Jason Potel In Media Res Media Component
Jason Potel In Media Res Media ComponentJason Potel In Media Res Media Component
Jason Potel In Media Res Media Component
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Farrington HS Streamlines Guest Entrance
Farrington HS Streamlines Guest EntranceFarrington HS Streamlines Guest Entrance
Farrington HS Streamlines Guest Entrance
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
Teaching Critical AI Literacies - Maha Bali
Teaching Critical AI Literacies - Maha BaliTeaching Critical AI Literacies - Maha Bali
Teaching Critical AI Literacies - Maha Bali
 
How to create _name_search function in odoo 17
How to create _name_search function in odoo 17How to create _name_search function in odoo 17
How to create _name_search function in odoo 17
 
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxTransdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenship
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 

Improved Discoverability of Digital Objects in Institutional Repositories Using Controlled Vocabularies

  • 1. Improved Discoverability of Digital Objects in Institutional Repositories Using Controlled Vocabularies University of Zambia Lusaka, ZAMBIA Bertha Chipangila · Eric Liswaniso · Andrew Mawila Philomena Mwanza · Daisy Nawila · Robert M’sendo Mayumbo Nyirenda · Lighton Phiri ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) September 27–30, 2021
  • 2. 2/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) About Us (1/2)
  • 3. 3/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) About Us (2/2) ● The DataLab research group at The University of Zambia is composed of faculty staff and students—undergraduate and postgraduate—working in three main areas ○ Data Mining ○ Digital Libraries ○ Technology-Enhanced Learning http://datalab.unza.zm
  • 4. 4/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Outline ● Motivation ● Problem Statement ● Methodology ● Results and Discussion ● Conclusion and Future Work
  • 5. 5/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) There is an Ever Increasing Amount of Scholarly Research Generated https://scholar.google.com
  • 6. 6/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) There is an Ever Increasing Amount of Scholarly Research Generated https://academic.microsoft.com
  • 7. 7/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Discoverability Services Facilitate Findability of Scholarly Research in IRs http://open.uct.ac.za http://dspace.unza.zm
  • 8. 8/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Problem Statement ● There are numerous inconsistencies with digital object metadata elements used to describe subjects ○ Lack of use of controlled vocabularies and subject headings compromises effective searching and browsing of scholarly research
  • 9. 9/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Problem Statement ● There are numerous inconsistencies with digital object metadata elements used to describe subjects ○ Lack of use of controlled vocabularies and subject headings compromises effective searching and browsing of scholarly research
  • 10. 10/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Problem Statement ● There are numerous inconsistencies with digital object metadata elements used to describe subjects ○ Lack of use of controlled vocabularies and subject headings compromises effective searching and browsing of scholarly research
  • 11. 11/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Methodology ● Situational analysis to determine the implications of non-use of controlled vocabularies ● Identification of subject-specific controlled vocabularies for various disciplines ● Usability study of IRs integrated with controlled vocabularies when compared with IRs without controlled vocabularies ● Implementation of multi-label subject classification model for classifying ACM CCS concepts and arXiv subjects
  • 12. 12/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Methodology: Situational Analysis ● Dublin Core encoded metadata harvested from three repositories ○ NDLTD Union Catalog ○ University of Cape Town Computer Science Document Archive ○ University of Zambia Institutional Repository
  • 13. 13/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Methodology: Identification of Subjects and Usability Study ● 7 faculty interviewed to determine appropriate controlled vocabularies ● DSpace-powered IRs set-up to conduct controlled comparative study ○ IR #1: LCSH subjects ○ IR #2: No subjects ○ System Usability Scale used to assess usability
  • 14. 14/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Methodology: Identification of Subjects and Usability Study ● 7 faculty interviewed to determine appropriate controlled vocabularies ● DSpace-powered IRs set-up to conduct controlled comparative study ○ IR #1: LCSH subjects ○ IR #2: No subjects ○ System Usability Scale used to assess usability ● Within subject SUS study conducted with 50 participants
  • 15. 15/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Methodology: Multi-label Subject Classifier Title + Abstract F1 Score Hamming Loss Jaccard Score SGDClassifier TF-IDF 0.540 0.005 0.431 [...] [...] [...] [...] [...] ● Multi-label subject classification model implemented using arXiv CoRR dataset and validated using the UCT@ CS Document archive ○ Titles and Abstracts used as input features
  • 16. 16/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Situational Analysis (2/2) ● Analysis 1. Metadata preparation and ingestion workflow based on internal policy ● Analysis 2. Subject heading usage is sparing. 92.1% of tags only associated with one publication ● Analysis 3. Domain-specific subject headings are not used. Internally devised LCSH used
  • 17. 17/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Situational Analysis (1/2)
  • 18. 18/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Situational Analysis (1/2) ● Incidentally, the problem manifests itself in other repositories and downstream services
  • 19. 19/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Comparative Analysis (1/2)
  • 20. 20/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Comparative Analysis (1/2) ● SUS average scores ○ [66.2] Baseline ○ [68.9] Intervention
  • 21. 21/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Comparative Analysis (2/2)
  • 22. 22/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Comparative Analysis (2/2)
  • 23. 23/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Multi-label Subject Classification Model—Implementation Title + Abstract F1 Score Hamming Loss Jaccard Score SGDClassifier TF-IDF 0.540 0.005 0.431 [...] [...] [...] [...] [...] ● Approaches used: Binary Relevance, Classifier Chains and One-Versus-Rest ● Estimators: MultinomialNB vs SGDClassifier vs RandomForest ● Features: TF vs TF-IDF; Title vs Abstract vs Title + Abstract
  • 24. 24/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Multi-label Subject Classification Model—Validation ● Model evaluated using CS subject repository with self-archiving implemented
  • 25. 25/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Results and Discussion: Multi-label Subject Classification Model—Demonstration C.2.4 · D.2.11 · F.1.1 · H.3.4 · H.3.5 · H.5.2 Computer Science - Artificial Intelligence · Computer Science - Computation and Language · Computer Science - General Literature · Computer Science - Human-Computer Interaction ● Six (6) arXiv and four (4) ACM CCS subject predicted by model
  • 26. 26/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Conclusions and Future Work ● Integrating IRs with subject controlled vocabularies can potentially complement self-archiving and, additionally, has the benefit ensuring that IRs are usable and effective ● Potential future work and/or direction ○ Metadata cleaning, enhancement and augmentation of existing descriptive metadata ○ Implementation of subject classification models for other subject-specific controlled vocabularies ○ Automatic generation of subject classes for large-scale repositories such as the NDLTD Union Catalog
  • 27. 27/30 September 27–30 , 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) Q & A Session ● Comments, concerns and complaints?
  • 28. [1] Phiri, L. (2018). Research Visibility in the Global South: Towards Increased Online Visibility of Scholarly Research Output in Zambia. IEEE International Conference in Information and Communication Technologies. [2] Phiri, L. (2020). A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs in Zambia. Cadernos BAD, (1). DOI: 10.1017/S0269888910000032 [3] Phiri, L. (2020). Automatic classification of digital objects for improved metadata quality of electronic theses and dissertations in institutional repositories. International Journal of Metadata, Semantics and Ontologies, 14(3), 234-248. DOI: 10.1504/IJMSO.2020.112804 Bibliography
  • 30. Improved Discoverability of Digital Objects in Institutional Repositories Using Controlled Vocabularies University of Zambia Lusaka, ZAMBIA Bertha Chipangila · Eric Liswaniso · Andrew Mawila Philomena Mwanza · Daisy Nawila · Robert M’sendo Mayumbo Nyirenda · Lighton Phiri ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021) September 27–30, 2021