SlideShare a Scribd company logo
AN IN-DEPTH ANALYSIS OF TAGS AND CONTROLLED
METADATA FOR BOOK SEARCH
TOINE BOGERS
VIVIEN PETRAS
MARCH 23, 2017iCONFERENCE 2017
OUTLINE
▸ Introduction
▸ Methodology & Experimental Setup
▸ Analysis
– Tags vs. Controlled Vocabularies
– Book Search Requests
– Failure Analysis
▸ Conclusions & Future Work
2
INTRODUCTION
MOTIVATION
▸ Readers often struggle with existing systems (i.e., library
catalogs, Amazon, eBook sellers) to discover new books
– Information needs are contextual, personal & complex
– Book metadata does not contain the necessary information
4
EARLIER WORK
▸ iConference 2015
– Tags outperform controlled vocabularies for search, but
sometimes controlled vocabularies are better.
– Controlled vocabularies contains more unique terms, tags
more repetition of terms.
▸ Why?
– Terminology
– Popularity / frequency
– Type of request
5
STUDY OBJECTIVES
▸ Why are tags better than controlled vocabularies for book
search?
– Which types of book search requests are better addressed
using tags and which using CV?
– Which book search requests fail completely and what
characterizes such requests?
6
METHODOLOGY&
EXPERIMENTAL SETUP
EXPERIMENTAL SETUP
▸ Controlled Vocabulary content (CV)
– DDC class labels
– Subjects
– Geographic names
– Category labels
– LCSH terms
▸ Tags
– Each tag occurs as many times as it has been assigned by
the users
▸ Unique tags
– Each tag occurs only once
8
AMAZON/LIBRARYTHING COLLECTION 9
Tags
Tags
Controlled Vocabulary Content (CV)
DDC class labels
subjects
geographic names
category labels
LCSH terms
Unique Tags
Unique Tags per record
ANNOTATED LT TOPIC
10
Recommended
books
Topic title
Narrative
EXPERIMENTAL SETUP
▸ Amazon / LibraryThing collection of book records
– 2 million records
▸ LibraryThing forum topics for search requests
– 334 search requests for testing
▸ Relevance judgements
– Recommendations from LT members with graded relevance scoring
(highest relevance if book is added by searcher)
▸ Evaluation metric
– Normalized Discounted Cumulated Gain (NDCG@10)
▸ IR system
– Indri 5.4 toolkit
10
ANALYSIS
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 1: Is there a difference in performance between
CV and Tags in retrieval?
▸ Answer
– Tags perform significantly
better than CV
– The combination of both
results in even better
performance than just for
tags, but not significantly so
– Losing tag frequency
information helps rather than
hurts performance (also not
significantly)
12
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 2: Do tags outperform CV because of the so-
called popularity effect?
▸ Answer
– No, there does not seem to be a popularity effect
– Types = unique words in a record
– Tokens = all instances of words in a record
13
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 3: Do Tags and
CV complement or cancel
each other out?
▸ Answer
– Tags and CV
complement each
other: they are
successful on different
sets of requests
– But most zero-difference
requests (74.0%)
actually fail completely!
When and why?
14
REQUESTS – RELEVANCE ASPECTS
▸ What makes a suggested book relevant to the user?
– Distinguish between eight relevance aspects (Reuter, 2007;
Koolen et al., 2015)
16
REQUESTS – RELEVANCE ASPECTS
Aspect Description
% of requests
(N = 87)
Accessibility Language, length, or level of difficulty of a book 9.2 %
Content Topic, plot, genre, style, or comprehensiveness 79.3 %
Engagement
Fit a certain mood or interest, are considered high
quality, or provide a certain reading experience
25.3 %
Familiarity
Similar to known books or related to a previous
experience
47.1 %
Known-item
The user is trying to identify a known book, but cannot
remember the metadata that would locate it
12.6 %
Metadata
With a certain title or by a certain author or publisher, in
a particular format, or certain year
23.0 %
Novelty Unusual or quirky, or containing novel content 3.4 %
Socio-cultural
Related to the user's socio-cultural background or
values; popular or obscure
13.8 %
16
REQUESTS – RELEVANCE ASPECTS
▸ Question 4: What types of book requests are best served
by the Unique tags and CV collections?
▸ Answer
– CV terms show a tendency to work best for requests that
touch upon aspects of engagement
– Other requests are best served by Unique tags
17
REQUESTS – RELEVANCE ASPECTS
0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00
Socio-cultural
(N = 10)
Novelty
(N = 2)
Metadata
(N = 17)
Known-item
(N = 11)
Familiarity
(N = 36)
Engagement
(N = 21)
Content
(N = 63)
Accessibility
(N = 7)
Unique tags
CV
0.0 0.20.1 0.40.3 0.60.5 0.80.7 1.00.9
Socio-cultural
(N = 10)
0.1127
0.0428
Novelty
(N = 2)
0.5304
0.0000
Metadata
(N = 17)
0.2454
0.1259
Known-item
(N = 11)
0.3593
0.1818
Familiarity
(N = 36)
0.1833
0.0701
Engagement
(N = 21)
0.1121
0.1425
Content
(N = 63)
0.1965
0.0821
Accessibility
(N = 7)
0.1235
0.0749
Performance grouped by relevance aspect
NDCG@10
18
REQUESTS – TYPE OF BOOK
▸ Question 5: What types of book requests (fiction or non-
fiction) are best served by Unique tags or CV?
▸ Answer
– Unique tags work significantly better for fiction
– CV work better for non-fiction (but not significantly so)
19
FAILURE ANALYSIS
▸ Question 6: Do failed book search requests fail because of
data sparsity, a lower recall base, or a lack of examples?
▸ Answer
– Neither sparsity nor the size of the recall base are the
reason for retrieval failure
– The number of examples provided by the requester has
significant positive influence on performance
(N = 247)
(N = 87)
(N = 334)
20
FAILURE ANALYSIS
▸ Question 7: Do book search requests fail because of their
relevance aspects?
▸ Answer
– No, relevance
aspects are
distributed equally
for successful &
failed requests
– Only Accessibility-
and Metadata-
related search
requests seem to
fail more often
21
FAILURE ANALYSIS
▸ Question 8: Does the type of book that is being requested
(fiction vs. non-fiction) have an influence on whether
requests succeed or fail?
▸ Answer
– Requests for works of fiction fail significantly more often
22
CONCLUSIONS &
FUTURE WORK
FINDINGS
▸ Tags outperform CV...
– ...probably because their terminology is closer to the user‘s
language (not because of the popularity effect)
▸ Sometimes CV are better, for example, for non-fiction books...
– ...whereas tags are better for fiction and for content-related,
familiarity or known-item searches
▸ We believe that tags are simply better able to match the user‘s
language when looking for books
– Although they are still not that great at it!
– Book search is still hard, especially for fiction books
25
OPEN QUESTIONS
▸ How can book metadata be adapted to be closer to the
vocabulary used in real-world book search requests?
▸ What other aspects (besides type of requested book or
relevance aspect of search request) contribute to request
difficulty?
▸ Our question to you:
– What other questions can we ask of this data?
26
QUESTIONS?
Paper URL: http://bit.ly/iconf2017

More Related Content

Viewers also liked

Semantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAMESemantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAME
National Information Standards Organization (NISO)
 
Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...
Naomi Young
 
RDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interactionRDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interaction
Gordon Dunsire
 
Beyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and BibframeBeyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and Bibframe
Thomas Meehan
 
BIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARCBIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARC
Thomas Meehan
 
MARC and BIBFRAME
MARC and BIBFRAMEMARC and BIBFRAME
MARC and BIBFRAME
Thomas Meehan
 
Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15
Ann Chapman
 
RDA and the semantic Web
RDA and the semantic WebRDA and the semantic Web
RDA and the semantic Web
Gordon Dunsire
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?
Thomas Meehan
 
Cataloging with RDA: An Overview
Cataloging with RDA: An OverviewCataloging with RDA: An Overview
Cataloging with RDA: An Overview
Emily Nimsakont
 

Viewers also liked (10)

Semantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAMESemantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAME
 
Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...
 
RDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interactionRDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interaction
 
Beyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and BibframeBeyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and Bibframe
 
BIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARCBIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARC
 
MARC and BIBFRAME
MARC and BIBFRAMEMARC and BIBFRAME
MARC and BIBFRAME
 
Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15
 
RDA and the semantic Web
RDA and the semantic WebRDA and the semantic Web
RDA and the semantic Web
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?
 
Cataloging with RDA: An Overview
Cataloging with RDA: An OverviewCataloging with RDA: An Overview
Cataloging with RDA: An Overview
 

Similar to An In-depth Analysis of Tags and Controlled Metadata for Book Search

Nature of inquiry and research
Nature of inquiry and researchNature of inquiry and research
Nature of inquiry and research
Department of Education
 
natureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdfnatureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdf
JARYLPILLAZAR1
 
Marketing Research Ch04
Marketing Research Ch04Marketing Research Ch04
Marketing Research Ch04
guestf8364c
 
natureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptxnatureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptx
JubilinAlbania
 
Questioning Practices And Strategies
Questioning Practices And  StrategiesQuestioning Practices And  Strategies
Questioning Practices And Strategies
robbi makely
 
Research questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_VietnamResearch questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_Vietnam
HangNguyenPhuocDieu
 
Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective
Kishor Patwardhan
 
Classroom Assessment Techniques
Classroom Assessment TechniquesClassroom Assessment Techniques
PPT-Final.pptx
PPT-Final.pptxPPT-Final.pptx
PPT-Final.pptx
JohnKingjohnkingmond
 
2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research
Live Angga
 
2-171124011016.pdf
2-171124011016.pdf2-171124011016.pdf
2-171124011016.pdf
JovManalili1
 
Arte387 Ch3
Arte387 Ch3Arte387 Ch3
Arte387 Ch3
SCWARTED
 
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TakishaPeck109
 
QUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.pptQUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.ppt
MuhammadSulaiman291223
 
Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012
Advancing a Massachusetts Culture of Assessment
 
Search vs Text Classification
Search vs Text ClassificationSearch vs Text Classification
Search vs Text Classification
Networked Insights
 
Essential questions
Essential questionsEssential questions
Essential questions
Carla Piper
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
studywriters
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
write4
 
Questionnaire design dr. s l gupta
Questionnaire design dr. s l guptaQuestionnaire design dr. s l gupta
Questionnaire design dr. s l gupta
Ravindra Sharma
 

Similar to An In-depth Analysis of Tags and Controlled Metadata for Book Search (20)

Nature of inquiry and research
Nature of inquiry and researchNature of inquiry and research
Nature of inquiry and research
 
natureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdfnatureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdf
 
Marketing Research Ch04
Marketing Research Ch04Marketing Research Ch04
Marketing Research Ch04
 
natureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptxnatureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptx
 
Questioning Practices And Strategies
Questioning Practices And  StrategiesQuestioning Practices And  Strategies
Questioning Practices And Strategies
 
Research questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_VietnamResearch questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_Vietnam
 
Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective
 
Classroom Assessment Techniques
Classroom Assessment TechniquesClassroom Assessment Techniques
Classroom Assessment Techniques
 
PPT-Final.pptx
PPT-Final.pptxPPT-Final.pptx
PPT-Final.pptx
 
2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research
 
2-171124011016.pdf
2-171124011016.pdf2-171124011016.pdf
2-171124011016.pdf
 
Arte387 Ch3
Arte387 Ch3Arte387 Ch3
Arte387 Ch3
 
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
 
QUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.pptQUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.ppt
 
Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012
 
Search vs Text Classification
Search vs Text ClassificationSearch vs Text Classification
Search vs Text Classification
 
Essential questions
Essential questionsEssential questions
Essential questions
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
 
Questionnaire design dr. s l gupta
Questionnaire design dr. s l guptaQuestionnaire design dr. s l gupta
Questionnaire design dr. s l gupta
 

More from Toine Bogers

"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C..."If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
Toine Bogers
 
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Toine Bogers
 
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
Toine Bogers
 
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkA Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
Toine Bogers
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
Toine Bogers
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
Toine Bogers
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Toine Bogers
 
Defining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationDefining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven Recommendation
Toine Bogers
 
Personalized search
Personalized searchPersonalized search
Personalized search
Toine Bogers
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index Size
Toine Bogers
 
Measuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsMeasuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage Systems
Toine Bogers
 
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
Toine Bogers
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?
Toine Bogers
 
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterMicro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Toine Bogers
 

More from Toine Bogers (14)

"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C..."If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
 
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
 
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
 
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkA Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Defining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationDefining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven Recommendation
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index Size
 
Measuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsMeasuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage Systems
 
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?
 
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterMicro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
 

Recently uploaded

23PH301 - Optics - Unit 1 - Optical Lenses
23PH301 - Optics  -  Unit 1 - Optical Lenses23PH301 - Optics  -  Unit 1 - Optical Lenses
23PH301 - Optics - Unit 1 - Optical Lenses
RDhivya6
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
Lattice Defects in ionic solid compound.pptx
Lattice Defects in ionic solid compound.pptxLattice Defects in ionic solid compound.pptx
Lattice Defects in ionic solid compound.pptx
DrRajeshDas
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Oeko-Institut
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Sérgio Sacani
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
Sérgio Sacani
 
2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf
lucianamillenium
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
ananya23nair
 
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Creative-Biolabs
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
OmAle5
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
suyashempire
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
QusayMaghayerh
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
ABHISHEK SONI NIMT INSTITUTE OF MEDICAL AND PARAMEDCIAL SCIENCES , GOVT PG COLLEGE NOIDA
 
Sustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart AgricultureSustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
goluk9330
 

Recently uploaded (20)

23PH301 - Optics - Unit 1 - Optical Lenses
23PH301 - Optics  -  Unit 1 - Optical Lenses23PH301 - Optics  -  Unit 1 - Optical Lenses
23PH301 - Optics - Unit 1 - Optical Lenses
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
Lattice Defects in ionic solid compound.pptx
Lattice Defects in ionic solid compound.pptxLattice Defects in ionic solid compound.pptx
Lattice Defects in ionic solid compound.pptx
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
 
2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
 
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
 
Sustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart AgricultureSustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart Agriculture
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
 

An In-depth Analysis of Tags and Controlled Metadata for Book Search

  • 1. AN IN-DEPTH ANALYSIS OF TAGS AND CONTROLLED METADATA FOR BOOK SEARCH TOINE BOGERS VIVIEN PETRAS MARCH 23, 2017iCONFERENCE 2017
  • 2. OUTLINE ▸ Introduction ▸ Methodology & Experimental Setup ▸ Analysis – Tags vs. Controlled Vocabularies – Book Search Requests – Failure Analysis ▸ Conclusions & Future Work 2
  • 4. MOTIVATION ▸ Readers often struggle with existing systems (i.e., library catalogs, Amazon, eBook sellers) to discover new books – Information needs are contextual, personal & complex – Book metadata does not contain the necessary information 4
  • 5. EARLIER WORK ▸ iConference 2015 – Tags outperform controlled vocabularies for search, but sometimes controlled vocabularies are better. – Controlled vocabularies contains more unique terms, tags more repetition of terms. ▸ Why? – Terminology – Popularity / frequency – Type of request 5
  • 6. STUDY OBJECTIVES ▸ Why are tags better than controlled vocabularies for book search? – Which types of book search requests are better addressed using tags and which using CV? – Which book search requests fail completely and what characterizes such requests? 6
  • 8. EXPERIMENTAL SETUP ▸ Controlled Vocabulary content (CV) – DDC class labels – Subjects – Geographic names – Category labels – LCSH terms ▸ Tags – Each tag occurs as many times as it has been assigned by the users ▸ Unique tags – Each tag occurs only once 8
  • 9. AMAZON/LIBRARYTHING COLLECTION 9 Tags Tags Controlled Vocabulary Content (CV) DDC class labels subjects geographic names category labels LCSH terms Unique Tags Unique Tags per record
  • 11. EXPERIMENTAL SETUP ▸ Amazon / LibraryThing collection of book records – 2 million records ▸ LibraryThing forum topics for search requests – 334 search requests for testing ▸ Relevance judgements – Recommendations from LT members with graded relevance scoring (highest relevance if book is added by searcher) ▸ Evaluation metric – Normalized Discounted Cumulated Gain (NDCG@10) ▸ IR system – Indri 5.4 toolkit 10
  • 13. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 1: Is there a difference in performance between CV and Tags in retrieval? ▸ Answer – Tags perform significantly better than CV – The combination of both results in even better performance than just for tags, but not significantly so – Losing tag frequency information helps rather than hurts performance (also not significantly) 12
  • 14. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 2: Do tags outperform CV because of the so- called popularity effect? ▸ Answer – No, there does not seem to be a popularity effect – Types = unique words in a record – Tokens = all instances of words in a record 13
  • 15. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 3: Do Tags and CV complement or cancel each other out? ▸ Answer – Tags and CV complement each other: they are successful on different sets of requests – But most zero-difference requests (74.0%) actually fail completely! When and why? 14
  • 16. REQUESTS – RELEVANCE ASPECTS ▸ What makes a suggested book relevant to the user? – Distinguish between eight relevance aspects (Reuter, 2007; Koolen et al., 2015) 16
  • 17. REQUESTS – RELEVANCE ASPECTS Aspect Description % of requests (N = 87) Accessibility Language, length, or level of difficulty of a book 9.2 % Content Topic, plot, genre, style, or comprehensiveness 79.3 % Engagement Fit a certain mood or interest, are considered high quality, or provide a certain reading experience 25.3 % Familiarity Similar to known books or related to a previous experience 47.1 % Known-item The user is trying to identify a known book, but cannot remember the metadata that would locate it 12.6 % Metadata With a certain title or by a certain author or publisher, in a particular format, or certain year 23.0 % Novelty Unusual or quirky, or containing novel content 3.4 % Socio-cultural Related to the user's socio-cultural background or values; popular or obscure 13.8 % 16
  • 18. REQUESTS – RELEVANCE ASPECTS ▸ Question 4: What types of book requests are best served by the Unique tags and CV collections? ▸ Answer – CV terms show a tendency to work best for requests that touch upon aspects of engagement – Other requests are best served by Unique tags 17
  • 19. REQUESTS – RELEVANCE ASPECTS 0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00 Socio-cultural (N = 10) Novelty (N = 2) Metadata (N = 17) Known-item (N = 11) Familiarity (N = 36) Engagement (N = 21) Content (N = 63) Accessibility (N = 7) Unique tags CV 0.0 0.20.1 0.40.3 0.60.5 0.80.7 1.00.9 Socio-cultural (N = 10) 0.1127 0.0428 Novelty (N = 2) 0.5304 0.0000 Metadata (N = 17) 0.2454 0.1259 Known-item (N = 11) 0.3593 0.1818 Familiarity (N = 36) 0.1833 0.0701 Engagement (N = 21) 0.1121 0.1425 Content (N = 63) 0.1965 0.0821 Accessibility (N = 7) 0.1235 0.0749 Performance grouped by relevance aspect NDCG@10 18
  • 20. REQUESTS – TYPE OF BOOK ▸ Question 5: What types of book requests (fiction or non- fiction) are best served by Unique tags or CV? ▸ Answer – Unique tags work significantly better for fiction – CV work better for non-fiction (but not significantly so) 19
  • 21. FAILURE ANALYSIS ▸ Question 6: Do failed book search requests fail because of data sparsity, a lower recall base, or a lack of examples? ▸ Answer – Neither sparsity nor the size of the recall base are the reason for retrieval failure – The number of examples provided by the requester has significant positive influence on performance (N = 247) (N = 87) (N = 334) 20
  • 22. FAILURE ANALYSIS ▸ Question 7: Do book search requests fail because of their relevance aspects? ▸ Answer – No, relevance aspects are distributed equally for successful & failed requests – Only Accessibility- and Metadata- related search requests seem to fail more often 21
  • 23. FAILURE ANALYSIS ▸ Question 8: Does the type of book that is being requested (fiction vs. non-fiction) have an influence on whether requests succeed or fail? ▸ Answer – Requests for works of fiction fail significantly more often 22
  • 25. FINDINGS ▸ Tags outperform CV... – ...probably because their terminology is closer to the user‘s language (not because of the popularity effect) ▸ Sometimes CV are better, for example, for non-fiction books... – ...whereas tags are better for fiction and for content-related, familiarity or known-item searches ▸ We believe that tags are simply better able to match the user‘s language when looking for books – Although they are still not that great at it! – Book search is still hard, especially for fiction books 25
  • 26. OPEN QUESTIONS ▸ How can book metadata be adapted to be closer to the vocabulary used in real-world book search requests? ▸ What other aspects (besides type of requested book or relevance aspect of search request) contribute to request difficulty? ▸ Our question to you: – What other questions can we ask of this data? 26