SlideShare a Scribd company logo
Managing Taxonomy Tagging
Taxonomy Boot Camp
Washington, DC
November 4, 2019
Heather Hedden & Terry Casey
Terry Casey
▪ Taxonomist
– Independent consultant, Casey
Indexing and Information Service
– Currently contract staff taxonomist
▪ Back-of-the book indexer
– Textbook, scholarly, trade book and
periodical indexer.
– Embedded indexes for digital
publications
About Us
2
Heather Hedden
▪ Taxonomist
– Independent consultant, Hedden
Information Management
– Previously employed and contract
consultant and staff taxonomist
▪ Former indexer
– Periodical article indexer at library
vendor IAC (acquired by Gale)
– Freelance back-of-the-book indexer
▪ Author of The Accidental Taxonomist
(2010, 2016, Information Today, Inc.)
▪ Introduction: Tagging, Indexing, Categorizing
▪ Taxonomy Design and Display for Indexing
▪ Indexing Policy, Documentation, and Training
▪ Automated Indexing Methods
▪ Manual Indexing
▪ Finding Indexers
▪ Examples of Indexing Projects
Outline
3
Tagging vs. indexing vs. categorizing/classifying
Tagging – assigning metadata labels (“tags”)
▪ By identifying topics and names within a document or content item
▪ By content creators or editors (minimally trained in tagging),
not as their primary job responsibility
▪ For metadata both with and without controlled vocabularies
▪ To support search
▪ Can also be semi-automated
Indexing – assigning index terms (subject metadata and related elements)
▪ By identifying topics and names within a document or content item
▪ By trained indexers, often as their primary job responsibility
▪ By selecting terms from a large controlled vocabulary/thesaurus/taxonomy
▪ To create a browsable index (and now also to support search)
▪ Can also be semi-automated
Introduction
4
Tagging vs. indexing vs. categorizing/classifying
Categorizing/classifying – organizing & assigning content into named categories
▪ By identifying which category a document or content item belongs within
▪ A feature of most content management systems, in addition to tagging.
▪ Often represented as virtual folders and subfolders.
▪ May be appropriate for Subjects or for Document Types.
▪ Content items can usually go into only one category, like classification.
▪ Categories are multi-level hierarchical.
▪ Category hierarchy is designed as a hierarchical taxonomy.
▪ Categories may or may not be metadata.
▪ Can also be automated or semi-automated.
Introduction
5
Introduction
6
Categories vs. Tags
Examples of both categories and tags within the same applications
Introduction
7
• What topics the
content contains
• Like an index
• More specific
• More numerous
• Overlapping
• Unstructured
• Less controlled
• Ad hoc
• Supports searching &
filtering
• What “buckets” the
content goes into
• Like a table of contents
• Relatively broad
• Limited in number
• Mutually exclusive
• Sometimes hierarchical
• More controlled
• Pre-planned
• Supports browsing &
filtering
Categories vs. Tags vs. Index terms
Categories Tags Index terms
• What topics the
content contains
• For an index
• More specific
• More numerous
• Overlapping
• Structured
• More controlled
• Pre-planned
• Supports browsing,
searching & filtering
Introduction
8
• Assigning any terms
desired
• Used by authors and
editors
• Tends to inconsistent
terms and indexing
• Responsive to trends
and dynamic
• May supplement a
controlled vocabulary
• Using only pre-approved
terms
• Used by indexers and
content managers
• Ensures consistent
indexing
• Slower to change and
updates
Indexing or tagging with a controlled vocabulary or not
Controlled vocabulary Keywords Folksonomy
• Assigning any terms and
reusing terms
• Used by authors, editors,
content managers, users
• Tends to inconsistent
terms and indexing
• Responsive to trends
and dynamic
• May supplement a
controlled vocabulary
• More collaborative as
“social tagging”
Taxonomy design for manual indexing
▪ Use of alternative labels/nonpreferred terms
(considering also search or browse UI, from start of term)
▪ Use of associative (related term – RT) relationships in addition to hierarchical
▪ Scope notes, dedicated Indexer notes, occasional definitions of terms
▪ Grouped distinct term sets, hierarchies, or facets for comprehensive indexing
(even if distinct term sets or facets are not supported in the end-user interface)
Taxonomy Design and Display for Indexing
9
Indexing user interface and experience (UI/UX) with taxonomy
Tagging interfaces of a commercial CMS are not user friendly.
For large volume manual tagging, develop your own.
Desirable features
▪ Both alphabetical and hierarchical browse options
▪ Alphabetical browse with alternative labels/nonpreferred terms
▪ Various search options: Begins with, Word/phrase within, Exact, Smart
▪ Exact term matches are validated and don’t require searching/browsing
▪ Shortcuts (abbreviations) for commonly indexed terms
▪ Auto-conversion of selected alternative labels/nonpreferred terms to preferred
▪ Indexing steps with keyboard shortcuts, and not just mouse, for speed
Taxonomy Design and Display for Indexing
10
Indexing UI display
Screenshot example
(Gale/Cengage internal)
Taxonomy Design and Display for Indexing
11
Indexing policy, rules, documentation, should cover:
▪ Criteria for determining topic or name relevancy for indexing
▪ Depth, level of detail
▪ Comprehensiveness of aspects (what, who, where, when, how, why, etc.)
▪ Required term types/facets (and any dependencies)
▪ Number of terms (of each type)
▪ Indexing of certain terms in combination
e.g.: a parent/broader term in addition to its narrower/child term
▪ Other required metadata to enter
➢ Recommendations/guidelines and rules/requirements
Indexing Policy, Documentation and Training
12
Indexer training
▪ Instructing the indexing policy/guidelines as a live or web presentation
▪ Training with examples on indexing that captures the “aboutness” of a
document rather than matching words in the text to taxonomy terms.
▪ Reviewing sample indexing and providing feedback.
Indexing Policy, Documentation and Training
13
Feedback from indexing to improve the taxonomy
Often based on statistics on term usage in indexing
▪ Underused terms may need added alternative labels or relationships.
▪ Overused terms may need to be split into more specific terms.
▪ Misused terms may need rewording, scope note, and/or alternative labels.
▪ Correctly used low-use terms can be dropped.
Also based on indexers’ individual requests and queries
Indexing Policy, Documentation and Training
14
Indexer-taxonomist communication for new terms
▪ Taxonomist informs indexers of new and changed terms, and indexing tips
(combinations of terms) for indexing new or recurring topics
▪ Indexers request taxonomist to clarify terms or create new terms
Methods:
▪ email distribution lists
▪ Intranet bulletin posts
▪ collaboration workspace posts
▪ indexing software feature for new term nomination
Indexing Policy, Documentation and Training
15
Automated indexing/Auto-categorization/Auto-classification
2 primary methods: machine-learning and rules-based
Machine-learning based
Automatically categorizes/tags based on previous examples.
▪ System has complex mathematical algorithms.
▪ Content managers must provide multiple (10’s or more) representative sample
documents for each taxonomy term to “train” the system. Results are reviewed
and training sets are “tuned.”
▪ Matches are to terms and alternative labels, which can be individually weighted.
▪ System may also “suggest” additional terms to add to taxonomy.
▪ Best if large body of pre-indexed records already exists
(such as migrating from manual to automated indexing)
Automated Indexing Methods
16
Rules-based auto-indexing
Rules are created for each taxonomy term.
▪ Rules are based on synonyms with more conditions.
▪ Some systems feature weighting of synonyms.
▪ Some systems feature more sophisticated rule-writing, like advanced Boolean
searching (in reverse) and proximity operators or regular expressions.
▪ Some systems feature auto-generated suggested rules for each term/synonym
which can be manually edited in addition to writing rules from scratch.
Automated Indexing Methods
17
Manual tasks for automated indexing
Continual update work is needed for each new term created.
▪ New training documents need to be added and taxonomy terms tuned.
▪ New rules need to created or edited.
➢ Identifying and tuning training documents is more appropriate for subject
matter experts, editors, indexers.
➢ Writing rules is more appropriate for information professionals, taxonomists,
knowledge engineers.
Automated Indexing Methods
18
Benefits of manual indexing
▪ Can audit and check indexers’ work immediately and make corrections or give
instructions
▪ Can respond to indexers request for new tags quickly
▪ Can make and/or use compound headings
▪ Can handle terms that could go under multiple headings and make educated
and nuanced decision of where to index the information correctly
▪ Human interpretation of complex subjects is hard to automate with rules.
Autoindexing can lead to inconsistent and uncertain results for complex
subjects
▪ Higher levels of precision and recall: indexers’ inconsistencies are minor,
compared with potential automated indexing errors.
Manual Indexing
19
Benefits of manual indexing:
Process advantages
▪ Handles complex documents that require human interpretation to analyze
terms
▪ Can figure out indexing parameters as you go. Everything does not need to be
decided ahead of time-very responsive to change
▪ Make major structural changes right away because decision maker is right
there
▪ Clean out old, out-of-date, unusable tags while tagging
▪ Create new tags immediately when needed. Able to adapt and change
taxonomy while it is evolving with new information instead of later coming back
to find information to tag
Manual Indexing
20
Considerations in choosing an indexing method
Manual versus Automated Indexing
21
Manual methods
➢ Manageable number of documents
➢ Higher accuracy in indexing
➢ May include non-text files
➢ Investing in people
➢ Low-tech: can build your own
indexing tool/user interface
➢ Internal control
Automated methods
➢ Very large number of documents
➢ Greater speed in indexing
➢ Text files only
➢ Investing in technology
➢ High-tech: must purchase auto-
indexing/classification software
➢ Software vendor relationship
Who are indexers and where to find them
▪ Full-time staff (for ongoing indexing)
‒ Editorial or subject specialization background + thorough indexing training
▪ Freelance or contractors (for temporary or part-time projects)
‒ Look for those with periodical article/database indexing experience
‒ Many have back-of-the-book indexing experience only
(so would require some additional training)
▪ Subject matter experts or not
‒ Scholarly or highly technical documents require subject expertise
‒ General enterprise or public content does not require subject expertise
▪ Rates can be hourly or per indexed record/document
▪ American Society for Indexing -- for finding freelance/contract indexers
Finding Indexers
22
American Society for Indexing
▪ ASI Website screen shot
Indexers
23
www.asindexing.org
▪ Educational institution: unique scholarly article for public research access
▪ Professional indexer for first phase; will train others on subsequent phases
▪ Taxonomy consultant remained available throughout first phase indexing
▪ International organization: SharePoint intranet taxonomy
▪ Request for not just guidelines but also training for tagging
▪ Fortune 500 firm: enterprise taxonomy for tagging articles
Examples of Indexing Projects
24
Questions/Contact
25
Terry Casey
Taxonomy Consultant
Casey Indexing and Information
Service
Saint Paul, MN
651-278-2023
terry@caseyindex.com
www.caseyindex.com
www.linkedin.com/in/terry-casey
Heather Hedden
Taxonomy Consultant
Hedden Information Management
Carlisle, MA
978-467-5195
heather@hedden.net
www.hedden-information.com
accidental-taxonomist.blogspot.com
www.linkedin.com/in/hedden
Twitter: @hhedden

More Related Content

What's hot

A Brief Introduction to SKOS
A Brief Introduction to SKOSA Brief Introduction to SKOS
A Brief Introduction to SKOS
Heather Hedden
 
Taxonomy Design for SharePoint
Taxonomy Design for SharePointTaxonomy Design for SharePoint
Taxonomy Design for SharePoint
Heather Hedden
 
Tools for Taxonomies
Tools for TaxonomiesTools for Taxonomies
Tools for Taxonomies
Earley Information Science
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
sarakirsten
 
Taxonomy Fundamentals Workshop 2013
Taxonomy Fundamentals Workshop 2013Taxonomy Fundamentals Workshop 2013
Taxonomy Fundamentals Workshop 2013
Access Innovations, Inc.
 
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy DevelopmentOrganizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
Art Schlussel
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerce
Heather Hedden
 
Taxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexingTaxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexing
Heather Hedden
 
Taxonomies, Categories, and Tags in WordPress
Taxonomies, Categories, and Tags in WordPressTaxonomies, Categories, and Tags in WordPress
Taxonomies, Categories, and Tags in WordPress
Heather Hedden
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
AIIM Minnesota
 
Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?
Fred Leise
 
Customer-Focused Thesauri
Customer-Focused ThesauriCustomer-Focused Thesauri
Customer-Focused Thesauri
Heather Hedden
 
Synonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred TermsSynonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred Terms
Heather Hedden
 
Taxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-IndexingTaxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-Indexing
Heather Hedden
 
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
inscit2006
 
Analyse and present week 1 - topic 1.1
Analyse and present   week 1 - topic 1.1Analyse and present   week 1 - topic 1.1
Analyse and present week 1 - topic 1.1
10shilcl
 
Evaluating Taxonomies
Evaluating TaxonomiesEvaluating Taxonomies
Evaluating Taxonomies
Joseph Busch
 
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
Crossref
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects
daniela barbosa
 
Practice point resource guide feb 2016
Practice point resource guide feb 2016Practice point resource guide feb 2016
Practice point resource guide feb 2016
Jean-Paul de Laureal
 

What's hot (20)

A Brief Introduction to SKOS
A Brief Introduction to SKOSA Brief Introduction to SKOS
A Brief Introduction to SKOS
 
Taxonomy Design for SharePoint
Taxonomy Design for SharePointTaxonomy Design for SharePoint
Taxonomy Design for SharePoint
 
Tools for Taxonomies
Tools for TaxonomiesTools for Taxonomies
Tools for Taxonomies
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
 
Taxonomy Fundamentals Workshop 2013
Taxonomy Fundamentals Workshop 2013Taxonomy Fundamentals Workshop 2013
Taxonomy Fundamentals Workshop 2013
 
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy DevelopmentOrganizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
Organizing Knowledge: A Knowledge Manager’s Primer to Taxonomy Development
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerce
 
Taxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexingTaxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexing
 
Taxonomies, Categories, and Tags in WordPress
Taxonomies, Categories, and Tags in WordPressTaxonomies, Categories, and Tags in WordPress
Taxonomies, Categories, and Tags in WordPress
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
 
Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?
 
Customer-Focused Thesauri
Customer-Focused ThesauriCustomer-Focused Thesauri
Customer-Focused Thesauri
 
Synonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred TermsSynonyms, Alternative Labels, and Nonpreferred Terms
Synonyms, Alternative Labels, and Nonpreferred Terms
 
Taxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-IndexingTaxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-Indexing
 
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
Exploring Web Pages' Semantics and a Subject Hierarchy for Supporting Persona...
 
Analyse and present week 1 - topic 1.1
Analyse and present   week 1 - topic 1.1Analyse and present   week 1 - topic 1.1
Analyse and present week 1 - topic 1.1
 
Evaluating Taxonomies
Evaluating TaxonomiesEvaluating Taxonomies
Evaluating Taxonomies
 
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
NISO/NFAIS Supplemental Journal Article Materials Working Group (2011 CrossRe...
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects
 
Practice point resource guide feb 2016
Practice point resource guide feb 2016Practice point resource guide feb 2016
Practice point resource guide feb 2016
 

Similar to Managing Taxonomy Tagging

[AIIM17] Data Categorization You Can Live With - Monica Crocker
[AIIM17]  Data Categorization You Can Live With - Monica Crocker [AIIM17]  Data Categorization You Can Live With - Monica Crocker
[AIIM17] Data Categorization You Can Live With - Monica Crocker
AIIM International
 
Why You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records ManagementWhy You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records Management
Concept Searching, Inc
 
Introduction to Intranet Planning
Introduction to Intranet PlanningIntroduction to Intranet Planning
Introduction to Intranet Planning
Haaron Gonzalez
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success
Baltimore SharePoint (BSPUG)
 
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
Concept Searching, Inc
 
Groundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search WebinarGroundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search Webinar
Concept Searching, Inc
 
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy WebinarThe Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
Concept Searching, Inc
 
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
Concept Searching, Inc
 
SPTechCon Austin 2015 - Perfecting Information Architecture
SPTechCon Austin 2015 - Perfecting Information ArchitectureSPTechCon Austin 2015 - Perfecting Information Architecture
SPTechCon Austin 2015 - Perfecting Information Architecture
Jill Hannemann
 
SharePoint Information Architecture Best Practices
SharePoint Information Architecture Best PracticesSharePoint Information Architecture Best Practices
SharePoint Information Architecture Best Practices
Stephanie Lemieux
 
DC SPUG Feb 2015 The Secret Sauce to Information Architecture
DC SPUG Feb 2015 The Secret Sauce to Information ArchitectureDC SPUG Feb 2015 The Secret Sauce to Information Architecture
DC SPUG Feb 2015 The Secret Sauce to Information Architecture
Jill Hannemann
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
 
Testing Taxonomies
Testing TaxonomiesTesting Taxonomies
Testing Taxonomies
Heather Hedden
 
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
IXIASOFT
 
Optimizing Your Content for Search
Optimizing Your Content for SearchOptimizing Your Content for Search
Optimizing Your Content for Search
Sharon Weaver
 
How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search
Sharon Weaver
 
Developing retention rules that work
Developing retention rules that workDeveloping retention rules that work
Developing retention rules that work
Becky Bertram
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 
Going Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePointGoing Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePoint
Concept Searching, Inc
 
Jason lax sap - from seo practitioner to seo evangelist
Jason lax   sap - from seo practitioner to seo evangelistJason lax   sap - from seo practitioner to seo evangelist
Jason lax sap - from seo practitioner to seo evangelist
Barry Schwartz
 

Similar to Managing Taxonomy Tagging (20)

[AIIM17] Data Categorization You Can Live With - Monica Crocker
[AIIM17]  Data Categorization You Can Live With - Monica Crocker [AIIM17]  Data Categorization You Can Live With - Monica Crocker
[AIIM17] Data Categorization You Can Live With - Monica Crocker
 
Why You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records ManagementWhy You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records Management
 
Introduction to Intranet Planning
Introduction to Intranet PlanningIntroduction to Intranet Planning
Introduction to Intranet Planning
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success
 
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
 
Groundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search WebinarGroundbreaking and Game-changing Enterprise Search Webinar
Groundbreaking and Game-changing Enterprise Search Webinar
 
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy WebinarThe Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy Webinar
 
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
 
SPTechCon Austin 2015 - Perfecting Information Architecture
SPTechCon Austin 2015 - Perfecting Information ArchitectureSPTechCon Austin 2015 - Perfecting Information Architecture
SPTechCon Austin 2015 - Perfecting Information Architecture
 
SharePoint Information Architecture Best Practices
SharePoint Information Architecture Best PracticesSharePoint Information Architecture Best Practices
SharePoint Information Architecture Best Practices
 
DC SPUG Feb 2015 The Secret Sauce to Information Architecture
DC SPUG Feb 2015 The Secret Sauce to Information ArchitectureDC SPUG Feb 2015 The Secret Sauce to Information Architecture
DC SPUG Feb 2015 The Secret Sauce to Information Architecture
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Testing Taxonomies
Testing TaxonomiesTesting Taxonomies
Testing Taxonomies
 
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
Optimizing DITA Content for Search Engine Optimization tekom tcworld 2016
 
Optimizing Your Content for Search
Optimizing Your Content for SearchOptimizing Your Content for Search
Optimizing Your Content for Search
 
How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search
 
Developing retention rules that work
Developing retention rules that workDeveloping retention rules that work
Developing retention rules that work
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Going Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePointGoing Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePoint
 
Jason lax sap - from seo practitioner to seo evangelist
Jason lax   sap - from seo practitioner to seo evangelistJason lax   sap - from seo practitioner to seo evangelist
Jason lax sap - from seo practitioner to seo evangelist
 

More from Heather Hedden

Introduction to Knowledge Graphs for Information Architects.pdf
Introduction to Knowledge Graphs for Information Architects.pdfIntroduction to Knowledge Graphs for Information Architects.pdf
Introduction to Knowledge Graphs for Information Architects.pdf
Heather Hedden
 
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
Heather Hedden
 
Managing Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan TermsManaging Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan Terms
Heather Hedden
 
Taxonomy Displays: Bridging UX & Taxonomy Design
Taxonomy Displays: Bridging UX & Taxonomy DesignTaxonomy Displays: Bridging UX & Taxonomy Design
Taxonomy Displays: Bridging UX & Taxonomy Design
Heather Hedden
 
Mapping, Merging, and Multilingual Taxonomies
Mapping, Merging, and Multilingual TaxonomiesMapping, Merging, and Multilingual Taxonomies
Mapping, Merging, and Multilingual Taxonomies
Heather Hedden
 
Making Decisions in Creating Taxonomies
Making Decisions in Creating TaxonomiesMaking Decisions in Creating Taxonomies
Making Decisions in Creating Taxonomies
Heather Hedden
 

More from Heather Hedden (6)

Introduction to Knowledge Graphs for Information Architects.pdf
Introduction to Knowledge Graphs for Information Architects.pdfIntroduction to Knowledge Graphs for Information Architects.pdf
Introduction to Knowledge Graphs for Information Architects.pdf
 
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
Thesauri for Indexing Support / Thesauri zur Unterstützung der Registererstel...
 
Managing Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan TermsManaging Mature Taxonomies: Resolving Orphan Terms
Managing Mature Taxonomies: Resolving Orphan Terms
 
Taxonomy Displays: Bridging UX & Taxonomy Design
Taxonomy Displays: Bridging UX & Taxonomy DesignTaxonomy Displays: Bridging UX & Taxonomy Design
Taxonomy Displays: Bridging UX & Taxonomy Design
 
Mapping, Merging, and Multilingual Taxonomies
Mapping, Merging, and Multilingual TaxonomiesMapping, Merging, and Multilingual Taxonomies
Mapping, Merging, and Multilingual Taxonomies
 
Making Decisions in Creating Taxonomies
Making Decisions in Creating TaxonomiesMaking Decisions in Creating Taxonomies
Making Decisions in Creating Taxonomies
 

Recently uploaded

EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
digitalxplive
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
LINUS PROJECTS (INDIA)
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
moinahousna
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 

Recently uploaded (20)

EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 

Managing Taxonomy Tagging

  • 1. Managing Taxonomy Tagging Taxonomy Boot Camp Washington, DC November 4, 2019 Heather Hedden & Terry Casey
  • 2. Terry Casey ▪ Taxonomist – Independent consultant, Casey Indexing and Information Service – Currently contract staff taxonomist ▪ Back-of-the book indexer – Textbook, scholarly, trade book and periodical indexer. – Embedded indexes for digital publications About Us 2 Heather Hedden ▪ Taxonomist – Independent consultant, Hedden Information Management – Previously employed and contract consultant and staff taxonomist ▪ Former indexer – Periodical article indexer at library vendor IAC (acquired by Gale) – Freelance back-of-the-book indexer ▪ Author of The Accidental Taxonomist (2010, 2016, Information Today, Inc.)
  • 3. ▪ Introduction: Tagging, Indexing, Categorizing ▪ Taxonomy Design and Display for Indexing ▪ Indexing Policy, Documentation, and Training ▪ Automated Indexing Methods ▪ Manual Indexing ▪ Finding Indexers ▪ Examples of Indexing Projects Outline 3
  • 4. Tagging vs. indexing vs. categorizing/classifying Tagging – assigning metadata labels (“tags”) ▪ By identifying topics and names within a document or content item ▪ By content creators or editors (minimally trained in tagging), not as their primary job responsibility ▪ For metadata both with and without controlled vocabularies ▪ To support search ▪ Can also be semi-automated Indexing – assigning index terms (subject metadata and related elements) ▪ By identifying topics and names within a document or content item ▪ By trained indexers, often as their primary job responsibility ▪ By selecting terms from a large controlled vocabulary/thesaurus/taxonomy ▪ To create a browsable index (and now also to support search) ▪ Can also be semi-automated Introduction 4
  • 5. Tagging vs. indexing vs. categorizing/classifying Categorizing/classifying – organizing & assigning content into named categories ▪ By identifying which category a document or content item belongs within ▪ A feature of most content management systems, in addition to tagging. ▪ Often represented as virtual folders and subfolders. ▪ May be appropriate for Subjects or for Document Types. ▪ Content items can usually go into only one category, like classification. ▪ Categories are multi-level hierarchical. ▪ Category hierarchy is designed as a hierarchical taxonomy. ▪ Categories may or may not be metadata. ▪ Can also be automated or semi-automated. Introduction 5
  • 6. Introduction 6 Categories vs. Tags Examples of both categories and tags within the same applications
  • 7. Introduction 7 • What topics the content contains • Like an index • More specific • More numerous • Overlapping • Unstructured • Less controlled • Ad hoc • Supports searching & filtering • What “buckets” the content goes into • Like a table of contents • Relatively broad • Limited in number • Mutually exclusive • Sometimes hierarchical • More controlled • Pre-planned • Supports browsing & filtering Categories vs. Tags vs. Index terms Categories Tags Index terms • What topics the content contains • For an index • More specific • More numerous • Overlapping • Structured • More controlled • Pre-planned • Supports browsing, searching & filtering
  • 8. Introduction 8 • Assigning any terms desired • Used by authors and editors • Tends to inconsistent terms and indexing • Responsive to trends and dynamic • May supplement a controlled vocabulary • Using only pre-approved terms • Used by indexers and content managers • Ensures consistent indexing • Slower to change and updates Indexing or tagging with a controlled vocabulary or not Controlled vocabulary Keywords Folksonomy • Assigning any terms and reusing terms • Used by authors, editors, content managers, users • Tends to inconsistent terms and indexing • Responsive to trends and dynamic • May supplement a controlled vocabulary • More collaborative as “social tagging”
  • 9. Taxonomy design for manual indexing ▪ Use of alternative labels/nonpreferred terms (considering also search or browse UI, from start of term) ▪ Use of associative (related term – RT) relationships in addition to hierarchical ▪ Scope notes, dedicated Indexer notes, occasional definitions of terms ▪ Grouped distinct term sets, hierarchies, or facets for comprehensive indexing (even if distinct term sets or facets are not supported in the end-user interface) Taxonomy Design and Display for Indexing 9
  • 10. Indexing user interface and experience (UI/UX) with taxonomy Tagging interfaces of a commercial CMS are not user friendly. For large volume manual tagging, develop your own. Desirable features ▪ Both alphabetical and hierarchical browse options ▪ Alphabetical browse with alternative labels/nonpreferred terms ▪ Various search options: Begins with, Word/phrase within, Exact, Smart ▪ Exact term matches are validated and don’t require searching/browsing ▪ Shortcuts (abbreviations) for commonly indexed terms ▪ Auto-conversion of selected alternative labels/nonpreferred terms to preferred ▪ Indexing steps with keyboard shortcuts, and not just mouse, for speed Taxonomy Design and Display for Indexing 10
  • 11. Indexing UI display Screenshot example (Gale/Cengage internal) Taxonomy Design and Display for Indexing 11
  • 12. Indexing policy, rules, documentation, should cover: ▪ Criteria for determining topic or name relevancy for indexing ▪ Depth, level of detail ▪ Comprehensiveness of aspects (what, who, where, when, how, why, etc.) ▪ Required term types/facets (and any dependencies) ▪ Number of terms (of each type) ▪ Indexing of certain terms in combination e.g.: a parent/broader term in addition to its narrower/child term ▪ Other required metadata to enter ➢ Recommendations/guidelines and rules/requirements Indexing Policy, Documentation and Training 12
  • 13. Indexer training ▪ Instructing the indexing policy/guidelines as a live or web presentation ▪ Training with examples on indexing that captures the “aboutness” of a document rather than matching words in the text to taxonomy terms. ▪ Reviewing sample indexing and providing feedback. Indexing Policy, Documentation and Training 13
  • 14. Feedback from indexing to improve the taxonomy Often based on statistics on term usage in indexing ▪ Underused terms may need added alternative labels or relationships. ▪ Overused terms may need to be split into more specific terms. ▪ Misused terms may need rewording, scope note, and/or alternative labels. ▪ Correctly used low-use terms can be dropped. Also based on indexers’ individual requests and queries Indexing Policy, Documentation and Training 14
  • 15. Indexer-taxonomist communication for new terms ▪ Taxonomist informs indexers of new and changed terms, and indexing tips (combinations of terms) for indexing new or recurring topics ▪ Indexers request taxonomist to clarify terms or create new terms Methods: ▪ email distribution lists ▪ Intranet bulletin posts ▪ collaboration workspace posts ▪ indexing software feature for new term nomination Indexing Policy, Documentation and Training 15
  • 16. Automated indexing/Auto-categorization/Auto-classification 2 primary methods: machine-learning and rules-based Machine-learning based Automatically categorizes/tags based on previous examples. ▪ System has complex mathematical algorithms. ▪ Content managers must provide multiple (10’s or more) representative sample documents for each taxonomy term to “train” the system. Results are reviewed and training sets are “tuned.” ▪ Matches are to terms and alternative labels, which can be individually weighted. ▪ System may also “suggest” additional terms to add to taxonomy. ▪ Best if large body of pre-indexed records already exists (such as migrating from manual to automated indexing) Automated Indexing Methods 16
  • 17. Rules-based auto-indexing Rules are created for each taxonomy term. ▪ Rules are based on synonyms with more conditions. ▪ Some systems feature weighting of synonyms. ▪ Some systems feature more sophisticated rule-writing, like advanced Boolean searching (in reverse) and proximity operators or regular expressions. ▪ Some systems feature auto-generated suggested rules for each term/synonym which can be manually edited in addition to writing rules from scratch. Automated Indexing Methods 17
  • 18. Manual tasks for automated indexing Continual update work is needed for each new term created. ▪ New training documents need to be added and taxonomy terms tuned. ▪ New rules need to created or edited. ➢ Identifying and tuning training documents is more appropriate for subject matter experts, editors, indexers. ➢ Writing rules is more appropriate for information professionals, taxonomists, knowledge engineers. Automated Indexing Methods 18
  • 19. Benefits of manual indexing ▪ Can audit and check indexers’ work immediately and make corrections or give instructions ▪ Can respond to indexers request for new tags quickly ▪ Can make and/or use compound headings ▪ Can handle terms that could go under multiple headings and make educated and nuanced decision of where to index the information correctly ▪ Human interpretation of complex subjects is hard to automate with rules. Autoindexing can lead to inconsistent and uncertain results for complex subjects ▪ Higher levels of precision and recall: indexers’ inconsistencies are minor, compared with potential automated indexing errors. Manual Indexing 19
  • 20. Benefits of manual indexing: Process advantages ▪ Handles complex documents that require human interpretation to analyze terms ▪ Can figure out indexing parameters as you go. Everything does not need to be decided ahead of time-very responsive to change ▪ Make major structural changes right away because decision maker is right there ▪ Clean out old, out-of-date, unusable tags while tagging ▪ Create new tags immediately when needed. Able to adapt and change taxonomy while it is evolving with new information instead of later coming back to find information to tag Manual Indexing 20
  • 21. Considerations in choosing an indexing method Manual versus Automated Indexing 21 Manual methods ➢ Manageable number of documents ➢ Higher accuracy in indexing ➢ May include non-text files ➢ Investing in people ➢ Low-tech: can build your own indexing tool/user interface ➢ Internal control Automated methods ➢ Very large number of documents ➢ Greater speed in indexing ➢ Text files only ➢ Investing in technology ➢ High-tech: must purchase auto- indexing/classification software ➢ Software vendor relationship
  • 22. Who are indexers and where to find them ▪ Full-time staff (for ongoing indexing) ‒ Editorial or subject specialization background + thorough indexing training ▪ Freelance or contractors (for temporary or part-time projects) ‒ Look for those with periodical article/database indexing experience ‒ Many have back-of-the-book indexing experience only (so would require some additional training) ▪ Subject matter experts or not ‒ Scholarly or highly technical documents require subject expertise ‒ General enterprise or public content does not require subject expertise ▪ Rates can be hourly or per indexed record/document ▪ American Society for Indexing -- for finding freelance/contract indexers Finding Indexers 22
  • 23. American Society for Indexing ▪ ASI Website screen shot Indexers 23 www.asindexing.org
  • 24. ▪ Educational institution: unique scholarly article for public research access ▪ Professional indexer for first phase; will train others on subsequent phases ▪ Taxonomy consultant remained available throughout first phase indexing ▪ International organization: SharePoint intranet taxonomy ▪ Request for not just guidelines but also training for tagging ▪ Fortune 500 firm: enterprise taxonomy for tagging articles Examples of Indexing Projects 24
  • 25. Questions/Contact 25 Terry Casey Taxonomy Consultant Casey Indexing and Information Service Saint Paul, MN 651-278-2023 terry@caseyindex.com www.caseyindex.com www.linkedin.com/in/terry-casey Heather Hedden Taxonomy Consultant Hedden Information Management Carlisle, MA 978-467-5195 heather@hedden.net www.hedden-information.com accidental-taxonomist.blogspot.com www.linkedin.com/in/hedden Twitter: @hhedden