Why Are Taxonomies Necessary?  By Fred Leise ContextualAnalysis, LLC
Taxonomies are sets of terms (controlled vocabularies or CVs) used to tag documents or other content objects. Taxonomies may also be used as browsing hierarchies or for search enhancement. What Are Taxonomies?
Taxonomy terms are collected into groups called attributes. Each attribute (or facet) describes one property of your content.  What Are Taxonomies?
Example: Attribute:  Office Location Terms:   London New York City (NYC, Big Apple) Washington, DC What Are Taxonomies? Alternate Terms
In this example, “NYC” and Big Apple” are given as variants for “New York.”  Variant terms are used to expand search queries. If a user enters “New York” the search system expands to search “New York  or  NYC  or  Big Apple. What Are Taxonomies?
Search query expansion ensures that more relevant information is found, even though it might use terms the searcher hasn’t thought of. What Are Taxonomies?
Other typical attributes include: Author Creation Date Audience Version Number Subject What Are Taxonomies?
There is an international standard for metadata, the Dublin Core Metadata Element Set, consisting of 15 attributes. What Are Taxonomies?
Good metadata schemas (collections of attributes) will adhere as closely as possible to the Dublin Core standard. More information is available at: www.dublincore.org What Are Taxonomies?
Well designed taxonomies: 1. Enable users to find relevant information quickly and efficiently (improved retrieval) 2. Lead users to additional relevant information, providing upselling and cross-selling opportunities What Are Taxonomies?
Well designed taxonomies: 3. Assists authors in consistently tagging content What Are Taxonomies?
Proper use of taxonomies results in: Less time wasted searching for information Fewer failed searches Fewer abandoned interactions Increased income Reduced customer assistance costs What Are Taxonomies?
English is rich in words that mean the same or nearly the same thing feline/cat car/automobile travel/journey/excursion/trip jeans/denims/Levi's/501s Why Are Taxonomies Important?
Result: scattering of information. No matter what term you use in a free-text search, you get only part of the relevant information.  The rest is not retrieved because it uses different terms to describe the same concept. Why Are Taxonomies Important?
Consider the example of mobile devices. There are many ways that users can refer to them: Personal digital assistants Handheld computers Blackberries PDAs Why Are Taxonomies Important?
If users don’t know the term you use to label the information they are looking for, they waste time browsing or give up their search completely. They are victims of a communication chasm. Why Are Taxonomies Important?
You use the term “cat.” I use “feline.” If we each search a recipe database that uses both terms with equal frequency, we will get back only half the appropriate recipes, a recall ratio of 50% Why Are Taxonomies Important?
Solution: Add a controlled vocabulary to the search system that gives “feline” and “cat” as equivalent terms. Search queries will be expanded appropriately. Why Are Taxonomies Important?
English is rich in words that have more than one disparate meaning Pitch To throw a baseball A tar-like substance A salesman’s monologue Why Are Taxonomies Important?
Bank Where you store money The side of a river To carom a cue ball off a pool table rail To prepare a fire for the night To maneuver a plane for a turn Why Are Taxonomies Important?
Result: Lots of false drops (irrelevant information), resulting in poor precision. Why Are Taxonomies Important?
Solution: use a CV that includes scope notes (definitions) or that uses facets. Example: Think about searching for the term “Rembrandt.” You might get the following results. Why Are Taxonomies Important?
Why Are Taxonomies Important? Rembrandt Go Search The painter  Rembrandt was one of the greatest of all the Dutch  realists…. If you want to whiten and brighten your teeth, there is no better brand than  Rembrandt.
Why Are Taxonomies Important? You probably are interested in only one of these “Rembrandts.” So half of your search results are irrelevant. Now consider what happens if you were able to specify the type of object you are looking for, either an artist or a toothpaste brand.
Why Are Taxonomies Important? The painter Rembrandt was one of the greatest of all the Dutch realists…. If you want to whiten and brighten your teeth, there is no better brand than Rembrandt. Artist Brand Name Rembrandt Rembrandt
Why Are Taxonomies Important? You get only results relevant to what you are interested in. Here, having search boxes identified by attribute (faceted searching) lets you hone in quickly on the particular information you want.
Why Are Taxonomies Important? You could also use one search and let users filter or narrow results after their search.
 
Roles for Taxonomies Tagging documents for a content management system Provides administrative metadata to control authoring and publishing processes  How are Taxonomies Used?
Roles for Taxonomies Administrative metadata: example Document # Author Department Creation date Publication date Expiration date How are Taxonomies Used?
Roles for Taxonomies Tagging document contents for a content management system Provides metadata to support search Ensures inter-indexer consistency How are Taxonomies Used?
Roles for Taxonomies Tagging document contents for a content management system Controls subject scattering Increases search results relevance: tags “aboutness” not just mentions of a word How are Taxonomies Used?
Roles for Taxonomies Search engine component  Translates user’s terms into those used to tag items (increases precision  and  recall) Offers options for expanding or reducing scope of search using broader or narrower terms How are Taxonomies Used?
Roles for Taxonomies Search engine component  Differentiates between multiple meanings of terms How are Taxonomies Used?
Taxonomy Use: Search Results rei.com
Roles for Taxonomies Operating as a browsing hierarchy Organizes content using taxonomy terms as category labels Represents taxonomy hierarchy by browsing levels How are Taxonomies Used?
rei.com Level 1 Level 4 Level 3 Level 2
Synonym Ring Identifies words with equivalent meanings (in a given context) rock = stone CD-ROM = CD = disk money = dough =  bucks = greenbacks = legal tender Types of Taxonomies
Synonym Ring When one of the words in a synonym ring is searched for, the search engine expands the search and returns items containing any of the words in the ring. Types of Taxonomies
Authority File Has all the features of a synonym ring, plus the identification of  preferred   terms  (approved terms/descriptors/keywords) for tagging content. Types of Taxonomies
Taxonomy Also called hierarchy or classification. All features of authority files, plus the broader term (BT) and narrower term (NT)  relationships. Types of Taxonomies
Taxonomy All terms must be part of a hierarchical relationship  (no  orphan  terms). Taxonomies may be presented in hierarchical or alphabetical format. Types of Taxonomies
total compensation   .   compensation   .  .   base salary (salary)   .  .   deferred payments (deferred compensation)   .  .   variable pay   .  benefits   .  .   401(k) plan   .  .   health benefits   .  .   .   dental plan   .  .   .  disability insurance Types of Taxonomies:  Taxonomy Example
Thesaurus Plural form: thesauri All the features of taxonomies, plus the associative relationship of  related terms  (RT) Types of Taxonomies
Types of Taxonomies: Thesaurus Example, Alphabetical Building Permits BT Permits Business Licenses BT Licenses Business Taxes BT Taxes Fees RT Taxes Licenses NT Business Licenses RT Permits Operating Permits BT Permits Permits NT Building Permits; Operating Permits RT Licenses Taxes NT Business Taxes RT Fees
Types of Taxonomies: Thesaurus Example, Hierarchical   Business Taxes . . Fees   Taxes .   Operating Permits . .   Building Permits . . Licenses   Permits .   Business Licenses . . Permits   Licenses . Taxes   Fees .     Licenses, Permits & Taxes Related Terms Vocabulary Terms
Synonym Ring   + preferred terms = Authority File   + broader/narrower terms = Taxonomy   + related terms = Thesaurus Types of Taxonomies—Summary
Facets are fundamental categories by which an object or concept may be described Example: some facets describing a toy ball: size, weight, shape, color, texture, material Taxonomies and Facets
Uses of Facets: Browsing Hierarchies Facets allow users to follow the path best matching the way they think (their mental model). Taxonomies and Facets
Uses of Facets: Browsing Hierarchies Example: epicurious.com > recipes > browse Main ingredient  Cuisine  Preparation method  Season/occasion  Course/dish Taxonomies and Facets
Taxonomies and Facets epicurious.com
Uses of Facets: Fielded Search Allows for greater specificity, thus increasing search precision. But this is usually more complicated for users than simple searching, so it is often introduced as option on results page. Taxonomies and Facets
alibris.com Advanced Search
epicurious.com Advanced Search
Requirements for Browsing/Search Facets Development of metadata schema Development of appropriate controlled vocabularies Proper content tagging Taxonomies and Facets
Aitchison, Jean.  Thesaurus Construction and Use: A Practical Manual.  4th ed. Chicago: Fitzroy Dearborn Publishers Resources
Resources International standard for metadata: Dublin Core Metadata Element Set (ISO Standard 15836-2003) http://www.niso.org/international/SC4/n515.pdf
National Information Standards Organization.  ANSI/NISO Z39.19:1993. Guidelines for the Construction, Format and Management of Monolingual Thesauri.  Bethesda, MD: NISO Press, 1994 Rosenfeld, Lou, and Peter Morville.  Information Architecture for the World Wide Web: Designing Large-Scale Websites.  3d ed. O’Reilly Publishers, 2006. Resources
Sinha, Rashmi.  Beyond Cardsorting: Free-listing Methods to Explore User Categorizations   Available at: http://www. boxesandarrows.com/archives/ beyond_cardsorting_freelisting_ methods_to_explore_user_categorizations.php Steckel, Mike, Karl Fast and Fred Leise.  Creating a Controlled Vocabulary.  2002 Available at: http://www.boxesandarrows.com/archives/ creating_a_controlled_vocabulary.php Resources
Contact Information Fred Leise www.contextualanalysis.com [email_address] @ChicagoIndexer

Why Are Taxonomies Necessary?

  • 1.
    Why Are TaxonomiesNecessary? By Fred Leise ContextualAnalysis, LLC
  • 2.
    Taxonomies are setsof terms (controlled vocabularies or CVs) used to tag documents or other content objects. Taxonomies may also be used as browsing hierarchies or for search enhancement. What Are Taxonomies?
  • 3.
    Taxonomy terms arecollected into groups called attributes. Each attribute (or facet) describes one property of your content. What Are Taxonomies?
  • 4.
    Example: Attribute: Office Location Terms: London New York City (NYC, Big Apple) Washington, DC What Are Taxonomies? Alternate Terms
  • 5.
    In this example,“NYC” and Big Apple” are given as variants for “New York.” Variant terms are used to expand search queries. If a user enters “New York” the search system expands to search “New York or NYC or Big Apple. What Are Taxonomies?
  • 6.
    Search query expansionensures that more relevant information is found, even though it might use terms the searcher hasn’t thought of. What Are Taxonomies?
  • 7.
    Other typical attributesinclude: Author Creation Date Audience Version Number Subject What Are Taxonomies?
  • 8.
    There is aninternational standard for metadata, the Dublin Core Metadata Element Set, consisting of 15 attributes. What Are Taxonomies?
  • 9.
    Good metadata schemas(collections of attributes) will adhere as closely as possible to the Dublin Core standard. More information is available at: www.dublincore.org What Are Taxonomies?
  • 10.
    Well designed taxonomies:1. Enable users to find relevant information quickly and efficiently (improved retrieval) 2. Lead users to additional relevant information, providing upselling and cross-selling opportunities What Are Taxonomies?
  • 11.
    Well designed taxonomies:3. Assists authors in consistently tagging content What Are Taxonomies?
  • 12.
    Proper use oftaxonomies results in: Less time wasted searching for information Fewer failed searches Fewer abandoned interactions Increased income Reduced customer assistance costs What Are Taxonomies?
  • 13.
    English is richin words that mean the same or nearly the same thing feline/cat car/automobile travel/journey/excursion/trip jeans/denims/Levi's/501s Why Are Taxonomies Important?
  • 14.
    Result: scattering ofinformation. No matter what term you use in a free-text search, you get only part of the relevant information. The rest is not retrieved because it uses different terms to describe the same concept. Why Are Taxonomies Important?
  • 15.
    Consider the exampleof mobile devices. There are many ways that users can refer to them: Personal digital assistants Handheld computers Blackberries PDAs Why Are Taxonomies Important?
  • 16.
    If users don’tknow the term you use to label the information they are looking for, they waste time browsing or give up their search completely. They are victims of a communication chasm. Why Are Taxonomies Important?
  • 17.
    You use theterm “cat.” I use “feline.” If we each search a recipe database that uses both terms with equal frequency, we will get back only half the appropriate recipes, a recall ratio of 50% Why Are Taxonomies Important?
  • 18.
    Solution: Add acontrolled vocabulary to the search system that gives “feline” and “cat” as equivalent terms. Search queries will be expanded appropriately. Why Are Taxonomies Important?
  • 19.
    English is richin words that have more than one disparate meaning Pitch To throw a baseball A tar-like substance A salesman’s monologue Why Are Taxonomies Important?
  • 20.
    Bank Where youstore money The side of a river To carom a cue ball off a pool table rail To prepare a fire for the night To maneuver a plane for a turn Why Are Taxonomies Important?
  • 21.
    Result: Lots offalse drops (irrelevant information), resulting in poor precision. Why Are Taxonomies Important?
  • 22.
    Solution: use aCV that includes scope notes (definitions) or that uses facets. Example: Think about searching for the term “Rembrandt.” You might get the following results. Why Are Taxonomies Important?
  • 23.
    Why Are TaxonomiesImportant? Rembrandt Go Search The painter Rembrandt was one of the greatest of all the Dutch realists…. If you want to whiten and brighten your teeth, there is no better brand than Rembrandt.
  • 24.
    Why Are TaxonomiesImportant? You probably are interested in only one of these “Rembrandts.” So half of your search results are irrelevant. Now consider what happens if you were able to specify the type of object you are looking for, either an artist or a toothpaste brand.
  • 25.
    Why Are TaxonomiesImportant? The painter Rembrandt was one of the greatest of all the Dutch realists…. If you want to whiten and brighten your teeth, there is no better brand than Rembrandt. Artist Brand Name Rembrandt Rembrandt
  • 26.
    Why Are TaxonomiesImportant? You get only results relevant to what you are interested in. Here, having search boxes identified by attribute (faceted searching) lets you hone in quickly on the particular information you want.
  • 27.
    Why Are TaxonomiesImportant? You could also use one search and let users filter or narrow results after their search.
  • 28.
  • 29.
    Roles for TaxonomiesTagging documents for a content management system Provides administrative metadata to control authoring and publishing processes How are Taxonomies Used?
  • 30.
    Roles for TaxonomiesAdministrative metadata: example Document # Author Department Creation date Publication date Expiration date How are Taxonomies Used?
  • 31.
    Roles for TaxonomiesTagging document contents for a content management system Provides metadata to support search Ensures inter-indexer consistency How are Taxonomies Used?
  • 32.
    Roles for TaxonomiesTagging document contents for a content management system Controls subject scattering Increases search results relevance: tags “aboutness” not just mentions of a word How are Taxonomies Used?
  • 33.
    Roles for TaxonomiesSearch engine component Translates user’s terms into those used to tag items (increases precision and recall) Offers options for expanding or reducing scope of search using broader or narrower terms How are Taxonomies Used?
  • 34.
    Roles for TaxonomiesSearch engine component Differentiates between multiple meanings of terms How are Taxonomies Used?
  • 35.
    Taxonomy Use: SearchResults rei.com
  • 36.
    Roles for TaxonomiesOperating as a browsing hierarchy Organizes content using taxonomy terms as category labels Represents taxonomy hierarchy by browsing levels How are Taxonomies Used?
  • 37.
    rei.com Level 1Level 4 Level 3 Level 2
  • 38.
    Synonym Ring Identifieswords with equivalent meanings (in a given context) rock = stone CD-ROM = CD = disk money = dough = bucks = greenbacks = legal tender Types of Taxonomies
  • 39.
    Synonym Ring Whenone of the words in a synonym ring is searched for, the search engine expands the search and returns items containing any of the words in the ring. Types of Taxonomies
  • 40.
    Authority File Hasall the features of a synonym ring, plus the identification of preferred terms (approved terms/descriptors/keywords) for tagging content. Types of Taxonomies
  • 41.
    Taxonomy Also calledhierarchy or classification. All features of authority files, plus the broader term (BT) and narrower term (NT) relationships. Types of Taxonomies
  • 42.
    Taxonomy All termsmust be part of a hierarchical relationship (no orphan terms). Taxonomies may be presented in hierarchical or alphabetical format. Types of Taxonomies
  • 43.
    total compensation . compensation . . base salary (salary) . . deferred payments (deferred compensation) . . variable pay . benefits . . 401(k) plan . . health benefits . . . dental plan . . . disability insurance Types of Taxonomies: Taxonomy Example
  • 44.
    Thesaurus Plural form:thesauri All the features of taxonomies, plus the associative relationship of related terms (RT) Types of Taxonomies
  • 45.
    Types of Taxonomies:Thesaurus Example, Alphabetical Building Permits BT Permits Business Licenses BT Licenses Business Taxes BT Taxes Fees RT Taxes Licenses NT Business Licenses RT Permits Operating Permits BT Permits Permits NT Building Permits; Operating Permits RT Licenses Taxes NT Business Taxes RT Fees
  • 46.
    Types of Taxonomies:Thesaurus Example, Hierarchical   Business Taxes . . Fees   Taxes .   Operating Permits . .   Building Permits . . Licenses   Permits .   Business Licenses . . Permits   Licenses . Taxes   Fees .     Licenses, Permits & Taxes Related Terms Vocabulary Terms
  • 47.
    Synonym Ring + preferred terms = Authority File + broader/narrower terms = Taxonomy + related terms = Thesaurus Types of Taxonomies—Summary
  • 48.
    Facets are fundamentalcategories by which an object or concept may be described Example: some facets describing a toy ball: size, weight, shape, color, texture, material Taxonomies and Facets
  • 49.
    Uses of Facets:Browsing Hierarchies Facets allow users to follow the path best matching the way they think (their mental model). Taxonomies and Facets
  • 50.
    Uses of Facets:Browsing Hierarchies Example: epicurious.com > recipes > browse Main ingredient Cuisine Preparation method Season/occasion Course/dish Taxonomies and Facets
  • 51.
    Taxonomies and Facetsepicurious.com
  • 52.
    Uses of Facets:Fielded Search Allows for greater specificity, thus increasing search precision. But this is usually more complicated for users than simple searching, so it is often introduced as option on results page. Taxonomies and Facets
  • 53.
  • 54.
  • 55.
    Requirements for Browsing/SearchFacets Development of metadata schema Development of appropriate controlled vocabularies Proper content tagging Taxonomies and Facets
  • 56.
    Aitchison, Jean. Thesaurus Construction and Use: A Practical Manual. 4th ed. Chicago: Fitzroy Dearborn Publishers Resources
  • 57.
    Resources International standardfor metadata: Dublin Core Metadata Element Set (ISO Standard 15836-2003) http://www.niso.org/international/SC4/n515.pdf
  • 58.
    National Information StandardsOrganization. ANSI/NISO Z39.19:1993. Guidelines for the Construction, Format and Management of Monolingual Thesauri. Bethesda, MD: NISO Press, 1994 Rosenfeld, Lou, and Peter Morville. Information Architecture for the World Wide Web: Designing Large-Scale Websites. 3d ed. O’Reilly Publishers, 2006. Resources
  • 59.
    Sinha, Rashmi. Beyond Cardsorting: Free-listing Methods to Explore User Categorizations Available at: http://www. boxesandarrows.com/archives/ beyond_cardsorting_freelisting_ methods_to_explore_user_categorizations.php Steckel, Mike, Karl Fast and Fred Leise. Creating a Controlled Vocabulary. 2002 Available at: http://www.boxesandarrows.com/archives/ creating_a_controlled_vocabulary.php Resources
  • 60.
    Contact Information FredLeise www.contextualanalysis.com [email_address] @ChicagoIndexer