• Save
Taxonomies and Metadata in Information Architecture
Upcoming SlideShare
Loading in...5
×
 

Taxonomies and Metadata in Information Architecture

on

  • 4,525 views

An overview of the benefits of using both taxonomies and metadata to make your information easier to search. Presentation by Alice Redmond-Neal of Access Innovations, Inc.

An overview of the benefits of using both taxonomies and metadata to make your information easier to search. Presentation by Alice Redmond-Neal of Access Innovations, Inc.

Statistics

Views

Total Views
4,525
Views on SlideShare
4,506
Embed Views
19

Actions

Likes
17
Downloads
89
Comments
0

5 Embeds 19

http://twitter.com 12
http://www.accessinn.com 2
http://www.pinterest.com 2
https://twitter.com 2
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Who is IA? Who is into taxonomies? Who is generally curious? Disclaimer: not librarian, IA, programmer, etc. Have something to offer on taxonomies, defer to many for IA stuff--- share!
  • Any search requiring a taxonomy term search is impossible unless the site shows the taxonomy. (MediaSleuth did in the past) Whatever we can do to improve user’s experience (search and find) is good.
  • Taxonomy allows you to sort out content by conceptual categories – by topic or subject -- by “aboutness” There are other forms of organization – alpha, chronological, geographical, audience, etc.
  • Special attention to Non-Preferred Term -- goldmine
  • All potentially applicable for a website’s IA
  • History of the term – Jack Meyers coined term “metadata” for products associated with his MetaModel and for his company The Medadata Company, registered trademark for the term in 1986. Found in Page Source or Page Info for any website.
  • If you know what you’re looking for -- Return to this point later and talk about smarter searching.
  • Function = Findability
  • There are other forms of organization – alpha, chronological, geographical, audience, etc. Taxonomy organizes by topic, by subject, by aboutness.
  • Room for improvement
  • Under the hood – the content management workflow stage, including indexing
  • Recognizing term equivalents – important point, we’ll see more on this later.
  • Facets work especially well when most items in the database can be described in multiple ways, have numerous aspects to consider…. E-commerce products, pharmaceuticals, etc.
  • There are other forms of organization – alpha, chronological, geographical, audience, etc. Taxonomy organizes by topic, by subject, by aboutness.
  • Search recognizes singular/plural and stemming (kangarooers, kangaroo-paws) Links to Broader and Narrower Terms and to Related Terms
  • For MediaSleuth, we are progressing toward doing just that and more. I introduced Machine Aided Indexer or MAI earlier. It is the categorizing assistant that prompts taxonomy terms for indexing—ultimately for subject metadata— based on words in a document. Those words come from a wide range of synonyms that writers use. MAI expands on the query, using its rulebase to link the search word to taxonomy terms. Let’s follow a search on the word “germs”
  • SLA also uses MAI behind the scenes to match search words to terms in their taxonomy and then to corresponding documents.
  • A search on the word “competencies” returns all documents in the category Professional competencies — documents for which MAI had suggested that term from their taxonomy.
  • Searching the word “thesaurus” (read as taxonomy term Thesauri ) yields 3 documents by looking at the descriptors, but 0 hits by looking at the original metadata supplied with the document.
  • Searching on the word “taxonomy” yielded 27 documents with Taxonomies as an indexing term, but only one having that word in the document’s original metadata. SLA’s search system takes advantage of two kinds of search: 1—a targeted search for document descriptors drawn from their own taxonomy, including the synonyms for taxonomy terms, and 2—a search of the original metadata. The two could be combined by including all document descriptors in the subject metadata.
  • Expands to 2 nd and 3 rd levels of taxonomy, includes Related Terms

Taxonomies and Metadata in Information Architecture Taxonomies and Metadata in Information Architecture Presentation Transcript

  • Taxonomies and Metadatafor Information Architecture Alice Redmond-Neal Thesaurus Development Manager Access Innovations, Inc. - Booth 217 ared@accessinn.com Internet Librarian 2005
  • What we’ll cover  Key definitions  Taxonomies, metadata, information architecture  How taxonomies and metadata influence information architecture  Using taxonomies to enhance retrieval2 Copyright © 2005 Access Innovations, Inc.
  • Key points A taxonomy provides both a browsable outline and descriptive metadata. Metadata provide efficient searchable handles for content. Taxonomy-based subject metadata yields the most precise retrieval. Taxonomy is the basis for Information architecture. Information architecture that takes full advantage of taxonomy and subject metadata supports findability.3 Copyright © 2005 Access Innovations, Inc.
  • What’s a taxonomy?  Words  Controlled vocabulary for a subject area  Descriptive labels  Hierarchy  Simple hierarchical view of a thesaurus  Knowledge organization system  Storage and retrieval aid4 Copyright © 2005 Access Innovations, Inc.
  • Info retrieval starts with a knowledge organization system  Uncontrolled list Not complex  Name authority file  Synonym set/ring  Controlled vocabulary  Taxonomy  Thesaurus  Ontology Highly complex  Semantic network5 Copyright © 2005 Access Innovations, Inc.
  • Structure of controlled vocabulariesList of words Synonyms Taxonomy Thesaurus INCREASING COMPLEXITYAmbiguity control Ambiguity control Ambiguity cont’l Synonym control Synonym control Synonym cont’l Hierarchical rel’s Hierarchical rel’s Associative rel’s6 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy? Thesaurus? Often used interchangeably Thesaurus is a taxonomy with extras  RelatedTerms  Nonpreferred Terms (USE/Used for)  Scope Notes  more Use the word your audience understands  Avoid confusion with Roget’s thesaurus7 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy Thesaurus view Term Record view8 Copyright © 2005 Access Innovations, Inc.
  • Basic taxonomy / thesaurus features Hierarchy structure  BroaderTerms = more general concepts  Narrower Terms = more specific concepts Related Terms = conceptual cousins Term equivalents Facets Scope notes Other elements as needed9 Copyright © 2005 Access Innovations, Inc.
  • Perspectives on taxonomies  Taxonomist (aka Lexicographer, Thesaurus builder)  Indexer  Information architect  Searcher Each has a different view and need for words in retrieving information. Each need relates to using a taxonomy for indexing / categorizing content.10 Copyright © 2005 Access Innovations, Inc.
  • Taxonomies for information retrieval online  Conceptual framework for web content – reflects organization of knowledge in a domain  Foundation for information architecture  Term records contain valuable info  Often 3 levels deep – depends on domain  May be displayed in full or part, modified,11 or hidden Copyright © 2005 Access Innovations, Inc.
  • Taxonomy display depends on purpose Descriptive taxonomy  Navigational taxonomy  Includes term variants,  Reflects user’s mental model synonyms, nonpreferred terms  Reflects user’s vernacular  Query term expansion links  Supports discovery through synonyms to valid taxonomy browsing term  May be modified version of full  Supports discovery through taxonomy hierarchy, Related Terms  Used primarily at indexing stage of content mgmt  Hidden taxonomy  Usually alphabetic links to terms workflow to categorize  May recognize term variants documents12 Copyright © 2005 Access Innovations, Inc.
  •  Taxonomy provides a way to describe the content -- the basis for subject metadata  Metadata provide a way for that description to be captured for a website13 Copyright © 2005 Access Innovations, Inc.
  • What’s metadata?  “When it comes to definitions, metadata is a slippery fish.”  Data about data  Tags used to describe documents, pages, images, software, video and audio files, and other content objects for the purposes of improved navigation and retrieval  Finding tool  Keywords not displayed to the viewer but available to search engines  Viewable in HTML keyword meta tag field of most web sitesPeter Morville, Louis Rosenfeld14 Copyright © 2005 Access Innovations, Inc.
  • Data about data - like what?  Title  Author name  Date of creation  Language used in the creation  Publisher  Subject of the creation  Keywords... our focus re: taxonomies  Other stuff, depending on need Dublin Core is a well-known metadata standard, but metadata schemas are commonly custom-designed15 Copyright © 2005 Access Innovations, Inc.
  • How does metadata work?  Search engine / web crawler looks at the HTML header on a web page  View  Page source  Subject Metadata is one part of the HTML header <META NAME="KEYWORDS" CONTENT= … >16 Copyright © 2005 Access Innovations, Inc.
  • 17 Copyright © 2005 Access Innovations, Inc.
  • <META NAME="KEYWORDS" CONTENT="content management software, xml thesaurus,data management, database management system,concept extraction, document management software,information management system, information retrieval,knowledge extraction, knowledge management software,machine aided indexing, taxonomy management system,text management, text retrieval,thesaurus management software, xml">A search including these words/phrases will retrieve this website. 18 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy terms as metadata  Most precise topic identifiers – 100% relevant  Searchable as metadata  Gives more precise results than free text search – if you know what you’re looking for  Prevents hit on random occurrence of your query word19 Copyright © 2005 Access Innovations, Inc.
  • What’s Information Architecture? The art and science of structuring and classifying web sites and intranets to help people find and manage information Content + Structure + Function = the basis for User ExperiencePeter Morville, Louis Rosenfeld20 Copyright © 2005 Access Innovations, Inc.
  • 21 Copyright © 2005 Access Innovations, Inc.
  • What’s an Information Architect?“I’m an Information Architect.I organize huge amounts of informationon big web sites and intranetsso that people can actually find what they want.Think of me as an Internet Librarian.”Peter Morville, Louis Rosenfeld22 Copyright © 2005 Access Innovations, Inc.
  • What IA is not  Graphic / visual design  Software development  Content management  Knowledge management  Coding (HTML, etc.)  Usability engineering  Library science23 Copyright © 2005 Access Innovations, Inc.
  • Information Architecture – major components  Taxonomies  Metadata  Organization  Search  Labeling  Navigation24 Copyright © 2005 Access Innovations, Inc.
  • 1–Taxonomies aid site organization Taxonomy provides  Framework for content organization  Hierarchical outline of your content by subject categories  Basis for faceted browsing25 Copyright © 2005 Access Innovations, Inc.
  • Categories show clearly what’s covered in this domain26 Copyright © 2005 Access Innovations, Inc.
  • Value of Category search  Searchers find info 50% faster using browsable categories than using list returned from free text search  Results even stronger when results not in top 20 returns  Searchers prefer browsable category search Chen, H., and Dumais, S.27 Copyright © 2005 Access Innovations, Inc.
  • MediaSleuth – displaying taxonomy categories improves IA MediaSleuth is:  Online source of educational media  Videos,software, audio, etc.  Over 96,000 products, nearly 64,000 titles  Based on NICEM database (National Information Center for Educational Media) Content – Excellent Findability – ?28 Copyright © 2005 Access Innovations, Inc.
  • 29 Copyright © 2005 Access Innovations, Inc.
  • 30 Copyright © 2005 Access Innovations, Inc.
  • MediaSleuth draws on XML-tagged elements under the hood31 Copyright © 2005 Access Innovations, Inc.
  • Machine Aided Indexer (M.A.I.)suggests taxonomy descriptors32 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy terms on documents help sort and organize the content  M.A.I. suggests the correct terms from the taxonomy as descriptors  M.A.I. rulebase recognizes term equivalents  germs  Microorganisms  vaccin*  Pharmaceutical drugs Recognizing term equivalents enables enhanced search33 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy descriptors become subject metadata  Selected descriptors are XML-tagged and stored with document  Descriptors available as webpage metadata  Metatags enable precise document retrieval  Term equivalence enables query expansion in search (coming)34 Copyright © 2005 Access Innovations, Inc.
  • Search: body growth Complete database  Free text search 8 hits — some irrelevant  Free text search on titles 6 hits — limited recall  Search by taxonomy descriptor (AKA category)  470 hits  100% relevant  100% recall 1,100 document sample  Category search results 3 hits35 Copyright © 2005 Access Innovations, Inc.
  • Sidebar: Recall, Precision, and Relevance Search for body growth If you retrieve B, C, F, G 100% recall, 100% precision, Documents 100% relevant If you retrieve B, C A, D, E, H, I, J 50% recall, 100% precision, 100% relevant Documents If you retrieve B, C, H, J tagged 50% recall, 50% precision, “body growth” 50% relevant B, C, F, G If you retrieve A, D, E, H 0% recall, 0% precision, 0% relevant36 Copyright © 2005 Access Innovations, Inc.
  • Display taxonomy categories to improve MediaSleuth search Results from sample of 1,100 documents (not all categories37 are populated) Copyright © 2005 Access Innovations, Inc.
  • See full topic coverage by revealing Narrower Terms
  • Select taxonomy category to see associated titles 39 Copyright © 2005 Access Innovations, Inc.
  • 40 Copyright © 2005 Access Innovations, Inc.
  • Facets offer finer organization Add details about any term  Pre-established aspects that pertain to each item Cross-cut a taxonomic hierarchy Basis for fine-tuning search results  Market group / audience  Price  Color  Sizerange  Source / company  Other attributes, varying by domain and need
  • Facets describe all / most items by Department, Price,42 Copyright © 2005 other Attributes Color, Access Innovations, Inc.
  • “Taxonomies and faceted models provide users with tools to see the forest and quickly focus on a specific tree.” Sullivan, D.43 Copyright © 2005 Access Innovations, Inc.
  • Alternative ways to display content organization  Alphabetically  Chronologically  Geographically  Permuted list of taxonomy terms  Content management system  management system, Content  system, Content management44 Copyright © 2005 Access Innovations, Inc.
  • 2–Taxonomies aid search Taxonomy provides  Authority terms of a controlled vocabulary  Synonyms and other alternative expressions  Typos (lathes, laiths, laths, layth…)  Obsolete names (Cooper’s plane / Lamb’s tongue)  Query expansion45 Copyright © 2005 Access Innovations, Inc.
  • Search: kangaroo Leverage taxonomy term information to aid search46 Copyright © 2005 Access Innovations, Inc.
  • 47 Copyright © 2005 Access Innovations, Inc.
  • 48 Copyright © 2005 Access Innovations, Inc.
  • SLA search Interpret search word “competencies” as taxonomy term Professional competencies49 Copyright © 2005 Access Innovations, Inc.
  • Returns all documents in Professional competencies category50 Copyright © 2005 Access Innovations, Inc.
  • Search: thesaurus Interpret “thesaurus” as term Thesauri, return all documents in that category.51 Copyright © 2005 Access Innovations, Inc.
  • Search “taxonomy” in XML descriptor field returns all documents in that category  27 Search in original metadata  1 Solution: Include descriptors52 Copyright © 2005with metadata! Access Innovations, Inc.
  • 3–Taxonomies aid labeling Taxonomy provides  Basis for labels on site/portal  Concepts that can be re-worded for audience53 Copyright © 2005 Access Innovations, Inc.
  • SLA website and thesaurus Navigational Descriptive Taxonomy Taxonomy for end user for Indexers54 Copyright © 2005 Access Innovations, Inc.
  • Adapt taxonomy terms for labeling  What words do users use? Gather variants from  Search logs  User focus groups  Subject matter experts  Tailor site/portal labels to typical users  Include variants as Nonpreferred terms (USE/Used for equivalents) in taxonomy  M.A.I. can also capture variants as rules without formalizing them as Nonpreferred terms55 Copyright © 2005 Access Innovations, Inc.
  • 4–Taxonomies aid navigation Taxonomy provides  Major categories  Expansion to Narrower Terms  Additional term information56 Copyright © 2005 Access Innovations, Inc.
  • Taxonomy Expanded Top categoriesCategories & additional information 57 Copyright © 2005 Access Innovations, Inc.
  • Drop-down menus reflect Narrower Terms and Related Terms58 Copyright © 2005 Access Innovations, Inc.
  • Integrate taxonomy to enhance findability  Browsable categories of a directory  Browsable faceted navigation  Smart search for term equivalents  Taxonomy terms (original or modified) as labels  Navigation aids incorporate taxonomy terms and relationships59 Copyright © 2005 Access Innovations, Inc.
  • Use software tools to support IA  Thesaurus creation / management tools  ANSI/NISO standards compliant  Support features you need  Customizable fields  Import ability  Categorization tools  Human / automatic / hybrid categorizer  Content management systems60 Copyright © 2005 Access Innovations, Inc.
  • TAXONOMY ABC Company --- --- Foundation of information --- --- --- Your architecture --- --- --- Portal --- --- Source of subject ion t metadata n iza ls e ga be id n r a o l la c h io Path to portal nt at te rta ar ig n o av usability Co P Se N Natural science Biology Botany TAXONOMY Medicine Physical science Astrononmy Chemistry Physics Your Content
  • Recap  Taxonomies and metadata are cornerstones of information architecture  Taxonomies are the basis for content organization  Taxonomies provide a browsable outline of your content  Subject metadata using taxonomy terms yield 100% relevant retrieval  Taxonomies are the basis for search, labeling, and navigation in information architecture  Tools that recognize synonyms (query expansion) improve taxonomy implementation62 Copyright © 2005 Access Innovations, Inc.
  • References  Aitchison, J., Gilchrist, A., and Bawden, D. Thesaurus Construction and Use: A Practical Manual (4th edition). Aslib, 2000  Chen, H., Dumais, S., Bringing order to the web: automatically categorizing search results. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI00), ACM (2000) 145-152.  Rosenfeld, L., and Morville, P. Information Architecture for the World Wide Web. OReilly, 1998.  Sullivan, D., Proven Portals: Best Practices for Planning, Designing, and Developing Enterprise Portals. Addison Wesley, 200363 Copyright © 2005 Access Innovations, Inc.
  • Thank you! Questions? Alice Redmond-Neal Access Innovations, Inc. Data Harmony software Thesaurus Master and Machine Aided Indexer ared@accessinn.com (505) 998-080064 Copyright © 2005 Access Innovations, Inc.