Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction To Controlled Vocabularies


Published on

A basic introduction to taxonomies/controlled vocabularies, what they are and how they are used. Presented originally at the Society of Indexers conference, July 2008.

Published in: Technology, Business

Introduction To Controlled Vocabularies

  1. 1. Introduction to Controlled Vocabularies <ul><li>Presented by Fred Leise </li></ul><ul><li>ContextualAnalysis, LLC </li></ul><ul><li>The Round Table Conference </li></ul><ul><li>Society of Indexers </li></ul><ul><li>Sunday, July 13, 2008 </li></ul>
  2. 2. About Fred Leise <ul><li>Co-Founder and Chief Operating Officer, Intuitect (software for website creators) </li></ul><ul><li>ContextualAnalysis, LLC </li></ul><ul><ul><li>Specializing in metadata and controlled vocabulary development </li></ul></ul><ul><ul><li> </li></ul></ul>
  3. 3. About Fred Leise <ul><li>Recent Clients </li></ul><ul><ul><li>Scripps Newspapers </li></ul></ul><ul><ul><li>Disney Studios </li></ul></ul><ul><ul><li>Harpo, Inc. ( </li></ul></ul><ul><ul><li>Dow Corning </li></ul></ul><ul><ul><li>Abbott Laboratories </li></ul></ul>
  4. 4. About Fred Leise <ul><li>Freelance back-of-book indexer since 1995 </li></ul><ul><li>Scholarly texts in the humanities </li></ul><ul><li>President, American Society for Indexing </li></ul>
  5. 5. Goals for This Presentation <ul><li>Introduce basic concepts and terminology about controlled vocabularies </li></ul><ul><li>(Feel free to ask questions or contribute examples at any time) </li></ul>
  6. 6. Workshop Overview <ul><li>Indexes vs. Controlled Vocabularies (CVs) </li></ul><ul><li>An Introduction to CVs </li></ul><ul><li>Using CVs </li></ul><ul><li>Facets </li></ul><ul><li>CV Development Methodology Overview </li></ul><ul><li>CV Governance and Maintenance </li></ul>
  7. 7. Terminology Problem <ul><li>Taxonomy </li></ul><ul><ul><li>generally, any kind of controlled vocabulary </li></ul></ul><ul><ul><li>also a specific type of controlled vocabulary </li></ul></ul>
  8. 8. Indexes vs. CVs
  9. 9. Indexes vs. CVs <ul><li>Similarities </li></ul><ul><ul><li>Concept identification </li></ul></ul><ul><ul><li>Term selection </li></ul></ul>
  10. 10. Indexes vs. CVs <ul><li>Differences </li></ul>research reading Methodology months weeks Project time content tagging, website navigation, search enhancement content locator Use term list index End product CVs Indexes Item
  11. 11. About Controlled Vocabularies
  12. 12. <ul><li>H. Wellisch </li></ul><ul><ul><li>A list of terms that may be used for indexing, produced by the operation of vocabulary control. </li></ul></ul>What are Controlled Vocabularies?
  13. 13. <ul><li>F. Leise: </li></ul><ul><ul><li>A list of terms and term relationships designed to: </li></ul></ul><ul><ul><li>1. Collect similar information, </li></ul></ul><ul><ul><li>2. Assist content authors in consistently tagging content, and </li></ul></ul>What are Controlled Vocabularies?
  14. 14. <ul><li>F. Leise: </li></ul><ul><ul><li>3. Enable users to find the information they need by translating their language into the language of the information store. </li></ul></ul>What are Controlled Vocabularies?
  15. 15. Equivalence (Synonyms) <ul><li>Country = Nation </li></ul><ul><li>Chief of State = Prime Minister </li></ul><ul><li>Brunei = Sultanate of Brunei = Negara Brunei Darussalam = سلطنة بروناي = برني دارالسلام </li></ul>
  16. 16. Hierarchical Relationships <ul><li>Whole/Part </li></ul><ul><li>Automobile </li></ul><ul><ul><li>Air bags </li></ul></ul><ul><ul><li>Engine </li></ul></ul><ul><ul><li>Seats </li></ul></ul><ul><ul><li>Steering </li></ul></ul><ul><ul><li>Wheels </li></ul></ul>
  17. 17. Hierarchical Relationships <ul><li>Instances </li></ul><ul><li>Buildings </li></ul><ul><ul><li>Great Pyramid of Giza </li></ul></ul><ul><ul><li>Madison Square Garden </li></ul></ul><ul><ul><li>Petronas Towers </li></ul></ul><ul><ul><li>Sears Tower </li></ul></ul><ul><ul><li>Taipei 101 </li></ul></ul>
  18. 18. <ul><li>Examples </li></ul><ul><ul><li>operation/agent turning : lathes </li></ul></ul><ul><ul><li>occupation/person social work : social worker </li></ul></ul><ul><ul><li>causal dependence friction : wear </li></ul></ul><ul><ul><li>agent/counteragent pests : pesticides </li></ul></ul><ul><ul><li>concept/opposite tolerance : prejudice </li></ul></ul><ul><ul><li>concept/origin water : water wells </li></ul></ul>Associative Relationship
  19. 19. <ul><li>Synonym Ring </li></ul><ul><li>Words with equivalent meanings (in a given context) </li></ul><ul><ul><li>pound sterling = pound = quid </li></ul></ul><ul><ul><li>CD-ROM = CD = disk </li></ul></ul><ul><ul><li>chips = French fries </li></ul></ul><ul><ul><li>Houses of Parliament = Palace of Westminster </li></ul></ul>Types of CVs
  20. 20. <ul><li>Authority File </li></ul><ul><li>Has all the features of a synonym ring, plus preferred terms (approved terms/keywords) for tagging content. </li></ul>Types of CVs
  21. 21. Authority File: Alphabetical <ul><ul><li>community USE neighborhood </li></ul></ul><ul><ul><li>health and safety UF safety </li></ul></ul><ul><ul><li>levy USE tax </li></ul></ul><ul><ul><li>neighborhood UF community </li></ul></ul><ul><ul><li>parks UF recreation </li></ul></ul><ul><ul><li>rebate USE refund </li></ul></ul><ul><ul><li>recreation USE parks </li></ul></ul><ul><ul><li>refund UF rebate </li></ul></ul><ul><ul><li>safety USE health and safety </li></ul></ul><ul><ul><li>tax UF levy </li></ul></ul>
  22. 22. Authority File: Spreadsheet levy tax rebate refund recreation parks community neighborhood safety health and safety Variant Terms Preferred Term
  23. 23. <ul><li>Taxonomy </li></ul><ul><li>Also called hierarchy </li></ul><ul><li>All features of authority files, plus: </li></ul><ul><li>Broader terms (BT) </li></ul><ul><li>Narrower terms (NT) </li></ul>Types of CVs
  24. 24. <ul><li>Taxonomy </li></ul><ul><li>All terms must be part of a hierarchical relationship (no orphan terms). </li></ul><ul><li>May be presented in indented (hierarchical) or alphabetical format. </li></ul>Types of CVs
  25. 25. <ul><li>total compensation . compensation . . base salary [salary] . . deferred payments [deferred compensation] . . variable pay . benefits . . 401(k) plan . . health benefits . . . dental plan . . . disability insurance </li></ul>Taxonomy Example (Indented)
  26. 26. <ul><li>401(k) plan BT benefits </li></ul><ul><li>base salary BT compensation UF salary </li></ul><ul><li>benefits BT total compensation NT 401(k) plan; health benefits </li></ul><ul><li>compensation BT total compensation NT base salary; deferred payments; variable pay </li></ul><ul><li>deferred compensation USE deferred payments </li></ul><ul><li>deferred payments BT compensation UF deferred compensation </li></ul><ul><li>dental plan BT health benefits </li></ul><ul><li>disability insurance BT health benefits </li></ul><ul><li>health benefits BT benefits NT dental plan; disability insurance </li></ul><ul><li>salary USE base salary </li></ul><ul><li>total compensation NT benefits; compensation </li></ul><ul><li>variable pay BT compensation </li></ul>Taxonomy Example (Alpha List)
  27. 27. <ul><li>Thesaurus (pl. thesauri) </li></ul><ul><li>All the features of taxonomies, plus the associative relationship of related terms (RT) </li></ul>Types of CVs
  28. 28. <ul><li>Building Permits BT Permits </li></ul><ul><li>Business Licenses BT Licenses </li></ul><ul><li>Business Taxes BT Taxes </li></ul><ul><li>Fees BT Licenses, Permits & Taxes; RT Taxes </li></ul><ul><li>Licenses BT Licenses, Permits & Taxes; NT Business Licenses; RT Permits </li></ul><ul><li>Operating Permits BT Permits </li></ul><ul><li>Permits BT Licenses, Permits & Taxes; NT Building Permits, Operating Permits; RT Licenses </li></ul><ul><li>Taxes BT Licenses, Permits & Taxes; NT Business Taxes RT Fees </li></ul>Thesaurus: Alphabetical
  29. 29. Thesaurus: Indented   Business Taxes . . Fees   Taxes .   Operating Permits . .   Building Permits . . Licenses   Permits .   Business Licenses . . Permits   Licenses . Taxes   Fees .     Licenses, Permits & Taxes Related Terms Vocabulary Terms
  30. 30. <ul><li>Synonym Ring </li></ul><ul><li>+ preferred terms </li></ul><ul><li>= Authority File </li></ul><ul><li>+ broader/narrower terms </li></ul><ul><li>= Taxonomy </li></ul><ul><li>+ related terms </li></ul><ul><li>= Thesaurus </li></ul>Types of CVs—Summary
  31. 31. <ul><li>International standards </li></ul><ul><ul><li>ISO 2788:1986 Guidelines for the Establishment and Development of Monolingual Thesauri (BS 5723: 1987) </li></ul></ul><ul><ul><li>ISO 5964:1985 Guidelines for the Establishment and Development of Multilingual Thesauri (BS 6723: 1985) </li></ul></ul><ul><ul><li> ; </li></ul></ul>CV Construction Standards
  32. 32. <ul><li>National standards </li></ul><ul><ul><li>BS 8723-3:2007 Structured vocabularies for information retrieval. Guide. Vocabularies other than thesauri </li></ul></ul><ul><ul><li>BS 8723-2:2005 Structured vocabularies for information retrieval. Guide. Thesauri </li></ul></ul><ul><ul><li> </li></ul></ul>CV Construction Standards
  33. 33. Polyhierarchies <ul><li>Terms live in multiple categories, have multiple parent/child relationships </li></ul><ul><ul><li>Sultanates Countries </li></ul></ul><ul><ul><li>Audhali Albania </li></ul></ul><ul><ul><li>Brunei Brunei </li></ul></ul><ul><ul><li>Oman China </li></ul></ul>
  34. 34. Using Controlled Vocabularies
  35. 35. <ul><li>Organizes content using CV terms as category labels </li></ul><ul><li>Represents vocabulary hierarchy by browsing levels </li></ul>Navigation Taxonomy
  36. 36. Level 1 Level 2 Level 3
  37. 37. <ul><li>Offers options for expanding or reducing scope of search using broader or narrower terms </li></ul><ul><li>Differentiates between multiple meanings of terms </li></ul>Search Enhancement
  38. 38. CV Use: Search Results
  39. 39. <ul><li>Synonym Ring </li></ul><ul><li>During search, when one of the words in a synonym ring is searched for, the search engine returns items containing any of the words in the ring. </li></ul><ul><ul><li>“biscuit” = “cookies” </li></ul></ul>Search Enhancement
  40. 40. Facets
  41. 41. Facets <ul><li>First introduced by S. J. Ranganathan in the early 1930s. </li></ul><ul><ul><li>, Personality What is it? </li></ul></ul><ul><ul><li>; Matter What is it made of? </li></ul></ul><ul><ul><li>: Energy What action is it performing? </li></ul></ul><ul><ul><li>. Space Where is it? </li></ul></ul><ul><ul><li>‘ Time When is it? </li></ul></ul>
  42. 42. Facets <ul><li>&quot;research in the cure of tuberculosis of lungs by x-ray conducted in India in 1950&quot; </li></ul><ul><ul><li>L,45;421:6;253:f.44'N5 </li></ul></ul><ul><li>Components of this call number </li></ul><ul><ul><li>Medicine,Lungs;Tuberculosis:Treatment;X-ray:Research.India'1950 </li></ul></ul><ul><ul><li>P,P;M:E;M:E.S’T </li></ul></ul>
  43. 43. <ul><li>Fundamental categories by which an object or concept may be described </li></ul><ul><li>Example: facets describing a ball: </li></ul><ul><ul><li>size, weight, shape, color, texture, material </li></ul></ul><ul><li>What are some other possible facets describing this ball? </li></ul>Facets
  44. 44. <ul><li>Used for Browsing Hierarchies </li></ul><ul><li>Facets allow users to follow the path best matching the way they think. </li></ul><ul><li>Allows multiple paths to same information. </li></ul><ul><li>Example: > recipes > browse </li></ul>Facets
  45. 46. Laptop Search
  46. 47. Advanced Search
  47. 48. Facets <ul><li>Reference </li></ul><ul><li>Louise Spiteri, “A Simplified Model for Facet Analysis,” in Canadian Journal of Information and Library Science v23, 1-30 (April-July 1998). </li></ul><ul><li>Available at: </li></ul>
  48. 49. CV Maintenance
  49. 50. CV Maintenance <ul><li>Possible Taxonomy Changes </li></ul><ul><li>Add/delete facet </li></ul><ul><li>Modify facet label </li></ul><ul><li>Reorganize hierarchy </li></ul><ul><li>Add/delete taxonomy term </li></ul><ul><li>Revise taxonomy term </li></ul><ul><li>Add/delete related term relationships </li></ul>
  50. 51. CV Maintenance <ul><li>Change Control Process </li></ul><ul><ul><li>Submit the changes for approval </li></ul></ul><ul><ul><li>Decide on changes to be made </li></ul></ul><ul><ul><li>SME validation? </li></ul></ul><ul><ul><li>Approve the changes </li></ul></ul><ul><ul><li>Edit the vocabulary file to make the changes </li></ul></ul>
  51. 52. CV Maintenance <ul><li>Process for Submitting New Terms </li></ul><ul><ul><li>Who submits proposed terms? </li></ul></ul><ul><ul><li>How are proposed terms submitted? </li></ul></ul><ul><ul><li>Who receives/processes term submissions? </li></ul></ul><ul><ul><li>Are provisional terms as part of tagging process allowed? </li></ul></ul>
  52. 53. Speed of Change <ul><li>Facets </li></ul><ul><li>Hierarchy </li></ul><ul><li>Terms </li></ul>Slowest Fastest
  53. 54. Impact of Change <ul><li>Facets </li></ul><ul><li>Hierarchy </li></ul><ul><li>Terms </li></ul>Most Least
  54. 55. CV Review <ul><li>Establish regular reviews of all existing taxonomies </li></ul><ul><ul><li>Frequency </li></ul></ul><ul><ul><li>Responsibility </li></ul></ul>
  55. 56. CV Ownership <ul><li>Department/business unit responsible for taxonomy maintenance </li></ul><ul><li>Teams/individuals responsible for maintenance </li></ul>
  56. 57. Resources <ul><li>Fast, Karl, Mike Steckel and Fred Leise: series of articles on CVs and facets for </li></ul><ul><ul><li>“ Controlled Vocabularies: A Glosso-Thesaurus” available at: archives/controlled_vocabularies_a_ glossothesaurus.php </li></ul></ul>
  57. 58. Contact Information <ul><li>Fred Leise </li></ul><ul><li>6530 N. Greenview Ave. </li></ul><ul><li>Chicago, IL 60626 </li></ul><ul><li>773.791.2849 </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li>[email_address] </li></ul>