Published on

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Thesauri, Controlled Vocabularies, and Metadata Information Architecture
  2. 2. Why? <ul><li>A way to view the network of relationships between the IA systems </li></ul><ul><li>Glue that holds the systems together </li></ul>
  3. 3. Metadata <ul><li>“ Data about data” </li></ul><ul><li>Provide information about or documentation of other data managed with an application or environment. </li></ul><ul><li>For example: data about elements or attributes (name, size, data type, etc.) </li></ul>
  4. 4. Metadata <ul><li>Metadata tags are used to describe documents, pages, images, software, video and audio files. And other content objects for the purpose of improving navigation and retrieval. </li></ul><ul><li>Example: </li></ul><ul><li><META name=“keywords” content=“”information Architecture, content management, knowledge management, user experience”> </li></ul>
  5. 5. Metadata <ul><li>Metadata driven web-sites take advantage of : </li></ul><ul><ul><li>Content management software </li></ul></ul><ul><ul><li>Controlled vocabulary </li></ul></ul><ul><li>We need to describe the documents and the software and vocabulary take care of the rest. </li></ul>
  6. 6. Controlled Vocabulary
  7. 7. Controlled Vocabularies <ul><li>A controlled vocabulary is a list of equivalent terms in the form of synonym ring, or list of preferred terms in the form of an authority file. </li></ul><ul><li>A subset of natural language </li></ul>
  8. 8. Types of Controlled Vocabularies Simple Complex (Relationships) (Vocabularies) Synonym Rings Authority Files Classification Schemes Thesauri Equivalence Hierarchical Associative
  9. 9. Controlled Vocabularies <ul><li>Synonym Rings </li></ul><ul><ul><li>connects a set of words that are defined as equivalent for the purpose of retrieval. </li></ul></ul>Cuisinart Food processor blender Kitchen aid Kitchenaid Cuizinart
  10. 10. Synonym Rings <ul><li>Pros: </li></ul><ul><li>Help users to locate information using different terms </li></ul><ul><li>Can be easily implemented using standard capabilities of search engines </li></ul><ul><li>Increases recall </li></ul><ul><li>Cons: </li></ul><ul><li>Users can be confused by results that actually don’t include their keywords. </li></ul><ul><li>Might reduce precision </li></ul>
  11. 11. Synonym Rings <ul><li>Recall Precision trade-off </li></ul>
  12. 12. Authority Files <ul><li>Authority Files </li></ul><ul><ul><li>It is a list of preferred terms or acceptable values. </li></ul></ul><ul><ul><li>It may include variants or synonyms </li></ul></ul><ul><li>Authority files are synonym rings in which one term has been defined as preferred term . </li></ul>
  13. 13. Authority Files <ul><li>Example: A list of U.S. states </li></ul><ul><ul><li>AL :: Alabama </li></ul></ul><ul><ul><li>AK :: Alaska </li></ul></ul><ul><ul><li>AZ :: Arizona </li></ul></ul><ul><ul><li>AR :: Arkansas </li></ul></ul><ul><ul><li>. </li></ul></ul><ul><ul><li>. </li></ul></ul><ul><ul><li>. </li></ul></ul>
  14. 14. Authority Files <ul><li>Pros </li></ul><ul><li>Can be a tool for improving consistency among content authors and indexers. </li></ul><ul><li>Can be used to “educate” users. </li></ul><ul><li>Preferred terms are useful for labeling and navigation </li></ul><ul><li>Cons </li></ul><ul><li>If equivalent terms begin with different letters, preferred terms must be complemented with links to other terms. </li></ul><ul><ul><li>Example: </li></ul></ul><ul><ul><li>Aspirin see Bayer </li></ul></ul>
  15. 15. Classification Schemes <ul><li>Classification Schemes or taxonomies </li></ul><ul><ul><li>Is a hierarchical arrangement of preferred terms. </li></ul></ul><ul><ul><li>Examples: </li></ul></ul><ul><ul><ul><li>Dewey Decimal Classification (DDC) </li></ul></ul></ul><ul><ul><ul><li>Yahoo! Hierarchy of categories </li></ul></ul></ul>
  16. 16. Controlled Vocabulary <ul><li>Thesauri </li></ul><ul><ul><li>“ A thesaurus is a controlled vocabulary in which equivalence, hierarchical, and associative relationships are identified for purposes of improved retrieval.” </li></ul></ul>
  17. 17. Controlled Vocabulary Associative Relationship Preferred Term Broader Term Variant Term Variant Term Related Term Related Term Narrower Term Associative Relationship Hierarchical Relationship Hierarchical Relationship Equivalence Relationship Equivalence Relationship
  18. 18. Controlled Vocabulary <ul><li>Technical Lingo </li></ul><ul><ul><li>Preferred term (PT) </li></ul></ul><ul><ul><li>Variant Term (VT) </li></ul></ul><ul><ul><li>Broader Term (BT) </li></ul></ul><ul><ul><li>Narrower Term (NT) </li></ul></ul><ul><ul><li>Related Term (RT) </li></ul></ul><ul><ul><li>Use (U) </li></ul></ul><ul><ul><li>Used For (UF) </li></ul></ul><ul><ul><li>Scope Note (SN) </li></ul></ul>
  19. 19. Controlled Vocabulary <ul><li>Examples of Thesaurus in web design </li></ul><ul><ul><li>PubMed (National Library of Medicine) </li></ul></ul>
  20. 20. Types of Thesauri Searching Thesaurus No tagging of content Can enrich queries Indexing Thesaurus Enables browsable indexes value untapped By search No Thesaurus Natural language search Classic Thesaurus High-end full Function tool Thesaurus Used in Indexing Thesaurus used in searching
  21. 21. Semantic Relationships <ul><li>Equivalence </li></ul><ul><ul><li>Connects preferred terms and their variants. </li></ul></ul><ul><ul><li>Example: </li></ul></ul><ul><ul><ul><li>Preferred Term </li></ul></ul></ul><ul><ul><ul><li>Aspirin </li></ul></ul></ul><ul><ul><ul><li>Variant Terms </li></ul></ul></ul><ul><ul><ul><li>Acetysal, Acetylsalicylic Acid, ASA, Bayer, Polopirin </li></ul></ul></ul>A = B
  22. 22. Semantic Relationships <ul><li>Hierarchical </li></ul><ul><ul><li>Divides up the information space into categories and subcategories. </li></ul></ul><ul><ul><li>Subtypes: </li></ul></ul><ul><ul><ul><li>Generic </li></ul></ul></ul><ul><ul><ul><li>Whole-part </li></ul></ul></ul><ul><ul><ul><li>Instance </li></ul></ul></ul>A B
  23. 24. Semantic Relationships <ul><li>Associative </li></ul><ul><ul><li>Strongly implied semantic connections that aren’t capture within equivalence or hierarchical relationships </li></ul></ul><ul><ul><li>Examples: </li></ul></ul><ul><ul><ul><li>Field and object of study : Cardiology RT Heart </li></ul></ul></ul><ul><ul><ul><li>Process and its agent : Termite Control RT Pesticides </li></ul></ul></ul><ul><ul><ul><li>Concepts and properties : Poison RT Toxicity </li></ul></ul></ul><ul><ul><ul><li>Action and product : Eating RT Indigestion </li></ul></ul></ul><ul><ul><ul><li>Causal dependency : Celebration RT New Year’s Eve </li></ul></ul></ul>B A
  24. 25. Preferred Terms <ul><li>Term form </li></ul><ul><ul><li>Grammatical form: Usually nouns </li></ul></ul><ul><ul><li>Spelling: Most common spelling form employed by users </li></ul></ul><ul><ul><li>Singular and Plural form: </li></ul></ul><ul><ul><ul><li>count nouns in plural (, roads, maps) </li></ul></ul></ul><ul><ul><ul><li>Conceptual nouns in singular (i.e. math) </li></ul></ul></ul><ul><ul><li>Abbreviations and acronyms: default to popular use. </li></ul></ul>
  25. 26. Preferred Terms <ul><li>Term Selection </li></ul><ul><ul><li>Term selection should be guided by your goals and how the thesaurus will integrate with your web site. </li></ul></ul>
  26. 27. Preferred Terms <ul><li>Term Definition </li></ul><ul><ul><li>Extreme specificity – we want to control vocab </li></ul></ul><ul><ul><ul><li>Examples </li></ul></ul></ul><ul><ul><ul><ul><li>Cells (biology) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Cells (electric) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Cells (prison) </li></ul></ul></ul></ul>
  27. 28. Preferred Terms <ul><li>Term specificity </li></ul><ul><ul><li>Whether to use pre-coordination of terms or not. </li></ul></ul><ul><ul><ul><li>For example: </li></ul></ul></ul><ul><ul><ul><ul><li>“ Knowledge Management Software” </li></ul></ul></ul></ul><ul><ul><ul><ul><li>OR </li></ul></ul></ul></ul><ul><ul><ul><ul><li>“ Knowledge Management” </li></ul></ul></ul></ul><ul><ul><ul><ul><li>“ Software” </li></ul></ul></ul></ul><ul><ul><li>Decision depends on your context. </li></ul></ul>
  28. 29. Polyhierarchy <ul><li>Polyhierarchy allows multiple parents for a single node. </li></ul>Diseases Respiratory Track Infections Viral Pneumonia Virus Diseases
  29. 30. Faceted Classification <ul><li>Invented by Shiyali R. Ranganathan in 1930. </li></ul><ul><li>Main principle: </li></ul><ul><ul><li>Documents and objects have multiple dimensions, or facets . </li></ul></ul>
  30. 31. Faceted Classification <ul><li>The faceted classification uses multiple taxonomies that focus on different dimensions of the content. </li></ul>
  31. 32. Faceted Classification <ul><li>Ranganathan’s universal facets: </li></ul><ul><ul><li>Personality </li></ul></ul><ul><ul><li>Matter </li></ul></ul><ul><ul><li>Energy </li></ul></ul><ul><ul><li>Space </li></ul></ul><ul><ul><li>Time </li></ul></ul>
  32. 33. Faceted Classification <ul><li>Most common facets used in the business world: </li></ul><ul><ul><li>Topic </li></ul></ul><ul><ul><li>Product </li></ul></ul><ul><ul><li>Document Type </li></ul></ul><ul><ul><li>Audience </li></ul></ul><ul><ul><li>Geography </li></ul></ul><ul><ul><li>Price </li></ul></ul>
  33. 34. Facted Classification <ul><li>Example of a faceted classification in a web site: </li></ul>Facet Sample controlled vocabulary values Type Red, white, sparkling, Pink, Dessert Region (origin) Australian, Californian, French, Italian Winery (manufacturer) Blackstone, Clos du Bois, Cakebread Year 1969, 1990, 199, 2000, 2001, 2002 Price $3.99, $29.99, <$199, Cheap, Moderate, Expensive
  34. 35. Faceted Classification <ul><li>More information about faceted classification: </li></ul><ul><ul><li>KMconnection: </li></ul></ul><ul><ul><li>Presentation of Faceted Classification </li></ul></ul><ul><ul><li>Innovation in classification: </li></ul></ul>