Controlled VocabularyCreated on http://www.wordle.net
What is Controlled Vocabulary?“organized lists of words and phrases, or notation systems, that are used to initially tag content, and then to find it through navigation or search.”                                                             - Amy Warner“a controlled vocabulary is a type of metadata that functions as a “subset of natural language.”“Using a controlled vocabulary is also a way to overtly display relationships among the various concepts that your database covers in order to increase findability.”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
Why do we need controlled vocabulary?“When we organize our information and label it however, there is so much richness, variance, and confusion in terminology that we often need to impose some order to facilitate agreement between the concepts within the database and the vocabulary of the person using it.”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
Three Kinds Of Controlled Vocabularies In Use TodayThesauriSubjectHeadingOntologies
GAP Example “Let’s say that gap.com decided to offer search. They would somehow need to translate the natural language of search into the controlled language of the website. People search in the same language they speak, natural language, so a more advanced controlled vocabulary needs to take the concepts of your users (natural language) and match them to the concepts expressed in the language of your website (controlled vocabulary).”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
An Example to show how  subject heading lists and thesauri both provide structural hierarchies so that terms are presented in relation to their broader terms, narrower terms, and related terms. GAP ExamplePretend the GAP only sells pants and they have hundreds of types of pants.  In this case, the GAPmight require more from their controlled vocabulary.  Asystematic way to map out the different terms will help people quickly find the specific kinds of pants they are wanting to locate. The GAP will need a hierarchy showing the broader terms (BTs), the narrower terms (NTs), and the variant terms ( “USE” and “UF” for Used for). These will show which terms are subsets of larger, broader concepts. They will start off with a jumble of words that are all related to “pants” in some way. There is a bucket which we will call “Pants”, and inside are a lot of terms with a relationship to the concept of pants. “Pants” is the broader term, and the kinds of pants refer to subsets of the whole universe of pants.
What is the benefit of this kind of hierarchical arrangement? There is a lot you can do with this hierarchical arrangement. It can help you formulate your homepage navigation. It could improve your searching and browsing. It can help users broaden and narrow their search results quickly by showing them where each set of results fits into the site’s hierarchy Generally, few sites need to go beyond the level of a taxonomy.
Benefits of Controlled VocabularyCVs can help with category analysis or keeping your categories distinct.CVs can help establish a site’s navigation.CVs can be the basis for personalization features..CVs get the organization using the same language as the users (which should result in better communication with them).CVs can help the organization (and the user) understand what concepts your site covers. Your controlled vocabulary is in reality a “concept map” of what is on your site
Problems with Controlled VocabulariesTheyare a lot of workThey are often difficult and time consuming to maintain They can be very political. Authors have freedoms in choice of terms
Process of Creating A Controlled Vocabulary
General Principles For Creating Controlled VocabularySpecificity- the level of hierarchical depth in the conceptsLiterary warrant – terminology is added to a subject heading list or thesaurus when a new concept shows up in the information resources that need organizing and therefore needs to have specific terminology assigned to it. Direct entry – a concept should be entered into a vocabulary using the term that names it, rather than treating that concept as a subdivision of a broader concept.
Steps For Applying Controlled VocabularyDetermine aboutnessDetermine subject concepts are to be represented in the metadata recordSelect Important concepts are as targeted searches in controlled vocabulary
General Principles For Applying Controlled VocabulariesSpecific Entry – this allow the user to know when to stop searching for an appropriate controlled vocabulary term. Number of Terms Assigned - There should not be any limits on the number of terms or descriptors assigned to the concepts.  Concepts not in CV - If a concept is not present in the controlled vocabulary, it should be represented temporarily by a more general concept, rather than simply adding unauthorized terms to the record.  
Natural Languages or “uncontrolled” ApproachesNLP – Natural Language ProcessingTagging – “a process by which a distributed mass of users applies keywords to various types of web-based resources for the purposes of collaborative information organization and retrieval.”

Controlled Vocabulary

  • 1.
    Controlled VocabularyCreated onhttp://www.wordle.net
  • 2.
    What is ControlledVocabulary?“organized lists of words and phrases, or notation systems, that are used to initially tag content, and then to find it through navigation or search.” - Amy Warner“a controlled vocabulary is a type of metadata that functions as a “subset of natural language.”“Using a controlled vocabulary is also a way to overtly display relationships among the various concepts that your database covers in order to increase findability.”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
  • 3.
    Why do weneed controlled vocabulary?“When we organize our information and label it however, there is so much richness, variance, and confusion in terminology that we often need to impose some order to facilitate agreement between the concepts within the database and the vocabulary of the person using it.”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
  • 4.
    Three Kinds OfControlled Vocabularies In Use TodayThesauriSubjectHeadingOntologies
  • 5.
    GAP Example “Let’ssay that gap.com decided to offer search. They would somehow need to translate the natural language of search into the controlled language of the website. People search in the same language they speak, natural language, so a more advanced controlled vocabulary needs to take the concepts of your users (natural language) and match them to the concepts expressed in the language of your website (controlled vocabulary).”http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_
  • 6.
    An Example toshow how subject heading lists and thesauri both provide structural hierarchies so that terms are presented in relation to their broader terms, narrower terms, and related terms. GAP ExamplePretend the GAP only sells pants and they have hundreds of types of pants. In this case, the GAPmight require more from their controlled vocabulary. Asystematic way to map out the different terms will help people quickly find the specific kinds of pants they are wanting to locate. The GAP will need a hierarchy showing the broader terms (BTs), the narrower terms (NTs), and the variant terms ( “USE” and “UF” for Used for). These will show which terms are subsets of larger, broader concepts. They will start off with a jumble of words that are all related to “pants” in some way. There is a bucket which we will call “Pants”, and inside are a lot of terms with a relationship to the concept of pants. “Pants” is the broader term, and the kinds of pants refer to subsets of the whole universe of pants.
  • 7.
    What is thebenefit of this kind of hierarchical arrangement? There is a lot you can do with this hierarchical arrangement. It can help you formulate your homepage navigation. It could improve your searching and browsing. It can help users broaden and narrow their search results quickly by showing them where each set of results fits into the site’s hierarchy Generally, few sites need to go beyond the level of a taxonomy.
  • 8.
    Benefits of ControlledVocabularyCVs can help with category analysis or keeping your categories distinct.CVs can help establish a site’s navigation.CVs can be the basis for personalization features..CVs get the organization using the same language as the users (which should result in better communication with them).CVs can help the organization (and the user) understand what concepts your site covers. Your controlled vocabulary is in reality a “concept map” of what is on your site
  • 9.
    Problems with ControlledVocabulariesTheyare a lot of workThey are often difficult and time consuming to maintain They can be very political. Authors have freedoms in choice of terms
  • 10.
    Process of CreatingA Controlled Vocabulary
  • 11.
    General Principles ForCreating Controlled VocabularySpecificity- the level of hierarchical depth in the conceptsLiterary warrant – terminology is added to a subject heading list or thesaurus when a new concept shows up in the information resources that need organizing and therefore needs to have specific terminology assigned to it. Direct entry – a concept should be entered into a vocabulary using the term that names it, rather than treating that concept as a subdivision of a broader concept.
  • 12.
    Steps For ApplyingControlled VocabularyDetermine aboutnessDetermine subject concepts are to be represented in the metadata recordSelect Important concepts are as targeted searches in controlled vocabulary
  • 13.
    General Principles ForApplying Controlled VocabulariesSpecific Entry – this allow the user to know when to stop searching for an appropriate controlled vocabulary term. Number of Terms Assigned - There should not be any limits on the number of terms or descriptors assigned to the concepts.  Concepts not in CV - If a concept is not present in the controlled vocabulary, it should be represented temporarily by a more general concept, rather than simply adding unauthorized terms to the record.  
  • 14.
    Natural Languages or“uncontrolled” ApproachesNLP – Natural Language ProcessingTagging – “a process by which a distributed mass of users applies keywords to various types of web-based resources for the purposes of collaborative information organization and retrieval.”