The MoTIF Project: Constructing a Pilot Thesaurus of Irish Folklore Using Facet Analysis - Catherine Ryan

  • 181 views
Uploaded on

Presented to the Annual Seminar of the Library Association of Ireland’s Cataloguing and Metadata Group, 8th of November, 2013.

Presented to the Annual Seminar of the Library Association of Ireland’s Cataloguing and Metadata Group, 8th of November, 2013.

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
181
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • MoTIF is a collaborative project undertaken by the Digital Repository of Ireland and the National Library of Ireland.It produced a set of guidelines on thesaurus construction as a resource for librarians, archivists and other information professionals who wish to organise and annotate their content for improved search and retrieval. The guidelines are accompanied by a pilot thesaurus of Irish folklore which acts as a illustrative example, a visual demonstration of the principles and processes outlined in the guidelines.Both guidelines and pilot have been submitted for review and will be published in December 2013.
  • Controlled vocabularies are restricted lists of terms used to provide consistency across search, remove any ambiguity between terms and improve search precision. They may contain equivalence relationships such as USE, Use For, see reference types but don’t have to.
  • Taxonomies are controlled vocabularies with hierarchical relationships, which can be used for browsing up and down a tree or navigating a website.
  • Thesauri are controlled vocabularies with hierarchical, associative and equivalence relationships that offer all the benefits previously described but can also make more connections between terms using associative relationships, allow search over non-preferred terms using equivalence relationships, and clarify the meaning of terms using definitions and scope notes.Now, definitions can overlap but that, broadly speaking, is how they work.
  • Following the Digital Archiving in Ireland DRI report, an opportunity was identified to produce guidelines which would give professionals the advice they need to improve their own data practices by adhering to international standards and best practices.
  • The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  • The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  • The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  • The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  • The construction process in brief
  • Step 1 involves the selection and recording of terms
  • Step 2: determining structure and display of the thesaurus, if it will be organised by subject or by facet.
  • Step 3: the facet analysis itself
  • Step 4: creating relationships and notes within the now complete structure
  • Step 5: creating an alphabetical list (if desired)
  • This will then be followed by expert review and
  • Documentation.
  • =thesaurus.
  • The correct form of these terms was then chosen.ISO 25964-1 sets out guidelines on the form that term should take when entered into the thesaurus. Nouns are the most common form you will encounter. Verbs are the next most common and will take the gerund or verbal forms (usually ending in –ing)Adjectives, adverbs (usually ending in –ly) and articles are to be avoided. Some adjectives were included in the pilot thesaurus as they did pop up as significant in the literature. For example, the significance of wearing red on a particular day might be discussed as part of the lore of a particular area.
  • The initial list of terms is all over the place and a systematic structure needs to be developed to order them in a logical way.
  • A systematic structure can be though of as a hierarchical or classified display with subjects or facets at the top of the hierarchies.At this stage a hierarchical display was chosen with fundamental facets as the main divisions, or top concepts as this structure is more easily updated, and it is a good demonstration of the ISO standard rules for hierarchical arrangement, that broader and narrower terms should be one of three different types of relationships: a thing/kind of a thing relationship, all concepts in a the objects facet will be a kind or type of object a whole and its parts relationship, the human anatomy will have hands, heads and so on a narrower terms or a narrower term should be an instance of the broader term. So, for example, the class ‘dogs’ would have a narrower term ‘Spot’It’s important to emphasise that we didn’t structure the thesaurus at this stage, we only made the decision on the design of the scaffolding. More detailed elements of the systematic structure were only determined during the vocabulary analysis.
  • We used the method of facet analysis which is theanalysis of a subject area into its constituent concepts which are then grouped into facets.The ISO thesaurus standard defines a facet as a ‘grouping of concepts of the same inherent category’. Object, materials, people, places and so on are known as fundamental facets. These fundamental categories of facets were first devised by Ranganathan as part of a library classification scheme in the 1920s and 1930s. Ranganathan proposed five categories, Personality, Matter, Energy, Space and Time, or PMEST, which could cover all aspects of a discipline or subject. These were later expanded by Brian Vickery for the Classification Research Group (CRG) based on the Aristotelian fundamental categories—thing, kind, part, property, material, process, operation, agent, patient, product, by-product, space and time. The CRG went on to state that these categories act as guides to analysis and should not be imposed on subjects Ultimately the choice of facets will depend on the subject matter and what is most practical.
  • What was most practical for the pilot were the facets listed above. How to organise..
  • ...a jumble of words into...
  • ...intolists of basic coherent facets. For example, in the literature, an agent is a person or piece of equipment which carries out transitive actions, i.e. actions that require a direct object. Following this, animals, fish and people were placed under the Agents facet as these were living creatures which can perform actions and can have an effect on the environment around them.The category also includes supernatural beings and creatures. Other living organisms were originally located under a separate Living Entities facet. In the end, the decision was taken to include all living organisms, from people through to mythical beings and plants under the Agents facet as it made more sense to keep these all living entities together. It is also arguable that, in folklore, some plants, trees and other such living entities have the potential to perform actions or have an effect on others. So that made sense in the context of folklore. It may not in another. Rather than confuse people, equipment was then put into objects. The guidelines go through a few more tricky decisions and they also outline the scope of each fundamental facet as defined for the pilot thesaurus.
  • Facet analysis IIOnce the initial analysis has been completed and all terms grouped, the facets were then grouped into narrower divisions, using node labels to divide the facets into sub-facets and to organise them according to the principles, or characteristics of division. In the above example, the Agent facets has sub-facets, people, animals, other living organisms and supernatural beings. The animal sub-facet is then organised according to their characteristics of division, in this case animals by function, by species and so on. This is exactly the kind of division you would see on say a fashion website where shirts are organised by size, by colour and so on.
  • Once the hierarchies were completed and input into the thesaurus management software, associative relationships were added. This is the process recommended by the ISO standard as the most useful associative relationships are usually across hierarchies and so this is easier to do once those hierarchies have been established.These are examples of the most common type of relationships created across hierarchies, so we have agents relating to their activities, materials referring to their products, objects with parts, etc. It should also be noted that these relationships are reciprocal, so they refer to each other.
  • After that, example scope notes were added to the pilot thesaurus to explain concepts. Like the relationships, the scope notes present in the pilot thesaurus should be considered as illustrative examples as this was as much as the time frame of the project would allow.
  • Two lists, alphabetical and hierarchical, were then generated within the software and exported. These formed the basis of the print version of the thesaurus.An electronic version of the software also exists and it contains both hierarchical and alphabetical displays which can be browsed. It can also be searched by keyword.

Transcript

  • 1. Constructing a Thesaurus of Irish Folklore Using Facet Analysis The MoTIF Project LAI CMG Annual Seminar, November 8 2013
  • 2. Project Aims Thesaurus Construction Guidelines MoTIF: Pilot Thesaurus of Irish Folklore LAI CMG Annual Seminar, November 8 2013
  • 3. Restricted List of Terms Consistency, remove ambiguity, improve precision Controlled Vocabularies LAI CMG Annual Seminar, November 8 2013
  • 4. Restricted List of Terms Consistency, remove ambiguity, improve precision Browsing, navigating Taxonomies LAI CMG Annual Seminar, November 8 2013 Hierarchical relationships
  • 5. Restricted List of Terms Consistency, remove ambiguity, improve precision Browsing, navigating Synonyms, antonyms, making connections , definitions, scope Thesauri Hierarchical relationships Equivalence Relationships, Associative Relationships, Scope Notes LAI CMG Annual Seminar, November 8 2013
  • 6. Custom vocabularies Adapted vocabularies International standards and best practice LAI CMG Annual Seminar, November 8 2013
  • 7. Guidelines Literature review Main elements of a thesaurus Terms and concepts Relationships (USE, UF, BT, NT, RT) Notes, node labels and arrays Facet analysis Construction process LAI CMG Annual Seminar, November 8 2013
  • 8. Guidelines Literature review Main elements of a thesaurus Terms and concepts Relationships (USE, UF, BT, NT, RT) Notes, node labels and arrays Facet analysis Construction process LAI CMG Annual Seminar, November 8 2013
  • 9. Guidelines Literature review Main elements of a thesaurus Terms and concepts Relationships (USE, UF, BT, NT, RT) Notes, node labels and arrays Facet analysis Construction process LAI CMG Annual Seminar, November 8 2013
  • 10. Guidelines Literature review Main elements of a thesaurus Terms and concepts Relationships (USE, UF, BT, NT, RT) Notes, node labels and arrays Facet analysis Construction process LAI CMG Annual Seminar, November 8 2013
  • 11. MoTIF Construction Process 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 12. MoTIF Construction Process 1. Selection of terms 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 13. MoTIF Construction Process 1. Selection of terms 2. Structure 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 14. MoTIF Construction Process 1. Selection of terms 2. Structure 3. Facet Analysis 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 15. MoTIF Construction Process 1. Selection of terms 2. Structure 3. Facet Analysis 4. Relationships 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 16. MoTIF Construction Process 1. Selection of terms 2. Structure 3. Facet Analysis 4. Relationships 5. Alphabetical List 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 17. MoTIF Construction Process 1. Selection of terms 2. Structure 3. Facet Analysis 4. Relationships 5. Alphabetical List 6. Expert Review 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 18. MoTIF Construction Process 1. Selection of terms 2. Structure 3. Facet Analysis 4. Relationships 5. Alphabetical List 6. Expert Review 7. Documentation 7 3 1 2 8 6 4 5 LAI CMG Annual Seminar, November 8 2013
  • 19. MoTIF Construction Process 1. Selection 2. Structure 3. Facet Analysis 4. Relationships 5. Alphabetical List 6. Expert Review 7. Documentation 7 3 1 2 8 6 Thesaurus 5 LAI CMG Annual Seminar, November 8 2013 4
  • 20. Term Selection I Vocabulary Resources A Handbook of Irish Folklore by Seán Ó Suilleabháin Bealoideas: Journal of the Folklore of Ireland Society LAI CMG Annual Seminar, November 8 2013
  • 21. Term Selection II Form of Entry Nouns: count nouns (cows, dogs) in the plural, non-count (livestock, milk) in the singular. Verbs: gerund or verbal, no infinitive. Adjectives: avoid unless significant. Adverbs: avoid. Articles (the, a): avoid. LAI CMG Annual Seminar, November 8 2013
  • 22. LAI CMG Annual Seminar, November 8 2013
  • 23. Systematic Structure Hierarchical (facet) or classified (subject) display Fundamental facets as top concepts (TT). Easily updated structure. Good demonstration of the ISO 25964 hierarchy rules. Thing/kind Whole/part Particular instances of a class LAI CMG Annual Seminar, November 8 2013
  • 24. Facet Analysis ISO: “grouping of concepts of the same inherent category” Objects, materials, people, places, etc. Ranganathan, 1920s and 1930s Personality, Matter, Energy, Space, Time Classification Research Group, 1960s Thing, kind, part, property, material, process, operation, agent, patient, product, by-product, space and time LAI CMG Annual Seminar, November 8 2013
  • 25. Time Place / Space / Environment Products Activities Processes and Phenomena Events Agents Objects Materials Attributes and Properties Parts Genre Abstract Entities and Concepts LAI CMG Annual Seminar, November 8 2013
  • 26. LAI CMG Annual Seminar, November 8 2013
  • 27. LAI CMG Annual Seminar, November 8 2013
  • 28. LAI CMG Annual Seminar, November 8 2013
  • 29. LAI CMG Annual Seminar, November 8 2013
  • 30. LAI CMG Annual Seminar, November 8 2013
  • 31. LAI CMG Annual Seminar, November 8 2013
  • 32. Current and Future Work Expansion of the pilot thesaurus to approximately 2,000 preferred terms Feasibility study into representation in SKOS Potential Future Work Multilingual thesaurus (Irish and English) Representation in SKOS Mapping to other vocabularies LAI CMG Annual Seminar, November 8 2013
  • 33. Thank you! UF UF UF Thanks! Cheers! Ta! LAI CMG Annual Seminar, November 8 2013