Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Subject Headings make information to be topic maps


Published on

This paper reports the efforts to make topic maps from Subject Headings (SHs) and discuss practical use of them for organizing information and knowledge. SHs are often maintained by libraries and used in bibliographic records. SHs are thesauri and they are well organized. Fortunately some SHs are published on the Web. We transformed them to topic maps. Usually each subject in SHs has own ID. It can play PSI role. By keeping the relationships included in SHs such as Broader-Narrower, Related, USE-UF etc in topic maps, information or knowledge can be linked together and organized according to the structure of SHs. In other words, by using SHs information and knowledge can be topic maps easily.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Subject Headings make information to be topic maps

  1. 1. Subject Headings make information to be topic maps 2010-9-30 Motomu Naito Center for Integrated Area Studies (CIAS) Kyoto University Ψ
  2. 2. Outline 1.Back ground 2.Purpose 3.Subject Headings 3 .1 NDLSH 3 .2 LCSH 4.Practical use of Subject Headings 5.Demo 6.Challenges 7.Conclusion & Future work 1
  3. 3. 1. Background: Area Study and Area Informatics This activity is a part of activities of Area Informatics in Center for Integrated Area Study (CIAS) in Kyoto university  Area Study is an Interdisciplinary Science  Understanding/comparing areas comprehensively  Diverse languages/subjects/disciplines/methodologies: • history, literature, religions, politics, economics, ethnology, folklore, agriculture, environment, etc.  Area Informatics  Informatics paradigm in area studies  Focusing on quantitative analysis • Objective, comparative and reproducible approaches • Spatiotemporal attributes of events  Knowledge discovery supports • Integration of disciplines • Creation of hypotheses Source: Shoichiro Hara, TMJP2010,
  4. 4. Model of Area Informatics Source: Shoichiro Hara, TMJP2010
  5. 5. 2.Purpose - Making and maintaining well organized knowledge is very hard and time consuming work - There have been many well organized knowledge (ex: NDLSH, BSH, LCSH, JST thesaurus, etc.) - Fortunately some Subject Headings (SHs) are published on the web and we can use them (ex: NDLSH, LCSH) Purpose of our activity: To make good system for linking and organizing Area Studies related information Purpose of today’s presentation: To report and discuss about our efforts to make topic maps and PSI from SHs 4
  6. 6. 3.Subject Headings What is Subject Headings: Wikipedia redirects “Subject Headings” to “Index term” and define the term as “An index term, subject term, subject heading, or descriptor, in information retrieval, is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records.” ( ・We are working on the following SHs at the moment - NDLSH, BSH and LCSH ・Probably we can find much more SHs in various countries - German SH, Norwegian SH, Finnish SH, Thai SH, etc. 5
  7. 7. 3.1 NDLSH ・ NDLSH: National Diet Library Subject Headings, in Japan ・We are making topic map from NDLSH 2008 Version - Subject Headings:17,953 - Subject Headings + Reference words:47,816 (47,377) - BT-NT relation:13,220 RT relation: 9,738 - USE-UF relation with LCSH: 11,663 ・Conversion from the SH to Topic Map - Subject Headings -> Topics - BT-NT, RT, USE-UF relation -> Associations - USE-UF, SA relation, Scope note, reading, … -> Occurrences ・ SHs have each own ID that can be used as PSI (e.g. 00574308) ・ If NDLSH shares PSI with LCSH, it can be merged with LCSH ・ NDLSH was exposed on the Web We can download it from 6
  8. 8. Some part of NDLSH Subject Headings around “ビール: Beer” 7
  9. 9. Origianal data NDLSH is provided as TSV (Tab Separated Value) format data ビール ビール〈地理区分〉 ID:00560674 UF:ビヤ ; 麦酒〔バクシュ〕 ; Beer BT:洋酒〔ヨウシュ〕{00574373} RT:ホップ{00563417} ; ※麦芽〔バク ガ〕{00560487}NDC(9):588.54 NDLC:DL687;PA416 ビールス ビールス USE:ウイルス{00560678} ビールスショウ ビールス症 USE:ウイルス感染症〔ウイルスカンセンショウ〕 {00560679} ビールゾク ※ビール族 ID:00575193 UF:Bhil (Indic people) NDC(9):382.25;469.925 NDLC:G131;SA51 ビールムギ ビール麦 USE:大麦〔オオムギ〕{00568818} ビインコウ 鼻咽腔 ID:00560662 UF:上咽頭〔ジョウイントウ〕 ; Nasopharynx BT:咽頭〔イントウ〕{00564179} NDC(9):491.134;496.8 NDLC:SC661 ヒエ ヒエ ID:00563143 UF:稗〔ヒエ〕 BT:穀物〔コクモツ〕 {00566375} ; イネ科〔イネカ〕{00564121} NDC(9):479.343;616.62 NDLC:DM221;RA347;RB134 ヒエ 稗 USE:ヒエ{00563143} ヒエイリダンタイ 非営利団体 USE:NPO〈地理区分〉{00577640} 8
  10. 10. Conversion process Conversion from original TSV data to topic maps 9
  11. 11. NDLSH Ontology Ontology graph of NDLSH topic map 10
  12. 12. NDLSH topic map application Screen shots of the application 11
  13. 13. 3.2 LCSH ・ LCSH : Library of Congress Subject Headings in US ・ We are making topic map from LCSH - We downloaded it from “” - Subject Headings : 380, 123 - BT-NT : 254,651 RT : 11,137 ・ RDF (SKOS) to Topic Maps using Omnigator - SH (core:Concept) -> Topics - BT-NT, RT relation -> Associations - scopeNote, created, modified, comment etc. -> Occurrences ・ SHs have each own identifiers as URI that can be used as PSIs (e.g. ・ LCSH has already exposed on the Web in consideration of Linked data 12
  14. 14. Some part of LCSH Subject Headings around “Beer” 13
  15. 15. Origianal data LCSH is provided as RDF format data <rdf:Description rdf:about=""> : : <skos:narrower rdf:resource=""/> <skos:broader rdf:resource=""/> <skos:closeMatch rdf:resource=""/> <skos:inScheme rdf:resource=""/> <skos:inScheme rdf:resource=""/> <rdf:type rdf:resource=""/> <skos:related rdf:resource=""/> <skos:related rdf:resource=""/> <skos:related rdf:resource=""/> <skos:prefLabel xml:lang="en">Beer</skos:prefLabel> <owl:sameAs rdf:resource="info:lc/authorities/sh85012832"/> <dcterms:modified rdf:datatype="">1989-03- 22T15:09:28-04:00</dcterms:modified> </rdf:Description> 14
  16. 16. LCSH Ontology Ontology graph of LCSH topic map 15
  17. 17. LCSH topic map application Screen shots of the application
  18. 18. 4. Practical use of Subject Headings Many practical uses are possible For example: ・ Organizing internal and external information according to SHs ・ Multilanguage mapping using LCSH as a core system ・ Mutual complementing of our concept classification and SHs ・ SH providing web service using TMRAP ・ Using SHs as PSI ・ Using SHs as common test data for TM engines, TM Query engines, etc. 17
  19. 19. (1) Organizing information according to SHs Example: Organizing Wikipedia according to SHs ・Available links to Wikipedia (NDLSH: 12051, BSH: 6086) Subject Headings around “Beer” 18
  20. 20. Organizing Wikipedia Beer The world around “Beer” in NDLSH Wine Amenities of life Fruit liquor Hop Wines and Spirits Beer Brandy Liquor Distilled liquor Whiskey Malt Barley 19
  21. 21. Organizing Wikipedia We can easily generate Wikipedia’s address “” + “ビール” (SH) 20
  22. 22. (2) Mapping between multi-language If each language is mapped to LCSH, multi-language mapping will be achieved LCSH (English) NDLSH or BSH (Japanese) Norwegian SH merge Øl (Norwegian) merge ビール Beer merge merge e.g. Japanese Norwegian mapping via LCSH (English) 21
  23. 23. Mapping between multi-language Link from NDLSH to LCSH (USE-UF relation between NDLSH and LCSH) 22
  24. 24. (3) Mutual complementing - Sometimes SHs doesn’t have enough subjects or vocabulary though it is very hard to gather enough subjects from scratch by ourselves - By merging our own subjects with SHs we can get enriched subjects 23
  25. 25. (4) Web service for providing Subject Headings Subject Heading providing web service using TMRAP SH providing Client Web service Information from client’s Web Topic Maps Request SH Topic Maps application Web Application Web Application SH related information - JSP Page - JSP Page Return Ontopia SH related Ontopia - Navigator Framework TM fragments - Navigator Framework - Query engine - Query engine Topic Map SH Topic Map “Term or Subject” “Subject” topic
  26. 26. 5. Demo I will do short demo if I have enough time 25
  27. 27. 6. Challenges (1) Attach or extract subjects to/from information In order to organize information , we need ・attach subject to information by human - tagging systems are required ・extract subjects from information - subject extraction systems are required (2) Large data ・We can’t convert large RDF data to topic map at the moment because of out of memory We had to omit “skos:altLabel”, “owl:sameAs”, etc. We need scalable and stable environment for big files (3) Type or Instance? ・We are treating each Subject Heading as instance topic But probably, Subject Headings are type topics We want to make topic map treating those as type topics 26
  28. 28. 7.Conclusion & Future work No.1 ・ CIAS has already stored huge amount of information that is wanted to be topic maps ・ Many well organized knowledge such as NDLSH, BSH, LCSH, etc. have already existed ・ We are making topic maps and their web application from them ・ Topic maps can inherit Subject Headings and their relationships such as BT-NT, RT and USE-UF naturally ・ According to the relationships, information can be linked and organized, in other words, to be topic maps ・ By providing Subject Headings as topic maps and PSI for use in the context of Linked Topic Maps, they will become powerful elements and they will be used in many way 27
  29. 29. 7. Conclusion & Future work No.2 ・ To make our own ontologies ・ Continue to try our information to be topic maps according to our ontologies and the SHs ・ Continue to try to achieve multi-language mapping using the SHs ・ Try to merge our domain subjects with the SHs ・ Try to find out and realize good ways to link the SHs with information resources ・ Try to realize the web service for providing the SHs ・ Others (Many, Many, Many, …. ) 28
  30. 30. ありがとう ございました。 Danke schön Any suggestion? 29