20100420 Methods Of Multilingual Interoperability Of Art & Architecture Thesaurus(Sophy)Presentation Transcript
The Methods of Multilingual Interoperability of Art & Architecture Thesaurus 藝術與建築索引典多語互通之方法 陳淑君 Shu-Jiun (Sophy) Chen CITI, Academia Sinica Program Office, Taiwan e-Learning and Digital Archives Program Workshop on the Construction and Multilingualization of a Lexical Knowledge Base April 20, 2010. Taiwan: Academia Sinica.
Preface & Introductions
The Key Issues
Conclusions and Outlooks
Throughout the years, TELDAP (Taiwan e-Learning and Digital Archives Program) has successfully accumulated more than 3 million digitized items.
We strive to explore new ways to share this abundant collection with the audience worldwide .
In order to bring our collection to the world, we face the following challenges:
Most of the digital objects and the metadata are in Chinese.
How to overcome the language barrier?
How to integrate different terms used by various institution and projects under the TELDAP in describing similar concept?
AAT-Taiwan multilingual research project
Officially launched by TELDAP Program Office at the beginning of 2009
The initial goal is to collaborate with GRI (Getty Research Institute) in developing the Chinese-language Art & Architecture Thesaurus (AAT) ,
and ultimately provide the global
communities a fully integrated
multilingual thesaurus online (with
broader, narrower, and related concepts
of the designated search item).
與美國 Getty Research Institute (GRI) 進行 AAT (Art & Architecture Thesaurus) 中文化研究計畫（以下簡稱 AAT-Taiwan ）
Cross-reference to elements of compound terms ( 複合詞參照 ):
coal mining( 開採煤礦 )---coal + mining
2. The relationship-Hierarchical 階層關係 :
The boarder context(s) for the concept record; parents refer to Hierarchical relationships, which are broader/narrower, reciprocal relationships between records. Hierarchical display are system-generated from the preferred term , the qualifier and the links to parents and other ancestors. ( 概念紀錄的上層脈絡 ， 父節點 指出層級關係，表達概念間 廣義或狹義的關係 。層級的顯示是從偏好詞彙、修飾語和父節點的連結及其他上層中的系統產生。 )
e.g. vessel is the immediate parent of vase, and <container by form> is an ancestor. All the concepts below vase are its children, and they are siblings to each other.
.... Furnishings and Equipment (G)
........ Containers (Hierarchy Name) (G)
............ containers (receptacles) (G)
................ <containers by form> (G)
.................... vessels (containers) (G)
........................ vases (G)
............................ Alhambra vases (G)
............................ arrow vases (G)
............................ boughpots (G)
3. The relationship- Associative:
including various types of ties or connections between concepts, excluding genus/species (hierarchical) relations. AAT is based on terms of types: person, activity, location, thing, field of study, style, material etc to categorize the relation with each other. 聯想關係 : 包含各種維繫或連接概念彼此之間的類型，但不包括屬 / 種 ( 層級的 ) 關係。 AAT 以人、活動、地點、事物、學習領域、風格、材質等不同詞彙紀錄，來歸納詞彙間的關係。
related to( 與…相關 ) e.g. Emergency housing 緊急住屋 ---crisis shelters 庇護中心
distinguish from( 與…區別 ) e.g. columns 圓柱 ---posts 杆
<person works/exists with another person> ( 一起工作 / 存在的相對應人 )
e.g. baron( 男爵 )---baronesses( 男爵夫人 )
<person and an activity> ( 人和活動 )
e.g. clock making( 製錶業 )---clockmaker( 鐘錶匠 )
<an activity and material> ( 活動與質材 )
e.g. dyeing( 染業 )---dye( 染色 )
<a field of study and an activity/event> ( 學習的領域和活動 / 事件 )
tint (<color-related effects>, color (perceived attribute), ... Color)
Note: Color produced by mixing a basic hue with white in order to lighten it.
色調 （ < 色彩相關作用 > ，色彩（知覺特質），… 色彩）
ID: 300056173 Record Type: concept
tone (color effect) (<color-related effects>, color (perceived attribute), ... Color)
Note: The variation of color produced by changes in intensity or value.
色調 （色彩作用）（ < 色彩相關作用 > ，色彩（知覺特質），… 色彩）
Both terms translate into 色調 (se diao) Reference: [tint] Mayer, Ralph. Harper Collins Dictionary of Art Term and Techniques. P.456 Longman Dictionary of English Language & Culture, Longman English-Chinese Science & Technology Dictionary, p.1686 [tone] : Longman Dictionary of English Language & Culture, p. 1854 The New Oxford Illustrated English-Chinese Dictionary, p.1923 tint tone (color effect)
An Example of Mapping Issues Culture-dependent Concepts ：福 / 祿 / 壽 fu/lu/shou (1/2)
Hierarchical position ：
motifs> characters > symbol > fu/lu/shou
紋飾資料庫 > 文字 > 符號 > 福 / 祿 / 壽
Since most motifs from NPM Metadata Requirement Specifications are cultural specific, many terms cannot be found in AAT. We recommend to create new terms, so these cultural-dependent terms may be added.
Equivalence Mapping (M) Cross-Division Coordination 跨分項協調 (C) 等同關係對照 Equivalence Mapping (M) Translation 翻譯 (T) 英翻中 (T2_EC) English to Chinese Translation 中英對照 (M2) Mapping 確認詞彙來源 Identify Sources 詞彙英譯 Identify the Eng term 參考資料 References 在 AAT 對照到的架構 Locate the term in AAT 確認中文詞彙 (M1) Identify Chinese terms 故宮器物 National Palace Museum 台灣古建築圖解事典 Traditional Architecture in Taiwan 中國工藝美術辭典 Dictionary of Chinese Arts and Crafts 知識網 Knowledge Web of Taiwan’s Diversity 人類學 / 科博館 Anthropology / NMNS Scope Note 範圍註撰寫 (N) 撰寫範圍註 (N2) Create a new record with required fields 中翻英 (T2_CE) Chinese to English Translation 確認詞彙類型 (M3) Identify Types of Equivalence Mapping 完全等同 (M3_EE) Exact Equivalence 不完全等同 (M3_IE) Inexact Equivalence 部分等同 (M3_PE) Partial Equivalence 一對多 (M3_SM) Single to Multiple 不等同 (M3_NE) Non-Equivalence
mapped to a broader
or narrower term
importance of this
pass on the concept
1st Pilot Study
The Preliminary Results
Frequencies of equivalence mapping types
Exact Equivalence (35 terms, 58%) has the highest match ratio, followed by Partial equivalence (18 terms, 30%) , Non-equivalence (6 terms, 10%), Inexact equivalence (1 term, 2%)
Similarity Types of Conceptual Structures
Type 1 Similar Structure
Type 2 Similar Structure , but required further expansion or correction of the structure
Type 3 Dissimilar Structure
Type 4 Lack of Structure
Chen, H. H. & Chen, S.J. (2009). Analysis of Multilingual Equivalence Mapping for Knowledge Organization Systems. Presented at 2009 TELDAP International Conference, Feb. 24, 2009, Academic Sinica, Taipei
Range of Equivalence X 彌勒佛 Non-equivalence (beyond scope) 5-3 X 龍泉窯 Non-equivalence (culture-dependent) 5-1 dots 乳丁紋 Partial Equivalence (species-genus relationship … Subordination) 3-2 censers 香爐 Inexact Equivalence 2-1 filigree enameling 掐絲 Exact cross-reference match 1-2 Yangshao 仰韶文化 Exact Equivalence 1-1 AAT’s examples TELDAP’s examples Sample Definition code of Match Type
仰韶文化 玉鑿 5000 B.C.-3000 B.C . The TELDAP Collection Yangshao (<Chinese Neolithic periods>, <Chinese prehistoric periods>, ... Styles and Periods) 仰韶文化 (< 考古學文化 ) Exact Equivalence 1-1 AAT term TELDAP term Example Match Type Code of Match Type
Translation (T) Cross-Division Coordination 跨分項協調 (C) 翻譯 Translation (T) Expert Group 專家審訂 (E) 內容審訂 (E2) Content Verification System Development 系統開發 (S) 著錄翻譯內容、圖片、參考書目 Input data into the system 前置作業 (T1) Preparation 確認翻譯內容 (T1_I) Identify Translation Materials 資料整理 / 更新 (T1_F) File compilation and update 募集兼任譯者 (T1_R) Translators recruitment Equivalence Mapping 等同關係對照 (M) Scope Note of New Concept 範圍註 (N) 翻譯 (T2) Translation 英翻中 (T2_EC) English to Chinese Translation 中翻英 (T2_CE) Chinese to English Translation
translate matched AAT terms
add required fields in AAT-
Taiwan (additional note, sources,
variant terms, transliterated
Chinese term, etc )
translate the terms into
English, submit to AAT
review (a) approved (b) revision 校訂 (T3) Proofread 統一詞彙翻譯與格式 (T3_U) Unified terms and format 翻譯品質管理 (T3_QC) Translation Quality Control 募集校訂人員 (T3_R) Proofreaders recruitment
Proofread (T3) 校訂 (T3) Proofread 統一詞彙翻譯與格式 (T3_U) Unified terms and format 翻譯品質管理 (T3_QC) Translation Quality Control 募集校訂人員 (T3_R) Proofreaders recruitment T3_QC2 ： make sure the translated scope note is both fluent and idiomatic Frequent problems: 1. Translated text is redundant or awkward, and does not assimilate to the cultural context of the target language. Not understandable in Chinese
Scope Note of New Concept (N) Cross-Division Coordination 跨分項協調 (C) 範圍註撰寫 Scope note of new concept (N) 增加相關圖檔 ( Ｎ 3) Find images Equivalence Mapping 等同關係對照 (M) Expert Group 專家審訂 (E) Translation 翻譯 (T) 中翻英 (T2_CE) Chinese to English Translation System Development 系統開發 (S) 著錄翻譯內容、圖片、參考書目 Input data into the system 校訂 (T3) Proofread 內容審訂 (E2) Content Verification 部分等同 (M3_PE) Partial Equivalence 撰寫範圍註 (N2) Create a new record with required fields 遵循 Getty 和 TELDAP 的指引 Follow editorial guideline to create a record in Chinese 搜集資料 (N1) Data Collection 參考權威資料 Authoritative References review approved
Create a new record with required fields (N2)
provide partial equivalence mapping (M3_PE) result to the scope note writer.
the scope note writer needs to fill out a worksheet(N2)
collect at least 3 authoritative references.
follow the Getty editorial
guideline to create a new record with required fields, including source, note, related term, preferred term, non-preferred term, and additional information.
Contribute New Candidate Terms (1/11)
Equivalence mapping of the following terms ：
－ seal script ( 篆書 )
－ bird and insect script ( 鳥蟲書 )
－ clerical script ( 隸書 )
－ semi-cursive script ( 行草書 )
－ regular script ( 楷書 )
These 5 terms are mapped to a broader term in AAT ：
.... Components (Hierarchy Name)
........ components (objects)
............ <components by specific context>
................ <information form components>
.................... <script and type forms>
........................ scripts (writing)
................................ add new terms
After determine the relevance of these terms, the information is then passed on to the Scope Note team (to create new records with required fields. Finally, the new records are provided to the Expert Group for verification
Expert Group (E) Cross-Division Coordination 跨分項協調 (C) 專家審訂 Expert Group (E) Translation 翻譯 (T) 內容準確性 (T3_A) Translation Accuracy 1 Scope Note 範圍註 (N) 完成新增中文詞彙所有的欄位資料 (N3) Create a new record with required fields System Development 系統開發 (S) 著錄翻譯內容、圖片、參考書目 Input data into the system 前置作業 (E1) Preparation 確認內容 (E1_I) Identify Content 審訂表整理更新 (E1_F) File compilation and update 內容審訂 (E2) Content Verification 專家修訂 (E2_RE) Revision by Expert 範圍註準確性 (E2_NA) Scope Note Accuracy 翻譯準確性 (E2_TA) Translation Accuracy Content experts need to verify the accuracy of the provided terms. If they determine it’s accurate, then the data will be inputted into the system; if not, revision is needed before the data can be submitted to AAT and AAT-Taiwan. review (a) approved (b) reject
After translation (T3) or the scope note (N3) is completed , the records are passed to content experts for review.
The expert should examine closely the accuracy of the translated or written materials to ensure the precision and credibility of the content
The content expert will receive the following files for review ：
1) scope note worksheet (N3) or translation files (T3)
2) content expert checklist
3) a brief summary of the thesaurus and the hierarchical structure
Scope Note Accuracy (E2_NA) 內容審訂 (E2) Content Verification 修訂 (E2_RE) Revision by Expert 範圍註準確性 (E2_NA) Scope Note Accuracy 翻譯準確性 (E2_TA) Translation Accuracy Example: Metadata Specification version 1.2 (p23) inscriptions > forms > calligraphy > clerical script E1: are the concepts of Chinese scripts complete? E2: is it the appropriate preferred term? E3: is it the appropriate non-preferred term? E4: is the content of the scope note accurate? E5: are these the appropriate broader, narrower, and related terms? E6: are the references and images credible? 範圍註審訂表 Content expert checklist
Contribute New Candidate Terms (2/11)
For content verification (E2), each assigned expert will receive the worksheet (N2) and a checklist. The worksheet includes content of new candidate terms, references, source, suggested hierarchical position, record type…etc.
Candidate term source Suggested hierarchical position record type
References Additional references Scope note Preferred term Non-preferred term Broader term Narrower term related term (3/11)
Contribute New Candidate Terms (4/11)
After the experts reviewed the content, they need to fill out the following checklist to verify the accuracy of the content ：
Are the concepts complete? Are the preferred terms, non-preferred terms, and English translation correct ? Is the term mapped to the correct hierarchical position? Is the scope note content correct?
Contribute New Candidate Terms (5/11) Are the broader terms, narrower terms, and the related terms correct? Are the references and the additional references valuable?
Contribute New Candidate Terms (6/11)
After reviewing the worksheet, the content expert decides to add more concepts under Chinese scripts, a total of 16 terms ：
Chinese scripts seal script clerical script regular script running script cursive script
Translation Progress Update Total Terms in AAT-Taiwan ： 11781 last updated: 2010/4/15 Hierarchy 19605 Total Translated Terms 8940 Objects Facet 3833 Materials Facet 791 Processes and Techniques 0 Physical and Mental Activities 273 Events 50 Functions 328 Disciplines Activities Facet 25 Living Organisms 185 Organizations 180 People Agents Facet 2136 Styles and Periods Facet 578 Color 288 Design Elements 48 Conditions and Effects 552 Attributes and Properties Physical Attributes Facet 1398 Associated Concepts Facet Terms Hierarchy Facet
For Example :
corresponding images Terms in different language
Switch to English
Link to Content Expert
From User’s Perspective (1) User searches for a term in AAT-Taiwan via query /browse (3) Use the term in AAT-Taiwan to retrieve records from MCN Taiwan MuseFusion (5) Use the term in AAT-Taiwan to retrieve records from TELDAP Union Catalog (2) Use Subject ID to retrieve equivalent English term in AAT (4) Use the same Term to retrieve records from TELDAP Union Catalog
Adopt the following standards:
CCO (Cataloging Cultural Objects)
Conclusion and Outlook
Provide global Asian cultural heritage communities with a knowledge-based database as reference to collection access and data value standard.
The Multilingual Vocabulary Working Group has been set up on October 2009 by the Getty Research Institute (GRI), as a means to establish close communication among the international collaborators .
(Chinese) Academia Sinica, TELDAP, Taiwan
(Dutch) Netherlands Institute for Art History, Netherland
(Spanish) Centro de Documentación de Bienes Patrimoniales, DIBAM, Chile
(German) State Museums of Berlin/Institute for Museum Research, Germany
The pilot test has been carried out by GRI and TELDAP between October2009 and March 2010 to test the feasibility of contributing transliterated terms and new concepts to AAT.
Chen, S.J. (2009). Chinese-language Art & Architecture Thesaurus: methods and issues. Presented at CIDOC 2009, Chile, 29 September 2009.
Chen, S.J. and Wu D. and Chang, Y.T. (2009). Bilingual equivalence mapping: methods and issues. Presented at Getty Research Institute.
Chen, S.J. and Wu D. and Peng, P.W. (2009). A Close look at bilingual translation: methods and issues. Presented at Getty Research Institute.
Chen, S.J. and Wu D. (2009). Contribution and creation of new concepts in the bilingual thesaurus: methods and practices. Presented at Getty Research Institute.
Cheng, C., Chen, Y.N., Chen, S.J., Chung, F.C., Lin, Y.H., and Wu, D. (2009). AAT-Taiwan System: initiative and progress. Presented at Getty Research Institute.
Thank you for your attention! Sophy Shu-Jiun Chen [email_address]