Your SlideShare is downloading. ×
0
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Knowledge Engineering for TELDAP

488

Published on

Keh-Jiann Chen …

Keh-Jiann Chen
Principal Investigator
Core Platforms for Digital Contents Project, TELDAP
Research Fellow
Research Center for Information Technology Innovation &
Institute of Information Science, Academia Sinica

Published in: Education, Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
488
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Knowledge Engineering for TELDAP Keh-Jiann Chen Principal Investigator Core Platforms for Digital Contents Project, TELDAP Research Fellow Research Center for Information Technology Innovation & Institute of Information Science, Academia Sinica
  • 2. Outline Introduction Union catalog Databases and metadata for digital contents and websites Knowledge engineering Future perspective
  • 3. Introduction The integration and management of digital contents has become an important issue as the amount of digital contents produced from different projects and institutions increases rapidly. Our project goal is to achieve optimized preservation, retrieval, and presentation of digital collections.
  • 4. 1. Union Catalog
  • 5. What is the union catalog¡H It is a catalog and portal for all digital collections of TELDAP. It is an integrated platform for browsing and searching entire digital contents of TELDAP. Metadata provides core descriptions and licensing information of each digital collection.
  • 6. Browsing by topics Search by keywords Home Page of Union Catalogs
  • 7. 2. Databases and metadata for digital contents and websites
  • 8. Metadata models for different types of objects Archived digital items Union catalog metadata model- Dublin core+ Web sites DCCAP (Dublin Core Collections Application Profile) Fields for internal used only Unique Identifier, Format, Evaluation, Cataloging History Documents Document metadata-Dublin core
  • 9. Metadata for Element Definition Title A name given to the resource digital items¡G Creator An entity primarily responsible for making the content of the resource Subject and Keywords The topic of the content of the resource Over 2 million Description An account of the content of the resource Publisher An entity responsible for making the resource digital items and available An entity responsible for making contributions to the Contributor content of the resource still increasing Date A date associated with an event in the life cycle of the resource Resource Type The nature or genre of the content of the resource Format The physical or digital manifestation of the resource Resource Identifier An unambiguous reference to the resource within a given context Source A Reference to a resource from which the present resource is derived Language A language of the intellectual content of the resource Relation A reference to a related resource Coverage The extent or scope of the content of the resource Rights Management Information about rights held in and over the resource 9
  • 10. 10
  • 11. Metadata for websites Over 200 websites and still increasing Metadata DCCAP (Dublin Core Collections Application Profile) To Combine the standard with our requirements: 19 data fields
  • 12. Metadata for websites The Website Homepage Picture URL, Project Information Type, Name, Author, Subject, Description, Language, Item Type, Target Archived Information: URL, time, authorization Copyright, Purpose, Other Information Figure: http://digitalarchives.tw
  • 13. Dynamic categorization User-oriented categorization General, elementary school students, high school students, researchers, …etc. Topical-based categorization Archaeology, painting, animal, plant, document, …etc. Functional-based categorization Research, education, business, technology,… Categorization based on institutions Academia Sinica, Taiwan U., Palace museum,…
  • 14. Figure: http://digitalarchives.tw Purpose: Education Target: Elementary school student, Junior high school student, Teacher… Select Items: According to 40 evaluation indicators, select top 5 websites Purpose: Creative applications Select Items: According to 40 evaluation indicators, select top 5 websites Purpose: Academic research Subject: Animal, Archaeology, Anthropology… Select Items: According to 40 evaluation indicators, select top 3 websites
  • 15. Metadata for project documents Over 5000 documents and still increasing Metadata- Dublin core Construct Teldapwiki- A Wikipedia for Teldap http://wiki.teldap.tw/
  • 16. 3. Knowledge Engineering
  • 17. Plans of making knowledge structures for TELDAP Construct metadata models for different objects. Establish hyperlinks between contexts and objects. Develop keyword extraction tools. Design automatic tagging tools. Construct Teldap ontology and thesaurus Art & Architecture Thesaurus by Getty Chinese WordNet
  • 18. (1) Metadata models for different objects Digital collections Union catalog metadata model- Dublin core+ Web sites DCCAP (Dublin Core Collections Application Profile) Public fields Private fields Unique Identifier, Format, Evaluation, Cataloging History Documents Document metadata-Dublin core
  • 19. (2) Establish hyperlinks between contents and objects Identify keywords in contents Tag keywords with related object hyperlinks
  • 20. Develop hyperlink tagging tools Word segmentation tools Resolve word segmentation ambiguities and identify keywords. CKIP word segmentation system: http://ckipsvr.iis.sinica.edu.tw/
  • 21. Develop hyperlink tagging tools TELDAP keyword dictionary Extract keywords from metadata and establish object-keyword relations. Extract text from XML data for each object The text are classified by topics, titles, descriptions, authors, locations, eras etc. From each class of text file extract keywords by automatic word segmentation and keyword extraction techniques.
  • 22. Prototype system for hyperlink tagger Identify and select keywords from the input text
  • 23. Prototype system for hyperlink tagger Produce text with hyperlinks
  • 24. Prototype system for hyperlink tagger Hyperlinks point to the related digital collections
  • 25. (3) Construct Teldap ontology and thesaurus Topical relation Hypernym/hyponym Synonym relation [¹¾¡²B³ ]/[ªM =ÄFY© ª¬ = Sushi ¡B½L¡B¸J¡BÂ| ] =©µ¥-°p¤ý Establish implicit links between objects by author, material, object type, …etc..
  • 26. (3) Construct Teldap ontology and thesaurus Establish association links between Chinese keywords and Getty AAT. Merging Chinese WordNet with English WordNet
  • 27. Future Perspectives Technology development Construct multi-lingua thesauri – Getty AAT Maintain the TELDAP keyword and object relation database Construct name authority files, gazetteers, and universal calendars Design hyperlink taggers and keyword extension tools Designing authoring tool which provides hyperlinks of keyword related digital contents automatically Design knowledge-based content retrieval system
  • 28. Future Perspectives Content enrichment Within TELDAP¡G Standardize object metadata model and data format All TELDAP objects should have their metadata Writing scripts and stories for different topics with Wiki-like knowledge structure Enrich the digital collections Establish hyperlinks between text books and TELDAP collections Extend the knowledge sources¡G e.g. Wikipedia
  • 29. Thank you for your attention! ·q½Ð«ü±

×