Your SlideShare is downloading. ×
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Developing the AIP Thesaurus: The Platform for an Ontology

1,870

Published on

Case study of the American Institute of Physics thesaurus. Presented by Mark Cassar of the American Institute of Physics and Jack Bruce of Access Innovations, Inc. at the 2012 Data Harmony User Group …

Case study of the American Institute of Physics thesaurus. Presented by Mark Cassar of the American Institute of Physics and Jack Bruce of Access Innovations, Inc. at the 2012 Data Harmony User Group meeting on February 8, 2012 at the Access Innovations, Inc. offices.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,870
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Developing the AIP Thesaurus: The Platform for an Ontology Mark Cassar American Institute of Physics Jack Bruce Marjorie (Margie) M.K. Hlava Access Innovations: 505-998-0800
  • 2. Background• Physics and Astronomy Classification Scheme (PACS)• Six digit code schema used for indexing scholarly content• 10 digit based – domain headings with subcategories nested under each domain.• Precoordinated system – Combine terms (concepts) at the time of indexing
  • 3. Why Change?• Improve searchability• Move to Post coordinated system – Combine terms at time of search• Semantic enrichment• Flexible metadata for many applications• Naturalize the vocabulary – Represent concepts succinctly and concisely – Easily add new concepts based on new and emerging technologies and applications – Allow unlimited hierarchy levels and polyhierarchy
  • 4. Better ROI• Rules-assisted indexing – Provide end users with a swift indexing solution based on the Machine-Aided Indexer (M.A.I.) engine. – Batch index large corpus of scholarly content, as well as future content.• Improve costs – Automate a large portion of electronic indexing – Less overhead for indexing
  • 5. Roadmap of the AIP Thesaurus• Data Collection – Load PACS codes and terms – Incorporate Search logs; add top searched concepts into the vocabulary• Analysis of Content – Test comparison of indexing to humanly indexed articles• Thesaurus Construction – Separate, disambiguate, and migrate concepts; Break up top domains – Apply thesaurus and taxonomy standardization to each term – Multiple reviews for each top section• Evaluation and Feedback – Send back working draft to AIP for review – Gather feedback from subject matter experts and incorporate the changes into the thesaurus• Finalization and Product Delivery
  • 6. Source Data• PACS 2009 ed.• 1999 ed. Of AIP Thesaurus (out of date)• Terms added to INSPEC since 2000• Internal and external search logs• Cumulative journal indexes – Digital – (2006 through 2009)• List of AIP divisions and their internal classifications
  • 7. Analysis of Content• Organizational warrant – PACS 2009 (2010) – www.aip.org – UniPHY• Literary warrant – Where we found the term used• Most frequent search terms loaded into thesaurus
  • 8. Thesaurus Creation Process• Load data (vocabulary) into Data Harmony MAIstro™• PACS – Restructure top domains – Separate into discrete – Disambiguate terms – Remove parenthetical qualifiers – Create post coordinated terms – Migrate separated terms into new/relevant categories• Sort flat lists (search logs) into main categories determined• Use multiple reviewers for each physics domain• About 8181 preferred terms and 5217 synonyms
  • 9. PACS TERM:– Low-energy electron diffraction (LEED) and reflectionhigh-energy electron diffraction (RHEED) (condensedmatter structure determination)– Becomes– BT Condensed matter structure determination • NT Low energy electron diffraction –Synonym LEED • NT Reflection high energy electron diffraction –Synonym RHEED
  • 10. Evaluation and Feedback• Weekly scheduled live demos of the thesaurus• Free web-hosted version of the thesaurus and periodic spreadsheet exports• Collect feedback based on SME suggestions and AIP PACS experts – Correspondence via email• Incorporate changes into thesaurus
  • 11. Available versions• Electronic copy of AIP thesaurus supplied in – XML – Excel – Web-based, read-only versions (Thesviewer) – MARC, SKOS, OWL, CSV etc
  • 12. Taxonomy view Thesaurus Term Record view
  • 13. To make an ontology• Define additional Associative relationships• Define additional Hierarchical relationships – IsA, IsPartOf, HasA• Define additional Equivalence relationship • Multilingual options • Weights and measures
  • 14. Clearer disambiguation? TemperaturePlanets IsA TypeOf IsA BrandOf MercuryRoman god IsA Automobile Metallic element
  • 15. Knowledge Organization Systems• Uncontrolled list Not complex• Name authority file• Synonym set/ring• Controlled vocabulary• Taxonomy• Thesaurus AIP Thesaurus is here• Ontology• Semantic network Highly complex
  • 16. Lessons Learned• Learning the style for indexing• Tendency to reversion to PACS style of language and classification• SME feedback turnaround – Sit with them 2 hours – Incorporate suggestions 8 hours – 2117 Terms Added 1354 Terms changed or updated 1333 Terms deleted 11259 Other actions
  • 17. Where are we now?• Platform is established• OWL and other formats available• One kind of Associative relationship – (Related terms)• One kind of Hierarchical relationship – Broader Narrower / Parent Child – Multiple broader terms for interdisciplinary options• One kind of Equivalence relationship • Synonym non preferred terms• Built using the Z39.19 standard - interoperable
  • 18. To Review AIP Thes• Use a web browser• http://thesview.accessinn.com/aipThes/• username/password twice - in all cases both are aip.• Begins a java app in your browser that shows the thesaurus starting from the top level of the hierarchy.• Use the collaboration module to comment and discuss
  • 19. Thank you Marjorie Hlava mhlava@accessin.com 505-998-0800

×