• Save
Taxonomy Fundamentals Workshop 2013
Upcoming SlideShare
Loading in...5
×
 

Taxonomy Fundamentals Workshop 2013

on

  • 1,492 views

Presented by Access Innovations, Inc. president Marjorie M.K. Hlava at the 2013 Taxonomy Boot Camp, November 5, 2013.

Presented by Access Innovations, Inc. president Marjorie M.K. Hlava at the 2013 Taxonomy Boot Camp, November 5, 2013.

Statistics

Views

Total Views
1,492
Views on SlideShare
1,484
Embed Views
8

Actions

Likes
6
Downloads
31
Comments
1

1 Embed 8

http://www.accessinn.com 8

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Thanks to Helen Atkins of AACR for this illustration.The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”

Taxonomy Fundamentals Workshop 2013 Taxonomy Fundamentals Workshop 2013 Presentation Transcript

  • Taxonomy Fundamentals Workshop Marjorie Hlava, President Access Innovations, Inc. www.accessinn.com Taxonomy Boot Camp Tuesday, November 5, 2013 © Access Innovations, Inc. All Rights Reserved.
  • Taxonomy Fundamentals Workshop 10:15 a.m. - 12:00 p.m. Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc., creator of Data Harmony software. My blog is TaxoDiary.com This interactive session starts by building a solid conceptual foundation for taxonomy creation and reinforces those concepts through audience participation. Starting with the basics, Hlava quickly advances to where and how to leverage taxonomies. This gives beginning and intermediate practitioners a good overview of the foundational knowledge for the more advanced sessions throughout the conference. Leveraging the taxonomy standards for the key components of a thesaurus, Hlava explores how those elements support the information needs of users from multiple perspectives and examines illustrative sites and behind-thescenes solutions to see how a well-constructed taxonomy with a rich interplay of terms and synonyms leads to better information access. The workshop discusses developing a taxonomy that serves users, respecting their needs for specialized vocabularies. With hands-on activities, attendees gain insight into how a subject area can be viewed, described, and structured. This learn-by-doing session provides basic knowledge to create a taxonomy that suits your needs. © Access Innovations, Inc. All Rights Reserved.
  • In our 1:45 hours together  Conceptual Framework - The Basics    Where and How to Leverage Taxonomies Better Information Access (Search)   “Card Sort” “Taxonomatch” A Quick Look at Standards  “A Taxing Situation” – “Taxopoly” © Access Innovations, Inc. All Rights Reserved.
  • Conceptual Framework – The Basics      What is a taxonomy? What are the parts of a taxonomy? How do you build one? Guidelines for the terms Subject matter experts (SMEs) © Access Innovations, Inc. All Rights Reserved.
  • What is a Taxonomy? ANSI/NISO Z39.19-2005 (R2010) controlled “A collection of controlled vocabulary terms organized into a Yes! hierarchical structure.” Missing: equivalence, associative relationships, and notes © Access Innovations, Inc. All Rights Reserved.
  • The Semantic Roadmap: Knowledge Organization Systems •Linked Entities •Contextual Specificity •Complex •High value         Semantic network Ontology Thesaurus Taxonomy Controlled vocabulary Synonym set/ring Name authority file Uncontrolled list •Simple •Low Value  Uncontrolled list Highest Cost over Time! © Access Innovations, Inc. All Rights Reserved. •Unrelated Entities •Ambiguity
  • Basic features – The term record      Main Term (MT) Top Term (TT) Broader Terms (BT) Narrower Terms (NT) Related Terms (RT)     ONTOLOGY THESAURUS Non-Preferred Term (NP)   See also (SA) TAXONOMY Used for (UF), See (S) Synonyms Scope Note (SN) History (H) © Access Innovations, Inc. All Rights Reserved. = subject term, heading, node, category, descriptor, class
  • Taxonomy? Thesaurus?   Often used interchangeably Thesaurus is a taxonomy with extras       Related Terms Non-preferred Terms (USE/Used for) Scope Notes More Taxonomies often have the actual information object at the final node. CMS and SharePoint tend to the hierarchical view only, definition, and USE © Access Innovations, Inc. All Rights Reserved.
  • Taxonomy view Copyright © 2005 © Access Innovations, Inc. All Rights Reserved. Access Innovations, In Thesaurus Term Record view
  • How do you build a taxonomy ? • • • • • • Define subject field Collect terms Organize terms Fill in gaps Flesh out and interrelate terms Apply to your data You’re done! © Access Innovations, Inc. All Rights Reserved.
  • Define subject field   Review representative collection of content Determine:  Core areas Sociology  Peripheral topics Psychology Education Law  Scope can be modified later © Access Innovations, Inc. All Rights Reserved.
  • Build, buy, augment?    Survey existing thesaurus/taxonomy resources for your domain Test for • Scope • Depth • Make-or-break terms • Cost Adoption of existing taxonomies  Term registries  TaxoBank  Taxonomy Warehouse  Other resources Don’t reinvent the wheel! © Access Innovations, Inc. All Rights Reserved.
  • Foundations       Start with what is known Build from there Use the literature, your data Use the lists you already have internally Build in continuous review throughout the process, and beyond Who is involved?     Taxonomists Subject matter experts Project management Users © Access Innovations, Inc. All Rights Reserved.
  • Collect terms           Your documents and databases Departmental terminology Textbooks and their indexes Book tables of contents and indexes Journal quarterly indexes Encyclopedias Lexicons, glossaries on the topic Web resources Users and experts Search logs © Access Innovations, Inc. All Rights Reserved.
  • Gather terms from search logs      Top ~100 search terms from search logs Terms used more than 50 times Match to web site with appropriate answer Basis for favorites or best bets, presented at the top of results list Behavior-based taxonomy © Access Innovations, Inc. All Rights Reserved.
  • Extract the terms – N-grams   Mine the full text for terms Decide term length     Up to four perhaps Sort into a frequency list Leave full strings and just n-grams Auto match to other lists © Access Innovations, Inc. All Rights Reserved.
  • Consolidate      Search log terms Source terms N-gram extractions De-dupe and frequency Work the list © Access Innovations, Inc. All Rights Reserved.
  • How do you choose terms?      Importance in the subject area Use in the literature, by the organization or community Necessary degree of specificity or detail Relationship with other controlled vocabularies Single concept = single term © Access Innovations, Inc. All Rights Reserved.
  • One term / one concept      Terms represent simple or unitary concept A unit of thought May be a single-word term May be a multi-word term if required to “A unit of thought, formed by represent the concept mentally combining some or all Three main categories of the characteristics of a – – – Concrete entities Abstract concepts Proper nouns © Access Innovations, Inc. All Rights Reserved. concrete or abstract, real or imaginary object. Concepts exist in the mind as abstract entities independent of terms used to express them.”
  • How big should it be?   Depends on use and your content What do the users need?     If search logs show precise detailed requests Support them Retail sites – less deep, more “facets” Scholarly publishers – deep and specific © Access Innovations, Inc. All Rights Reserved.
  • The levels         7 – 22 top terms Cognitive width supports this range 3 levels for e-commerce Roll up if you have more levels Index / tag to the most specific level More for specific and precise data Smaller taxos are tougher to maintain 22 x 22 = 484 x 22 = 10,648 © Access Innovations, Inc. All Rights Reserved.
  • Sample vocabulary sizes Scholarly AIP IOP JSTOR OSA SPIE NICEM ~6500 terms ~5400 terms ~57000 terms ~2400 terms ~3800 terms Physics Physics General Optics Optical Physics and Imaging ~5500 terms Education © Access Innovations, Inc. All Rights Reserved.
  • Sample vocabulary sizes Retail  Barnes & Noble   LL Bean   16 TT 30 level 2 = 480 Home Depot   10 TT 3 levels 10 / level = 1000 terms Amazon   14TT 30 level 2 = 420 terms 14 TT – 3levels – 8 per level = 896 terms For more information on product taxos http://gilbaneboston.com/12/presentations/T11_Hedden.pdf © Access Innovations, Inc. All Rights Reserved.
  • Concrete entities as terms • Things and their physical parts – Birds • • Buildings • • Floors Materials – – – – Feathers Cement Wood Lead Cards and chips © Access Innovations, Inc. All Rights Reserved.
  • Abstract concepts as terms • Actions and events – • Abstract entities – • strength, efficiency Disciplines and sciences – • law, theory Properties of things, materials, and actions – • evolution, skating, management, ceremonies physics, meteorology, mathematics Units of measurement – pounds, kilograms, miles, meters, nanoseconds © Access Innovations, Inc. All Rights Reserved.
  • Proper nouns as terms  Individual entities – “classes of one” – expressed as proper nouns  San Francisco, Lake Michigan Thesaurus standards exclude proper names, persons, and trade names  authority files. Taxonomies include them as final nodes. © Access Innovations, Inc. All Rights Reserved.
  • Organize terms – roughly  Sort terms into several major categories – logical groups of similar concepts as Top Terms     Identify core areas and peripheral topics 10 – 20 to start Consider moving proper names to authority files Result: loose collection of terms under several main headings    Rough and tentative – see how it fits as you go Initial gap analysis Add / modify / delete as needed © Access Innovations, Inc. All Rights Reserved.
  • How do terms relate?  Hierarchical relationships -- Parents and their children   Equivalence relationships -- Aliases Associative relationships -- Related terms -- Cousins -- See Also’s © Access Innovations, Inc. All Rights Reserved. TAXONOMY THESAURUS
  • Hierarchical relationships   Broader Term represents the class, whole, or genus Narrower Term is a member, part, or species      Generic relationship Whole-part relationship Instance relationship NTs inherit all the BT characteristics BTs/NTs have a reciprocal relationship © Access Innovations, Inc. All Rights Reserved.
  • Broader to Narrower Terms Communications equipment Telephones Smartphones Radio phones Analog phones Speaker phones © Access Innovations, Inc. All Rights Reserved.
  • Hierarchy – Whole-part relationship  Four general types – Body systems and organs • – Geographical locations • – Bernalillo County  Albuquerque Fields of study • – Ear  Middle ear Geology  Physical geology Hierarchical social structures • Ontario  Manitoulin District © Access Innovations, Inc. All Rights Reserved.
  • Hierarchy – Instance relationship  General category (common noun) as BT, with individual example (proper noun) as NTI (Narrower Term Instance) Seas French cathedrals Baltic Sea Caspian Sea Mediterranean Sea Chartres Cathedral Rheims Cathedral Rouen Cathedral Essentially identical to “final node” in some taxonomies © Access Innovations, Inc. All Rights Reserved.
  • Polyhierarchical relationship • • Term can logically fit under more than one Broader Term – can have Multiple Broader Terms (MBT) Part of ISO and ANSI/NISO standards Nurses Nurse administrators Health administrators Nurse administrators Finance Accounting Careers Accounting © Access Innovations, Inc. All Rights Reserved.
  • Generic relationship test – 1 • • Both terms in same fundamental category “All-and-some” test Rodents SOME SOME Squirrels Pests Squirrels ALL NOT ALL Inheritance or inclusion – what’s true of the parent (BT) is true for all children (NTs) © Access Innovations, Inc. All Rights Reserved.
  • Generic relationship test – 2 Rodents Squirrels Pests  ALL squirrels are rodents x NOT ALL squirrels are pests x NOT ALL pests are rodents © Access Innovations, Inc. All Rights Reserved.
  • Equivalence relationship • Preferred Term – – • Thesaurus term and valid for indexing Thesaurus notation: USE Non-Preferred Term – – – – Not valid for indexing An alias Entry point, directs user to Preferred Term Thesaurus notation: UF or NPT Spiders UF Arachnids © Access Innovations, Inc. All Rights Reserved. Plant pathology USE Phytopathology
  • Equivalence – when to use   Synonyms, slang, quasi-synonyms Scientific and trade names   UF Motrin Lexical variants    Ibuprofen Fiber optics Mouse UF Fibre optics UF Mice Upward posting of narrow concepts not specified in taxonomy or thesaurus  Social class UF Elite, Middle class, Working class Get equivalent terms from search logs, brainstorming… © Access Innovations, Inc. All Rights Reserved.
  • Associative relationship  Related Terms (RTs) – cousins  “…terms related conceptually but not hierarchically, and are not part of an equivalence set” (i.e. not synonyms)  Both valid for indexing Reciprocal relationship with each other Expands user’s awareness, reflects thesaurus coverage of unanticipated areas Main basis for the ontology 14 main options offered in Z39.19     © Access Innovations, Inc. All Rights Reserved.
  • Scope Notes (SN)         Indicate meaning of the term in the context of this thesaurus, for this audience  Stress – Metal, Psychological, Physiological Could be the definition or glossary Indicate any restriction in meaning Indicate range of topics covered Provide direction for indexers; for terms often confused, may suggest an alternative term Use as needed – may not be for every term Use a style guide Be concise © Access Innovations, Inc. All Rights Reserved.
  • Stating the terms • • • • • • • • Term format Grammatical issues Singular and plural forms Spelling Abbreviations and acronyms Capitalization Other punctuation Consistency © Access Innovations, Inc. All Rights Reserved.
  • Term format  KISS – Keep it short and simple • • •  Establish a policy •  1-2-3 words Effect on search Pre- and post-coordination E.g., follow Chicago Manual of Style Grammatical issues • • • • • Nouns and noun phrases Verbs  Gerunds Adjectives - no Adverbs - no Initial articles – no © Access Innovations, Inc. All Rights Reserved.
  • Compound Terms    “Terms in a thesaurus should represent simple or unitary concepts…” (ISO standard) “Compound terms should be factored (split) into simple elements…” (ANSI/NISO standard) Term phrases are okay (bigrams)    Adjective Noun American history Two concepts combined are not  Aromatherapy for bloating © Access Innovations, Inc. All Rights Reserved.
  • Pre- and Post-Coordination  Pre-coordination – multiple concepts  Subject headings – Library of Congress     American history – Civil War Back of the book Put together in advance by the publisher Post-coordination    Taxonomy terms Single concept Put together by the user / searcher © Access Innovations, Inc. All Rights Reserved.
  • So far you’ve got  Hierarchy – • – Preferred/Non-Preferred Terms – – – Equivalence relationships Related Terms – – Broader and Narrower Terms Polyhierarchies when needed Associative relationships Scope Notes Complete term records – Correct term format © Access Innovations, Inc. All Rights Reserved.
  • Taxonomy view © Access Innovations, Inc. All Rights Reserved. Thesaurus Term Record view
  • Does it work?  Test on your data     Index 500+ documents (more for variable writing style; fewer for strict style) No un-indexed articles allowed Consider deleting unused terms Review   Users Expert reviewers Consider automated / assisted indexing software © Access Innovations, Inc. All Rights Reserved.
  • Subject Matter Experts     Work first from the literature Establish literary warrant for terms Someone else do the clerical work Differentiate the lexicography work    From the subject matter expert work Let SMEs do the review and tailoring Expert review ensures the proper term use and application © Access Innovations, Inc. All Rights Reserved.
  • Subject Matter Experts  Interview   Make suggested enhancements    1 – 2 hours 8 hours Balance competing perspectives Advisory Board…advisable! © Access Innovations, Inc. All Rights Reserved.
  • Review, edit, test, edit, use, edit, and maintain, i.e. edit    Monitor search logs Allow indexers to suggest candidate terms Edit and maintain         Add term Change existing term Change term status Delete term Add term relationship Delete term relationship Add/modify scope note Change overall structure © Access Innovations, Inc. All Rights Reserved.
  • Card Sort       Groups of three Organize terms into the “proper” hierarchical order Use as many levels as needed Use polyhierarchy as needed Write your top terms on the flip chart sheet to show the group 15 minutes © Access Innovations, Inc. All Rights Reserved.
  • Where and how to leverage taxonomies      Implementation and applications Adding the terms to the information objects Search and other applications Taxonomy use cases – implementation Opportunities and obstacles © Access Innovations, Inc. All Rights Reserved.
  • Parts of the puzzle  The taxonomy    Applications   Search, Web site, CMS, SharePoint, Publishing system, Author submission, Peer reviewer ID, Recommendation engines, etc. Implementation / actions     The words to use In the order you want the users to browse Making the links Adding terms to information objects Mash-ups Most people confuse the parts, but they act very differently © Access Innovations, Inc. All Rights Reserved.
  • Fully integrated with MOSS The Workflow Gather source data Client Data Full Text Tag and Create metadata Put in data base with tags Build Search inverted index Automatic Summarization Search Presentation Layer HTML, PDF, Data Feeds, etc. Machine Aided Indexer (M.A.I.™) Inline Tagging Client Client Taxonomy taxonomy Create user interface Database Repository Metadata and Entity Extractor Thesaurus Master © Access Innovations, Inc. All Rights Reserved. Search Software Increases accuracy Browse by Subject Auto-completion Broader Terms Narrower Terms Related Terms
  • Adding terms to information objects  Part of the record     A relational table pointing the terms to a record ID number (Secondary key) Adding data to the HTML   XML MARC META NAME KEYWORD Element Many other options © Access Innovations, Inc. All Rights Reserved.
  • Part of the record – XML   Added as an element in the XML record Need an element to put the data in    Element = Field = table value = <Taxonomy Term> <Taxonomy Term>Roadworks</Taxonomy Term> Capture the terms when creating the records © Access Innovations, Inc. All Rights Reserved.
  • Part of the record MARC   Added as an element in the MARC record Need an element to put the data in     654 Roadworks 345 Roadworks Wherever you decide it works best for your OPAC Add the terms when creating the record © Access Innovations, Inc. All Rights Reserved.
  • Editorial Workflow Integration Author Submission Module The author fills in the data to the document template, attaching images and graphs as necessary. An API calls Data Harmony and generates a list of indexing terms based on the content. © Access Innovations, Inc. All Rights Reserved.
  • Editorial Workflow Integration Author Submission Module Authors review the indexing and may change it. Content is stored into a data repository as HTML, XML, etc. © Access Innovations, Inc. All Rights Reserved.
  • Editorial Workflow Integration Contributor Role Tagging A popup list of contributor role options appears for the author to choose from ----Study conception ? Contributor Information Contributor Role --Methodology ? Mouse Over Formal analysis ? for explanation Computation ? Investigation ? Application of Resources ? statistical, Data Curation ? mathematical, or Publication ? other formal ---Supervision ? techniques to --Project administration ? analyze study data Funding acquisition ? Formal Analysis © Access Innovations, Inc. All Rights Reserved.
  • In the HTML record   Makes it crawlable for the internet Used in CMS applications   Add to the HTML     Content management systems Manually In Dreamweaver In your CMS system (Drupal, WordPress, etc.) Author Submissions Example © Access Innovations, Inc. All Rights Reserved.
  • META NAME “KEYWORDS” © Access Innovations, Inc. All Rights Reserved.
  • In Relational Database Table   Primary key – for the record Secondary key all the metadata     Used in Oracle, SQL, etc.   Like taxonomy terms Like author Like publication date Need field to put the taxonomy data in Supports “Faceted Search”  each item in a separate field or element or table © Access Innovations, Inc. All Rights Reserved.
  • Relational database diagram © Access Innovations, Inc. All Rights Reserved.
  • User uploads a document to SharePoint space Data Harmony automatically attaches indexing terms before uploading to MOSS Before uploading to SharePoint server, the EventHandler sends the document to Data Harmony. TaxoTerm Server Data Harmony (M.A.I.) Adding terms to SharePoint Returns subject metadata © Access Innovations, Inc. All Rights Reserved. Microsoft SharePoint Server 2010
  • SharePoint 2010 shows only 10 lines of the taxonomy This add-on makes it all viewable © Access Innovations, Inc. All Rights Reserved.
  • Taxonomies added in search example Core Architectural Components Administrator’s Dashboard FAST MANAGEMENT API EMAIL CONNECTOR Email, Groupware Content Push FILTER SERVER Alerts CUSTOM CONNECTOR MAIstro Agent DB Use taxonomy terms here Data Harmony Governance API © Access Innovations, Inc. All Rights Reserved. Query Vertical Applications Portals Results Search harmony Custom Applications Index DB Pipeline QUERY PROCESSOR Databases DATABASE CONNECTOR Pipeline DOCUMENT PROCESSOR FILE TRAVERSER CONTENT API Files, Documents SEARCH SERVER QUERY API WEB CRAWLER Web Content Custom Front-Ends Mobile Devices
  • Autosuggestion of taxonomy terms Populate keywords, descriptors, indexing terms, etc. Allow for manual review of autotagging for quality assurance. © Access Innovations, Inc. All Rights Reserved.
  • HTML Header © Access Innovations, Inc. All Rights Reserved.
  • Suggested taxonomy descriptors © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • Dynamic Display 71 © Access Innovations, Inc. All Rights Reserved.
  • Inline Tagging Shows the exact point where the concept is mentioned Mouse-over to view the term record © Access Innovations, Inc. All Rights Reserved. Statistical summary, showing the number of times each term is mentioned in the article
  • Integrate taxonomy to enhance findability  Browsable categories of a directory     Browsable faceted navigation Smart search for term equivalents Taxonomy terms (original or modified) as labels Navigation aids incorporate taxonomy terms and relationships © Access Innovations, Inc. All Rights Reserved.
  • More Taxonomy Enrichment      Spelling alternatives and correction Related concepts Statistical information about the metadata Navigation or drill-downs Search refinement    Recursive sets Concept linking Dictionary lookup (in taxonomy glossary) © Access Innovations, Inc. All Rights Reserved.
  • Parts of Search  Search software    Inverted index Search algorithms Presentation layer     Search box Auto-completion Related and narrower terms Hierarchical display © Access Innovations, Inc. All Rights Reserved.
  • Database Plus Search Workflow SQL for ecommerce Raw Full text data feeds Printed source materials Data Crawls on data sources Source data XIS Creation XIS repository Load to Search Taxonomy terms MAI Concept Extractor MAI Rule Base Taxonomy Thesaurus Master Add metadata Clean and enhance data © Access Innovations, Inc. All Rights Reserved. Search Harmony Display Search Search data
  • Why does search fail?  Most large organizations have 5 different search software applications      All disappointing and on the shelf Inconsistent results Unclear path to results Lack of single unified clear and consistent vocabulary Not tied to data governance   Taxonomy Other metadata © Access Innovations, Inc. All Rights Reserved.
  • Sample DOCUMENT Outline of Presentation 1 Creating an Inverted File Index 2 Define key terminology Thesaurus tools   3 Costs   4 Features Functions Thesaurus construction Thesaurus tools Why & when? © Access Innovations, Inc. All Rights Reserved.
  • Simple inverted file index The terms from the “outline” & 1 2 3 4 construction costs define features functions © Access Innovations, Inc. All Rights Reserved. key of outline presentation terminology thesaurus tools when why
  • Complex inverted file index Placement location & - Stop 1 - Stop 2 - Stop 3 - Stop 4 - Stop construction - L7, P2, SH costs - L6, P1, H define - L2, P1, H features - L4, P1, SH functions - L5, P1, SH key - L2, P2, H of - Stop outline - L1, P1, T presentation - L1, P3, T terminology - L2, P3, H thesaurus - (1) - L3, P1, H (2) - L7, P1, SH (3) - L8, P1, SH tools - (1) - L3, P2, H (2) - L8, P2, SH when - L9, P3, H why - L9, P1, H © Access Innovations, Inc. All Rights Reserved.
  • Access Innovations – Complex Farm with Perfect Search Query Query Servers Search Harmony Presentation Layer Federators Cleanup, etc. Deploy Hub Repository XIS (cache) Cache Builders Source Data © Access Innovations, Inc. All Rights Reserved. Index Builders
  • Measuring accuracy in search             Relevance Recall Precision Hits, misses, noise Ranking Linguistics Query processing Results processing Display Search refinement Usability Business rules © Access Innovations, Inc. All Rights Reserved.
  • Relevance    How well a set of returned documents answers the information need “Accuracy” Related to objective of search     Different user communities Information resources Tension of user needs and context available A confidence “guessimate” © Access Innovations, Inc. All Rights Reserved.
  • The formulas Recall = Number of relevant items retrieved Number of relevant items in the collection Precision = Number of relevant items retrieved Number of items retrieved Relevance = Germane (Precision) Pertinent (Recall) © Access Innovations, Inc. All Rights Reserved.
  • Measuring Relevance        Concepts Context Age of documents Completeness (recall) Quality Statistically determined? Nope, it is subjective   Someone has to determine the rightness of the item A confidence factor = canard! © Access Innovations, Inc. All Rights Reserved.
  • Improve Search www.mediasleuth.com Auto-completion using the taxonomy Guide the user Navigate the full taxonomy “tree” BROWSE © Access Innovations, Inc. All Rights Reserved.
  • Subject Browsing © Access Innovations, Inc. All Rights Reserved.
  • Targeted resources based on subject or user role CONFIDENTIAL © Access Innovations, Inc. All Rights Reserved.
  • Link to Society Resources Cancer Epidemiology Biomarkers & Prevention Vol. 12, 161-164, February 2003 © 2003 American Association for Cancer Research Short Communications Alcohol, Folate, Methionine, and Risk of Incident Breast Cancer in the American Cancer Society Cancer Prevention Study II Nutrition Cohort Heather Spencer Feigelson1, Carolyn R. Jonas, Andreas S. Robertson, Marjorie L. McCullough, Michael J. Thun and Eugenia E. Calle Department of Epidemiology and Surveillance Research, American Cancer Society, National Home Office, Atlanta, Georgia 30329-4251 Recent studies suggest that the increased risk of breast cancer associated with alcohol consumption may be reduced by adequate folate intake. We examined this question among 66,561 postmenopausal women in the American Cancer Society Related Working Groups Think Tank Report •FinancePrevention Study II NutritionRelated Think Tank Report Cancer Cohort. •Charter Content •Molecular Epidemiology Webcasts Related Awards Related Webcasts •AACR-GlaxoSmithKline Clinical Cancer Research Scholar Awards •ACS Award •Weinstein Distinguished Lecture © Access Innovations, Inc. All Rights Reserved. Related Press Releases •How What and How Much We Eat (And Drink) Affects Our Risk of Cancer •Novel COX-2 Combination Treatment May Reduce Colon Cancer Risk Combination Regimen of COX-2 Inhibitor and Fish Oil Causes Cell Death •COX-2 Levels Are Elevated in Smokers Related AACR Workshops and Conferences •Frontiers in Cancer Prevention Research •Continuing Medical Education (CME) •Molecular Targets and Cancer Therapeutics Related Meeting Abstracts •Association between dietary folate intake, alcohol intake, and methylenetetrahydrofolate reductase C677T and A1298C polymorphisms and subsequent breast •Folate, folate cofactor, and alcohol intakes and risk for colorectal adenoma •Dietary folate intake and risk of prostate cancer in a large prospective cohort study Related Education Book Content Oral Contraceptives, Postmenopausal Hormones, and Breast Cancer Physical Activity and Cancer Hormonal Interventions: From Adjuvant Therapy to Breast Cancer Prevention After Helen Atkins
  • Linked Data Other Journal Articles on Topic A CME Activity on Topic A Journal Article on Topic A Grant Available for Researchers Working on Topic A Upcoming Conference on Topic A Job Posting for Expert on Topic A Podcast Interview with Researcher Working on Topic A Author Networks Social Networking © Access Innovations, Inc. All Rights Reserved. After Helen Atkins
  • Authors at a Place © Access Innovations, Inc. All Rights Reserved.
  • Member Profile Tagging User pastes or uploads CV Button to autoextract taxonomy attributes © Access Innovations, Inc. All Rights Reserved.
  •   Designed to enhance understanding and retention of the vocabulary concepts necessary for creating a taxonomy, ontology, thesaurus, or controlled vocabulary. Game supplies:    Game setup:       1 Deck of Orange Question and Challenge Cards 1 Deck of Green Answer Cards Shuffle the deck of Green Answer cards. Deal the entire deck to the players. Shuffle the deck of Orange Question and Challenge cards. Place them face down in a pile in the middle of the table so that all players can reach the pile. Reinforce what you just heard! Have fun! © Access Innovations, Inc. All Rights Reserved.
  • 1. 2. 3. 4. 5. 6. Play moves to the left of the dealer. 7. Draw a card from the top of the Orange cards. Read it aloud to all of the players. The player who read the card says out loud 8. what they think the answer is. Each player looks at the Green Answer cards in their hand. 1. If they have the correct answer to the Question or Challenge, they show their 9. card to everyone at the table. 2. If everyone agrees that the answer is correct, the player holding the correct answer card gives it to the player who 10. read the Question or Challenge card. The player places their associated pair of cards – one Orange Question and Challenge card and one Green Answer card – face up on the table in front of them. Play passes to the person who held the correct Green Answer card in their hand. Play continues as in step 2 above. © Access Innovations, Inc. All Rights Reserved. Discussion among the players to arrive at the correct answer is permissible and encouraged! If players do not arrive at a consensus regarding the correct answer, the Orange Question and Challenge card may be returned to the bottom of the pile, and play passes to the person to the left of the player who drew the previous card. When all of the Orange Question and Challenge cards have been drawn, read aloud, and matched with their Green Answer cards, the game ends. If there are any Orange Question and Challenge cards remaining to which players cannot agree on an answer, players may consult their notes or ask the session speaker.
  • Using taxonomies in applications • • • • • • • • • • • Improve search • Subject browsing Mobile intelligence Targeted resources based • on subject or user role Link to society resources • Author submission module • Author authority database • • Expert reviewer • identification Member profiles • Data visualization  More like this © Access Innovations, Inc. All Rights Reserved. In “indexing” or categorizing, as subject metadata In content management systems In SharePoint In mash-ups In social networking sites In author tagging In filtering data – e.g., spam filters and RSS feeds In web crawlers Social media - community
  • More Innovations!          Link topic to article to author to event Make visual links within domain Enable authors to submit and categorize conference submissions Create author authority database linking to co-authors, topics, locations, etc. Create expert reviewer database Create member profiles with alternate names, publications, tagged by topic Visualize data and domain distribution Display interest connections in social network Deliver accurate targeted information through mobile applications © Access Innovations, Inc. All Rights Reserved.
  • Visualize your tagged data This is a radial graph of “plosthes”. The number of records for which each index term occurs is reflected by circle sizes. © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • All data up-posted to the top level © Access Innovations, Inc. All Rights Reserved.
  • Load to a visualization program such as Prefuse © Access Innovations, Inc. All Rights Reserved.
  • Thesaurus, Controlled vocabulary, ontology... Taxonomy only as part of thesaurus © Access Innovations, Inc. All Rights Reserved.
  • Taxonomy standards        Z39.19 (2005; reaffirmed 2010) Controlled Vocabularies BS 8723 Parts 1 – 5 ISO 25964 Parts 1 and 2 TAG 37 and 46 standards SKOS - Simple Knowledge Organization System OWL - Web Ontology Language AND more! © Access Innovations, Inc. All Rights Reserved.
  • Taxonomies don’t exist in a vacuum    They are part of metadata They are used to tag information objects They are used      On Web sites In search To profile people To link resources So we have to know a little about those standards as well © Access Innovations, Inc. All Rights Reserved.
  • More on ISO 25964    Part 2 Interoperability and RDA at 2:15 PM © Access Innovations, Inc. All Rights Reserved.
  • W3C    HTML 5 Linked Data Ontologies (OWL) and SKOS   Cascading Style Sheets (CSS)   Simple Knowledge Organization System Adding style to Web content Widgets   Widget Packaging and XML Configuration, Widget Interface   API to metadata and persistently storing data XML Digital Signatures for Widgets © Access Innovations, Inc. All Rights Reserved.
  • Big Library Followings  DCMI – Dublin Core Metadata Initiative   Functional requirements Library of Congress © Access Innovations, Inc. All Rights Reserved.
  • 109 © Access Innovations, Inc. All Rights Reserved.
  • Library of Congress         MARC 21 formats and MARCXML VRA Core -- them METS (Metadata Encoding & Transmission Standard) MIX (NISO Metadata for Images in XML) PREMIS (Preservation Metadata) TextMD (Technical Metadata for Text) ALTO - Technical Metadata for Optical Character Recognition Extended Date/Time Format (EDTF) © Access Innovations, Inc. All Rights Reserved.
  • Thesaurus related           NISO Z39.19 2010 www.niso.org ISO 2788 - Monolingual (1986) (withdrawn) ISO 5964 - Multilingual (1985) (withdrawn) ISO 5127, Information and documentation  Vocabulary BS 8723 (withdrawn) (basis for revised ISO) ISO 25964 / Part 1 – Controlled Vocabularies ISO 25964 / Part 2 – Taxonomy Interoperability Dublin Core DCMI Functional requirements SKOS – the W3C thesaurus standard OWL from W3C © Access Innovations, Inc. All Rights Reserved.
  • Thesaurus and Indexing Standards – ANSI/NISO  NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies  NISO TR02-1997 Guidelines for Indexes and Related Information Retrieval Devices by James D. Anderson © Access Innovations, Inc. All Rights Reserved.
  • New ISO Taxonomy Standard  ISO 25964. Thesauri and interoperability with other vocabularies    Part 1: Thesauri for information retrieval Part 2: Interoperability with other vocabularies Stella Dextre Clarke, principal author © Access Innovations, Inc. All Rights Reserved.
  • W3C  OWL – Web Ontology Language RDF – Resource Description Format Topic Maps SKOS - Simple Knowledge Organization System SKOS 2 DCMI TURTLE  Which community to serve?      © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • Other Relevant ISO & W3C Standards  Metadata standards overview   http://www.slis.kent.edu/~mzeng/metadatabasics/ completelist.htm Review of SKOS / DCMI / Taxonomy Standards  http://nkos.slis.kent.edu/ © Access Innovations, Inc. All Rights Reserved.
  • SKOS  SKOS 1 –    SKOS 2 –     no synonyms, no polyhierarchies Added the above Allows other fields (elements) on request OWL Crosswalk NISO Z39.19, BSI 8723, and ISO 25964 © Access Innovations, Inc. All Rights Reserved.
  • Who supports SKOS Everyone         Data Harmony Thesaurus Master Synaptica SmartLogic WordMap PoolParty Top Quadrant Protégé Etc. © Access Innovations, Inc. All Rights Reserved.
  • Standards and pragmatism    Use Standards  Lead to richer, more informative product  Promote interoperability -- Allow you to adopt or adapt other controlled vocabularies  Promote predictability  Allow repurposing within your organization and by other organizations Follow thesaurus standards for taxonomy  Incorporate authority files / final nodes as needed Your taxonomy or thesaurus must meet your needs © Access Innovations, Inc. All Rights Reserved.
  • The Problem – KEEPING UP     Many players we know and don’t know Controlled vocabulary standards Groups developing guidelines and standards  W3C with SKOS and OWL  Governments worldwide developing and mandating taxonomies Communities  Increase reuse  Mapping interoperability between controlled vocabularies © Access Innovations, Inc. All Rights Reserved.
  • Places to watch   Other W3C and ISO areas Support groups        Blogs Communities of Practice WSDL – Web Services Digital Library DCMI NKOS ISKO Linked Data © Access Innovations, Inc. All Rights Reserved.
  •     The New Board Game Applications Implementation The taxonomy © Access Innovations, Inc. All Rights Reserved.
  • Where do I learn more?     Online resources Taxonomy books Those standards Organizations    SLA Taxonomy Division ISKO NKOS © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • 129 © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • Lists of Taxonomy Resources        Registry? NKOS KOS of KOS SKOS participants – W3C KOS typology – Tudhope TaxoBank.org Kent.edu site – Marcia Zeng Taxonomy Warehouse – Synaptica UMLS - Unified Medical Language System - NIH © Access Innovations, Inc. All Rights Reserved.
  • © Access Innovations, Inc. All Rights Reserved.
  • IT is often Fire, Ready, Aim!        Choose the hardware Choose the software Decide on the format Convert the data Fix the data Tack on a taxonomy Ignore the standards © Access Innovations, Inc. All Rights Reserved.
  • Change to Ready, Aim, Fire!         Follow the data Look at the data, format, and content Design taxonomy for data Leverage the standards Use taxonomy to tag data Choose search and repository software for data Load the data into the system Keep your eye on the target © Access Innovations, Inc. All Rights Reserved.
  • For copies of the “The Games” © Access Innovations, Inc. All Rights Reserved.
  • Summary         We covered the basics We talked about the implementation Application of the terms to your content Search Standards We reinforced the learning with activities You drank from the fire hose Now go hear the case studies of the next two days! © Access Innovations, Inc. All Rights Reserved.
  • Marjorie M.K. Hlava Thank you for your attention! © Access Innovations, Inc. All Rights Reserved. President Access Innovations, Inc. Data Harmony mhlava@accessinn.com 505-998-0800 www.taxodiary.com - the taxonomy news blog mmkhlava = Twitter mhlava = Facebook, LinkedIn, eAcademy, Plaxo