Re-engineering Taxonomy Warehouse as an Ontology


Published on

Re-engineering Taxonomy Warehouse as an Ontology - presented by Dave Clarke, CEO Synaptica International at Taxonomy Boot Camp, 2011

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Re-engineering Taxonomy Warehouse as an Ontology

  1. 1. Re-engineering Taxonomy Warehouse as an Ontology By Dave Clarke CEO: Synaptica International
  2. 2. <ul><li>Free online directory of: </li></ul><ul><li>Over 650 Taxonomies </li></ul><ul><li>Produced by over 230 Publishers </li></ul><ul><li>In over 40 Languages </li></ul><ul><li>Categorized by 75 Subject Domains </li></ul><ul><li>With all available contact information </li></ul><ul><li>plus </li></ul><ul><li>Events </li></ul><ul><li>Standards </li></ul><ul><li>Bibliography </li></ul>What is Taxonomy Warehouse
  3. 3. Why Bother Re-engineering It People really need this information The system has to become more adaptable & easier to use LEGACY SITE PROS LEGACY SITE CONS Extensive database containing years of primary research Content becoming out of date Appreciative user base & high volume of visitors Back-end interface does not support access by location-independent editors Interest in taxonomies growing not declining New content types emerging & many requests to add new data elements Taxonomies are enablers to bigger things, like Web 3.0 Back-end DBMS has complex data model – changes require programmer/ DBA support
  4. 4. Which Direction To Go <ul><li>Traditional </li></ul><ul><li>Build web interface for the editors </li></ul><ul><li>Extend data model for new data elements & </li></ul><ul><li>content types </li></ul><ul><li>Redesign front end </li></ul><ul><li>Evolve periodically by following standard change request procedure </li></ul><ul><li>Ontological </li></ul><ul><li>Adopt existing taxonomy software </li></ul><ul><li>Decompose & transform legacy data </li></ul><ul><li>Build a data-dynamic ontology front-end </li></ul><ul><li>Evolve continuously & on-the-fly </li></ul>
  5. 5. Peek Under the Hood: Legacy Data Model 25 data tables with over 380 separate data elements Dozens of DBMS indexes, rules, triggers & stored procedures Table Z Table Y Table X 1 ∞ 1 ∞ IDX1 IDX2 IDX n AUX1 AUX n AUX2
  6. 6. Peek Under the Hood: GUI / Application Over 250 separate web pages and web application scripts Over 120 image files .ASP .INC .GIF .JPG
  7. 7. .ASP .INC .GIF .JPG Table Z Table Y Table X 1- ∞ 1- ∞ IDX1 IDX2 IDX n AUX1 AUX n AUX2 .ASP .INC .GIF .JPG
  8. 8. Step 1: Whiteboard what there is 6 entity types (facets) - 62 properties (fields) - 12 relationship types about 2 days work
  9. 9. Step 2: Configure the software to model the whiteboard Create Facets and their Properties Create Semantic Relationship Types about 2 hours work
  10. 10. Step 3: Transform & import legacy data about 6 hours work
  11. 11. Step 4: Start production analysis to production in 3 days
  12. 12. Step 5: Create the front-end 100% of content stored in a taxonomy management tool 100% front-end delivered by new front-end extension of the taxonomy tool Publish only approved items & control how each content channel is presented Real-time publication to web
  13. 14. A new paradigm: from Records to Graphs Legacy: records are the terminus of information access – they ‘dead-end’ Re-engineered: graphs are never-ending – they constantly offer new paths to explore
  14. 15. Graphs & Triples Art & Architecture Thesaurus (Taxonomy) Art & Architecture (Category) J. Paul Getty Trust (Publisher) English (Language) Is About Published By Has Language Links to 2 other taxonomies published by Getty Links to hundreds of other taxonomies in the English language Links to 14 other taxonomies about Art & Architecture Subject Predicate Object A&A Thesaurus Is About Art & Architecture A&A Thesaurus Has Language English A&A Thesaurus Published By J. Paul Getty Trust J. Paul Getty Trust Publishes A&A Thesaurus J. Paul Getty Trust Publishes Union List of Artist Names J. Paul Getty Trust Publishes Getty Thesaurus of Geographic Names
  15. 16. Add Facets and Predicates to the Ontology to roll out new Content Channels Blogs Products & Services Speakers, Authors, Bloggers Events Books average time to configure & publish a new content channel < 1 hour Subject Predicate Object Taxonomy Is About Category Category Discussed In Book Book Authored By Person Person Blogs At Blog Person Affiliated With Organization Organization Provides Service Organization Produces Product Product Instance Of Product Type
  16. 17. Limitless Discovery BLOG BLOGGER BOOK ORGANIZATION CATEGORY Discovery of other providers of KM Consulting services Discovery of related books, companies, products & services
  17. 18. Key Points A suitable taxonomy management tool allows new content types and inter-relationships to be rapidly created and instantly published 1 The ontology never ‘dead-ends’ – it lets users explore and discover an unbounded graph of relevant information 2 Taxonomies are building blocks for ontologies; ontologies are not just a means to access information, they ARE information 3
  18. 19. Email: Twitter: @AboutTaxonomy Thank You!