Taxonomy boot camp best practices panel Mary Chitty
Mary Chitty, MSLS, Library Director & Taxonomist, Knowledge & Information Services
Cambridge Healthtech, Needham MA | www.healthtech.com
firstname.lastname@example.org 781 972-5416 | www.genomicglossaries.com
TODAY’S SCIENCE FICTION CAN BE TOMORROW’S SCIENCE
A Division of Cambridge Innovation Institute
Home-grown SQL database
1991 CEO created structure for
keywords – Still involved with
identifying and creating new terms
2011 Major reorganization into 25 top
2017 Nearly 1,600 concepts and
Database 2.0 in planning
Looking into new software options
Public website www.genomicglossaries.com
2015 Company migrated
to SharePoint intranet
2017 Summer Knowledge &
Information Services portal
Developing resources on using and
training about in-house keywords
All very technical complex
1999 Started as a small glossary based on content from in-house taxonomy
2000 Launched as website
2001 Renamed Glossaries & Taxonomies
June Reviewed by Science magazine – a nice surprise!
Search works best IF:
1. You know what to call what you're looking for AND
2. You know what you're looking for exists.
Often neither one is certain for my topics. So …
1999, created glossaries on DNA and proteins for new market research products.
Really interested in poly-hierarchical and non-hierarchical relationships
-- not easily curated!
2000, when websites were still new, realized this could be a solution to update and
share my terms. This website could be valuable to others.
My company is in the information overload business, but we get overloaded too.
In 2017, major Updates including Ontologies & Taxonomies.
Because you’re going to make changes
Call projects prototype/s or proof/s of concept as long as possible
Break daunting project revisions and updates into small
Look for quick wins
Maximum effect with limited effort
More complicated projects can
Knowledge and credibility gained by
Seek metrics feedback
anywhere and everywhere
Qualitative and quantitative
Google Analytics for usage metrics
Welcome questions and emails
Look for reviews and accolades
Both NIH through the Big Data to Knowledge Program and the
European Commission with Horizon 2020 have allocated
considerable resources to making data FAIRer.
FAIR Data Principles, 2017 short with link to long version
FAIR Guiding Principles for scientific data management and
stewardship Sci Data. 2016; 3: 160018. Published online 2016
Mar15. doi: 10.1038/sdata.2016.18
Take advantage of modularity & reusability. Don’t re-invent the wheel.
Descriptive not prescriptive definitions, if any.
Packaging and labels matter. Taxonomies or ontologies sound sexier than
thesauri or controlled vocabularies
Taxonomies inherently get more and more granular. Keep editing!
Don't try to boil the ocean.
80/20 rule or the Pareto principal
Focus on 20% of effort with 80% of usage – not the other way around.
Relevance is inherently subjective. What do your users value most?
even after years of experience!
MAINTENANCE AND UPKEEP
Topics morph in new directions & into new disciplines
Interoperability & reusability
Huge challenges still
Balance short term & long term needs & goals
RETURN ON INVESTMENT
Complexity and information overload trade-offs
Out-of-the-Box vs. Configurability vs. Customization
More programming = more $ - Choose software wisely
People can’t buy your products if they don’t know they exist,
or where to find them.
Choose challenging – but not impossible projects.
Look for allies and buy-in to help make sustainable
Use metrics and feedback to measure progress, so you
know when you've made some.
Share best practices, lessons learned and ongoing
challenges. Acknowledge issues nobody has resolved
yet, so you don't get discouraged.