SlideShare a Scribd company logo
Vocabulary Interoperability in the Semantic Web 
Why Linked Data will Transform the Taxonomy Profession 
James R Morris 
Associate Principle Information Scientist 
AstraZeneca 
Oct 21, 2013 
This article was originally published in Taxonomy Times : Bulletin of the SLA Taxonomy Division, issue 16 
(October 2013). Visit http://taxonomy.sla.org to learn more about becoming a member. 
Introduction: why this matters 
The management of controlled vocabularies is a traditional librarian specialty. These vocabularies, these 
networks of concepts, that librarians have created, managed, and used for decades, are more important than 
ever in today’s highly networked and complex information world. (This article uses the term "vocabularies" 
to refer to any type of controlled and structured collection of terms or concepts: thesauri, taxonomies, 
ontologies, glossaries, etc.) 
We live in a networked world--a “linked” world. The World Wide Web is more than a technology; it’s a fact of 
life. No one entity controls the content on the web, and that’s a good thing. In the same way, we can’t control 
all the information and data that we might need to do our jobs. We can’t import it all, we can’t recreate it all, 
and we can't buy it all. More critically, we can't predict what information will help us answer our questions 
until we ask. 
Vocabularies (taxonomies, thesauri, etc.), as every taxonomist knows, are essential for finding, navigating, 
and connecting the information or data we need. But just like the rest of the information we need, we don’t 
control those either (unless we own them). We can’t import them all and we can’t buy them all. And we can’t 
recreate them all--which is often what we end up doing. 
To continue to build our profession's capability in building, managing, and using taxonomies, thesauri and 
other vocabularies, we need an approach that recognizes the reality of today’s highly networked information 
world. This approach includes Linked Data principles and the technology of the Semantic Web, which are 
designed to harness the inherent open-endedness of the World Wide Web. It's the right platform on which to 
continue building our professional expertise in vocabulary management. 
The opportunities that this field presents to our profession are very exciting. We need more discussions about 
Linked Data vocabulary management among librarians and information scientists. To an equal extent, the 
values, skills, and experiences of librarians and information scientists are needed to make this emerging field 
successful. 
Background: The “hyper-connected enterprise”
The phrase “hyper-connected enterprise” was highlighted in a 2007 Gartner report, “Gartner Predicts 2008: 
CIO Success Factors” (http://www.gartner.com/id=564014. License required). Many organizations are 
moving towards that model and the progression usually looks like this: 
1. One Enterprise. It starts with a very self-contained organization, with ready access to all the information 
it needs. 
2. Connected Enterprise. Then increasing number of suppliers, outsourced providers, and new partners 
become part of how the organization operates day-to-day. 
3. Hyper-connected Enterprise. Then, the relationship to these providers and partners begins to evolve. 
Positions of the players shift, they come and go more rapidly, and boundaries between the center and 
periphery begin to blur. 
To use Gartner's language, "Ecosystems widen, Complexity rises, Coordination increases.” The bottom line is 
that coordination is the key--not control. Of course your own information must be well controlled, but you 
cannot succeed with that alone. 
As mentioned earlier, this is about more than a hyper-connected enterprise--it’s about a hyper-connected 
world. It’s an open webbed world; we can’t predict what information we’ll need to answer--or even ask--our 
questions. But in this world how do we effectively combine resources and work with others to ultimately 
achieve our goals? 
Vocabulary Interoperability and the Semantic Web 
The Semantic Web is about building a data architecture that enables us to ask questions and solve business 
problems across a heterogeneous information landscape extending beyond the traditional boundaries of one 
organization. Traditionally, mainly professional indexers and searchers have used controlled vocabularies. 
But today they are often used in a text-analytics or informatics context: 
 Adding structure to unstructured text; essentially tagging unstructured content with vocabulary term 
identifiers, by looking for strings that match a wide-array of synonyms, and by other more sophisticated 
algorithms. 
 Adding additional structure to already indexed content so that we can organize information in ways 
specific to our requirements. 
 Adding vocabulary metadata to the content or to a search index to connect information across 
repositories of information by connecting the concepts embedded in that information. 
Let’s review the hyper-connected enterprise again, this time in the context of controlled vocabularies: 
1. One Enterprise. For one organization, with good control of their information, an enterprise taxonomy, 
or set of enterprise vocabularies can be created, and their usage enforced. A governance structure for
maintaining it can be established. Enterprise taxonomies and master data management is still 
important--companies do need to manage their own information well. 
2. Connected Enterprise: To use external information sources, use their native vocabularies. Deliver 
information to outsourced providers and/or regulatory agencies requires the use of certain 
vocabularies. For example, in the pharmaceutical and healthcare field, MedDRA, ICD-9-CM, ICD-10, 
SNOMED are a few common ones. In the case of a merger or strategic alliance, another company 
might have their own enterprise taxonomy or product thesaurus that can be integrated with yours 
without too high a cost. 
3. Hyper-connected enterprise: Other partners, universities, companies, divisions, and subsidiaries 
connected to the enterprise are spread across the globe, and they come and go with increasing 
frequency. Embarking on a new taxonomy integration project for every partner that appears in the 
network is not feasible. 
 A small biotech might say, “Enterprise vocabulary? We’re too small – we just use terms that we 
need; it’s not complicated. You're too bureaucratic.” 
 A large corporate partner or regulatory agency might say, “Over here, yes we follow standard 
vocabularies. They’re not the same as your standards, but…” 
 An information supplier or database provider could say, “Our proprietary databases use specially 
designed vocabs optimized for our software. You should really be using our software, you know. 
We have the information need.” 
Traditional solutions to taxonomy development and management fall short of meeting the challenges of the 
hyper-connected enterprise. Semantic Web principles and technology offer a new approach to the 
management of our thesauri, taxonomies, and ontologies--one of vocabulary interoperability. Treating 
vocabularies as part of a network--just like the web itself—is needed. A network of controlled vocabularies to 
facilitate data interoperability and query expansion across disparate structured and unstructured, internal 
and external, information is part of the solution. A “network” implies that the vocabularies can be rapidly 
adopted, reused, mapped, and versioned using standardized, sustainable technologies and approaches. 
Principles 
The following principles were drafted in my organization. They are presented as points for discussion within 
the librarian/taxonomist community. 
 Don’t create a new vocabulary where one already exists. The preference is for authoritative, 
trustworthy, well-managed sources. Apply standard library criteria used to evaluate resources: 
authority, accuracy, currency, point of view (objectivity), coverage, relevance, and format. 
 Extend those sources if necessary, but even then they should be extended by making associations 
with other authoritative sources. Whenever possible leverage cross-vocabulary mappings already 
available publicly.
 When necessary, extend or map an external vocabulary with internal or bespoke terminology. This 
could include mapping obvious synonyms between vocabularies, or could entail more nuanced 
relationships. For example, mapping a published vocabulary to a list of values from a proprietary or 
internal database. By strategically connecting the concepts, the data is connected. 
 Do not corrupt authoritative vocabulary sources. It will always be possible to identify the source 
vocabulary, including what version, in the network. It's just like a library. A page is not taken out of a 
book and pasted it into another; sections are not crossed out here and added there to the point 
where the books can’t be put back together again--that shouldn’t be done with vocabularies either. 
However, using Linked Data standards, vocabularies can be repurposed. For example, perhaps a 
deeply hierarchical taxonomy just needs to be rendered as flat list, or only some concepts are 
relevant to the purpose at hand. Linked Data can enable these types of reuse without corrupting the 
source. 
 Obtain vocabularies directly from the authority that produces them whenever practical. Vocabularies 
will be managed as linked data. Preference is for the “SKOS” standard. SKOS stands for "Simple 
Knowledge Organization System" and is a lightweight W3C data standard that can be applied to any 
controlled vocabulary. Focusing on this simple standard allows us to standardize curation, 
management, and delivery processes. 
Capabilities 
The following are a proposed set of linked data vocabulary management capabilities. They are presented as a 
impetus for further discussion within the librarian/taxonomies community. 
 Vocabulary Mapping. Reusing, creating, and maintaining the relationships between vocabularies 
holds the power of the “vocabulary network” at the heart of Linked Data Vocabulary Management. 
Processes and technology need to be developed for lightweight, sustainable, semantics-based 
vocabulary mapping. Challenges include: 
o Mapping two or more vocabularies together using Semantic technology. 
o Facilitating the review of automatic mapping; approving, rejecting, adjusting, and amending 
mapping. 
o Applying traditional taxonomy heuristics in a Linked Data environment as quality measures. 
o Leveraging the mapping already done in published vocabularies. 
o Publishing mapped “bridging” vocabularies to customers. 
 Vocabulary Slicing. Sometimes a vocabulary consumer (system or project) needs only a subset of a 
vocabulary that is relevant to their needs; for example, only the terms that are of interest to a 
particular project or organizational unit. Rather than create a duplicate, we need to identify terms or 
branches of relevance, and make those available--all while retaining their link back to the authority,
so that updates to the authority can continue to be leveraged (new synonyms, new mappings, new 
child terms, new notes). 
 Vocabulary Version Management. As the focus is the reuse of already published vocabularies, 
successfully managing the impact of new versions on the network is critical. For example: 
o Comparing a new version to an existing version. 
o Identifying impact of vocabulary changes on the network; that is, to other vocabularies that 
map to it. 
o Identifying impact of vocabulary changes on consuming systems or related datasets. Then, 
once we identify the impact, how do we address it? 
 Vocabulary modeling and conversions. The vocabularies in the network need to be accessible as 
Linked Data, using the data standards of the Semantic Web. And, until external suppliers deliver or 
make their vocabularies available as linked data, we will need to be extremely proficient in 
converting them from their native formats. 
 Vocabulary Inventory. Managing a network of vocabularies requires a sophisticated inventory. Ideally 
managed as Linked Data itself, the inventory can directly inform queries using the vocabulary 
network. For example, identify all the vocabularies that cover concept X, the datasets that reference 
those vocabularies, and how to access them. There are developing standards in the Semantic Web 
specifically for managing these types of inventories. 
Vocabulary Mapping using Linked Data--in more detail. 
For this article, just one of these capabilities, Vocabulary Mapping, will be explored in more detail. How, and 
why, would two vocabularies be brought together using Linked Data? First, a few Semantic Web 
fundamentals need to be covered. 
The Semantic Web is a web of data--not pages. 
Pages need to be read by a person, or a program mimicking a person, in order to obtain the meaning of the 
data on the page. For example, if we want to learn the times that local bars offer “happy hour”, we need to find 
the web page for each bar, scan pages, follow links, and finally read content that says something like “we have 
happy hour M-F from 4-7”. It doesn’t matter that that information might originally be in a database that builds 
the page, because to interpret the information we need to read the page. Linked Data challenges us to use the 
principles of the web: unique identification of resources and links between those resources, to the data itself. 
URIs, RDF, and Ontologies 
 URIs. URI stands for “uniform resource identifier”. URL’s are “uniform resource locators”, a type of 
URI used to uniquely identify pages and locate them. You "point" your browser to them to locate the
unique page. In the semantic web, things in the real world are also given URIs. In fact they use the 
same syntax. For, example, a local bar, The Artful Dodger, might have a webpage with a URL. But in 
the semantic web the bar itself--as a unique thing, not a webpage--would have a URI. The URI could 
be used to differentiate it from the Dickens character of the same name, which might also have a URI. 
 RDF. RDF stands for Resource Description Framework. RDF is used to encode the most basic 
statement about a thing in the form of what is called a "triple": subject--predicate--object, which is 
often abbreviated S--P--O. For example: 
o The Artful Dodger (subject), Is a (predicate), Bar (object). 
o The Artful Dodger (subject), Has a (predicate), Happy Hour (object). 
 Ontologies. Ontologies are collections of information about classes of things and their relationships. 
For example, an ontology might state that a Bar (subject) is a Type of (predicate) Restaurant (object). 
So if we have data that states that the Artful Dodger (S), Is a (P), Bar (O), then, by referencing our 
ontology, we can infer that the Artful Dodger (S) is a (P), restaurant (O). Inference is a very important 
concept in the Semantic Web that will be covered more closely in the next example. 
The following is how these fundamentals apply to vocabularies, and specifically to the capability of mapping 
two vocabularies together. 
Concepts are Things 
Just like The Artful Dodger is a thing, the concept of a “Bar” is also a thing. Vocabularies, taxonomies, thesauri, 
etc. are all collections of concepts. 
Concepts have URIs 
As discussed above, things are identified in the semantic web with URIs. A vocabulary is about concepts, their 
properties, and their relationships. 
SKOS is for vocabularies 
SKOS stands for “Simple Knowledge Organization System”, and is a very popular Semantic Web ontology. It is 
a standardized set of classes and properties for representing “knowledge organization systems”, a.k.a. 
vocabularies. Unlike other thesaurus and taxonomy standards, the fact that SKOS is Linked Data means it 
allows for the management of vocabularies in a distributed, linkable, interoperable, way. In other words, as 
part of the open-ended, web of data. The following are some of the most common SKOS elements: 
 skos:Concept and skos:ConceptScheme. These are what are called "classes" in the Semantic Web. Every 
"term" in a vocabulary is of the class skos:Concept. A skos:ConceptScheme designates a particular 
thesaurus, or part of a thesaurus. Concepts and ConceptSchemes are both things, so each is 
represented by a unique URI. Concepts can belong to one or more ConceptSchemes. 
 skos:prefLabel, skos:altLabel, and skos:definition. These are used to designate fundamental properties 
of the classes like skos:Concept. Skos:prefLabel is preferred name for a concept, and skos:altLabel is 
any synonym or other type of entry term, e.g. “see from”, “used for”, etc. An important feature of 
SKOS as well as most Semantic Web ontologies is that the classes and properties are extensible. For
example, we can create a sub-property, unique to our context. We might want a specific type of 
skos:altLabel to differentiate acronyms from other synonyms. In this case, we would create a new 
property for acronym, but make it a sub-property of skos:altLabel. In this way, we can identify the 
acronyms specifically if we need to, but they can also be referenced as generic synonyms. This is 
particularly useful when working across multiple vocabularies where custom properties can all 
resolve to generic SKOS standards. 
 skos:broader, skos:narrower, skos:related. These are the essential properties that serve as the building 
blocks for a hierarchical or ontological structure, within a single vocabulary. 
 skos:exactMatch, skos:closeMatch. All of the above properties are intended for use within a particular 
vocabulary or skos:ConceptScheme. These matching relationships, on the other hand, are used to 
relate Concepts across vocabularies. For example, the concept for Bladder Cancer in one vocabulary 
might be an exactMatch to the concept for Bladder Neoplasm in another vocabulary. 
Now that some basics about Linked Data and SKOS have been covered, we’ll look at a specific example of 
using SKOS to delivering the core capability of Mapping. Here are two concepts from two popular biomedical 
thesauri: the National Cancer Institute (NCI) Thesaurus or NCIt, and the National Library of Medicine’s 
Medical Subject Headings or MeSH, encoded in SKOS: 
NCI Thesaurus – “Bladder Neoplasm” 
URI = http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C2901 
(Shorthand: ncit:C2901) 
Subject (S) Predicate (P) Object (O) 
ncit:C2901 a skos:Concept 
skos:prefLabel “Bladder Neoplasm” 
skos:altLabel “Tumor of Bladder” 
skos:definition “A benign or malignant…” 
skos:broader ncit:C2900 (“Bladder Disorder”) 
skos:broader ncit:C3431 (“Urinary System Neoplasm”) 
MeSH – “Urinary Bladder Neoplasms” 
URI = http://purl.bioontology.org/ontology/MSH/D001749 
(Shorthand: mesh:D001749) 
Subject (S) Predicate (P) Object (O) 
mesh:D001749 a skos:Concept
skos:prefLabel “Urinary Bladder Neoplasms” 
skos:altLabel “Cancer of Bladder” 
skos:definition “Tumors or cancer of the URINARY 
BLADDER…” 
skos:broader mesh:D014571 (“Urogenital Neoplasms”) 
skos:broader mesh:D001745 (“Urinary Bladder Diseases”) 
Once we can uniquely identify each of these concepts with URIs we can record some basic things about them: 
 They're both concepts. 
 They both have preferred labels 
 They each have synonyms or entry terms (many more, in fact, than just those listed here) 
 They have definitions. 
 They have one or more parents because both of these thesauri are poly-hierarchical. 
A couple critical points here: If we look at the full record for this concept in MeSH 
(http://www.ncbi.nlm.nih.gov/mesh/68001749), it does in fact have the NCIt preferred label, “Bladder 
Neoplasm”, as a synonym. However, NCIt does not have the MeSH preferred name as a synonym. Note 
especially that alternate labels of “Cancer of the Bladder” in MeSH and “Tumor of the Bladder” in NCIt are 
unique to each vocabulary. 
But what if we know that mesh:D001749 is the same concept as ncit:C2901? We map those two concepts 
together by creating a new “triple” that creates a relationship between the two URIs. 
Subject (S) Predicate (P) Object (O) 
ncit:2901 skos:exactMatch mesh:D001749 
So now we've encoded the relationship that MeSH concept labeled “Urinary Bladder Neoplasms” equals the 
NCIt concept labeled “Bladder Neoplasm”. How do we realize the power of this relationship? What if we’re 
searching two carefully indexed repositories--one indexed using MeSH and another using NCIt and we want 
to produce a joined up view of these two repositories, leveraging the power of the two separate indexing 
methods? What if we’re searching across multiple repositories and we want as rich a synonym set for 
“Bladder Cancer” as we can get. As we see in this example--by using only one of these vocabularies, we could 
end up missing things.
Here’s how we use Linked Data to do really interesting things. Now we’ll create a rule that builds upon our 
new matching relationship to bring more intelligence to our vocabularies. For this we use SPARQL. SPARQL 
is akin to SQL for relational databases in that it is the language with which to query and manipulate Linked 
Data in the Semantic Web. 
SPARQL Rule: 
CONSTRUCT ?s skos:altLabel ?o . 
WHERE ?s skos:exactMatch ?match . 
(Note that this is not executable SPARQL; the syntax has been simplified for the clarity of this example.) 
Creating these rules is key. When we hear experts talk about making your data “more intelligent”--this is what 
they mean! This three-line SPARQL command takes that new matching relationship and reasons that the 
preferred labels or synonyms in the matched MeSH concept can be inferred as additional synonyms (in red 
italics below) in the NCIt concept. It would do this for all the concepts in the NCIt that have been matched to 
Mesh terms. 
Subject(S) Predicate (P) Object (O) 
ncit:2901 skos:prefLabel “Bladder Neoplasm” 
skos:altLabel “Urinary Bladder Neoplasms” 
skos:altLabel “Cancer of the Bladder” 
Those terms “reasoning” and “inferred” are very important. First we’re making a rule that a computer can use 
to infer additional information. Second--that’s all this has to be--an inference. We are not necessarily hard-coding 
additional vocabulary data. We are not manipulating the source vocabularies at all. Every inferred 
triple can be managed completely separately from either source vocabulary! 
A fair question to ask is, "What about meta-thesauri and large vocabulary mapping efforts? Aren't they 
intended to do the same thing? For example, isn’t that what the National Cancer Institute's Metathesaurus and 
the National Library of Medicine's UMLS (Unified Medical Language System) initiatives are for?” Yes, to a 
point. However: 
1. MeSH and NCIt are just two familiar examples. What if we try to map vocabularies or search across 
repositories that use less standard vocabularies: proprietary, internal, or custom vendor vocabs? Or 
if we need to search, and view information in an interface using the richness of NCIt, but against an 
repository that uses something different--a folksonomy perhaps, or just values selected from a drop-down 
list. 
?match skos:prefLabel | skos:altLabel ?o .
2. What if the mapping we wanted to do was proprietary to a project or a company? Every taxonomist 
will agree that how concepts are arranged and mapped to each other has power. However, what if 
that mapping was relevant today--but not tomorrow? How can this be accomplished by leveraging 
these large published vocabularies without corrupting them, or worse--by building yet another one? 
It is important to be clear that identifying those initial matches (and they are not always as straightforward as 
an “exactMatch”) is the hard intellectual work of vocabulary management that is, and will continue to be, a 
hallmark of our profession. The semantic web doesn’t solve the problem of knowing which terms to match. 
But what it does do--unlike any other taxonomy, thesaurus, vocabulary management system--is give us the 
standards and tools to encode those matches and create rules that computers can use to make additional 
inferences. And very importantly, it does not require that we merge or integrate or otherwise corrupt original 
authoritative sources. So this approach doesn’t make it easy to manage vocabularies in a hyper-connected 
world--that’s not the point. The point is it makes it possible; possible, sustainable and extensible. 
Linked Data Vocabulary Management Roles 
In thinking about how to make this all work and to develop and deliver vocabulary services based on these 
new approaches, the following roles could be assigned. 
Vocabulary Developer. Uses Semantic Web technologies and informatics toolkit to ingest, model, bridge, 
slice, quality check, and version-control vocabularies. This would require some programming skills, and in-depth 
knowledge of the technology. 
Vocabulary Manager. Provides hands-on management, development and enhancement of vocabularies. 
They would have a deep understanding of vocabulary processes; and internal/external vocabulary landscape 
and best practices, often with subject specialty. This role is most like our taxonomist role of today. 
Vocabulary Administrator. Provides first-line support for helping customer’s access, use and exploitation of 
available vocabularies. They are accountable for management of service processes, documentation, and 
inventories. 
Conclusion: The Semantic Web needs us 
The Promise and the Reality. Linked Data approaches are the sustainable way to work with vocabularies in 
an increasingly federated and linked information world. The reality is that not enough vocabularies are 
available, from the authority that produced them, natively in SKOS or another form of RDF. This needs to 
change. But we can start to instill these practices where we work. This is field is rapidly developing; it needs 
librarians and information scientists with their deep knowledge of the importance and power of controlled 
vocabularies and the information sources that use them. 
This is not the future--it’s now. Leading organizations in the US, like OCLC and the Library of Congress, are 
adopting these standards. This even more true in Europe. The field of biomedical ontologies is very exciting 
right now, and firmly moving toward a Linked Data paradigm. The nature of business and of information 
demands new approaches. Even taking the Semantic Web’s promise of universally sharing data across the 
world out of the picture, SKOS and RDF--even just applied to vocabularies--is opportunity alone. These
vocabularies can be published in whatever form people or systems need them in. However, to manage a web 
of vocabularies requires tools and techniques designed for the web. 
This is not easy. Breaking down information, including vocabularies into their most basic constructs, is what 
RDF does. It is elegant in design, but can quickly become very complex in execution. Writing the SPARQL 
queries that do complete transformations of vocabularies, manage versions, etc. can become very 
sophisticated. But isn't it a challenge worth tackling? Aren’t librarians and information scientists poised to 
make this happen? 
SKOS is a great example of a simplified model with practical application. RDF can be used to model ontologies 
in linguistically accurate, but extraordinarily complex, ways as well. The discussions at a Bioontology or 
Semantic Web conference can get very deep. But the beauty of RDF is that those complex models can be 
transformed, again without corrupting the source, into a simple model like SKOS. Guarding against 
unnecessary complexity is something that librarians are especially good at. 
Managing unconnected silos of information, including taxonomies, will not harness the power of a networked, 
collaborative world. Vocabulary interoperability, enabled by the Linked Data principles of the Semantic Web 
will--if further developed as a discipline within our profession. 
Acknowledgements 
This article was derived from a presentation given by James R Morris at the Special Libraries Association’s 
2013 PHT and BIO Division Spring Meeting, April 14-16, 2013. 
The author thanks: 
 The AstraZeneca R&D Vocabulary Team and Linked Data Community of Practice members: 
Courtland Yockey, Sorana Popa, Rob Hernandez, Dana Crowley, Tom Plasterer, Mike Westaway, 
Kerstin Forsberg, Johan Turnqvist, Tracy Wolski, Dan Ringenbach, Rajan Desai, Pete Edwards, Simon 
Rakov, Jeff Saxe, and Ian Dix. 
 Vendors and Advisors: 
o Scott Henninger, Bob Ducharme – TopQuadrant 
o Dean Allemang – Working Ontologist 
References 
• Dean Allemang, Jim Hendler. Semantic Web for the Working Ontologist, 2nd edition. Morgan 
Kaufman, 2011.
The first few chapters of this book provide one of the best overviews of the semantic web that you’ll find, 
and is very readable. It also has a whole chapter on SKOS. Available through ScienceDirect and Safari. 
• Bob Ducharme. Learning Sparql, 2nd edition. O’Reilly Media, 2013. 
Invaluable to have on hand when writing SPARQL. 
• World Wide Web Consortium (W3C). SKOS Simple Knowledge Organisation System Primer. 
http://www.w3.org/TR/skos-primer Straight from the source, it is technically oriented, but once you 
understand the basics, this is essential reading. 
• National Center for Biomedical Ontology. BioPortal. http://bioportal.bioontology.org . 
A phenomenally deep resource providing access to many biomedical vocabularies, queriable through 
SPARQL. The NCBO also has several webinars available. 
• Lee Harland, et al. “Empowering industrial research with shared biomedical vocabularies” Drug 
Discovery Today, Volume 16, Issues 21-22, November 2011, Pages 940-947, ISSN 1359-6446, 
10.1016/j.drudis.2011.09.013 
Excellent overview of the challenges and opportunities for using biomedical vocabularies in 
pharmaceutical research. 
• Kerstin Forsberg. Linked Data and URI:s for Enterprises (Blog). http://kerfors.blogspot.com. 
A thought-leader and evangelist for linked data, especially in the clinical domain.

More Related Content

What's hot

Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...Anil Mishra
 
Evaluating Library Capacity to Manage Research Data
Evaluating Library Capacity to Manage Research DataEvaluating Library Capacity to Manage Research Data
Evaluating Library Capacity to Manage Research Data
Charleston Conference
 
Introduction to Taxonomy Development - by Clobridge Consulting
Introduction to Taxonomy Development - by Clobridge ConsultingIntroduction to Taxonomy Development - by Clobridge Consulting
Introduction to Taxonomy Development - by Clobridge Consulting
Abby Clobridge
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
Kristi Holmes
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programs
Joseph Busch
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
Clare Dean
 
Metadata strategies for transportation agencies: An information management pe...
Metadata strategies for transportation agencies: An information management pe...Metadata strategies for transportation agencies: An information management pe...
Metadata strategies for transportation agencies: An information management pe...
Joseph Busch
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
National Information Standards Organization (NISO)
 
Introducing Siderean Software (PC Forum 2005)
Introducing Siderean Software (PC Forum 2005)Introducing Siderean Software (PC Forum 2005)
Introducing Siderean Software (PC Forum 2005)Bradley Allen
 
Starfish - Networked Knowledge Sharing
Starfish - Networked Knowledge SharingStarfish - Networked Knowledge Sharing
Starfish - Networked Knowledge Sharing
Sander Latour
 
Knowledge organization
Knowledge organizationKnowledge organization
Knowledge organization
Ethel88
 
The Metadata Secret in Your Data
The Metadata Secret in Your DataThe Metadata Secret in Your Data
The Metadata Secret in Your Data
Everteam
 
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
gardensofmeaning
 
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de BellisFrom Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
Xanat V. Meza
 
Enterprise Navigation (KM World 2007)
Enterprise Navigation (KM World 2007)Enterprise Navigation (KM World 2007)
Enterprise Navigation (KM World 2007)Bradley Allen
 
Data models and ro
Data models and roData models and ro
Data models and ro
Diana Diana
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
David Champeau
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
Amit Sheth
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
Amit Sheth
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Bradley Allen
 

What's hot (20)

Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...
 
Evaluating Library Capacity to Manage Research Data
Evaluating Library Capacity to Manage Research DataEvaluating Library Capacity to Manage Research Data
Evaluating Library Capacity to Manage Research Data
 
Introduction to Taxonomy Development - by Clobridge Consulting
Introduction to Taxonomy Development - by Clobridge ConsultingIntroduction to Taxonomy Development - by Clobridge Consulting
Introduction to Taxonomy Development - by Clobridge Consulting
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programs
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
 
Metadata strategies for transportation agencies: An information management pe...
Metadata strategies for transportation agencies: An information management pe...Metadata strategies for transportation agencies: An information management pe...
Metadata strategies for transportation agencies: An information management pe...
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Introducing Siderean Software (PC Forum 2005)
Introducing Siderean Software (PC Forum 2005)Introducing Siderean Software (PC Forum 2005)
Introducing Siderean Software (PC Forum 2005)
 
Starfish - Networked Knowledge Sharing
Starfish - Networked Knowledge SharingStarfish - Networked Knowledge Sharing
Starfish - Networked Knowledge Sharing
 
Knowledge organization
Knowledge organizationKnowledge organization
Knowledge organization
 
The Metadata Secret in Your Data
The Metadata Secret in Your DataThe Metadata Secret in Your Data
The Metadata Secret in Your Data
 
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
Simple Knowledge Organization System (SKOS) in the Context of Semantic Web De...
 
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de BellisFrom Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
 
Enterprise Navigation (KM World 2007)
Enterprise Navigation (KM World 2007)Enterprise Navigation (KM World 2007)
Enterprise Navigation (KM World 2007)
 
Data models and ro
Data models and roData models and ro
Data models and ro
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
 

Viewers also liked

Blocks by Lachs Cox
Blocks by Lachs CoxBlocks by Lachs Cox
Blocks by Lachs Cox
lachie
 
Upper B U Nits 17 18 19 Review For The Test
Upper B U Nits 17 18 19  Review For The TestUpper B U Nits 17 18 19  Review For The Test
Upper B U Nits 17 18 19 Review For The Test
danubiabull
 
Jewels & Watches - concept presentatie
Jewels & Watches -  concept presentatieJewels & Watches -  concept presentatie
Jewels & Watches - concept presentatie
Sander Ejlenberg
 
Oral practice
Oral practiceOral practice
Oral practice
danubiabull
 
The Rails Request Cycle
The Rails Request CycleThe Rails Request Cycle
The Rails Request Cycle
lachie
 
Promo Curs Oratoria I 2008 09
Promo Curs Oratoria I 2008 09Promo Curs Oratoria I 2008 09
Promo Curs Oratoria I 2008 091977bcn
 
Promo PolíTiques Sectorials Socialistes.Març
Promo PolíTiques Sectorials Socialistes.MarçPromo PolíTiques Sectorials Socialistes.Març
Promo PolíTiques Sectorials Socialistes.Març1977bcn
 
Promo Curs Gestió Emocional I Lideratge 2008
Promo Curs Gestió Emocional I Lideratge 2008Promo Curs Gestió Emocional I Lideratge 2008
Promo Curs Gestió Emocional I Lideratge 2008
1977bcn
 
Pcb101 Quick Quiz L1
Pcb101 Quick Quiz L1Pcb101 Quick Quiz L1
Pcb101 Quick Quiz L1selby
 
Prit 1 Unit 11 20 Oral Practice Ok
Prit 1   Unit 11   20   Oral Practice OkPrit 1   Unit 11   20   Oral Practice Ok
Prit 1 Unit 11 20 Oral Practice Ok
danubiabull
 
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
James R. Morris
 
Propostes Per Al Pad Del Psc D’Horta Guinardó
Propostes Per Al Pad Del Psc D’Horta GuinardóPropostes Per Al Pad Del Psc D’Horta Guinardó
Propostes Per Al Pad Del Psc D’Horta Guinardó
1977bcn
 
eLearning: Ipotesi progettuale di una rete di learning point
eLearning: Ipotesi progettuale di una rete di learning pointeLearning: Ipotesi progettuale di una rete di learning point
eLearning: Ipotesi progettuale di una rete di learning point
Domenico Capano
 
Keep the Complexity. Simplify with SKOS
Keep the Complexity. Simplify with SKOSKeep the Complexity. Simplify with SKOS
Keep the Complexity. Simplify with SKOSJames R. Morris
 
Spectrofluorimetry Lecture
Spectrofluorimetry LectureSpectrofluorimetry Lecture
Spectrofluorimetry Lectureselby
 
La vida antes de Google
La vida antes de GoogleLa vida antes de Google
La vida antes de Google
José Ramón Rodríguez Martín
 

Viewers also liked (16)

Blocks by Lachs Cox
Blocks by Lachs CoxBlocks by Lachs Cox
Blocks by Lachs Cox
 
Upper B U Nits 17 18 19 Review For The Test
Upper B U Nits 17 18 19  Review For The TestUpper B U Nits 17 18 19  Review For The Test
Upper B U Nits 17 18 19 Review For The Test
 
Jewels & Watches - concept presentatie
Jewels & Watches -  concept presentatieJewels & Watches -  concept presentatie
Jewels & Watches - concept presentatie
 
Oral practice
Oral practiceOral practice
Oral practice
 
The Rails Request Cycle
The Rails Request CycleThe Rails Request Cycle
The Rails Request Cycle
 
Promo Curs Oratoria I 2008 09
Promo Curs Oratoria I 2008 09Promo Curs Oratoria I 2008 09
Promo Curs Oratoria I 2008 09
 
Promo PolíTiques Sectorials Socialistes.Març
Promo PolíTiques Sectorials Socialistes.MarçPromo PolíTiques Sectorials Socialistes.Març
Promo PolíTiques Sectorials Socialistes.Març
 
Promo Curs Gestió Emocional I Lideratge 2008
Promo Curs Gestió Emocional I Lideratge 2008Promo Curs Gestió Emocional I Lideratge 2008
Promo Curs Gestió Emocional I Lideratge 2008
 
Pcb101 Quick Quiz L1
Pcb101 Quick Quiz L1Pcb101 Quick Quiz L1
Pcb101 Quick Quiz L1
 
Prit 1 Unit 11 20 Oral Practice Ok
Prit 1   Unit 11   20   Oral Practice OkPrit 1   Unit 11   20   Oral Practice Ok
Prit 1 Unit 11 20 Oral Practice Ok
 
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
Vocabulary Interoperability using Linked Data : Principles, Capabilities and ...
 
Propostes Per Al Pad Del Psc D’Horta Guinardó
Propostes Per Al Pad Del Psc D’Horta GuinardóPropostes Per Al Pad Del Psc D’Horta Guinardó
Propostes Per Al Pad Del Psc D’Horta Guinardó
 
eLearning: Ipotesi progettuale di una rete di learning point
eLearning: Ipotesi progettuale di una rete di learning pointeLearning: Ipotesi progettuale di una rete di learning point
eLearning: Ipotesi progettuale di una rete di learning point
 
Keep the Complexity. Simplify with SKOS
Keep the Complexity. Simplify with SKOSKeep the Complexity. Simplify with SKOS
Keep the Complexity. Simplify with SKOS
 
Spectrofluorimetry Lecture
Spectrofluorimetry LectureSpectrofluorimetry Lecture
Spectrofluorimetry Lecture
 
La vida antes de Google
La vida antes de GoogleLa vida antes de Google
La vida antes de Google
 

Similar to Vocabulary interoperability in the semantic web james r morris

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
 
Empowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic EnrichmentEmpowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic Enrichment
The Digital Group
 
Semantic intelligence
Semantic intelligenceSemantic intelligence
Semantic intelligence
Stephen Lahanas
 
Co-creation of Learning and Social CRM
Co-creation of Learning and Social CRMCo-creation of Learning and Social CRM
Co-creation of Learning and Social CRM
Darshan Desai
 
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Dan Keldsen
 
Transforming knowledge management for climate action
Transforming knowledge management for climate action  Transforming knowledge management for climate action
Transforming knowledge management for climate action
weADAPT
 
Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart
Streamlining Your Path to Metadata Charlotte Robidoux Stacey SwartStreamlining Your Path to Metadata Charlotte Robidoux Stacey Swart
Streamlining Your Path to Metadata Charlotte Robidoux Stacey SwartHewlett Packard Enterprise Services
 
XXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair KernerXXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair Kerner
Darrell W. Gunter
 
Tec2010 Buckley Share
Tec2010 Buckley ShareTec2010 Buckley Share
Tec2010 Buckley Share
Christian Buckley
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification
303Computing
 
Looking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint ConferenceLooking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint Conference
Christian Buckley
 
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
Sociotechnical Roundtable
 
Intelligent Content & Search
Intelligent Content & SearchIntelligent Content & Search
Intelligent Content & Search
Stephen Lahanas
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic WaveKaniska Mandal
 
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris MarinoKM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
KM Institute
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
Carmen Sanborn
 
What Is Taxonomy and Why Is It Useful?
What Is Taxonomy and Why Is It Useful?What Is Taxonomy and Why Is It Useful?
What Is Taxonomy and Why Is It Useful?
Theresa Putkey
 

Similar to Vocabulary interoperability in the semantic web james r morris (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Empowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic EnrichmentEmpowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic Enrichment
 
AAUP 2008: Making XML Work (T. Kerner)
AAUP 2008: Making XML Work (T. Kerner)AAUP 2008: Making XML Work (T. Kerner)
AAUP 2008: Making XML Work (T. Kerner)
 
Semantic intelligence
Semantic intelligenceSemantic intelligence
Semantic intelligence
 
Co-creation of Learning and Social CRM
Co-creation of Learning and Social CRMCo-creation of Learning and Social CRM
Co-creation of Learning and Social CRM
 
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
 
Transforming knowledge management for climate action
Transforming knowledge management for climate action  Transforming knowledge management for climate action
Transforming knowledge management for climate action
 
Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart
Streamlining Your Path to Metadata Charlotte Robidoux Stacey SwartStreamlining Your Path to Metadata Charlotte Robidoux Stacey Swart
Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart
 
XXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair KernerXXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair Kerner
 
Tec2010 Buckley Share
Tec2010 Buckley ShareTec2010 Buckley Share
Tec2010 Buckley Share
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification
 
Looking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint ConferenceLooking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint Conference
 
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
Adapting to New Realities: The Emergence of Network Organizations and Work Sy...
 
Intelligent Content & Search
Intelligent Content & SearchIntelligent Content & Search
Intelligent Content & Search
 
AKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATIONAKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATION
 
Angels_in_our_Midst
Angels_in_our_MidstAngels_in_our_Midst
Angels_in_our_Midst
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris MarinoKM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
KM SHOWCASE 2020 - "Lessons Learned Building a Knowledge Graph" - Chris Marino
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
What Is Taxonomy and Why Is It Useful?
What Is Taxonomy and Why Is It Useful?What Is Taxonomy and Why Is It Useful?
What Is Taxonomy and Why Is It Useful?
 

Vocabulary interoperability in the semantic web james r morris

  • 1. Vocabulary Interoperability in the Semantic Web Why Linked Data will Transform the Taxonomy Profession James R Morris Associate Principle Information Scientist AstraZeneca Oct 21, 2013 This article was originally published in Taxonomy Times : Bulletin of the SLA Taxonomy Division, issue 16 (October 2013). Visit http://taxonomy.sla.org to learn more about becoming a member. Introduction: why this matters The management of controlled vocabularies is a traditional librarian specialty. These vocabularies, these networks of concepts, that librarians have created, managed, and used for decades, are more important than ever in today’s highly networked and complex information world. (This article uses the term "vocabularies" to refer to any type of controlled and structured collection of terms or concepts: thesauri, taxonomies, ontologies, glossaries, etc.) We live in a networked world--a “linked” world. The World Wide Web is more than a technology; it’s a fact of life. No one entity controls the content on the web, and that’s a good thing. In the same way, we can’t control all the information and data that we might need to do our jobs. We can’t import it all, we can’t recreate it all, and we can't buy it all. More critically, we can't predict what information will help us answer our questions until we ask. Vocabularies (taxonomies, thesauri, etc.), as every taxonomist knows, are essential for finding, navigating, and connecting the information or data we need. But just like the rest of the information we need, we don’t control those either (unless we own them). We can’t import them all and we can’t buy them all. And we can’t recreate them all--which is often what we end up doing. To continue to build our profession's capability in building, managing, and using taxonomies, thesauri and other vocabularies, we need an approach that recognizes the reality of today’s highly networked information world. This approach includes Linked Data principles and the technology of the Semantic Web, which are designed to harness the inherent open-endedness of the World Wide Web. It's the right platform on which to continue building our professional expertise in vocabulary management. The opportunities that this field presents to our profession are very exciting. We need more discussions about Linked Data vocabulary management among librarians and information scientists. To an equal extent, the values, skills, and experiences of librarians and information scientists are needed to make this emerging field successful. Background: The “hyper-connected enterprise”
  • 2. The phrase “hyper-connected enterprise” was highlighted in a 2007 Gartner report, “Gartner Predicts 2008: CIO Success Factors” (http://www.gartner.com/id=564014. License required). Many organizations are moving towards that model and the progression usually looks like this: 1. One Enterprise. It starts with a very self-contained organization, with ready access to all the information it needs. 2. Connected Enterprise. Then increasing number of suppliers, outsourced providers, and new partners become part of how the organization operates day-to-day. 3. Hyper-connected Enterprise. Then, the relationship to these providers and partners begins to evolve. Positions of the players shift, they come and go more rapidly, and boundaries between the center and periphery begin to blur. To use Gartner's language, "Ecosystems widen, Complexity rises, Coordination increases.” The bottom line is that coordination is the key--not control. Of course your own information must be well controlled, but you cannot succeed with that alone. As mentioned earlier, this is about more than a hyper-connected enterprise--it’s about a hyper-connected world. It’s an open webbed world; we can’t predict what information we’ll need to answer--or even ask--our questions. But in this world how do we effectively combine resources and work with others to ultimately achieve our goals? Vocabulary Interoperability and the Semantic Web The Semantic Web is about building a data architecture that enables us to ask questions and solve business problems across a heterogeneous information landscape extending beyond the traditional boundaries of one organization. Traditionally, mainly professional indexers and searchers have used controlled vocabularies. But today they are often used in a text-analytics or informatics context:  Adding structure to unstructured text; essentially tagging unstructured content with vocabulary term identifiers, by looking for strings that match a wide-array of synonyms, and by other more sophisticated algorithms.  Adding additional structure to already indexed content so that we can organize information in ways specific to our requirements.  Adding vocabulary metadata to the content or to a search index to connect information across repositories of information by connecting the concepts embedded in that information. Let’s review the hyper-connected enterprise again, this time in the context of controlled vocabularies: 1. One Enterprise. For one organization, with good control of their information, an enterprise taxonomy, or set of enterprise vocabularies can be created, and their usage enforced. A governance structure for
  • 3. maintaining it can be established. Enterprise taxonomies and master data management is still important--companies do need to manage their own information well. 2. Connected Enterprise: To use external information sources, use their native vocabularies. Deliver information to outsourced providers and/or regulatory agencies requires the use of certain vocabularies. For example, in the pharmaceutical and healthcare field, MedDRA, ICD-9-CM, ICD-10, SNOMED are a few common ones. In the case of a merger or strategic alliance, another company might have their own enterprise taxonomy or product thesaurus that can be integrated with yours without too high a cost. 3. Hyper-connected enterprise: Other partners, universities, companies, divisions, and subsidiaries connected to the enterprise are spread across the globe, and they come and go with increasing frequency. Embarking on a new taxonomy integration project for every partner that appears in the network is not feasible.  A small biotech might say, “Enterprise vocabulary? We’re too small – we just use terms that we need; it’s not complicated. You're too bureaucratic.”  A large corporate partner or regulatory agency might say, “Over here, yes we follow standard vocabularies. They’re not the same as your standards, but…”  An information supplier or database provider could say, “Our proprietary databases use specially designed vocabs optimized for our software. You should really be using our software, you know. We have the information need.” Traditional solutions to taxonomy development and management fall short of meeting the challenges of the hyper-connected enterprise. Semantic Web principles and technology offer a new approach to the management of our thesauri, taxonomies, and ontologies--one of vocabulary interoperability. Treating vocabularies as part of a network--just like the web itself—is needed. A network of controlled vocabularies to facilitate data interoperability and query expansion across disparate structured and unstructured, internal and external, information is part of the solution. A “network” implies that the vocabularies can be rapidly adopted, reused, mapped, and versioned using standardized, sustainable technologies and approaches. Principles The following principles were drafted in my organization. They are presented as points for discussion within the librarian/taxonomist community.  Don’t create a new vocabulary where one already exists. The preference is for authoritative, trustworthy, well-managed sources. Apply standard library criteria used to evaluate resources: authority, accuracy, currency, point of view (objectivity), coverage, relevance, and format.  Extend those sources if necessary, but even then they should be extended by making associations with other authoritative sources. Whenever possible leverage cross-vocabulary mappings already available publicly.
  • 4.  When necessary, extend or map an external vocabulary with internal or bespoke terminology. This could include mapping obvious synonyms between vocabularies, or could entail more nuanced relationships. For example, mapping a published vocabulary to a list of values from a proprietary or internal database. By strategically connecting the concepts, the data is connected.  Do not corrupt authoritative vocabulary sources. It will always be possible to identify the source vocabulary, including what version, in the network. It's just like a library. A page is not taken out of a book and pasted it into another; sections are not crossed out here and added there to the point where the books can’t be put back together again--that shouldn’t be done with vocabularies either. However, using Linked Data standards, vocabularies can be repurposed. For example, perhaps a deeply hierarchical taxonomy just needs to be rendered as flat list, or only some concepts are relevant to the purpose at hand. Linked Data can enable these types of reuse without corrupting the source.  Obtain vocabularies directly from the authority that produces them whenever practical. Vocabularies will be managed as linked data. Preference is for the “SKOS” standard. SKOS stands for "Simple Knowledge Organization System" and is a lightweight W3C data standard that can be applied to any controlled vocabulary. Focusing on this simple standard allows us to standardize curation, management, and delivery processes. Capabilities The following are a proposed set of linked data vocabulary management capabilities. They are presented as a impetus for further discussion within the librarian/taxonomies community.  Vocabulary Mapping. Reusing, creating, and maintaining the relationships between vocabularies holds the power of the “vocabulary network” at the heart of Linked Data Vocabulary Management. Processes and technology need to be developed for lightweight, sustainable, semantics-based vocabulary mapping. Challenges include: o Mapping two or more vocabularies together using Semantic technology. o Facilitating the review of automatic mapping; approving, rejecting, adjusting, and amending mapping. o Applying traditional taxonomy heuristics in a Linked Data environment as quality measures. o Leveraging the mapping already done in published vocabularies. o Publishing mapped “bridging” vocabularies to customers.  Vocabulary Slicing. Sometimes a vocabulary consumer (system or project) needs only a subset of a vocabulary that is relevant to their needs; for example, only the terms that are of interest to a particular project or organizational unit. Rather than create a duplicate, we need to identify terms or branches of relevance, and make those available--all while retaining their link back to the authority,
  • 5. so that updates to the authority can continue to be leveraged (new synonyms, new mappings, new child terms, new notes).  Vocabulary Version Management. As the focus is the reuse of already published vocabularies, successfully managing the impact of new versions on the network is critical. For example: o Comparing a new version to an existing version. o Identifying impact of vocabulary changes on the network; that is, to other vocabularies that map to it. o Identifying impact of vocabulary changes on consuming systems or related datasets. Then, once we identify the impact, how do we address it?  Vocabulary modeling and conversions. The vocabularies in the network need to be accessible as Linked Data, using the data standards of the Semantic Web. And, until external suppliers deliver or make their vocabularies available as linked data, we will need to be extremely proficient in converting them from their native formats.  Vocabulary Inventory. Managing a network of vocabularies requires a sophisticated inventory. Ideally managed as Linked Data itself, the inventory can directly inform queries using the vocabulary network. For example, identify all the vocabularies that cover concept X, the datasets that reference those vocabularies, and how to access them. There are developing standards in the Semantic Web specifically for managing these types of inventories. Vocabulary Mapping using Linked Data--in more detail. For this article, just one of these capabilities, Vocabulary Mapping, will be explored in more detail. How, and why, would two vocabularies be brought together using Linked Data? First, a few Semantic Web fundamentals need to be covered. The Semantic Web is a web of data--not pages. Pages need to be read by a person, or a program mimicking a person, in order to obtain the meaning of the data on the page. For example, if we want to learn the times that local bars offer “happy hour”, we need to find the web page for each bar, scan pages, follow links, and finally read content that says something like “we have happy hour M-F from 4-7”. It doesn’t matter that that information might originally be in a database that builds the page, because to interpret the information we need to read the page. Linked Data challenges us to use the principles of the web: unique identification of resources and links between those resources, to the data itself. URIs, RDF, and Ontologies  URIs. URI stands for “uniform resource identifier”. URL’s are “uniform resource locators”, a type of URI used to uniquely identify pages and locate them. You "point" your browser to them to locate the
  • 6. unique page. In the semantic web, things in the real world are also given URIs. In fact they use the same syntax. For, example, a local bar, The Artful Dodger, might have a webpage with a URL. But in the semantic web the bar itself--as a unique thing, not a webpage--would have a URI. The URI could be used to differentiate it from the Dickens character of the same name, which might also have a URI.  RDF. RDF stands for Resource Description Framework. RDF is used to encode the most basic statement about a thing in the form of what is called a "triple": subject--predicate--object, which is often abbreviated S--P--O. For example: o The Artful Dodger (subject), Is a (predicate), Bar (object). o The Artful Dodger (subject), Has a (predicate), Happy Hour (object).  Ontologies. Ontologies are collections of information about classes of things and their relationships. For example, an ontology might state that a Bar (subject) is a Type of (predicate) Restaurant (object). So if we have data that states that the Artful Dodger (S), Is a (P), Bar (O), then, by referencing our ontology, we can infer that the Artful Dodger (S) is a (P), restaurant (O). Inference is a very important concept in the Semantic Web that will be covered more closely in the next example. The following is how these fundamentals apply to vocabularies, and specifically to the capability of mapping two vocabularies together. Concepts are Things Just like The Artful Dodger is a thing, the concept of a “Bar” is also a thing. Vocabularies, taxonomies, thesauri, etc. are all collections of concepts. Concepts have URIs As discussed above, things are identified in the semantic web with URIs. A vocabulary is about concepts, their properties, and their relationships. SKOS is for vocabularies SKOS stands for “Simple Knowledge Organization System”, and is a very popular Semantic Web ontology. It is a standardized set of classes and properties for representing “knowledge organization systems”, a.k.a. vocabularies. Unlike other thesaurus and taxonomy standards, the fact that SKOS is Linked Data means it allows for the management of vocabularies in a distributed, linkable, interoperable, way. In other words, as part of the open-ended, web of data. The following are some of the most common SKOS elements:  skos:Concept and skos:ConceptScheme. These are what are called "classes" in the Semantic Web. Every "term" in a vocabulary is of the class skos:Concept. A skos:ConceptScheme designates a particular thesaurus, or part of a thesaurus. Concepts and ConceptSchemes are both things, so each is represented by a unique URI. Concepts can belong to one or more ConceptSchemes.  skos:prefLabel, skos:altLabel, and skos:definition. These are used to designate fundamental properties of the classes like skos:Concept. Skos:prefLabel is preferred name for a concept, and skos:altLabel is any synonym or other type of entry term, e.g. “see from”, “used for”, etc. An important feature of SKOS as well as most Semantic Web ontologies is that the classes and properties are extensible. For
  • 7. example, we can create a sub-property, unique to our context. We might want a specific type of skos:altLabel to differentiate acronyms from other synonyms. In this case, we would create a new property for acronym, but make it a sub-property of skos:altLabel. In this way, we can identify the acronyms specifically if we need to, but they can also be referenced as generic synonyms. This is particularly useful when working across multiple vocabularies where custom properties can all resolve to generic SKOS standards.  skos:broader, skos:narrower, skos:related. These are the essential properties that serve as the building blocks for a hierarchical or ontological structure, within a single vocabulary.  skos:exactMatch, skos:closeMatch. All of the above properties are intended for use within a particular vocabulary or skos:ConceptScheme. These matching relationships, on the other hand, are used to relate Concepts across vocabularies. For example, the concept for Bladder Cancer in one vocabulary might be an exactMatch to the concept for Bladder Neoplasm in another vocabulary. Now that some basics about Linked Data and SKOS have been covered, we’ll look at a specific example of using SKOS to delivering the core capability of Mapping. Here are two concepts from two popular biomedical thesauri: the National Cancer Institute (NCI) Thesaurus or NCIt, and the National Library of Medicine’s Medical Subject Headings or MeSH, encoded in SKOS: NCI Thesaurus – “Bladder Neoplasm” URI = http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C2901 (Shorthand: ncit:C2901) Subject (S) Predicate (P) Object (O) ncit:C2901 a skos:Concept skos:prefLabel “Bladder Neoplasm” skos:altLabel “Tumor of Bladder” skos:definition “A benign or malignant…” skos:broader ncit:C2900 (“Bladder Disorder”) skos:broader ncit:C3431 (“Urinary System Neoplasm”) MeSH – “Urinary Bladder Neoplasms” URI = http://purl.bioontology.org/ontology/MSH/D001749 (Shorthand: mesh:D001749) Subject (S) Predicate (P) Object (O) mesh:D001749 a skos:Concept
  • 8. skos:prefLabel “Urinary Bladder Neoplasms” skos:altLabel “Cancer of Bladder” skos:definition “Tumors or cancer of the URINARY BLADDER…” skos:broader mesh:D014571 (“Urogenital Neoplasms”) skos:broader mesh:D001745 (“Urinary Bladder Diseases”) Once we can uniquely identify each of these concepts with URIs we can record some basic things about them:  They're both concepts.  They both have preferred labels  They each have synonyms or entry terms (many more, in fact, than just those listed here)  They have definitions.  They have one or more parents because both of these thesauri are poly-hierarchical. A couple critical points here: If we look at the full record for this concept in MeSH (http://www.ncbi.nlm.nih.gov/mesh/68001749), it does in fact have the NCIt preferred label, “Bladder Neoplasm”, as a synonym. However, NCIt does not have the MeSH preferred name as a synonym. Note especially that alternate labels of “Cancer of the Bladder” in MeSH and “Tumor of the Bladder” in NCIt are unique to each vocabulary. But what if we know that mesh:D001749 is the same concept as ncit:C2901? We map those two concepts together by creating a new “triple” that creates a relationship between the two URIs. Subject (S) Predicate (P) Object (O) ncit:2901 skos:exactMatch mesh:D001749 So now we've encoded the relationship that MeSH concept labeled “Urinary Bladder Neoplasms” equals the NCIt concept labeled “Bladder Neoplasm”. How do we realize the power of this relationship? What if we’re searching two carefully indexed repositories--one indexed using MeSH and another using NCIt and we want to produce a joined up view of these two repositories, leveraging the power of the two separate indexing methods? What if we’re searching across multiple repositories and we want as rich a synonym set for “Bladder Cancer” as we can get. As we see in this example--by using only one of these vocabularies, we could end up missing things.
  • 9. Here’s how we use Linked Data to do really interesting things. Now we’ll create a rule that builds upon our new matching relationship to bring more intelligence to our vocabularies. For this we use SPARQL. SPARQL is akin to SQL for relational databases in that it is the language with which to query and manipulate Linked Data in the Semantic Web. SPARQL Rule: CONSTRUCT ?s skos:altLabel ?o . WHERE ?s skos:exactMatch ?match . (Note that this is not executable SPARQL; the syntax has been simplified for the clarity of this example.) Creating these rules is key. When we hear experts talk about making your data “more intelligent”--this is what they mean! This three-line SPARQL command takes that new matching relationship and reasons that the preferred labels or synonyms in the matched MeSH concept can be inferred as additional synonyms (in red italics below) in the NCIt concept. It would do this for all the concepts in the NCIt that have been matched to Mesh terms. Subject(S) Predicate (P) Object (O) ncit:2901 skos:prefLabel “Bladder Neoplasm” skos:altLabel “Urinary Bladder Neoplasms” skos:altLabel “Cancer of the Bladder” Those terms “reasoning” and “inferred” are very important. First we’re making a rule that a computer can use to infer additional information. Second--that’s all this has to be--an inference. We are not necessarily hard-coding additional vocabulary data. We are not manipulating the source vocabularies at all. Every inferred triple can be managed completely separately from either source vocabulary! A fair question to ask is, "What about meta-thesauri and large vocabulary mapping efforts? Aren't they intended to do the same thing? For example, isn’t that what the National Cancer Institute's Metathesaurus and the National Library of Medicine's UMLS (Unified Medical Language System) initiatives are for?” Yes, to a point. However: 1. MeSH and NCIt are just two familiar examples. What if we try to map vocabularies or search across repositories that use less standard vocabularies: proprietary, internal, or custom vendor vocabs? Or if we need to search, and view information in an interface using the richness of NCIt, but against an repository that uses something different--a folksonomy perhaps, or just values selected from a drop-down list. ?match skos:prefLabel | skos:altLabel ?o .
  • 10. 2. What if the mapping we wanted to do was proprietary to a project or a company? Every taxonomist will agree that how concepts are arranged and mapped to each other has power. However, what if that mapping was relevant today--but not tomorrow? How can this be accomplished by leveraging these large published vocabularies without corrupting them, or worse--by building yet another one? It is important to be clear that identifying those initial matches (and they are not always as straightforward as an “exactMatch”) is the hard intellectual work of vocabulary management that is, and will continue to be, a hallmark of our profession. The semantic web doesn’t solve the problem of knowing which terms to match. But what it does do--unlike any other taxonomy, thesaurus, vocabulary management system--is give us the standards and tools to encode those matches and create rules that computers can use to make additional inferences. And very importantly, it does not require that we merge or integrate or otherwise corrupt original authoritative sources. So this approach doesn’t make it easy to manage vocabularies in a hyper-connected world--that’s not the point. The point is it makes it possible; possible, sustainable and extensible. Linked Data Vocabulary Management Roles In thinking about how to make this all work and to develop and deliver vocabulary services based on these new approaches, the following roles could be assigned. Vocabulary Developer. Uses Semantic Web technologies and informatics toolkit to ingest, model, bridge, slice, quality check, and version-control vocabularies. This would require some programming skills, and in-depth knowledge of the technology. Vocabulary Manager. Provides hands-on management, development and enhancement of vocabularies. They would have a deep understanding of vocabulary processes; and internal/external vocabulary landscape and best practices, often with subject specialty. This role is most like our taxonomist role of today. Vocabulary Administrator. Provides first-line support for helping customer’s access, use and exploitation of available vocabularies. They are accountable for management of service processes, documentation, and inventories. Conclusion: The Semantic Web needs us The Promise and the Reality. Linked Data approaches are the sustainable way to work with vocabularies in an increasingly federated and linked information world. The reality is that not enough vocabularies are available, from the authority that produced them, natively in SKOS or another form of RDF. This needs to change. But we can start to instill these practices where we work. This is field is rapidly developing; it needs librarians and information scientists with their deep knowledge of the importance and power of controlled vocabularies and the information sources that use them. This is not the future--it’s now. Leading organizations in the US, like OCLC and the Library of Congress, are adopting these standards. This even more true in Europe. The field of biomedical ontologies is very exciting right now, and firmly moving toward a Linked Data paradigm. The nature of business and of information demands new approaches. Even taking the Semantic Web’s promise of universally sharing data across the world out of the picture, SKOS and RDF--even just applied to vocabularies--is opportunity alone. These
  • 11. vocabularies can be published in whatever form people or systems need them in. However, to manage a web of vocabularies requires tools and techniques designed for the web. This is not easy. Breaking down information, including vocabularies into their most basic constructs, is what RDF does. It is elegant in design, but can quickly become very complex in execution. Writing the SPARQL queries that do complete transformations of vocabularies, manage versions, etc. can become very sophisticated. But isn't it a challenge worth tackling? Aren’t librarians and information scientists poised to make this happen? SKOS is a great example of a simplified model with practical application. RDF can be used to model ontologies in linguistically accurate, but extraordinarily complex, ways as well. The discussions at a Bioontology or Semantic Web conference can get very deep. But the beauty of RDF is that those complex models can be transformed, again without corrupting the source, into a simple model like SKOS. Guarding against unnecessary complexity is something that librarians are especially good at. Managing unconnected silos of information, including taxonomies, will not harness the power of a networked, collaborative world. Vocabulary interoperability, enabled by the Linked Data principles of the Semantic Web will--if further developed as a discipline within our profession. Acknowledgements This article was derived from a presentation given by James R Morris at the Special Libraries Association’s 2013 PHT and BIO Division Spring Meeting, April 14-16, 2013. The author thanks:  The AstraZeneca R&D Vocabulary Team and Linked Data Community of Practice members: Courtland Yockey, Sorana Popa, Rob Hernandez, Dana Crowley, Tom Plasterer, Mike Westaway, Kerstin Forsberg, Johan Turnqvist, Tracy Wolski, Dan Ringenbach, Rajan Desai, Pete Edwards, Simon Rakov, Jeff Saxe, and Ian Dix.  Vendors and Advisors: o Scott Henninger, Bob Ducharme – TopQuadrant o Dean Allemang – Working Ontologist References • Dean Allemang, Jim Hendler. Semantic Web for the Working Ontologist, 2nd edition. Morgan Kaufman, 2011.
  • 12. The first few chapters of this book provide one of the best overviews of the semantic web that you’ll find, and is very readable. It also has a whole chapter on SKOS. Available through ScienceDirect and Safari. • Bob Ducharme. Learning Sparql, 2nd edition. O’Reilly Media, 2013. Invaluable to have on hand when writing SPARQL. • World Wide Web Consortium (W3C). SKOS Simple Knowledge Organisation System Primer. http://www.w3.org/TR/skos-primer Straight from the source, it is technically oriented, but once you understand the basics, this is essential reading. • National Center for Biomedical Ontology. BioPortal. http://bioportal.bioontology.org . A phenomenally deep resource providing access to many biomedical vocabularies, queriable through SPARQL. The NCBO also has several webinars available. • Lee Harland, et al. “Empowering industrial research with shared biomedical vocabularies” Drug Discovery Today, Volume 16, Issues 21-22, November 2011, Pages 940-947, ISSN 1359-6446, 10.1016/j.drudis.2011.09.013 Excellent overview of the challenges and opportunities for using biomedical vocabularies in pharmaceutical research. • Kerstin Forsberg. Linked Data and URI:s for Enterprises (Blog). http://kerfors.blogspot.com. A thought-leader and evangelist for linked data, especially in the clinical domain.