Target Audience: Potential VIVO users, local faculty members, administrators, public audiences
Presentation Goals: Provide a general overview of the VIVO project. Describe how VIVO will be implemented at a local level. Provide general information about VIVOs main features, functionalities, and advantages.
As research effort become more interdisciplinary, it can be hard to find collaborators outside your own area of expertise. VIVO enables collaboration and understanding across an institution and among institutions – and not just for scientists.
Goal of VIVO: Improve all of science by providing the means for sharing and using current, accurate, and precise information regarding scientists’ interest, activities, and accomplishments. Foster team science by providing tools for identifying potential collaborators. Improve collaboration by creating tools using this information for enhancing new and existing teams. Not limited to science – at Cornell, VIVO covers all disciplines across the entire institution
Profiles vs. web pages/social sites: richer content & more explicit
Profiles are richer in content and much more complete and explicit than typical [web pages or] social networking sites
VIVO is useful to many different users and audiences both within and outside of the participating institutions. VIVO can result in increased grant and donor funding. For instance, the Cornell Alumni Affairs and Development Office was contacted by a major company interested in making research funds available to scientists working on issues related to urology; a search in VIVO-Cornell quickly returned several faculty in diverse disciplines working in this area, translating to real dollars.
Since VIVO stores profile information drawn from a variety of sources in a single, flexible format, it can be easily “re-skinned” or “re-purposed” to present specialized views into the institution.
Authoritative data, diverse formats, filter out private information
Talk about verified data Talking points: Much of the data in VIVO profiles is ingested from authoritative sources so it is accurate and current, reducing the need for manual input. Private or sensitive information is never imported into VIVO. Only public information will be stored and displayed. Data is housed and maintained at the local institutions. There it can be updated on a regular basis.
There are three ways to get data: internal, external, individuals. Internal is authoritative!
The rich information in VIVO profiles can be repurposed and shared with other institutional web pages and consumers, reducing cost and increasing efficiencies across the institution.
Talking points: Simple format Read and write standard data and ontologies Leverage open source and now commercial tools More and more data available on the Web (e.g., Bio2RDF)
Feel for the VIVO ontology
Search results are faceted so information can be located rapidly and with less time spent sorting through information.
Emphasis these are links, not just text. Click on the name and it will take you to his profile.
This search on homeostasis reveals the departments and interdisciplinary nature of the research. This slide demonstrates that someone in Molecular Genetics might be a good person to work with! Note: refine by facets
Each statement is a complete fact unto itself.
The “Hendler Hypothesis” – a little semantics goes a long way
Vitro All-in-1 ontology editing, data management, and public display Integrates data to display on the Web or share as feeds
VIVO Visually themed and branded for hosting institution Ontology optimized for discovering people and research
VIVOweb Harvests exposed VIVO data at 1 or more points of shared discovery Capable of pulling in RDF data from sources other than VIVO
Talking points: Much of the data in VIVO profiles is ingested from authoritative sources so it is accurate and current, reducing the need for manual input. The rich information in VIVO profiles can be repurposed and shared with other institutional web pages and consumers, reducing cost and increasing efficiencies across the institution. Private or sensitive information is never imported into VIVO. Only public information will be stored and displayed. Data is housed and maintained at the local institutions. There it can be updated on a regular basis. Search results are faceted so information can be located rapidly and with less time spent sorting through information.
The VIVO application was originally developed at Cornell University in 2003. UF implemented in 2007 and in sept 2009 Cornell, UF and five partner institutions received the $12 million stimulus grant from the NIH to enable national networking of researchers
All 7 institutions have VIVO installed and data is stored and maintained locally by the institution
Because the data is: RDF compliant Data is open We have a core ontology that all VIVOs conform with – making the data interoperable and consistent
And enables networking on a national scale
We will continue to build the network
Through varying types of participation.
Adoption -Institutions (columbia, nw), consortias (CTSAs) and federal agencies can install VIVO locally. If institutions already have another platform – they can also participate by sharing their semantic web-compliant data with the VIVO network. Participation can also include the provision of data. Whereas VIVO enables networking of researchers, Eagle-I enables resource discovery. Each resource may have a “creater” and, thus, has the potential to be linked to a researcher profile. Database aggregators and professional societies are also curators of data. Open access systems (such as PubMed) or subscription databases (such as Scopus/Web of Science), or even Collexis (a similar resource to VIVO) can participate through the provision of data
Collexis has a separate system called biomed experts which is a profile system – and VIVO data can be added to those profiles – and also has a large amount of disambiguated publications extracted from pubmed which can be purchased for a local VIVO system.
Additional participation can include application developers such as DERI or Digital Vita, who find ways to extract and reuse VIVO data
Lastly, as data is linked open data, search engines, professional societies and other producers of web-compliant data can participate through VIVO data consumption
There are a number of challenges VIVO faces in facilitating adoption.
What researchers want from VIVO can be very different from what institutions want At UF researchers are interested in CVs, biosketches and collaborations. UF administration are more focused on faculty reporting needs – although they are also interested in increasing interdisciplinary collaborations. Finding common ground is important as we build the VIVO network.
Additionally, VIVO was developed primarily for a researching community – which has been enthusiastic and willing to participate. Enthusiasm may vary based on where a researcher is at in their career, but VIVO is generally welcomed.
As we expand to the biomedical community, we are finding that the clinical community is driven by different business needs.
Looking at this graph of adoption trends – Innovators and early adopters are on the front end; and a small amount of laggards on the back in, all of which tend to be excited about new technologies – regardless of marketing. In the middle of this graph, however, we have early and late majority, which includes 68% of potential adopters. Our challenge is to prove to people that it’s going to last and the return they are investing is worth the effort. One of the advantages of VIVO is the linkage with local data sources – reducing the effort of individuals who want to use VIVO. Reaching their needs is crucial to VIVO’s success.
VIVO chooses to meet these participation challenges with the help of libraries – neutral campus entities that understand their user communities and their research environment. Libraries are increasingly involved in IT decisions – in fact my first librarian position was in a merged library/IT department.
Libraries have subject experts. While I work closely with VIVO at UF, I also provide reference and instructional support to the agricultural sciences. I know what it takes to facilitate adoption within my agricultural science community.
Librarians have good knowledge of their user communities – they understand what drives their institution’s research environment.
Of course, librarians have a strong service ethic and have a long history of providing academic support
Facilitate the resolution of data integration issues endemic to legacy data/faculty reporting systems.
So what do librarians DO?
Identification of content types – ontology ;development and interface refinement We negotiate with the owners of local data sources to explain what data we need and why we need it.
We provide local and national support and training through the development of documentation, presentations and help-desk services
Using the vivoweb.org website we liaise with potential collaborators – giving demos and answering questions Creating a community of support And delivering usability feedback to the technical team.
Marketing – through demonstrations, conferences, workshops and the development PR materials designed both to attract new participants and to assist participants with local adoption
Our assumption is that if we provide strong user support and iterative user testing that will faciliatate participation – because will have confidence that it is being supported.
Slide 1 - Coalition for Networked Information
D. Krafft, V. Davis: Presenters
Co-Authors: J. Corson-Rikert, M.Conlon, M.
Devare, B.Lowe, B.Caruso, K.Börner, Y.Ding,
L.McIntosh; M. Conlon; and
Cornell University: Dean Krafft (Cornell PI), Manolo Bevia, Jim Blake, Nick Cappadona, Brian
Caruso, Jon Corson-Rikert, Elly Cramer, Medha Devare, John Fereira, Brian Lowe, Stella
Mitchell, Holly Mistlebauer, Anup Sawant, Christopher Westling, Rebecca Younes. University
of Florida: Mike Conlon (VIVO and UF PI), Cecilia Botero, Kerry Britt, Erin Brooks, Amy
Buhler, Ellie Bushhousen, Chris Case, Valrie Davis, Nita Ferree, Chris Haines, Rae Jesano,
Margeaux Johnson, Sara Kreinest, Yang Li, Paula Markes, Sara Russell Gonzalez, Alexander
Rockwell, Nancy Schaefer, Michele R. Tennant, George Hack, Chris Barnes, Narayan
Raum, Brenda Stevens, Alicia Turner, Stephen Williams. Indiana University: Katy Borner (IU
PI), William Barnett, Shanshan Chen, Ying Ding, Russell Duhon, Jon Dunn, Micah Linnemeier,
Nianli Ma, Robert McDonald, Barbara Ann O'Leary, Mark Price, Yuyin Sun, Alan Walsh, Brian
Wheeler, Angela Zoss. Ponce School of Medicine: Richard Noel (Ponce PI), Ricardo Espada,
Damaris Torres. The Scripps Research Institute: Gerald Joyce (Scripps PI), Greg Dunlap,
Catherine Dunn, Brant Kelley, Paula King, Angela Murrell, Barbara Noble, Cary Thomas,
Michaeleen Trimarchi. Washington University, St. Louis: Rakesh Nagarajan (WUSTL PI), Kristi
L. Holmes, Sunita B. Koul, Leslie D. McIntosh. Weill Cornell Medical College: Curtis Cole
(Weill PI), Paul Albert, Victor Brodsky, Adam Cheriff, Oscar Cruz, Dan Dickinson, Chris Huang,
Itay Klaz, Peter Michelini, Grace Migliorisi, John Ruffing, Jason Specland, Tru Tran, Jesse
Turner, Vinay Varughese.
• What is VIVO?
• How does it work?
• How do we implement it?
• What’s ahead?
• VIVO will help create the collaborations that are
increasingly crucial in science by facilitating communication
and collaboration across interdisciplinary and institutional
boundaries NOT ONLY for scientists but also for
administrators, students, faculty, donors, funding agencies,
and the public.
• Researchers often struggle to locate and communicate with
collaborators across fields and outside rigidly defined
Populated with detailed profiles of
faculty and researchers; displaying
items such as publications, teaching,
service, and professional affiliations.
A powerful search functionality for
locating people and information
within or across institutions.
A semantic web application that
enables the discovery of research and
scholarship across disciplines in an
social, and research
•Showcase special skills or
•Establish connections and
research areas and
•Publish the URL or link
the profile to other
A VIVO profile allows researchers to:
Who can use VIVO?
…and many more!
• Searching for
• Visualizing research
activity within an
• Locating graduate
• Looking for
seminars or events
• Looking for
collaborators on large
• Keeping abreast of
VIVO as disseminator
VIVO harvests much of its data automatically
from verified sources
• Reducing the need for manual input of data.
• Centralizing information and providing an
integrated source of data at an institutional level.
Data, Data, Data
Individuals may also edit and customize their
profiles to suit their professional needs.
Information is stored using the Resource Description Framework
Data is structured in the form of “triples” as subject-predicate-object.
Concepts and their relationships use a shared ontology to facilitate
the harvesting of data from multiple sources.
Storing Data in VIVO
is member of
has affiliations with
Subject Predicate Object
Detailed relationships for a
research area for
Mining the record: Historical evidence for…
teaches research area for
Earth and Atmospheric
Cornell’s supercomputers crunch weather data to help farmers manage chemicals
faculty appointment in
Query and explore
Everything about an event, a grant, a person
Everything about a class of events, grants, or persons
Grants with PIs from different colleges or campuses
By combinations and facets
Explore any publication, grant, or talk with a relationship to
a concept or geographic location
Explore orthogonally (navigate a concept or geographic
Advantages of an ontology approach
Provides the key to meaning
Defines a set of classes and properties in a unique
Embedded as RDF so data becomes self-
Definitions available via the namespace URI
Helps align RDF from multiple sources
VIVO core ontology maps to common shared
ontologies organized by domain
Local extensions roll up into VIVO core
Information flow with shared ontologies
Log boom on Williston Lake, British Columbia
Information flow without shared
Log jam, looking up the dalles, Taylor's Falls, St. Croix River, MN. Photo by F.E. Loomis
Linked Data principles
Use URIs as names for things
Use HTTP URIs so that people can look up those
When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
Include links to other URIs so that people can
discover more things
VIVO enables authoritative data about
researchers to join the Linked Data cloud
Challenges in our approach
Keynote at 2003 Int’l Semantic Web Conference http://iswc2003.semanticweb.org/hendler_files/v3_document.htm
BUT - “I believe that a WEB of Semantic terminology (anchored
in URIs qua RDF/OWL) will create a powerful network effect,
where the wealth of knowledge (despite being open and
uncurated) creates a wide range of new possibilities, without
the need for expressive logics and powerful reasoners (that
cannot scale to web sizes).” – Jim Hendler
Major VIVO Components
A general-purpose, open source Web-based
application leveraging semantic standards:
ontology editor, data manager, display manager
Customizations of this application for VIVO
Additional new software to enable a distributed
network of RDF by harvesting from VIVO
instances or other sources
Search and browse interface
Display, search and navigation setup
Data ingest Data export
& data flow
VIVO’s three functional layers
Local data flow
(RDF)> > VIVO
(RDF) > shared as
From local to national
• Cornell University
• University of Florida
• Indiana University
• Ponce School of Medicine
• The Scripps Research Institute
• Washington University, St. Louis
• Weill Cornell Medical College
Linked Open Data
working to enable
Work to enable National
Networking is supported via
stimulus funds from the
National Center for Research
Resources (NCRR) of the
National Institutes of Health
Start with the project participants
Enable a National Network
The National VIVO Network
enables the discovery of
research across institutions.
Data is imported, stored, and
maintained by the individual
VIVO allows researchers to
browse and …enables
networking on a national
potential adopters and data providers
Individual institutions – Columbia University, …
Federal agencies – NIH (NIH RePORTER), NSF, …
Consortia of schools – SURA, CTSA…
eagle-i (“enabling resource discovery” U24 award)
Publishers/vendors – PubMed, Elsevier, Collexis, ISI…
Professional societies – AAAS, …
application developers and
Semantic Web community – DERI, others…
Search Providers – Google, Bing, Yahoo, …
Professional Societies – AAAS, …
Producers, consumers of semantic web-compliant data
Facilitating adoption: Challenges
Scientist vs. administrator
Individual vs. institutional
Research vs. clinical community
Researchers: Early vs. established;
enthusiastic, willing to participate
Clinicians: Less enthusiastic;
“business” accumulated in other
Innovators, early adopters
Excited; willing to experiment
Early, late majority
Hesitant to adopt new
Facilitating adoption &
Support and dissemination through libraries
Neutral, trusted institutional entities
IT, information management and dissemination expertise
Subject experts; experience in translational/outreach roles
Knowledge of institutional, research, instruction landscape
Service ethic; tradition of academic support
Recognition of challenges posed by user communities…
Facilitate resolving data integration problems endemic to
Librarians as facilitators: Our model
Oversight of initial content development
Content types, ontology, interface refinement…
Negotiation with campus data stewards for publicly visible data
Support and training: local and national level
Documentation, presentation/demo templates
Help-desk support, videos, tutorials, web site FAQs, etc.
Maintain web site (http://www.vivoweb.org)
Engage with potential collaborators, participants
Create community of support via user forums
Usability: Feedback, new use cases from users to technical team
Demonstrations/exhibits, conferences, workshops, website
Release 1 (as of Friday April 16th)
Deployment at 7 sites on production hardware
Mapping local data to R1 ontology
Batch data ingest (extract-transform-load)
Feedback on ontology; develop local ontologies
Open software (BSD license) and ontology for download
Next 6-12 months
Expand data ingest framework
Publications | people | grants | courses
Visualization in-page and at site level
National network application design
Improvements to VIVO to support modularity & customizability
User support material development
Expand functionality to meet developing use cases
Driving future participation
“User scenario”-based ontology and feature development
Usability; robust user support
Compelling proof of concept within consortium
Administrators, scientists, clinicians, students, staff…
Addressing interest from other institutions, partners
Conferences and workshops: National VIVO Conference (8/12/10; NYC)
Individual demos; meetings
Network Analysis & Visualization
At the individual investigator level
In-page, small graphs highlighting publication
history, collaborations or co-authorship networks
At the department or institutional level
Trends in research grants or publications
Collaboration networks across a larger group
Topical alignment with base maps of science
At the network level
Patterns and clusters by geography, topic, funding
agency, institution, discipline
Incorporate external data sources for
publications and affiliations.
Future versions of VIVO will:
Display visualizations of complex
research networks and relationships.
Link data to external applications and
Realize the full potential of the
Generate CVs and biosketches for
faculty reporting or grant proposals.
Get involved as an:
data provider, or
Call to action
Thank you! Questions?
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.