Slide 1 - Coalition for Networked Information


Published on

Published in: Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Target Audience: Potential VIVO users, local faculty members, administrators, public audiences

    Presentation Goals:
    Provide a general overview of the VIVO project.
    Describe how VIVO will be implemented at a local level.
    Provide general information about VIVOs main features, functionalities, and advantages.
  • As research effort become more interdisciplinary, it can be hard to find collaborators outside your own area of expertise. VIVO enables collaboration and understanding across an institution and among institutions – and not just for scientists.
  • Goal of VIVO: Improve all of science by providing the means for sharing and using current, accurate, and precise information regarding scientists’ interest, activities, and accomplishments.
    Foster team science by providing tools for identifying potential collaborators.
    Improve collaboration by creating tools using this information for enhancing new and existing teams.
    Not limited to science – at Cornell, VIVO covers all disciplines across the entire institution
  • Profiles vs. web pages/social sites: richer content & more explicit

    Profiles are richer in content and much more complete and explicit than typical [web pages or] social networking sites
  • VIVO is useful to many different users and audiences both within and outside of the participating institutions.
    VIVO can result in increased grant and donor funding. For instance, the Cornell Alumni Affairs and Development Office was contacted by a major company interested in making research funds available to scientists working on issues related to urology; a search in VIVO-Cornell quickly returned several faculty in diverse disciplines working in this area, translating to real dollars.
  • Since VIVO stores profile information drawn from a variety of sources in a single, flexible format, it can be easily “re-skinned” or “re-purposed” to present specialized views into the institution.
  • Authoritative data, diverse formats, filter out private information

    Talk about verified data
    Talking points:
    Much of the data in VIVO profiles is ingested from authoritative sources so it is accurate and current, reducing the need for manual input.
    Private or sensitive information is never imported into VIVO. Only public information will be stored and displayed.
    Data is housed and maintained at the local institutions. There it can be updated on a regular basis.

    There are three ways to get data: internal, external, individuals. Internal is authoritative!

    The rich information in VIVO profiles can be repurposed and shared with other institutional web pages and consumers, reducing cost and increasing efficiencies across the institution.
  • Talking points: Simple format
    Read and write standard data and ontologies
    Leverage open source and now commercial tools
    More and more data available on the Web (e.g., Bio2RDF)

  • Feel for the VIVO ontology
  • Search results are faceted so information can be located rapidly and with less time spent sorting through information.
  • Emphasis these are links, not just text. Click on the name and it will take you to his profile.
  • This search on homeostasis reveals the departments and interdisciplinary nature of the research. This slide demonstrates that someone in Molecular Genetics might be a good person to work with!
    Note: refine by facets
  • Each statement is a complete fact unto itself.
  • The “Hendler Hypothesis” – a little semantics goes a long way
  • Vitro
    All-in-1 ontology editing, data management, and public display
    Integrates data to display on the Web or share as feeds

    Visually themed and branded for hosting institution
    Ontology optimized for discovering people and research

    Harvests exposed VIVO data at 1 or more points of shared discovery
    Capable of pulling in RDF data from sources other than VIVO

  • Talking points:
    Much of the data in VIVO profiles is ingested from authoritative sources so it is accurate and current, reducing the need for manual input.
    The rich information in VIVO profiles can be repurposed and shared with other institutional web pages and consumers, reducing cost and increasing efficiencies across the institution.
    Private or sensitive information is never imported into VIVO. Only public information will be stored and displayed.
    Data is housed and maintained at the local institutions. There it can be updated on a regular basis.
    Search results are faceted so information can be located rapidly and with less time spent sorting through information.
  • The VIVO application was originally developed at Cornell University in 2003.
    UF implemented in 2007
    and in sept 2009 Cornell, UF and five partner institutions received the $12 million stimulus grant from the NIH to enable national networking of researchers

    All 7 institutions have VIVO installed and data is stored and maintained locally by the institution
  • Because the data is:
    RDF compliant
    Data is open
    We have a core ontology that all VIVOs conform with – making the data interoperable and consistent

    And enables networking on a national scale

    We will continue to build the network
  • Through varying types of participation.

    Adoption -Institutions (columbia, nw), consortias (CTSAs) and federal agencies can install VIVO locally. If institutions already have another platform – they can also participate by sharing their semantic web-compliant data with the VIVO network.
    Participation can also include the provision of data.
    Whereas VIVO enables networking of researchers, Eagle-I enables resource discovery. Each resource may have a “creater” and, thus, has the potential to be linked to a researcher profile.
    Database aggregators and professional societies are also curators of data. Open access systems (such as PubMed) or subscription databases (such as Scopus/Web of Science), or even Collexis (a similar resource to VIVO) can participate through the provision of data

    Collexis has a separate system called biomed experts which is a profile system – and VIVO data can be added to those profiles – and also has a large amount of disambiguated publications extracted from pubmed which can be purchased for a local VIVO system.
  • Additional participation can include application developers such as DERI or Digital Vita, who find ways to extract and reuse VIVO data

    Lastly, as data is linked open data, search engines, professional societies and other producers of web-compliant data can participate through VIVO data consumption

  • There are a number of challenges VIVO faces in facilitating adoption.

    What researchers want from VIVO can be very different from what institutions want
    At UF researchers are interested in CVs, biosketches and collaborations. UF administration are more focused on faculty reporting needs – although they are also interested in increasing interdisciplinary collaborations. Finding common ground is important as we build the VIVO network.

    Additionally, VIVO was developed primarily for a researching community – which has been enthusiastic and willing to participate. Enthusiasm may vary based on where a researcher is at in their career, but VIVO is generally welcomed.

    As we expand to the biomedical community, we are finding that the clinical community is driven by different business needs.

    Looking at this graph of adoption trends –
    Innovators and early adopters are on the front end; and a small amount of laggards on the back in, all of which tend to be excited about new technologies – regardless of marketing.
    In the middle of this graph, however, we have early and late majority, which includes 68% of potential adopters. Our challenge is to prove to people that it’s going to last and the return they are investing is worth the effort.
    One of the advantages of VIVO is the linkage with local data sources – reducing the effort of individuals who want to use VIVO.
    Reaching their needs is crucial to VIVO’s success.
  • VIVO chooses to meet these participation challenges with the help of libraries – neutral campus entities that understand their user communities and their research environment.
    Libraries are increasingly involved in IT decisions – in fact my first librarian position was in a merged library/IT department.

    Libraries have subject experts. While I work closely with VIVO at UF, I also provide reference and instructional support to the agricultural sciences. I know what it takes to facilitate adoption within my agricultural science community.

    Librarians have good knowledge of their user communities – they understand what drives their institution’s research environment.

    Of course, librarians have a strong service ethic and have a long history of providing academic support

    Facilitate the resolution of data integration issues endemic to legacy data/faculty reporting systems.

  • So what do librarians DO?

    Identification of content types – ontology ;development and interface refinement
    We negotiate with the owners of local data sources to explain what data we need and why we need it.

    We provide local and national support and training through the development of documentation, presentations and help-desk services

    Using the website we liaise with potential collaborators – giving demos and answering questions
    Creating a community of support
    And delivering usability feedback to the technical team.

    Marketing – through demonstrations, conferences, workshops and the development PR materials designed both to attract new participants and to assist participants with local adoption
  • Our assumption is that if we provide strong user support and iterative user testing that will faciliatate participation – because will have confidence that it is being supported.
  • Slide 1 - Coalition for Networked Information

    1. 1. D. Krafft, V. Davis: Presenters Co-Authors: J. Corson-Rikert, M.Conlon, M. Devare, B.Lowe, B.Caruso, K.Börner, Y.Ding, L.McIntosh; M. Conlon; and VIVO Collaboration*
    2. 2. Cornell University: Dean Krafft (Cornell PI), Manolo Bevia, Jim Blake, Nick Cappadona, Brian Caruso, Jon Corson-Rikert, Elly Cramer, Medha Devare, John Fereira, Brian Lowe, Stella Mitchell, Holly Mistlebauer, Anup Sawant, Christopher Westling, Rebecca Younes. University of Florida: Mike Conlon (VIVO and UF PI), Cecilia Botero, Kerry Britt, Erin Brooks, Amy Buhler, Ellie Bushhousen, Chris Case, Valrie Davis, Nita Ferree, Chris Haines, Rae Jesano, Margeaux Johnson, Sara Kreinest, Yang Li, Paula Markes, Sara Russell Gonzalez, Alexander Rockwell, Nancy Schaefer, Michele R. Tennant, George Hack, Chris Barnes, Narayan Raum, Brenda Stevens, Alicia Turner, Stephen Williams. Indiana University: Katy Borner (IU PI), William Barnett, Shanshan Chen, Ying Ding, Russell Duhon, Jon Dunn, Micah Linnemeier, Nianli Ma, Robert McDonald, Barbara Ann O'Leary, Mark Price, Yuyin Sun, Alan Walsh, Brian Wheeler, Angela Zoss. Ponce School of Medicine: Richard Noel (Ponce PI), Ricardo Espada, Damaris Torres. The Scripps Research Institute: Gerald Joyce (Scripps PI), Greg Dunlap, Catherine Dunn, Brant Kelley, Paula King, Angela Murrell, Barbara Noble, Cary Thomas, Michaeleen Trimarchi. Washington University, St. Louis: Rakesh Nagarajan (WUSTL PI), Kristi L. Holmes, Sunita B. Koul, Leslie D. McIntosh. Weill Cornell Medical College: Curtis Cole (Weill PI), Paul Albert, Victor Brodsky, Adam Cheriff, Oscar Cruz, Dan Dickinson, Chris Huang, Itay Klaz, Peter Michelini, Grace Migliorisi, John Ruffing, Jason Specland, Tru Tran, Jesse Turner, Vinay Varughese. VIVO Collaboration:
    3. 3. • What is VIVO? • How does it work? • How do we implement it? • What’s ahead? Overview
    4. 4. • VIVO will help create the collaborations that are increasingly crucial in science by facilitating communication and collaboration across interdisciplinary and institutional boundaries NOT ONLY for scientists but also for administrators, students, faculty, donors, funding agencies, and the public. Solution: • Researchers often struggle to locate and communicate with collaborators across fields and outside rigidly defined organizational confines Problem:
    5. 5. What is VIVO?
    6. 6. VIVO is: Populated with detailed profiles of faculty and researchers; displaying items such as publications, teaching, service, and professional affiliations. A powerful search functionality for locating people and information within or across institutions. A semantic web application that enables the discovery of research and scholarship across disciplines in an institution.
    7. 7. •Illustrate academic, social, and research networks. •Showcase special skills or expertise. •Establish connections and communities within research areas and geographic expertise. •Summarize credentials and professional achievements. •Publish the URL or link the profile to other applications. A VIVO profile allows researchers to:
    8. 8. Who can use VIVO? …and many more! • Showcasing departmental activities • Centralizing faculty information • Searching for specialized expertise • Visualizing research activity within an institution • Locating graduate programs, mentors, or advisors • Looking for seminars or events • Looking for collaborators on large or multi-disciplinary grants • Keeping abreast of new research Faculty & Researchers Students Funding Agencies Administrators
    9. 9. VIVO as disseminator
    10. 10. How does it work?
    11. 11. VIVO harvests much of its data automatically from verified sources • Reducing the need for manual input of data. • Centralizing information and providing an integrated source of data at an institutional level. Data, Data, Data Individuals may also edit and customize their profiles to suit their professional needs. External data sources Internal data sources
    12. 12.  Information is stored using the Resource Description Framework (RDF) .  Data is structured in the form of “triples” as subject-predicate-object.  Concepts and their relationships use a shared ontology to facilitate the harvesting of data from multiple sources. Storing Data in VIVO Jane Smith is member of author of has affiliations with Dept. of Genetics College of Medicine Journal article Book chapter Book Genetics Institute Subject Predicate Object
    13. 13. Detailed relationships for a researcher Andrew McDonald author of has author research area research area for academic staff in academic staff Susan Riha Mining the record: Historical evidence for… author of has author teaches research area for research area headed by NYS WRI Earth and Atmospheric Sciences crop management CSS 4830 Cornell’s supercomputers crunch weather data to help farmers manage chemicals head of faculty appointment in faculty members taught by featured in features person
    14. 14. Query and explore  By individual  Everything about an event, a grant, a person  By type  Everything about a class of events, grants, or persons  By relationship  Grants with PIs from different colleges or campuses  By combinations and facets  Explore any publication, grant, or talk with a relationship to a concept or geographic location  Explore orthogonally (navigate a concept or geographic hierarchy)
    15. 15. By individual: a grant
    16. 16. Browse by Type: seminar series
    17. 17. Results by Topic: homeostasis
    18. 18. Advantages of an ontology approach  Provides the key to meaning  Defines a set of classes and properties in a unique namespace  Embedded as RDF so data becomes self- describing  Definitions available via the namespace URI  Helps align RDF from multiple sources  VIVO core ontology maps to common shared ontologies organized by domain  Local extensions roll up into VIVO core
    19. 19. Information flow with shared ontologies Log boom on Williston Lake, British Columbia
    20. 20. Information flow without shared ontologies Log jam, looking up the dalles, Taylor's Falls, St. Croix River, MN. Photo by F.E. Loomis
    21. 21. Linked Data principles  Tim Berners-Lee:  Use URIs as names for things  Use HTTP URIs so that people can look up those names  When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)  Include links to other URIs so that people can discover more things  ml 
    22. 22. VIVO enables authoritative data about researchers to join the Linked Data cloud
    23. 23. Challenges in our approach  Granularity levels  Terminologies  Scalability  Disambiguation  Provenance  Temporality Keynote at 2003 Int’l Semantic Web Conference BUT - “I believe that a WEB of Semantic terminology (anchored in URIs qua RDF/OWL) will create a powerful network effect, where the wealth of knowledge (despite being open and uncurated) creates a wide range of new possibilities, without the need for expressive logics and powerful reasoners (that cannot scale to web sizes).” – Jim Hendler
    24. 24. Major VIVO Components  A general-purpose, open source Web-based application leveraging semantic standards: ontology editor, data manager, display manager  Customizations of this application for VIVO  Ontology  Display theming  Installation  Additional new software to enable a distributed network of RDF by harvesting from VIVO instances or other sources
    25. 25. Search and browse interface Editing Display, search and navigation setup Curator editing Ontology Editing Data ingest Data export curators ontology editing & data flow end users VIVO’s three functional layers
    26. 26. Local data flow local systems of record national sources data ingest ontologies (RDF)> > VIVO (RDF) > shared as RDF interactive inputPeoplesoft Grants DB PubMed Publishers Researchers Librarians Administrative Staff Self-Editors RDFa RDF harvest SPARQL endpoint
    27. 27. From local to national > VIVO local sources nat’l sources > share as RDF website data search browse visualize share as RDF search browse visualize • Cornell University • University of Florida • Indiana University • Ponce School of Medicine • The Scripps Research Institute • Washington University, St. Louis • Weill Cornell Medical College Local National text indexing filtered RDF
    28. 28. Visuali- zation Ponce VIVO WashU VIVO Scripps VIVO UF VIVO IU VIVO WCMC VIVO Cornell VIVO RDF Triple Store RDF Triple Store Future VIVO Future VIVO Future VIVO Other RDF Other RDF Other RDF Prof. Assn. Triple Store Regional Triple Store Search Other RDF Search Linked Open Data National networking
    29. 29. How do we implement it?
    30. 30. Institutions currently working to enable National Networking with VIVO Work to enable National Networking is supported via stimulus funds from the National Center for Research Resources (NCRR) of the National Institutes of Health (NIH). Start with the project participants
    31. 31. Enable a National Network The National VIVO Network enables the discovery of research across institutions. Data is imported, stored, and maintained by the individual institutions. VIVO allows researchers to browse and …enables networking on a national scale.
    32. 32. Participation: potential adopters and data providers  Individual institutions – Columbia University, …  Federal agencies – NIH (NIH RePORTER), NSF, …  Consortia of schools – SURA, CTSA…  eagle-i (“enabling resource discovery” U24 award)  Publishers/vendors – PubMed, Elsevier, Collexis, ISI…  Professional societies – AAAS, …
    33. 33. Participation: application developers and consumers  Semantic Web community – DERI, others…  Search Providers – Google, Bing, Yahoo, …  Professional Societies – AAAS, …  Producers, consumers of semantic web-compliant data
    34. 34. Facilitating adoption: Challenges  Scientist vs. administrator  Individual vs. institutional  Research vs. clinical community  Researchers: Early vs. established; enthusiastic, willing to participate  Clinicians: Less enthusiastic; “business” accumulated in other ways  Innovators, early adopters  Excited; willing to experiment  Early, late majority  Hesitant to adopt new technologies
    35. 35. Facilitating adoption & participation  Support and dissemination through libraries  Neutral, trusted institutional entities  IT, information management and dissemination expertise  Subject experts; experience in translational/outreach roles  Knowledge of institutional, research, instruction landscape  Service ethic; tradition of academic support  Recognition of challenges posed by user communities…  Facilitate resolving data integration problems endemic to legacy systems
    36. 36. Librarians as facilitators: Our model  Oversight of initial content development  Content types, ontology, interface refinement…  Negotiation with campus data stewards for publicly visible data  Support and training: local and national level  Documentation, presentation/demo templates  Help-desk support, videos, tutorials, web site FAQs, etc.  Communication/liaising  Maintain web site (  Engage with potential collaborators, participants  Create community of support via user forums  Usability: Feedback, new use cases from users to technical team  Marketing  Demonstrations/exhibits, conferences, workshops, website  PR materials
    37. 37. What’s ahead?
    38. 38. Near-term developments  Release 1 (as of Friday April 16th)  Deployment at 7 sites on production hardware  Mapping local data to R1 ontology  Batch data ingest (extract-transform-load)  Upcoming Releases  Feedback on ontology; develop local ontologies  Improved usability  Open software (BSD license) and ontology for download  Next 6-12 months  Expand data ingest framework  Publications | people | grants | courses  Visualization in-page and at site level  National network application design  Improvements to VIVO to support modularity & customizability  User support material development  Expand functionality to meet developing use cases
    39. 39. Driving future participation  “User scenario”-based ontology and feature development  Usability; robust user support  Compelling proof of concept within consortium  Administrators, scientists, clinicians, students, staff…  Addressing interest from other institutions, partners   Conferences and workshops: National VIVO Conference (8/12/10; NYC)  Individual demos; meetings  Exhibits, publicity
    40. 40. Network Analysis & Visualization  At the individual investigator level  In-page, small graphs highlighting publication history, collaborations or co-authorship networks  At the department or institutional level  Trends in research grants or publications  Collaboration networks across a larger group  Topical alignment with base maps of science  At the network level  Patterns and clusters by geography, topic, funding agency, institution, discipline
    41. 41. Incorporate external data sources for publications and affiliations. Future versions of VIVO will: Display visualizations of complex research networks and relationships. Link data to external applications and web pages. Realize the full potential of the semantic web. Generate CVs and biosketches for faculty reporting or grant proposals.
    42. 42.  Get involved as an:  adopter,  data provider, or  application developer  Visit:  Call to action Thank you! Questions?