5-14-13 An Introduction to VIVO Presentation Slides


Published on

“Hot Topics: The DuraSpace Community Webinar Series, "Series Five: VIVO: Research Discovery and Networking.” Webinar #1: An Introduction to VIVO, May 14, 2013
Presented by: Dean Krafft, Chief Technology Strategist at Cornell University Library and Chair of the VIVO-DuraSpace Management Committee, Brian Lowe, Semantic Applications Programmer, Cornell and Jon Corson-Rikert, VIVO Development Lead, Cornell

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • historical overview – motivation here at Cornell before we thought of the larger context access across disciplines via a structure emphasizing connections over hierarchies predicated on information emerging from what people are doing – outputs, yes, but other activities including grants, teaching, and talks
  • motivation for the NIH grant and bigger vision of the VIVO network, from the opening scholarly needs and desires realization by NIH of benefits to science when communities have organized community resources like ontologies, databases, and repositories Also aware that NIH can’t fund the full cost -looking for tools that are locally sustainable
  • motivation for the NIH grant and bigger vision of the VIVO network, from the opening scholarly needs and desires realization by NIH of benefits to science when communities have organized community resources like ontologies, databases, and repositories Also aware that NIH can’t fund the full cost -looking for tools that are locally sustainable
  • As you can see, The VIVO project itself is a rather large, geographically dispersed team. 7 institutions Project areas: development, implementation, ontology, and outreach
  • Abawi slide (small) + predicated on information emerging from what people are doing – outputs, yes, but other activities including grants, teaching, and talks
  • predicated on information emerging from what people are doing – outputs, yes, but other activities including grants, teaching, and talks
  • Reasoning example: sameAs An ontology is a representation of entities and relations … … for a part of reality … … expressed in human and computer interpretable form
  • An ontology is a representation of entities and relations … … for a part of reality … … expressed in human and computer interpretable form
  • The Mike Conlon slide
  • Results link back to home instititution
  • See all or see nothing across colleges Information in aggregate has been known to reveal unintended identifiable data
  • Our philosophy is you can save time and improve currency and accuracy if you only have to input information one time. This is a feature that is very appealing to faculty. [Cornell intends to participate in the Economic Development Portal being developed by NY????? Having data in VIVO means we can pipe the data to the Portal, faculty don’t have to fill out yet form.] The College of Arts and Sciences is using VIVO to feed core data into departmental websites.
  • Another example of the reuse of data is the geographic display of data. CALS faculty report the impact of their work, and include the geographic focus. Using that data we can generate displays of where particular work is being done. The map in the upper right is New York State, but we can instantly render similar maps for the United States and the world.
  • Transform static data into a network Leverage relationships – topic, place, and shared activities We’re not just the nodes, we’re the connections
  • IICA
  • Dean will be talking about this
  • But it’s a many-layered problem mention Bill Trochim Fifteen Most Promising Clinical Research Processes and Outcomes Metrics from Evaluation KFC Annual Meeting Time from IRB submission to approval Studies meeting accrual goals Time from notice of grant award to study opening (e.g., investigator initiated studies) Number of technology transfer products Volume of investigators who used services Volume of types of services used Satisfaction/needs assessment Time to publication Influence of research publication (e.g., observed/expected citations) Researcher collaboration (e.g., team science; collaboration index) ROI of pilot and KL2 scholars Time from publication to a research synthesis Career development Career trajectory (e.g., K-R transition) Institutional collaboration (public-private; cross-institutional; community)
  • ISF, euroCRIS, CASRAI, Lattes – pushing toward data compatibility across the world's major initiatives
  • ISF, euroCRIS, CASRAI, Lattes – pushing toward data compatibility across the world's major initiatives
  • University of Colorado Boulder Laboratory for Atmospheric and Space Physics additional information about research projects, equipment, and facilities will be stored behind a firewall Linked to main Boulder VIVO via sameAs
  • Focus on “enter once, use many times” Cornell needs data, including expertise, partnerships, and geographic focus for: Cornell sesquicentennial campaign NY state economic development SUNY Knowledge Network Carnegie classification as an institution of community engagement Competitive landscape analysis Highlight Cornell’s uniqueness while feeding into national initiatives Build on our existing partnerships, through the Library, campus-wide teams, and specific units as necessary Strong collaboration already with Weill – build on VIVO’s momentum to make Cornell a leader in this domain
  • 5-14-13 An Introduction to VIVO Presentation Slides

    1. 1. May 14, 2013 Hot Topics: DuraSpace Community Webinar SeriesHot Topics: The DuraSpaceCommunity Webinar SeriesSeries Five:“VIVO: Research Discovery &Networking ”Curated by Dean Krafft
    2. 2. May 14, 2013 Hot Topics: DuraSpace Community Webinar SeriesWebinar 1: Overview of VIVOPresented by:Brian Lowe, Semantic Applications Programmer, CornellJon Corson-Rikert, VIVO Development Lead, CornellDean Krafft, Chief Technology Strategist at CornellUniversity Library and Chair of the VIVO-DuraSpaceManagement Committee
    3. 3. What is VIVO?• A semantic-web-based researcher andresearch discovery tool– People plus much more• Institution-wide, publicly-visibleinformation– For external as well as internal audiences• An open, shared platform for connectingscholars, communities, campuses, andcountries using Linked Open Data
    4. 4. How did we get here?31 authors6 institutions
    5. 5. A brief VIVO history2003-2005 First realization for the life sciences atCornell, as a relational database2006-2008 Expansion to all disciplines at Cornell,and conversion to Semantic Web2009-2012 National Institutes of Health-sponsoredVIVO: Enabling the National Networkingof Scientists project transforms VIVO toa multi-institutional open sourceplatform2013-2014 VIVO Incubator Project with DuraSpacefor open community development
    6. 6. Major opportunity, 2009NIH … “invites applicationsdesigned to develop, enhance, orextend infrastructure forconnecting people andresources to facilitatenational discovery ofindividuals and of scientificresources by scientists andstudents to encourageinterdisciplinarycollaboration and scientificexchange.”
    7. 7. National partnership2009
    8. 8. VIVO CollaborationCornell UniversityDean Krafft (Cornell PI)Manolo BeviaJim BlakeNick CappadonaBrian CarusoJon Corson-RikertElly CramerMedha DevareElizabeth HinesHuda KhanDepak KonidenaBrian LoweJoseph McEnerneyHolly MistlebauerStella MitchellAnup SawantChristopher WestlingTim WorrallRebecca YounesUniversity of FloridaMike Conlon (VIVO and UF PI)Beth AutenMichael BarbieriChris BarnesKaitlin BlackburnCecilia BoteroKerry BrittErin BrooksAmy BuhlerEllie BushhousenLinda ButsonChris CaseChristine CogarValrie DavisMary EdwardsNita FerreeRolando Garcia-MilanGeorge HackChris HainesSara HenningRae JesanoMargeaux JohnsonMeghan LatorreYang LiJennifer LyonPaula MarkesHannah NortonJames PenceNarayan RaumNicholas RejackAlexander RockwellSara Russell GonzalezNancy SchaeferDale SchepplerNicholas SkaggsMatthew TedderMichele R. TennantAlicia TurnerStephen WilliamsIndiana UniversityKaty Borner (IU PI)Kavitha ChandrasekarBin ChenShanshan ChenRyan CobineJeni CoffeySuresh DeivasigamaniYing DingRussell DuhonJon DunnPoornima GopinathJulie HardestyBrian KeeseNamrata LeleMicah LinnemeierNianli MaRobert H. McDonaldAsik Pradhan GongajuMark PriceMichael StamperYuyin SunChintan TankAlan WalshBrian WheelerFeng WuAngela ZossPonce School of MedicineRichard J. Noel, Jr. (Ponce PI)Ricardo Espada ColonDamaris Torres CruzMichael Vega NegrónThis project is funded by the National Institutes of Health, U24 RR029822"VIVO: Enabling National Networking of Scientists”The Scripps ResearchInstituteGerald Joyce (Scripps PI)Catherine DunnSam KatkovBrant KelleyPaula KingAngela MurrellBarbara NobleCary ThomasMichaeleen TrimarchiWashington University School ofMedicine in St. LouisRakesh Nagarajan (WUSTL PI)Kristi L. HolmesCaerie HouchinsGeorge JosephSunita B. KoulLeslie D. McIntoshWeill Cornell Medical CollegeCurtis Cole (Weill PI)Paul AlbertVictor BrodskyMark BronnimannAdam CheriffOscar CruzDan DickinsonRichard HuChris HuangItay KlazKenneth LeePeter MicheliniGrace MigliorisiJohn RuffingJason SpeclandTru TranVinay VarugheseVirgil Wong
    9. 9. What does VIVO do?• Integrates multiple sources of data– Systems of record– Faculty activity reporting– External sources (e.g., Scopus, PubMed,NIH RePORTER)• Provides a review and editing interface– Single sign-on for self-editing or byproxy• Provides integrated, filterable feeds toother websites
    10. 10. People
    11. 11. People and what they do
    12. 12. Structured data forvisualizations
    13. 13. Enabling an (inter)national network• Open software• Open data• Local control• Decentralized infrastructure
    14. 14. What does VIVO model?• People and more– Organizations, grants, programs, projects,publications, events, facilities, and researchresources• Relationships among the above– Meaningful– Bidirectional– Navigable context• Links to URIs elsewhere– Concepts, identifiers– People, places, organizations, events
    15. 15. Typical data sources• HR – people, appointments• Research administration – grants & contracts• Registrar – courses• Faculty reporting system(s)– publications, service, research areas, awards• Events calendar• Internal and external news• External repositories – e.g., Pubmed, Scopus
    16. 16. Value for institutions• Common data substrate– Public, granular and direct– Discovery via external and internal searchengines– Available for reuse at many levels• Distributed curation– E.g., affiliations beyond what HR system tracks– Data coordination across functional silos– Feeding changes back to systems of record– Direct linking across campuses• Data that is visible gets fixed
    17. 17. The Semantic Web• Turn data into a web of simple links• Use ontology to explain how things arelinked• Use reasoning to add new linksautomatically• Be flexible and extensible
    18. 18. The VIVO ontology• Describe people and organizations inthe process of doing research• Stay discipline neutral• Use existing scientific domainterminology to describe content ofresearch
    19. 19. What is Linked Open Data (LOD)?• Data– Structured information, not just documentswith text– A common, simple format• Open– Available, visible, mine-able– Anyone can post, consume, and reuse• Linked– Directly by reference– Indirectly through common references andinference
    20. 20. Linked Open Data
    21. 21. Linked data indexed for searchPonceVIVOPonceVIVOWashUVIVOWashUVIVOIUVIVOIUVIVOCornellIthacaVIVOCornellIthacaVIVOWeillCornellVIVOWeillCornellVIVOeagle-iresearchresourceseagle-iresearchresources HarvardProfilesRDFHarvardProfilesRDFOtherVIVOsOtherVIVOsDigitalVitaRDFDigitalVitaRDFIowaLokiRDFIowaLokiRDFLinked Open DataLinked Open Datavivosearch.orgUFVIVOUFVIVOScrippsVIVOScrippsVIVOSolrsearchindexSolrsearchindexanotherSolrindexanotherSolrindex
    22. 22. Implementation challenges• A simple idea – take the basic publicinformation about researchers at Cornelland make it easy to find for academicpurposes• Why is this hard?
    23. 23. Policy issues• Dirty data• Lack even of common definitions oforganizations or who’s faculty• Data ownership• Many dimensions of privacy• Short-term “go it alone” vs. commongood
    24. 24. Enter data once, use it many times
    25. 25. Weill Cornell research reporting• How has the number of publications co-authored with other institutionschanged year to year?
    26. 26. Multi-institutional scenarios for VIVO• Multiple campuses of one university• University and federal lab connections– E.g., Colorado ties with regional federallabs• Consortia – 60 CTSAs• International– 13 Netherlands universities and theNational Library– AgriVIVO
    27. 27. Benefits across institutions• Sharing experience provides clarity and newideas• Incentives from sharing development, tools,customizations• Potential data-level connectivity– Research is happening increasingly inteams that span institutions– Meeting the needs of short and long-termvirtual organizations
    28. 28. From outputs to outcomes• Outputs like papers and patents can be tracked– Collaborative ontology effort to adequatelyrepresent the humanities• Outcomes such as economic impact or societalbenefit are much harder to identify• Questions about return on research investmentbeg for consistent, comparable data– over time– across institutions– across domains
    29. 29. International engagement
    30. 30. International engagement
    31. 31. Partnerships – ORCID• Open Researcher and Contributor ID– Attribution for works of any type• ORCID and VIVO– ORCID is an attribute in a VIVO profile– Tools being tested for submission ofresearcher registrations from VIVOhttp://orcid.org
    32. 32. VIVO/DuraSpace Partnership• DuraSpace is a not-for-profit organizationsupporting the DSpace and Fedora repositories• Serves as the open source community home forfuture VIVO development• Provides a legal and financial framework,extensive tools, and proven track record ofmanaging community developed open sourceprojects• Joint two-year initial governance based onfounding sponsors, management team, anddedicated development and leadership effort
    33. 33. The VIVO Community
    34. 34. Meeting about VIVO• 2nd Australian VIVO Days in February• CU Boulder hosted 50 attendees for the3rdVIVO Implementation Fest in April• May 20thVIVO event for New York Cityarea institutions• August 2013 will be the 4thAnnual VIVOConference – approximately 200-250attendees, with workshops, papers,keynotes, invited talks, and posters
    35. 35. Research Informatics Infrastructure• USDA adopting for intramural research,and also using VIVO to knit togetherdata from their 7 major agencies tofulfill reporting mandates to Office ofScience & Technology Policy andCongress• National Center for AtmosphericResearch (NCAR) is piloting VIVO tocoordinate large, multi-year, multi-institutional, multi-instrument researchprojects
    36. 36. Research Informatics Infrastructure –cont.• Accurate, structured VIVO data can feedexternal profiling and discovery systems(ORCID, Google Scholar, AcademicAnalytics, etc.)• VIVO extensibility allows it to representresearch resources and tie them toresearch datasets, publications, andresearchers, promoting data discoveryand reuse
    37. 37. VIVO for atmospheric and space physics
    38. 38. CTSAconnect and the ISF• VIVO and eagle-i team members won NIHfunding in 2012 for a project to unify theirontologies and extend both in the clinicaldomain• The unified ontology is known as theIntegrated Semantic Framework, or ISF• VIVO 1.6 and eagle-i’s next release will use theISF• This combined ontology is modular to allowselective data population based on local needs
    39. 39. Tying biomedical research to clinical delivery
    40. 40. Challenges• Communicating VIVO’s goals to faculty,administrators, funders, and otherinstitutions• Adapting to constant changes in datasources• Fully exploiting the opportunities providedby VIVO linked open data• Co-existing in a world where not everyoneuses VIVO• Positioning VIVO on a sustainable path
    41. 41. Next Webinar: Case Studies• Tuesday, June 4• Colorado• Duke• Brown• Weill Cornell Medical College
    42. 42. 3rdWebinar – Technical Deep Dive• Tuesday, June 11• Ontology & Linked Data• Open source technologies used• What’s coming in v1.6• VIVO technical community touch points• Many ways to participate, benefit, andcontribute
    43. 43. May 14, 2013 Hot Topics: DuraSpace Community Webinar SeriesQuestions?