VIVO at the University of Idaho
Upcoming SlideShare
Loading in...5
×
 

VIVO at the University of Idaho

on

  • 158 views

In 2012, the University of Idaho Library began implementing VIVO, an open-source Semantic Web application, both as a discovery layer for its fledgling institutional repository and as a database to ...

In 2012, the University of Idaho Library began implementing VIVO, an open-source Semantic Web application, both as a discovery layer for its fledgling institutional repository and as a database to describe, visualize, and report university research activity. The presenters will detail some of the challenges they encountered developing this resource, while discussing the tools and techniques they used for obtaining, editing, and uploading institutional data into the RDF-based VIVO system.

Statistics

Views

Total Views
158
Views on SlideShare
158
Embed Views
0

Actions

Likes
0
Downloads
3
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Example 1:A very simple way to look at awards data. This presents the number of awards by agency. It is using a javascript library called sgvizler to turn JSON data from Fuseki into a Google Charts visualization.
  • Example 2:An other simple view using sg-vizler. This shows a comparison of two variables – awards and publications – for personnel in a specific research group. It would need work as a formal graph, but it points to the way that the data can be re-used.
  • Example 3:An other simple example of data re-use using a javascript/ajax technique to display a list of journal titles and faculty within a specific research group. Links to the faculty members’ VIVO profiles are associated with their names.

VIVO at the University of Idaho VIVO at the University of Idaho Presentation Transcript

  • VIVO at the University of Idaho SHINY HAPPY PEOPLE HOLDING NODES: USING VIVO (A SEMANTIC WEB APPLICATION) TO REVEAL UNIVERSITY OF IDAHO RESEARCH AND RESEARCHERS
  • What is VIVO?  An Open-Source …   Semantic Web application …   RDF (Resource Description Framework) Triples, which are controlled subject-predicate-object expressions that produce consistent relationships and Data Harvesting procedures   Data structured so that it can be shared and reused using Linked Data practices and standards…   Freely available with a community of librarians and web developers Collecting, ingesting and publishing (public/private) data in batches to create a searchable, browseable, and reusable network of information on research and researchers.
  • Early History of VIVO  1997-2005: VIVO Network idea developed at Cornell for life and social sciences.  Intended to provide a view of sciences and research “across disciplinary and administrative boundaries.”  2005: Released for Life Sciences  2007: Expanded to all of Cornell University (thru Library)  2009: $12.2 million NIH grant provided to develop a national version with several other partners  2010 – Present: More and more institutions adopting and developing VIVO instances from “VIVO: Enabling National Networking of Scientists”
  • VIVO at the University of Idaho  Spring 2012 – Fall 2012  Approached by Idaho INBRE (a Biomedical Researcher network in Idaho) with question about possibly installing VIVO instance  Installed VIVO, began setting up and learning the system, while gathering feedback from INBRE and other stakeholders  Garnered approval from INBRE faculty to publish their information in the system  Harvested INBRE related information from public resources: PubMed and NIH and NSF grants database
  • VIVO at the University of Idaho  Spring 2013  Began to pursue expanded VIVO  Receive approval from institutional IT evaluation group to go forward  Re-branded instance  Presented VIVO to library faculty and administration as possible project going forward  Presented instance and proposal for new position to VP of Research
  • VIVO at the University of Idaho  Summer 2013  VP approved expanded use of VIVO for Research Groups on campus and funding for position  Annie Gaines begins as Scholarly Communication Librarian  Ingest, Ingest, Ingest,  Added three additional research groups, as well as the Law School, and associated faculty  Added thousands of grants, publications, and people into the system.
  • VIVO at the University of Idaho  Fall 2013  Presented VIVO publicly on campus for first time  VIVO goes live (accessible from off campus)  Additional organizational descriptions added (Department, College, Grant Strucutures, etc.)  Gained approval and access to use campus database system, Banner
  • VIVO at the University of Idaho  VIVO Today  Beginning to explore VIVO as front-end for historical documents  Adding all University Faculty  Creating applications and access points for data  Cleaning, always cleaning …  Using this presentation as a prompt for further development of application, as well as further defining:  the system’s presentation  our data’s preservation  and our mission and goals in using the system
  • Hosting  Provided by the Northwest Knowledge Network  www.northwestknowledge.net  NKN focuses on providing technical support to researchers  Division of UI’s Office of Research  Strong relationship with the UI Library (they are in the building)  Data is replicated to a data center at Idaho National Laboratory  Present future opportunities for integrating VIVO’s information with other research-related tools/systems
  • Technical Specs  Our installation   Apache Web Server  MySQL   Red Hat Linux Tomcat Current Version of VIVO  1.5.2  Probably upgrade to 1.6 in March 2014
  • Building VIVO – Two Approaches  Approach #1 – the high-resource approach (ideal)  Requires   Available programmers and developers   Discrete IT department Formal IT project management Advantages   Advanced customization and configuration   High-level of integration into existing systems/services Reasonably short time from inception to production Disadvantages  Red-tape  Represents a large commitment by the unit
  • Building VIVO – Two Approaches  Approach #2 – the low-resource approach (practical)  Requires   Experimental mindset   Minimum recommended staff identified in the VIVO implementation guide View VIVO as a series of small projects, rather than one large integration into university activities Advantages    Simple Manageable Disadvantages  Time (takes much longer)  Integration with existing services  Creation of custom data ingest tools
  • Implementation Goals  Start with low-hanging fruit. It is easier to collect  When considering custom tools and processes, our priorities:  1 – re-use from community or locally  2 – buy if possible  3 – build as needed  Build institutional interest in the existing data before soliciting more resources to further our development  Investigate third-party solutions (Symplectic Elements) as alternatives to custom-building internal methods of collecting data
  • Data Ingestion - General Typical workflow: 1. Receive data in source format 2. Convert to RDF (usually RDF/XML or Turtle) 3. Associate with VIVO ontology (as needed) 4. Reconcile against existing database 5. Load into the application 6. Re-index if needed
  • Data Ingestion - Sources  Public Sources    NSF, NIH, USDA Awards Pubmed Commercial Sources    Web of Science Must remove “intellectual effort” CVs, Publication Lists   Must have some means of soliciting them Local Databases (central university, research groups)  Several institutional sources  Must work through the gatekeepers of each  Need data security review to ensure that institutional concerns are met before public exposure
  • Data Ingestion - Tools  VIVO Harvester   Extract, Transform, and Load (ETL) tool that takes data from a source and loads it into VIVO automatically OpenRefine   Very flexible for different datatypes  Extension enables export in RDF format   Data cleaning tool Reconciliation service allows us to match and deduplicate entries before export Custom Conversion Tools (in Python)  Used for CRIS reports output, as well as other consistent, but unusual formats
  • Ontology Extensions  Custom University of Idaho model prefixed with “uidaho:”  Goals with our extensions   Establish the local need before creating   Re-use as much as possible Always associate classes within the VIVO hierarchy so that data is not fully reliant on uidaho for context Examples  Members of Idaho EPSCoR, Idaho INBRE, REACCH-PNA  Non-UI/Courtesy Faculty
  • Data Re-use - Fuseki  Apache Jena - Fuseki project   jena.apache.org/documentation/serving_data/ Enables external access to VIVO data  Without Fuseki, data re-use is limited to those authenticated with the system  Created examples of data re-use to assist in marketing efforts  Goal: to establish value-addness of putting data in VIVO  Example: Labs who need to report the results of their research by creating publication lists, or displaying spatial, temporal, or conceptual aspects of UI research to stakeholders or students could use this feature
  • Data Re-use - Fuseki Example 1: A very simple way to look at awards data. This presents the number of awards by agency. It is using a javascript library called sgvizler to turn JSON data from Fuseki into a Google Charts visualization.
  • Data Re-use - Fuseki Example 2: An other simple view using sg-vizler. This shows a comparison of two variables – awards and publications – for personnel in a specific research group. It would need work as a formal graph, but it points to the way that the data can be reused.
  • Data Re-use - Fuseki Example 3: An other simple example of data re-use using a javascript/ajax technique to display a list of journal titles and faculty within a specific research group. Links to the faculty members’ VIVO profiles are associated with their names.
  • VIVO as Institutional Repository
  • Background  When Annie was brought on for Scholarly Communications, one of her tasks was to develop an IR for the UI.  Some potential platforms to use for UI IR:  CONTENTdm – too flat  Bepress – too expensive  VIVO?
  • ‘Institutional repositories’ “A set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members.” Clifford Lynch, ARL Bimonthly Report 226, Feb. 2003. “Digital collections that capture and preserve the intellectual output of university communities.” Ryam Crowe, Case for Institutional Repositories, SPARC, 2002
  • ‘Institutional repositories’  Are:   Collection of scholarly work  Both cumulative and perpetual   Institutionally defined and managed Open Provide:  Long term preservation  Wide dissemination  Showcase for scholars and the institution
  • Challenges  Copyright issues, varying access  Buy-in from faculty, voluntary submissions  Getting people to care
  • VIVO as IR?  Not your typical IR interface   Interconnectedness in a large network  Includes diverse materials, not just article pre-prints  Includes citations for all works, not just the ones hosted in the IR   Dynamic browsing and searching Linked data format allows for reuse of data for a variety of purposes The following page shows a theses document in VIVO
  • Theory vs. Practice  Although VIVO can act as a front end, the documents must be hosted elsewhere  We deposit our docs in CONTENTdm and link to the PDF in VIVO  This makes things easier, but also more complicated  See example of the same theses document in CONTENTdm on the next page
  • Theory vs. Practice  We wanted to close this presentation by asking some questions to the group. If you have any advice for us on this project we would love to hear from you!  Are more access points better or more confusing?  Should we include historical documents in the VIVO IR?  Which page should be the main collection?  Should we provide links to all collections? Or link from one into the other?  What are best practices with unusually constructed Irs?
  • Thank you!