2. Data
Collection
Data
Cleaning
Data
Transformation
Data
Storage
01
02
03
04
- Person and Position Data (Human Resource)
- Publications data (Citation Data Sources)
- Grants Data (Office of Sponsored Programs)
- Incomplete Data, Incorrect Data, Duplicate Data
- CSV to RDF Conversion
- XML to RDF Conversion
- Loading RDF Data to VIVO Triplestore
Data
Access
Data
Analysis
Data
Visualization
Data
Prediction
VIVO Data Life Cycle
Scope of the Presentation
@Cornell(2009 - 2016)
5. Advance the visibility and accessibility of
Cornell scholarship and creative expression
(May 2016)
&
Explore the scholarship of Cornell from the
perspective of what the scholarly record itself
can tell us.
(August 2017)
6. ‘Us’ who?
Explore the scholarship of Cornell from the perspective of what the
scholarly record itself can tell us.
Knowledge Delivery
Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum
Faculty Admin
Librarian
Student
Faculty is interested in global
visibility of their research and
an authoritative source of data
about their scholarship and
how it relates to other
scholars.
Faculty
Academic deans and
department chairs need
macro views of research
output, global impact and
for faculty reporting.
Admin
Librarians need data to
support strategies to address
issues such as prioritization
and selection of library
resources and collections.
Librarians Students
Students need means
to discover faculty,
domain experts and
their research
contributions.
who are the stakeholders
7. Explore the scholarship of Cornell from the perspective of what the scholarly
record itself can tell us.
About whom?
8. FEED
MACHINE
publications
Database
Graph
Scholars@Cor1ell - “System of Systems”
g;ants
&
cont;acts
people
& positions academic
units
Manual
Data Analysis Module
SPARQL JAVA
JSON, RDF
Push
t;iples in the g;aph
Push
JSON files
QuerE
Database
Process
QuerE ResultSets
Process
outHut files
Scholars@Cornell
10. Image taken from http://www.servervitalsigns.com/liberate-domino-data/
Linked Open Data
Yes,
in RDF ForLat
Can I reuse it?
YesNo, May be
RDFOWL
Ontology EngineersSoftware Developers
JSJSON
Can I access it ?
[LOD]VIVO records data as
11. Image taken from http://www.servervitalsigns.com/liberate-domino-data/
Data
Yes,
in RDF ForLat
Can I reuse it?
YesNo, May be
RDFOWL
Ontology EngineersSoftware Developers
JSJSON
Can I access it ?
We need Linked OpenReusable
[LO D]R
12. RDFOWL
Ontology Engineers Software Developers
JSJSON
1) Data Access
2) Data Reuse
Unlock your Data
3) Infographics Reuse
https://cul-it.github.io/vivo-data-distribution-api/
RDF to JSON: Data Distribution API forVIVO
14. The Data Maturity Model
http://www.b-eye-network.com/view/15105
Stage One: No Usable Data
Stage Two: Big Data
Stage Three: The Right Data
Stage Four: Predictions
Stage Five: Strategy
With little or no useful data.
Can’t run metrics.
Doesn’t fully understand data.
No information-backed insights.
Have access to large data.
Steady flow of data from multiple sources.
But few tools to turn data into information.
Spent more time looking than analyzing.
Have access to high quality data.
Apply context and relevance to data models.
Explain data in meaningful ways.
Cornell units accept responsibility
for being Content Creators.
Can conduct historical & predictive analysis.
What is likely to happen tomorrow and beyond.
Can predict customer behavior and market
demand.
Entire business model is built around
its analytical models.
Predictive analysis is integrated into
core business processes.
17. KNOWKNOWLEDGE
Linked Data
Data is merely a record & VIVO is just a (linked) data warehouse
unless we pull the knowledge out of it.
From Data to Knowledge
18. How do we SHARE the KNOWLEDGE in
Scholars@Cornell ?
19. Access and Reuse
- List of Publications for a Faculty.
https://scholars.cornell.edu/api/dataRequest/listPublications?person=http://scholars.cornell.edu/individual/mjh78
- List of Grants
- of a Person
- of an Academic Unit
- List of Current Faculty
- of an Academic Unit
- List of Academic / Industry-level CoAuthorships *
- of an Academic Unit
- of an Person
Future Plans
- and more …
- in JSON Format
- in RDF Format
via Data Distribution API
20. Knowledge Share via Infographics
Global Collaborations
Grants
Keyword Cloud
Co-authorships
Inter-departmental
co-authorships
Research Interests
(in a department)
Journals/Proceedings
21. in Scholars@Cornell
Image taken from: https://marketingland.com/breadcrumb-links-good-user-experience-yes-97848
User Experience
DEMO
22. The Data Maturity Model
http://www.b-eye-network.com/view/15105
Stage One: No Usable Data
Stage Two: Big Data
Stage Three: The Right Data
Stage Four: Predictions
Stage Five: Strategy
With little or no useful data.
Can’t run metrics.
Doesn’t fully understand data.
No information-backed insights.
Have access to large data.
Steady flow of data from multiple sources.
But few tools to turn data into information.
Spent more time looking than analyzing.
Have access to high quality data.
Apply context and relevance to data models.
Explain data in meaningful ways.
Cornell Units accept responsibility
for being Content Creators.
Can conduct historical & predictive analysis.
What is likely to happen tomorrow and beyond.
Can predict customer behavior and market
demand.
Entire business model is built around
its analytical models.
Predictive analysis is integrated into
core business processes.
VIVO Cornell
Scholars@Cornell
24. July 2017 June 2018
1.College of Engineering
2.Johnson Graduate School of Management
3.BoyceThompson Institute
1.College of Engineering
2.Johnson Graduate School of Management
3.BoyceThompson Institute
4.College of Agriculture and Life Sciences
5.Law School (Pilot)
6.College of Veterinary Sciences (Pilot)
PARTNERS
DATA SOURCES 1.Upstream sources of Symplectic Elements
1.Upstream sources of Symplectic Elements
2.Activity Insight (Digital Measures)
3.Institutional Repository (Digital Commons)
DATATYPES 1.Journal Articles 1.Journal Articles
2.Conference Papers
DATA ACCESS 1.Data Download
i. RDF
1.Data Download
i. RDF
ii. JSON (using Date Distr. API)
iii.CSV (VIZ Data)
iv.SVG (VIZ)
2.VIZ Embed
Achievements
Cornell - International Academic Partners
OPERA - Open Research Analytics
Denmark’s Electronic Research Library Project
29. Acknowledgments
A number of 24Slides free templates were used to prepare the slides
Parraguez, P. and Maier, A.M. (2016), “Using network science to support design research: From counting to connecting”, in Cash, P., Stankovic, T. and Storga,
M. (Eds.),Experimental Design Research: Approaches, perspectives, applications, Springer, Cham, pp. 153–172.
Counting to Connecting Images: Wehrli, U., Born, G. and Spehr, D. (2013), The art of clean up: life made neat and tidy, Chronicle Books, San Francisco.
Few other images were taken from Google-Images.