How To Make
Your Data Count
Patricia Cruse, Exec Director, DataCite
Daniella Lowenberg, Project Lead, MDC
November 26, 2018
Agenda
❏ What is our initiative about?
❏ Milestones in 2018
❏ Our first release
❏ Implementation at your repository
❏ Code of Practice
❏ Log Processing
❏ Sending Usage Logs
❏ Pulling & Displaying Usage & Citation Metrics
❏ Questions
What is MDC?
Imagine a world where
data are considered a
first-class research
output and are valued as
such...
Making Data
Count
2014 -
2015
▪2014 - 2015
5
“
6
1. Formal recommendation for measuring data usage
2. Develop Hub for all Data Level Metrics (DLM)
3. Make usage tracking easier
4. Drive adoption by showing how it can be done
(easily)
5. Engage across all research communities
6. Iterate!
8
Make Data
Count
2017 -
2019
9
DevelopNewDataLevelMetricsHub
101110
01101001
10100101
00111001
10100110
Data Citations
Repository
Usage Metrics
Future Metrics
Leverage Existing
Initiatives
Develop New
Recommendation
ServerLog
Processing
Drive Adoption - Engage Communities
Display
Data Metrics
Milestones thus far
We created a data usage metrics standard
We gave MDC a narrative
We built out an open hub for usage metrics
https://api.datacite.org/events
We implemented at our own repositories
We began to address citations
What does it look like?
How does this relate to
Scholix?
Scholix is not a thing - it is a change initiative
MDC & Scholix work hand in hand to advocate for best data
citation practices
● Scholix is an information framework for submitting data
citations
● MDC allows for displaying data citations back at the
repository
Implementation at your
repository
Why it is important
Community has long
grappled with the problem
of assessing and tracking
the results of scholarship
● Researchers
● Repositories
● Funders
● Publishers
Five simple steps to Make YOUR Data Count
1. Read the data usage metrics standard “Code of Practice for
Research Data”
2. Process your usage logs against this standard
3. Send processed and standardized usage logs to an open hub
4. Pull usage and citation metrics from an open hub
5. Display standardized usage and citation metrics on your
repository interface
Getting Started
We have built a “Getting Started” guide walking through
these steps as implemented in CDL’s Dash
https://github.com/CDLUC3/Make-Data-Count/blob/master/g
etting-started.md
1. Code of Practice for
Research Data
2. Log Processing
Standardized Logs
● Specialized logs that are processed against Code of Practice
● Views
● Downloads
● Users: at the country level, access during a session
● Session: de-duplicate access to page within 30 seconds
3. Sending Usage Reports
JSON Report - HeaderJSON Report - Body
The Usage Metrics Hub
Usage Metrics Hub (hosted by DataCite)
● Aggregator of research data usage reports
● Usage reports are made available via API (in original JSON format) and soon
web interface and CSV
● Usage reports are broken down by dataset (and request method), and can
then be aggregated over time
● Information in usage reports can be combined with data citations and dataset
metadata
4. Pulling Usage and
Citations
Pulling Usage and Citations
● Data usage metrics and citations are made available via public API, with one
“event” for each data citation or monthly usage count.
● Data citations are provided by DataCite metadata (i.e. come from data
repositories) and Crossref, with more to come
● Currently separate APIs for usage and citations, and a third API for dataset
metadata, will be combined into single API for easier retrieval of information
What’s next?
Looking Ahead
● Outreach and Adoption
○ Repository & Publishers
● Iterating on our implementation
○ Adding volume and usage by regions
○ Provide aggregation through DataCite hub
○ Beyond the DOI: metrics for other types of identifiers
○ Possible: altmetrics
Questions?
www.makedatacount.org
@makedatacount

How to make your data count webinar, 26 Nov 2018

  • 1.
    How To Make YourData Count Patricia Cruse, Exec Director, DataCite Daniella Lowenberg, Project Lead, MDC November 26, 2018
  • 2.
    Agenda ❏ What isour initiative about? ❏ Milestones in 2018 ❏ Our first release ❏ Implementation at your repository ❏ Code of Practice ❏ Log Processing ❏ Sending Usage Logs ❏ Pulling & Displaying Usage & Citation Metrics ❏ Questions
  • 3.
  • 4.
    Imagine a worldwhere data are considered a first-class research output and are valued as such...
  • 5.
  • 6.
  • 8.
    1. Formal recommendationfor measuring data usage 2. Develop Hub for all Data Level Metrics (DLM) 3. Make usage tracking easier 4. Drive adoption by showing how it can be done (easily) 5. Engage across all research communities 6. Iterate! 8 Make Data Count 2017 - 2019
  • 9.
    9 DevelopNewDataLevelMetricsHub 101110 01101001 10100101 00111001 10100110 Data Citations Repository Usage Metrics FutureMetrics Leverage Existing Initiatives Develop New Recommendation ServerLog Processing Drive Adoption - Engage Communities Display Data Metrics
  • 10.
  • 11.
    We created adata usage metrics standard
  • 12.
    We gave MDCa narrative
  • 13.
    We built outan open hub for usage metrics https://api.datacite.org/events
  • 14.
    We implemented atour own repositories
  • 15.
    We began toaddress citations
  • 16.
    What does itlook like?
  • 21.
    How does thisrelate to Scholix?
  • 22.
    Scholix is nota thing - it is a change initiative MDC & Scholix work hand in hand to advocate for best data citation practices ● Scholix is an information framework for submitting data citations ● MDC allows for displaying data citations back at the repository
  • 23.
  • 24.
    Why it isimportant Community has long grappled with the problem of assessing and tracking the results of scholarship ● Researchers ● Repositories ● Funders ● Publishers
  • 25.
    Five simple stepsto Make YOUR Data Count 1. Read the data usage metrics standard “Code of Practice for Research Data” 2. Process your usage logs against this standard 3. Send processed and standardized usage logs to an open hub 4. Pull usage and citation metrics from an open hub 5. Display standardized usage and citation metrics on your repository interface
  • 26.
    Getting Started We havebuilt a “Getting Started” guide walking through these steps as implemented in CDL’s Dash https://github.com/CDLUC3/Make-Data-Count/blob/master/g etting-started.md
  • 27.
    1. Code ofPractice for Research Data
  • 29.
  • 30.
    Standardized Logs ● Specializedlogs that are processed against Code of Practice ● Views ● Downloads ● Users: at the country level, access during a session ● Session: de-duplicate access to page within 30 seconds
  • 31.
  • 33.
    JSON Report -HeaderJSON Report - Body
  • 34.
  • 35.
    Usage Metrics Hub(hosted by DataCite) ● Aggregator of research data usage reports ● Usage reports are made available via API (in original JSON format) and soon web interface and CSV ● Usage reports are broken down by dataset (and request method), and can then be aggregated over time ● Information in usage reports can be combined with data citations and dataset metadata
  • 36.
    4. Pulling Usageand Citations
  • 37.
    Pulling Usage andCitations ● Data usage metrics and citations are made available via public API, with one “event” for each data citation or monthly usage count. ● Data citations are provided by DataCite metadata (i.e. come from data repositories) and Crossref, with more to come ● Currently separate APIs for usage and citations, and a third API for dataset metadata, will be combined into single API for easier retrieval of information
  • 38.
  • 39.
    Looking Ahead ● Outreachand Adoption ○ Repository & Publishers ● Iterating on our implementation ○ Adding volume and usage by regions ○ Provide aggregation through DataCite hub ○ Beyond the DOI: metrics for other types of identifiers ○ Possible: altmetrics
  • 40.