SlideShare a Scribd company logo
1 of 33
DataShare:
Collaboration Yields Promising Tool

          Julia Kochi, UCSF Library
       Angela Rizk-Jackson, UCSF CTSI
    Perry Willett, California Digital Library
              CNI 2013 Meeting
               San Antonio, TX
The Background

    Julia Kochi
   UCSF Library
What is DataShare?
An open data repository for the UCSF
 researcher
A concept initially envisioned by Michael
 Weiner, M.D.
A collaboration between UCSF CTSI, UCSF
 Library, and the California Digital Library
The Problem
Increasing requirements to share data
  • NIH grants >$500k
  • Publisher requirements
Unequal availability of national repositories
Campus priorities
FASTR, White House Directive
The Partners
UCSF CTSI
  • Knowledge of the researcher, access to the data
UCSF Library
  • Metadata expertise, programming resources
UC3
  • Preservations tools, services and expertise
Technical Infrastructure

         Perry Willett
   California Digital Library
DataShare Components
Merritt: CDL
EZID: CDL
XTF: CDL, UCSF Library
Ingest tool: UCSF Library
Merritt Repository Service
Built on “micro-services” principles
Content and format agnostic
Has a UI and RESTful APIs to submit and
 retrieve content, and check statuses
Can serve as either “dark” or “bright” archive
Added public access, data use
 agreements, asynchronous downloads as part
 of Datashare project
EZID
Service for creation and management of long-
 term identifiers
Currently supports ARKs and DOIs; other types
 in planning stages
Registers DOIs with DataCite
Has a UI and APIs with good documentation
XTF
eXtensible Text Framework
Developed and maintained by CDL
Runs several CDL services:
  • eScholarship
  • Online Archive of California
  • Calisphere
Faceted browsing, full-text search, other
 desirable features
Ingest tool
Submitting content to a digital repository is
 hard and costly
An attempt to simplify several aspects:
  • Digital object creation
  • Metadata creation
  • Object submission
Interactions for submission
                  Creates Metadata
                  Assembles Dataset                                 Datacite
                  Packages object
                  Submits to Merritt

                                                          Registers DOI and Metadata
     Ingest                                      Requests DOI
      Tool                             Merritt   Submits Metadata
                                                 to EZID
Requests ATOM feed for collection                Receives DOI
Gets ATOM feed
Retrieves Metadata


      XTF                                                            EZID
                  Index metadata
Process for Endusers
 Search, browse
 Request dataset download
 Fill out Data Use Agreement
 Receive dataset
Lessons learned
Partnerships
  • Many hands make light work
  • Real users uncover hidden assumptions
Scale
  • Object size
  • Number of files
  • Upload and download
If you build it, will they come?

        Angela Rizk-Jackson
            UCSF CTSI
What will it take?



                                             +
Sketch by Juliana Olivera Silva via Flickr
Providing Incentives: Requirements
   Organization               Data Access Requirement         # UCSF Studies
Funding
       NIH             Grants >$500K (2003 on), Specific    318 (active
                       programs                             projects)
                                                            693 (inactive)
       NSF             All funded projects (2005 on)        19
    Foundations     All funded projects                     3, 31, 19
(e.g. Moore, Gates,
      Hewlett)
Publishing
      Nature            All published studies (2009-2011)   58
 Publishing Group
 (Nature, Science,
       etc.)
     Cell Press         All published studies (2009-2011)   48
(Cell, Neuron, etc.)
       PNAS             All published studies (2005-2011)   26
Providing Incentives: Visibility
                     01010010101
                     00110010100
                     10101001001
                     00110001111




 Enhances collaborative opportunities
 69% increase in citation rate for
  publications associated with shared data
   (Piwowar, 2007)
Providing Incentives: Credit
Providing Incentives:
Preservation & Access
Providing Incentives: Institutional


                                                          • Support researcher needs
                                                          • Improved archiving efficiency
                                                          • Cost savings
UCLA Royce Hall photo courtesy of Adam Fagen via Flickr
Eliminating Barriers
1. Time / Effort
   - Minimal requirements
   - Specific tools (e.g. ingest)
   - Integrate into existing workflow
2. Control
   - Data Use Agreement
   - Centralized service
3. Cultural Paradigm
   - Outreach
   - Demonstrate value
Other Collaborators
Lessons Learned
Don’t underestimate technical matters
  • Separating data & metadata
Standards are not standard
  • Metadata schema (Dublin Core  DataCite)
  • Interpretation
Policy issues are ever-present
  • Data Ownership & Data Use Agreements
  • Privacy & Consent (Human subjects)
Keep in mind the entire lifecycle: ALL users
  • Discoverability & interoperability
  • README File
Next Steps
Outreach
System enhancements
  • Design overhaul
  • Ingest mechanism
  • DUA menu
Policy navigation
Proof-of-concept
Discussion Topics
What incentives have you found useful to
 encourage adoption of this type of resource?
Are you using data use agreements? Uniform
 or individualized?
Where do you see institutional data
 repositories fitting in the larger ecosystem?
More info
Datashare: http://datashare.ucsf.edu
CDL: http://www.cdlib.org
  • Merritt: https://merritt.cdlib.org
  • EZID: http://n2t.net/ezid
  • XTF: http://xtf.cdlib.org
UCSF Library: http://www.library.ucsf.edu/
UCSF CTSI: http://ctsi.ucsf.edu/
     NCATS – NIH Grant # UL1 TR000004

More Related Content

What's hot

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Management
slabrams
 

What's hot (20)

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
Organising and Documenting Data
Organising and Documenting DataOrganising and Documenting Data
Organising and Documenting Data
 
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Management
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy Issues
 
Federated Architecture with Provenance and Access Control to realize Open Dig...
Federated Architecture with Provenance and Access Control to realize Open Dig...Federated Architecture with Provenance and Access Control to realize Open Dig...
Federated Architecture with Provenance and Access Control to realize Open Dig...
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research Series
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
 
Digital Curation Technology: JHU Summit, October 2015
Digital Curation Technology: JHU Summit, October 2015Digital Curation Technology: JHU Summit, October 2015
Digital Curation Technology: JHU Summit, October 2015
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
Investigation into Private LOCKSS Networks
Investigation into Private LOCKSS NetworksInvestigation into Private LOCKSS Networks
Investigation into Private LOCKSS Networks
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Digital Curation in Libraries: An innovative way of content preservation and...
Digital Curation in Libraries:  An innovative way of content preservation and...Digital Curation in Libraries:  An innovative way of content preservation and...
Digital Curation in Libraries: An innovative way of content preservation and...
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case Study
 

Viewers also liked

Faculty.babson.edu grossman breeze_erp_erp_printable_version
Faculty.babson.edu grossman breeze_erp_erp_printable_versionFaculty.babson.edu grossman breeze_erp_erp_printable_version
Faculty.babson.edu grossman breeze_erp_erp_printable_version
Swathi Shetty
 
UC BRAID Clinical Research Billing
UC BRAID Clinical Research BillingUC BRAID Clinical Research Billing
UC BRAID Clinical Research Billing
CTSI at UCSF
 
みんなの意見教えて
みんなの意見教えてみんなの意見教えて
みんなの意見教えて
耕作 茂木
 
Technology in Education
Technology in EducationTechnology in Education
Technology in Education
watsonk2304
 

Viewers also liked (7)

Faculty.babson.edu grossman breeze_erp_erp_printable_version
Faculty.babson.edu grossman breeze_erp_erp_printable_versionFaculty.babson.edu grossman breeze_erp_erp_printable_version
Faculty.babson.edu grossman breeze_erp_erp_printable_version
 
Engage UC
Engage UCEngage UC
Engage UC
 
UC BRAID Clinical Research Billing
UC BRAID Clinical Research BillingUC BRAID Clinical Research Billing
UC BRAID Clinical Research Billing
 
みんなの意見教えて
みんなの意見教えてみんなの意見教えて
みんなの意見教えて
 
Moteki keio science-talksession_130511
Moteki keio science-talksession_130511Moteki keio science-talksession_130511
Moteki keio science-talksession_130511
 
Technology in Education
Technology in EducationTechnology in Education
Technology in Education
 
Baiu frontid 120721
Baiu frontid 120721Baiu frontid 120721
Baiu frontid 120721
 

Similar to Datashare cni spring2013

How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-use
Matthew Vaughn
 

Similar to Datashare cni spring2013 (20)

Datashare cni spring2013
Datashare cni spring2013Datashare cni spring2013
Datashare cni spring2013
 
DataShare for UC Campuses
DataShare for UC CampusesDataShare for UC Campuses
DataShare for UC Campuses
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-use
 
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
 
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Making Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDLMaking Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDL
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
NISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLNISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDL
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
RDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemRDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management Ecosystem
 
Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive track
 
New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
Research Discovery, Social Networks and VIVO
Research Discovery, Social Networks and VIVO Research Discovery, Social Networks and VIVO
Research Discovery, Social Networks and VIVO
 
CDL research lifecycle
CDL research lifecycleCDL research lifecycle
CDL research lifecycle
 
Or 2013-abrams-sharing-data-rich-research
Or 2013-abrams-sharing-data-rich-researchOr 2013-abrams-sharing-data-rich-research
Or 2013-abrams-sharing-data-rich-research
 

More from CTSI at UCSF

UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
CTSI at UCSF
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
CTSI at UCSF
 
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
CTSI at UCSF
 
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
CTSI at UCSF
 
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
CTSI at UCSF
 

More from CTSI at UCSF (20)

AMIA Joint Summits 2017: Building Research Data Mart from UCSF OMOP Database ...
AMIA Joint Summits 2017: Building Research Data Mart from UCSF OMOP Database ...AMIA Joint Summits 2017: Building Research Data Mart from UCSF OMOP Database ...
AMIA Joint Summits 2017: Building Research Data Mart from UCSF OMOP Database ...
 
CER 2016 Trontell pcori cer presentation 2016 02 02 final
CER 2016 Trontell pcori cer presentation 2016 02 02 finalCER 2016 Trontell pcori cer presentation 2016 02 02 final
CER 2016 Trontell pcori cer presentation 2016 02 02 final
 
CER 2016 Srivastava
CER 2016 Srivastava CER 2016 Srivastava
CER 2016 Srivastava
 
CER 2016 Phillips cer symposium pcori 2016 from 012716
CER 2016 Phillips cer symposium pcori 2016 from 012716CER 2016 Phillips cer symposium pcori 2016 from 012716
CER 2016 Phillips cer symposium pcori 2016 from 012716
 
CER 2016 Nguyen ctsi collaborative research
CER 2016 Nguyen ctsi collaborative researchCER 2016 Nguyen ctsi collaborative research
CER 2016 Nguyen ctsi collaborative research
 
CER 2016 Hernandez patient engagement
CER 2016 Hernandez patient engagementCER 2016 Hernandez patient engagement
CER 2016 Hernandez patient engagement
 
CER 2016 Dohan EQUIP
CER 2016 Dohan EQUIPCER 2016 Dohan EQUIP
CER 2016 Dohan EQUIP
 
CER 2016 Jacoby stakeholder engagement
CER 2016 Jacoby stakeholder engagementCER 2016 Jacoby stakeholder engagement
CER 2016 Jacoby stakeholder engagement
 
CER 2016 Goldman CTSI CER Resources
CER 2016 Goldman CTSI CER ResourcesCER 2016 Goldman CTSI CER Resources
CER 2016 Goldman CTSI CER Resources
 
CER 2016 Goldman Intro
CER 2016 Goldman IntroCER 2016 Goldman Intro
CER 2016 Goldman Intro
 
Data Reproducibility in Preclinical Discovery, Is It a Real Problem? 09/17/15
Data Reproducibility in Preclinical Discovery, Is It a Real Problem? 09/17/15Data Reproducibility in Preclinical Discovery, Is It a Real Problem? 09/17/15
Data Reproducibility in Preclinical Discovery, Is It a Real Problem? 09/17/15
 
Building Your Professional Network with LinkedIn
Building Your Professional Network with LinkedInBuilding Your Professional Network with LinkedIn
Building Your Professional Network with LinkedIn
 
How to Harness the Power of Google Analytics, Email Marketing & Vanity to Inc...
How to Harness the Power of Google Analytics, Email Marketing & Vanity to Inc...How to Harness the Power of Google Analytics, Email Marketing & Vanity to Inc...
How to Harness the Power of Google Analytics, Email Marketing & Vanity to Inc...
 
VIVO 2014: Google Analytics, Email Marketing & Vanity to Increase User Engage...
VIVO 2014: Google Analytics, Email Marketing & Vanity to Increase User Engage...VIVO 2014: Google Analytics, Email Marketing & Vanity to Increase User Engage...
VIVO 2014: Google Analytics, Email Marketing & Vanity to Increase User Engage...
 
UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
UCSF Informatics Day 2014 - Mark Pletcher, "Making EHR Data Useful for the Le...
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
 
UCSF Informatics Day 2014 - Keith R. Yamamoto, "Precision Medicine"
UCSF Informatics Day 2014 - Keith R. Yamamoto, "Precision Medicine"UCSF Informatics Day 2014 - Keith R. Yamamoto, "Precision Medicine"
UCSF Informatics Day 2014 - Keith R. Yamamoto, "Precision Medicine"
 
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
 
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
UCSF Informatics Day 2014 - Lindsey Watt Alami, "Study Management throughout ...
 
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
UCSF Informatics Day 2014 - Elizabeth St. Lezin, "Blood Transfusion Research ...
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Datashare cni spring2013

  • 1. DataShare: Collaboration Yields Promising Tool Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, California Digital Library CNI 2013 Meeting San Antonio, TX
  • 2. The Background Julia Kochi UCSF Library
  • 3. What is DataShare? An open data repository for the UCSF researcher A concept initially envisioned by Michael Weiner, M.D. A collaboration between UCSF CTSI, UCSF Library, and the California Digital Library
  • 4. The Problem Increasing requirements to share data • NIH grants >$500k • Publisher requirements Unequal availability of national repositories Campus priorities FASTR, White House Directive
  • 5. The Partners UCSF CTSI • Knowledge of the researcher, access to the data UCSF Library • Metadata expertise, programming resources UC3 • Preservations tools, services and expertise
  • 6. Technical Infrastructure Perry Willett California Digital Library
  • 7. DataShare Components Merritt: CDL EZID: CDL XTF: CDL, UCSF Library Ingest tool: UCSF Library
  • 8. Merritt Repository Service Built on “micro-services” principles Content and format agnostic Has a UI and RESTful APIs to submit and retrieve content, and check statuses Can serve as either “dark” or “bright” archive Added public access, data use agreements, asynchronous downloads as part of Datashare project
  • 9. EZID Service for creation and management of long- term identifiers Currently supports ARKs and DOIs; other types in planning stages Registers DOIs with DataCite Has a UI and APIs with good documentation
  • 10. XTF eXtensible Text Framework Developed and maintained by CDL Runs several CDL services: • eScholarship • Online Archive of California • Calisphere Faceted browsing, full-text search, other desirable features
  • 11.
  • 12.
  • 13. Ingest tool Submitting content to a digital repository is hard and costly An attempt to simplify several aspects: • Digital object creation • Metadata creation • Object submission
  • 14.
  • 15. Interactions for submission Creates Metadata Assembles Dataset Datacite Packages object Submits to Merritt Registers DOI and Metadata Ingest Requests DOI Tool Merritt Submits Metadata to EZID Requests ATOM feed for collection Receives DOI Gets ATOM feed Retrieves Metadata XTF EZID Index metadata
  • 16. Process for Endusers  Search, browse  Request dataset download  Fill out Data Use Agreement  Receive dataset
  • 17.
  • 18.
  • 19.
  • 20. Lessons learned Partnerships • Many hands make light work • Real users uncover hidden assumptions Scale • Object size • Number of files • Upload and download
  • 21. If you build it, will they come? Angela Rizk-Jackson UCSF CTSI
  • 22. What will it take? + Sketch by Juliana Olivera Silva via Flickr
  • 23. Providing Incentives: Requirements Organization Data Access Requirement # UCSF Studies Funding NIH Grants >$500K (2003 on), Specific 318 (active programs projects) 693 (inactive) NSF All funded projects (2005 on) 19 Foundations All funded projects 3, 31, 19 (e.g. Moore, Gates, Hewlett) Publishing Nature All published studies (2009-2011) 58 Publishing Group (Nature, Science, etc.) Cell Press All published studies (2009-2011) 48 (Cell, Neuron, etc.) PNAS All published studies (2005-2011) 26
  • 24. Providing Incentives: Visibility 01010010101 00110010100 10101001001 00110001111  Enhances collaborative opportunities  69% increase in citation rate for publications associated with shared data (Piwowar, 2007)
  • 27. Providing Incentives: Institutional • Support researcher needs • Improved archiving efficiency • Cost savings UCLA Royce Hall photo courtesy of Adam Fagen via Flickr
  • 28. Eliminating Barriers 1. Time / Effort - Minimal requirements - Specific tools (e.g. ingest) - Integrate into existing workflow 2. Control - Data Use Agreement - Centralized service 3. Cultural Paradigm - Outreach - Demonstrate value
  • 30. Lessons Learned Don’t underestimate technical matters • Separating data & metadata Standards are not standard • Metadata schema (Dublin Core  DataCite) • Interpretation Policy issues are ever-present • Data Ownership & Data Use Agreements • Privacy & Consent (Human subjects) Keep in mind the entire lifecycle: ALL users • Discoverability & interoperability • README File
  • 31. Next Steps Outreach System enhancements • Design overhaul • Ingest mechanism • DUA menu Policy navigation Proof-of-concept
  • 32. Discussion Topics What incentives have you found useful to encourage adoption of this type of resource? Are you using data use agreements? Uniform or individualized? Where do you see institutional data repositories fitting in the larger ecosystem?
  • 33. More info Datashare: http://datashare.ucsf.edu CDL: http://www.cdlib.org • Merritt: https://merritt.cdlib.org • EZID: http://n2t.net/ezid • XTF: http://xtf.cdlib.org UCSF Library: http://www.library.ucsf.edu/ UCSF CTSI: http://ctsi.ucsf.edu/ NCATS – NIH Grant # UL1 TR000004

Editor's Notes

  1. Mission: enable individual researchers to share their research data sets with the global communityA researcher at UCSF. In his work as the Principal Investigator of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) he concluded that widespread data sharing can be achieved now, with great scientific and economic benefits. All ADNI raw data is immediately shared, without embargo, with all scientists in the world. The project is very successful: more than 300 publications have resulted from use of the ADNI data resource. This success demonstrates the feasibility and benefits of sharing data.Clinical and Translational Sciences InstituteWorking together to develop a resource that meets the needs of the researcher while leverging the
  2. Cell Press, Nature Publishing Group, PNASOver 100 papers published between 2009-11 in journals from 3 publishers that have data sharing requirementsSome researchers have national repositories for their data (e.g. GenBank) while others don’t.Campus focused on developing infrastructure for storing and analyzing data but not sharing it generally. Additionally, the current focus is on clinical data, especially anonymized data from the electronic health record, and not basic or social sciences data.
  3. CTSI: Mission is to accelerate the research enterprise and saw the sharing of data as one way to accomplish this mission. Library: Interest in as well as an extension of the support of the open access ‘UC3: provide the tools to the UC community to promote digital scholarship.
  4. Screenshot of eScholarship, running XTF
  5. Screenshot of Datashare, running XTF
  6. Datashare website; enduser selects title
  7. Full information on dataset; enduser selects download
  8. Data Use Agreement (DUA) for enduser.
  9. Fulfills requirements, existing and emerging
  10. Increases visibility of work
  11. The new TR Data Citation Index provides a mechanism to discover data for re-use in the same familiar fashion as discovering publications
  12. Long term preservation, easy access to your own dataMerritt repository is an active archival environ with format migration and integrity checks – a smart filing cabinet for digital assets
  13. Centralizing resources improves efficiency by streamlining/standardizing the process and saves money in the aggregateCurrently gather data to support this
  14. Metadata, data/metadataseparation, file size, DUA, Discoverability, interoperability, README
  15. Metadata, data/metadataseparation, file size, DUA, Discoverability, interoperability, README