SlideShare a Scribd company logo
@dataverseorg
Mercè Crosas, Director of Data Science, IQSS, Harvard University
@mercecrosas
Harvard Purdue Data Management Symposium, June 16-17, 2015
New!
About Dataverse
● Gives credit and control to data authors and distributors
● Follows best practices, standards for data management and archiving
● Dataverse development started in 2006 at Harvard’s IQSS
● Now widely used, with a vibrant development and user community
● Helped instigate and is at the center of a cultural change toward open and
reproducible research
Science requires
community access
to data
An open source software
project for sharing, citing
and archiving data
Technology
Solution
In 2015 ...
Dataverse 4.0
A full rewrite that improves usability defines a rigorous and standardized
data publishing workflow, and leverages the latest technologies.
Rich Set of Features
● Standard, persistent data citation
● Branding for each dataverse
● Standard, extensible metadata:
○ citation metadata
○ domain-specific metadata
○ file-level metadata
● Faceted search for all metadata
● Multiple levels of access control
○ CC0/ terms of use/ restricted
● Multiple roles and permissions
● Re-formatting of tabular data files
● Extraction of file metadata
● Versioning
● APIs for search, deposit, access
Upgraded Technology
● UI improved by usability testing
● Built with open source solutions
● Enhanced UI framework
○ PrimeFaces and Bootstrap
● Widely used, community driven
enterprise software platform
○ Java EE7 and Glassfish
● Reliable, scalable search platform
○ Solr
● Web standard programmatic interfaces
○ RESTful APIs
● Standards for archiving and
interoperability
○ OAI-PMH, LOCKSS
Dataverse Installations worldwide
Dataverse software installations around the world serve as public data repositories (Harvard and
ODUM Dataverses) or institutional research data repositories.
Harvard Dataverse
● A collaboration between the Harvard Library and IQSS
● Open to research data worldwide:
○ > 1000 dataverses
○ > 58,000 datasets
○ > 270,000 files
○ > 1.3 million downloads
○ > 11,000 registered users
● Includes dataverses for:
○ individual researchers
○ research teams
○ journals
○ institutions or organizations
● Rate of data deposit has increased by a factor of 30 since last year
Dataverse is now more than a
software project and a data repository
Dataverse
Development of
Protocols and
Software
Collaboration
with the Library
User Support,
curation and
Training
User Community
Input
Grant
Partnerships
Collaboration
with Broader Data
Community
Interns Program
Outreach
(meetings,
papers)
Journals
Integration
External Tools
Integration
Technical Support
for other
Installations
A vibrant community
Contributions:
● Internalization, translation to chinese (Fudan University)
● Integration with Archivematica (University of Toronto)
● Integration with iRODS (UNC)
● Integration with Shibboleth (Netherlands Dataverse, DANS)
What’s next
•Support new data types
•Sensitive information
•Large-scale data (terabytes to petabytes)
•Streaming data (e.g., sensors, cell phones)
•Make datasets shared in Dataverse more reusable
•Extend APIs to build community of app contributors
•Build integrated tools (e.g., data visualization, analysis)
dataverse demo
https://dataverse-demo.iq.harvard.edu
Thanks
@mercecrosas

More Related Content

What's hot

Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
DigitalLibraryServices
 
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
ASIS&T
 
Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13
DataDryad
 
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
DeVonne Parks, CEM
 
Deep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCATDeep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCAT
EDINA, University of Edinburgh
 
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
EDINA, University of Edinburgh
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
ASIS&T
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their data
Jisc RDM
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
Jisc RDM
 
Caldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data RepositoriesCaldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data Repositories
National Information Standards Organization (NISO)
 
Research Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghResearch Data Support at the University of Edinburgh
Research Data Support at the University of Edinburgh
Robin Rice
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Robin Rice
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
National Information Standards Organization (NISO)
 
SMRUDAS
SMRUDAS SMRUDAS
SMRUDAS
Jisc RDM
 
SCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison serviceSCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison service
EDINA, University of Edinburgh
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 EDINA, University of Edinburgh
 
Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service
ARDC
 
Rusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing AccessRusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing Access
National Information Standards Organization (NISO)
 
JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training
EDINA, University of Edinburgh
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShare
Robin Rice
 

What's hot (20)

Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
 
Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13
 
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
 
Deep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCATDeep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCAT
 
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their data
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Caldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data RepositoriesCaldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data Repositories
 
Research Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghResearch Data Support at the University of Edinburgh
Research Data Support at the University of Edinburgh
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
 
SMRUDAS
SMRUDAS SMRUDAS
SMRUDAS
 
SCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison serviceSCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison service
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
 
Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service
 
Rusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing AccessRusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing Access
 
JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShare
 

Similar to Dataverse hpdm symposium

Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
Merce Crosas
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
OpenAIRE
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
Karlsruhe Institute of Technology (KIT)
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
Robin Rice
 
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
National Information Standards Organization (NISO)
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
vty
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunities
vty
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf
4Science
 
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary) Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
OpenAIRE
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support Services
SusanMRob
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
Sarah Anna Stewart
 
Integrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkIntegrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science framework
rmacneil88
 
Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?
International Food Policy Research Institute (IFPRI)
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
DuraSpace
 
RDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue LibrariesRDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue Libraries
ASIS&T
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
philipdurbin
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
University of California Curation Center
 
Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"
National Information Standards Organization (NISO)
 

Similar to Dataverse hpdm symposium (20)

Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunities
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf
 
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary) Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support Services
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
Integrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkIntegrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science framework
 
Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
RDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue LibrariesRDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue Libraries
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
 
Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"
 

More from Merce Crosas

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
Merce Crosas
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
Merce Crosas
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Merce Crosas
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
Merce Crosas
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
Merce Crosas
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
Merce Crosas
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
Merce Crosas
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
Merce Crosas
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
Merce Crosas
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
Merce Crosas
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
Merce Crosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
Merce Crosas
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Merce Crosas
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
Merce Crosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
Merce Crosas
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
Merce Crosas
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
Merce Crosas
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Merce Crosas
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
Merce Crosas
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
Merce Crosas
 

More from Merce Crosas (20)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 

Recently uploaded

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 

Recently uploaded (20)

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 

Dataverse hpdm symposium

  • 1. @dataverseorg Mercè Crosas, Director of Data Science, IQSS, Harvard University @mercecrosas Harvard Purdue Data Management Symposium, June 16-17, 2015 New!
  • 2. About Dataverse ● Gives credit and control to data authors and distributors ● Follows best practices, standards for data management and archiving ● Dataverse development started in 2006 at Harvard’s IQSS ● Now widely used, with a vibrant development and user community ● Helped instigate and is at the center of a cultural change toward open and reproducible research Science requires community access to data An open source software project for sharing, citing and archiving data Technology Solution
  • 4. Dataverse 4.0 A full rewrite that improves usability defines a rigorous and standardized data publishing workflow, and leverages the latest technologies.
  • 5. Rich Set of Features ● Standard, persistent data citation ● Branding for each dataverse ● Standard, extensible metadata: ○ citation metadata ○ domain-specific metadata ○ file-level metadata ● Faceted search for all metadata ● Multiple levels of access control ○ CC0/ terms of use/ restricted ● Multiple roles and permissions ● Re-formatting of tabular data files ● Extraction of file metadata ● Versioning ● APIs for search, deposit, access Upgraded Technology ● UI improved by usability testing ● Built with open source solutions ● Enhanced UI framework ○ PrimeFaces and Bootstrap ● Widely used, community driven enterprise software platform ○ Java EE7 and Glassfish ● Reliable, scalable search platform ○ Solr ● Web standard programmatic interfaces ○ RESTful APIs ● Standards for archiving and interoperability ○ OAI-PMH, LOCKSS
  • 6. Dataverse Installations worldwide Dataverse software installations around the world serve as public data repositories (Harvard and ODUM Dataverses) or institutional research data repositories.
  • 7. Harvard Dataverse ● A collaboration between the Harvard Library and IQSS ● Open to research data worldwide: ○ > 1000 dataverses ○ > 58,000 datasets ○ > 270,000 files ○ > 1.3 million downloads ○ > 11,000 registered users ● Includes dataverses for: ○ individual researchers ○ research teams ○ journals ○ institutions or organizations ● Rate of data deposit has increased by a factor of 30 since last year
  • 8.
  • 9.
  • 10. Dataverse is now more than a software project and a data repository Dataverse Development of Protocols and Software Collaboration with the Library User Support, curation and Training User Community Input Grant Partnerships Collaboration with Broader Data Community Interns Program Outreach (meetings, papers) Journals Integration External Tools Integration Technical Support for other Installations
  • 11. A vibrant community Contributions: ● Internalization, translation to chinese (Fudan University) ● Integration with Archivematica (University of Toronto) ● Integration with iRODS (UNC) ● Integration with Shibboleth (Netherlands Dataverse, DANS)
  • 12. What’s next •Support new data types •Sensitive information •Large-scale data (terabytes to petabytes) •Streaming data (e.g., sensors, cell phones) •Make datasets shared in Dataverse more reusable •Extend APIs to build community of app contributors •Build integrated tools (e.g., data visualization, analysis)