Data management-handout

Data management-handout






    Data management-handout Data management-handout Document Transcript

    • Data Management Responsible Conduct of Research Seminar Series Jeffery Loo, UCB Library, jloo@berkeley.edu April 16, 2012 Presentation slides at http://goo.gl/bYnkA Definition Data management is about how you organize, store, use, and share your research data. PurposeTo review some first steps of data management with a focus on data organization, storage, and sharing.Saving data  Traditional storage devices = personal computers, departmental and university servers  Institutional archives and repositories o UC3 Merritt, http://merritt.cdlib.org/ o Data repository management services at UCB, http://ist.berkeley.edu/ds  Public archives and repositories (search for one at http://databib.lib.purdue.edu/)  3rd party cloud storage (e.g., Amazon S3, Google Docs, Dropbox)  Save with file formats that offer long-term access  Backup 3 copies: original master, local external storage, remote external storage  Online backup services o At UC Berkeley, http://ist.berkeley.edu/services/catalog/storage o 3rd party services, http://www.cdlib.org/services/uc3/dmp/security.html#secureDescribing/documenting data (metadata)  Overview of metadata, http://www.niso.org/publications/press/UnderstandingMetadata.pdf  Example metadata elements, http://libraries.mit.edu/guides/subjects/data- management/metadata.html  Recording metadata o readme.txt file o metadata form/record in an archive/repository o annotate data (e.g., XML, http://www.w3schools.com/xml/)  Assign descriptive file and folder namesSharing data  Reinventing Discovery by Michael Nielsen (book, 2011)  Data sharing associated with increased citation rate, http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0000308
    •  NIH Data Sharing Policy, http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm  NSF Data Sharing Policy, http://www.nsf.gov/bfa/dias/policy/dmp.jsp  Modes of data sharing o Share-upon-request o Self-archive (on personal websites) o Publish as supporting online materials in journals o Institutional archives or repositories  UC3 Merritt, http://merritt.cdlib.org/ o Public archives or repositories  Ask colleagues for recommendations or search http://databib.lib.purdue.edu/  Resolve DOIs by visiting the URL http://dx.doi.org/ with the DOI appended at the end  Generate DOIs and other permanent identifiers with EZID, http://n2t.net/ezid o Request your free account at data-consult@lists.berkeley.edu  Online services for sharing data among your team o UC Berkeley’s Research Hub, https://hub.berkeley.edu o 3rd party services, http://www.readwriteweb.com/biz/2011/06/8-simple-ways- to-share-data-on.phpData management planning  A plan for organizing, storing, and sharing data  NSF data management plans o Requirements, http://www.nsf.gov/bfa/dias/policy/dmp.jsp o Examples, http://rci.ucsd.edu/dmp/examples.html  NIH data sharing plans o Requirements, implementation guide, and examples, http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm  Online service for building data plans with step-by-step instructions for meeting funding requirements o DMPTool, https://dmp.cdlib.org/  Guidelines for Responsible Data Management in Scientific Research, http://ori.hhs.gov/images/ddblock/data.pdfData ethics  Cite data that you use. Here are some citation styles: http://www.dcc.ac.uk/resources/how-guides/cite-datasets#x1-5000  Keep raw, original data, and log changes  When using external data sets, check for licenses and other restrictions  Keep current with changing data requirements