SlideShare a Scribd company logo
1 of 87
MSU Libraries
Research Data Management
Research Data
Management
Aaron Collie
collie@msu.edu
@aaroncollie
MSU Libraries
Research Data Management
Introductions
• Please tell us your
name and department
• A brief description of
your primary research
area
• What do you consider
to be your research
data
• Experience and/or
comfort level with
managing research
data?
cc http://www.flickr.com/photos/quinnanya/
MSU Libraries
Research Data Management
• Introduction
• Background
• The Impetus: NSF Data Management Plan Mandate
• The Effect: Policy to Practice
• The Response: Changing Data Landscape
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Publishing, Sharing, & Reuse
• Protecting Data & Responsible Reuse
• Data Lifecycle Resources
Agenda
MSU Libraries
Research Data Management
Data Management. Isn’t that…
trivial?
• Not so much. Data is a primary output of research; it is
very expensive to produce high quality data. Data may
be collected in nanoseconds, but it takes the expert
application of research protocol and design to generate
data.
CC-BY-SA-3.0 Rob Lavinsky CC-BY-SA-3.0 Rob
MSU Libraries
Research Data Management
Data is the input of a process that generates
higher orders of understanding.
Wisdom
Knowledge
Information
Data
Understanding is
hierarchical!
Russell Ackoff
MSU Libraries
Research Data Management
This is the engine of the academic
industry…
MSU Libraries
Research Data Management
This is the engine of the academic
industry…
MSU Libraries
Research Data Management
MSU Libraries
Research Data Management
The scientific method “is often
misrepresented as a fixed
sequence of steps,” rather than
being seen for what it truly is,
“a highly variable and creative
process” (AAAS 2000:18).
Gauch, Hugh G. Scientific Method in Practice. New York: Cambridge University Press, 2010. Print. (Emphasis added)
MSU Libraries
Research Data Management
And, things get a little messy.
MSU Libraries
Research Data Management
May you live in interesting times
MSU Libraries
Research Data Management
Public Trust of Science : The Crisis Tri-
force
• Scholarly Communication crisis
– Challenge: Increasing costs of
access
– Opportunity: Open Access
• Reproducibility crisis
– Challenge: Failure-to-replicate
rates
– Opportunity: Open Science
• Higher Education Crisis
– Challenge: Value of education
– Opportunity: Open Education
MSU Libraries
Research Data Management
Crisis, Part 1:
Scholarly Communication “The
cage”
• https://www.lib.msu.
edu/about/collections
/scholcomm/more/
• http://www.arl.org/sto
rage/documents/mo
nograph-serial-
costs.pdf
MSU Libraries
Research Data Management
Crisis, Part 2:
Replication Crisis “The Canary”
• Ioannidis JPA (2005) Why Most Published
Research Findings Are False. PLoS Med
2(8): e124.
doi:10.1371/journal.pmed.0020124
• "Estimating the reproducibility of
psychological science". Science 349
(6251). August 28, 2015.
doi:10.1126/science.aac4716. Retrieved
September 12, 2015.
MSU Libraries
Research Data Management
Crisis, Part 3:
Higher Education “The coal mine”
http://news.bbc.co.uk/onthisday/hi/dates/stories/december/30/newsid_2547000/2547587.stm
MSU Libraries
Research Data Management
This is the engine of the academic
industry…
MSU Libraries
Research Data Management
MSU Libraries
Research Data Management
The Research Depth Chart
Scientific Method
Research Design
Research Method
Research Tasks
MoreDomainSpecificMoreGeneric
MSU Libraries
Research Data Management
Problem
Identification
Study Concept
Literature
Review
Environmental
Scan
Funding &
Proposal
Research
Design
Research
Methodology
Research
Workflow
Hypothesis
Formation
Design
Validation
Research
Activity
Data
Management
Data
Organization
Data
Storage
Data
Description
Data Sharing
Scholarly
Communication
Report
Findings
Publish
Peer Review
MSU Libraries
Research Data Management
Problem
Identification
Study Concept
Literature
Review
Environmental
Scan
Funding &
Proposal
Research
Design
Research
Methodology
Research
Workflow
Hypothesis
Formation
Design
Validation
Research
Activity
Data
Management
Data
Organization
Data
Storage
Data
Description
Data Sharing
Scholarly
Communication
Report
Findings
Publish
Peer Review
MSU Libraries
Research Data Management
Data Management
• The process of
planning for and
implementing a
system of care for
your research data
before, during, and
after a research
project in order to
ensure a (re)usable
resource.
MSU Libraries
Research Data Management
• Introduction
• Background
• The Impetus: NSF Data Management Plan Mandate
• The Effect: Policy to Practice
• The Response: Changing Data Landscape
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Publishing, Sharing, & Reuse
• Protecting Data & Responsible Reuse
• Data Lifecycle Resources
Agenda
MSU Libraries
Research Data Management
• Introduction
• Background
• The Impetus: NSF Data Management Plan Mandate
• The Effect: Policy to Practice
• The Response: Changing Data Landscape
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Publishing, Sharing, & Reuse
• Protecting Data & Responsible Reuse
• Data Lifecycle Resources
Agenda
MSU Libraries
Research Data Management
So why are we here?
Good science!
Government and Research
Funder Mandates
MSU Libraries
Research Data Management
But why are we really here?
• Impetus: NSF has mandated that all grant applications
submitted after January 18th, 2011 must include a
supplemental “Data Management Plan”
• Effect: The original NSF mandate has had a domino
effect, and many funders now require or state guidelines
for data management of grant funded research
• Response: Data management has not traditionally
received a full treatment in (many) graduate and doctoral
curricula; intervention is necessary
MSU Libraries
Research Data Management
Positive reinforcement….
• National Science Foundation Data
Management Plan mandate (January 18,
2011)
• Presidential Memorandum on Managing
Government Records (August 24, 2012)
– Managing Government Records Directive: All
permanent electronic records in Federal
agencies will be managed electronically to the
fullest extent possible for eventual transfer
and accessioning by NARA in an electronic
format.
MSU Libraries
Research Data Management
Positive reinforcement… (cont.)
• White House policy memo (February 22,
2013)
– Increasing Access to the Results of Federally Funded Scientific
Research: Federal agencies with more than $100M in R&D
expenditures must develop plans to make the published results
of federally funded research freely available to the public within
one year of publication.
• OSTP policy memo (March 20, 2014)
– Improving the Management of and Access to Scientific
Collections: directs each Federal agency that owns, maintains,
or otherwise financially supports permanent scientific collections
to develop a draft scientific-collections management and access
policy within six months.
MSU Libraries
Research Data Management
Positive reinforcement… (cont. w/
teeth!)
• AHRQ = “…all AHRQ-funded researchers will be
required to include a data management plan for
sharing final research data in digital format, or state
why data sharing is not possible.
• NASA = This plan extends NASA’s culture of open
data access to all NASA-funded research.”
• USDA = Phased approach beginning with DMP
• More: http://www.arl.org/focus-areas/public-access-
policies/federally-funded-research/2696-white-house-
directive-on-public-access-to-federally-funded-
research-and-data#agency-policies
MSU Libraries
Research Data Management
1980
Forsham v.
Harris
•Research data
not subject to
FOIA
1999
Data Access
Act of 1999
•OMB Circular A-
110 revised:
data produced
with Federal
monies subject
to FOIA
2003
NIH Data
Sharing
Plans
•Grants over
$500,000
require plans
2010
America
Competes
Reauthorization
Act
•OSTP must coordinate
policies for
dissemination and
stewardship of scholarly
publications and data
produced with Federal
funds
2011
NSF Data
Management
Plans
•Grants require
supplementary DMP
2013
OSTP memo:
Increasing Access
to the Results of
Federally Funded
Research
•Requires agencies to
develop plans for public
access to publications and
data
Federally funded research data
Fischer, E. A. (2013). Public Access to Data from Federally Funded Research: Provisions in OMB Circular A-110 (Congressional Research Service). Retrieved from
HTTP://congressional.proquest.com.proxy2.cl.msu.edu/congressional/docview/t21.d22.crs-2013-rsi-0116?accountid=12598
MSU Libraries
Research Data Management
Funder Policies
NASA “promotes the full and open sharing of all data”
“requires that data…be submitted to and archived by
designated national data centers.”
“expects the timely release and sharing of final research data"
"IMLS encourages sharing of research data."
“…should describe how the project team will manage and
disseminate data generated by the project”
MSU Libraries
Research Data Management
 Policies for re-use, re-distribution, and creation of
derivatives
 Plans for archiving data, samples, and other research
outcomes, maintaining access
 Types of data, samples, physical collections, software
generated
• Standards for data and metadata format and content
• Access and sharing policies, with stipulations for
privacy, confidentiality, security, intellectual property, or
other rights or requirements
MSU Libraries
Research Data Management
• NSF will not evaluate any proposal
missing a DMP
• PI may state that project will not generate
data
• DMP is reviewed as part of intellectual
merit or broader impacts of application, or
both
• Costs to implement DMP may be included
in proposal’s budget
• May be up to two pages long
MSU Libraries
Research Data Management
• Investigators seeking $500,000 or more in direct costs in any year
should include a description of how final research data will be
shared, or explain why data sharing is not possible.
• The precise content of the data-sharing plan will vary, depending on
the data being collected and how the investigator is planning to
share the data.
• More stringent data management and sharing requirements may be
required in specific NIH Funding Opportunity Announcements.
Principal Investigators must discuss how these requirements will be
met in their Data Sharing Plans.
MSU Libraries
Research Data Management
 Roles and responsibilities
 Expected Data
 Period of data retention
• Data formats and dissemination
• Data storage and preservation of access
MSU Libraries
Research Data Management
Local Policy
University Research Council Best Practices:
https://rio.msu.edu/research-data
Research Data: Management, Control, and
Access
– To assure that research data are appropriately
recorded, archived for a reasonable period of
time, and available for review under the
appropriate circumstances.
• Ownership = MSU
• “Stewardship” = You
• Period of Retention = 3 years
• Transfer of Responsibility = Written Request
MSU Libraries
Research Data Management
Broader Response: Changing
Data Landscapes
• Data Management Competencies
– Standards & Best Practices
– Discipline Specific Discourse
• Data sharing and open data
– Data sets as publications
– Data journals
– Citations for data (e.g., used in secondary
analysis)
– Data as supplementary materials to traditional
articles
– Data repositories and archives
MSU Libraries
Research Data Management
Science Paradigms
• Thousand years ago:
science was empirical
describing natural phenomena
• Last few hundred years:
theoretical branch
using models, generalizations
• Last few decades:
a computational branch
simulating complex phenomena
• Today:
data exploration (eScience)
unify theory, experiment, and simulation
– Data captured by instruments
Or generated by simulator
– Processed by software
– Information/Knowledge stored in computer
– Scientist analyzes database / files
using data management and statistics
2
2
2
.
3
4
a
cG
a
a












Slide credit: Gray, J. & Szalay, A. (11 January 2007). eScience Talk at NRC-CSTB meeting. http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt
MSU Libraries
Research Data Management
Curation responsibilities (Carlson, The Chronicle, 2006)
“Data from Big Science is … easier to handle, understand and archive.
Small Science is horribly heterogeneous and far more vast. In time Small
Science will generate 2-3 times more data than Big Science.”
big science
data
small science data
institution?
domain?
MacColl, John (2010). The Role of libraries in data curation. RLG Partnership Annual Meeting, Chicago. June 2010
MSU Libraries
Research Data Management
What’s in it for me?
• Better organization = less headaches
– Course management
– Bibliographic management
– File management
– Research
• Career advancement
– Publish datasets and list on your CV
– Data management is an “unnamed practice” –
name it for yourself and your students!
MSU Libraries
Research Data Management
Data Sharing Impacts
• Reinforces open
scientific inquiry
• Encourages diversity
of analysis and opinion
• Promotes new
research, testing of
new or alternative
hypotheses and
methods of analysis
• Supports studies on
data collection
methods and
measurement
Cc http://www.flickr.com/photos/pinchof_10/
MSU Libraries
Research Data Management
Data Sharing Impacts
• Facilitates education
of new researchers
• Enables exploration
of topics not
envisioned by initial
investigators
• Permits creation of
new datasets by
combining data from
multiple sources
MSU Libraries
Research Data Management
Data as
Publication
Figure 1. To be
published, datasets are
typically deposited in a
repository to make them
available, documented
to support reproduction
and reuse, and assigned
an identifier to facilitate
citation.
Kratz J and Strasser C (2014) [version 3] Data publication consensus and controversies.
F1000Research 3:94 (doi: 10.12688/f1000research.3979.3)
MSU Libraries
Research Data Management
What could come next?
• Increasing emphasis on making data available over long time
periods from institutionally maintained repositories.
– The end of project or PI based data repositories
• Increasing pressure for rapid release of data and project
transparency
– The annual reports maybe used to dictate data release in the
near future
• How these challenges are resolved is a core institutional
responsibility. It is important that MSU be a leader in this
initiative.
– Institutions are broadly competing to establish these data
repositories, and I suspect these will be a competitive indicator
for proposal success in the future.
Slide from Messina, Joseph. (2015) Data Plans in SBE and Geography. CI Forum October 22, 2015
MSU Libraries
Research Data Management
http://retractionwatch.com/2014/01/07/doing-the-right-thing-authors-retract-brain-paper-with-systematic-human-error-in-coding/
MSU Libraries
Research Data Management
• Introduction
• Background
• The Impetus: NSF Data Management Plan Mandate
• The Effect: Policy to Practice
• The Response: Changing Data Landscape
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Publishing, Sharing, & Reuse
• Protecting Data & Responsible Reuse
• Data Lifecycle Resources
Agenda
MSU Libraries
Research Data Management
Research Data Management
Fundamentals
• Documentation
• File Organization
• Storage & Backup
• Data Publishing, Sharing,
& Reuse
• Protecting Data
& Responsible Reuse
MSU Libraries
Research Data Management
Documentation Practices:
Overview
• Researchers benefit from proper
documentation to decipher or reuse their
datasets – even prior to thinking about
sharing
• Think “downstream”
MSU Libraries
Research Data Management
Documentation Practices: Overview
1. At minimum create a
README file that you can
use to document your
project
2. Utilize standards for
describing data including
Metadata Standards
3. If applicable, use in-line
code commentary to
explain code
(cc) Will Scullin
MSU Libraries
Research Data Management
Create a README file
• At minimum, store documentation in
readme.txt file or equivalent, with data
– What data consists of
– How it was collected
– Restrictions to distribution or use
– Other descriptive information
MSU Libraries
Research Data Management
• “Data about data”
• Standardized way of describing data
• Explains who, what, where, when of data
creation and methods of use
• Data more easily found
• Data more easily compared to other data sets
Use Metadata Standards
MSU Libraries
Research Data Management
Use Metadata Standards
Basic project metadata:
• Title • Language • File Formats
• Creator • Dates • File Structure
• Identifier • Location • Variable List
• Subject • Methodology • Code Lists
• Funders • Data Processing • Versions
• Rights • Sources • Checksums
• Access
Information
• List of File Names
MSU Libraries
Research Data Management
Use Metadata Standards
• Dublin Core: Commonly-used descriptive
metadata format facilitates dataset discovery
across the Web.
• Data Documentation Initiative (DDI): Defines
metadata content, presentation, transport, and
preservation for the social and behavioral
sciences.
• ISO 19115:2003: Describes geographic data such
as maps and charts.
• More
examples:http://www.lib.msu.edu/about/diginfo/coll
ect.jsp
MSU Libraries
Research Data Management
Use In-Line Code Commentary
Example of R code commentary
# Cumulative normal density
pnorm(c(-1.96,0,1.96))
• If applicable, in-line code commentary helps
explain code
MSU Libraries
Research Data Management
File Organization Practices:
Overview
1. Design a file plan
for your research
project
2. Use file naming
conventions that
work for your project
3. Choose file formats
to maximize
usefulness
“When I was a
freshmen I named
my assignments
Paper Paperr
Paperrr Paperrrr”
-Undergrad
MSU Libraries
Research Data Management
Design a File Plan
• File structure is the framework
• Classification system makes it easier to
locate folders/files
• Benefits:
– Simple organization intuitive to team
members and colleagues
– Reduces duplicate copies in personal drives
and e-mail attachments
MSU Libraries
Research Data Management
Design a File Plan
Choose a sortable directory hierarchy
• Example 1: Investigator, Process, Date
Collie
TEI_Encoding
20110117
• Example 2: Instrument, Date, Sample
Usability Survey
2012043
sample_1
MSU Libraries
Research Data Management
Design a File Plan
Example documentation of Directory Hierarchy:
/[Project]/[Grant Number]/[Event]/[Investigator/Date]
MSU Libraries
Research Data Management
Use File Naming Conventions
– Enable better access/retrieval of files
– Create logical sequences for file sorting
– More easily identify what you’re searching for
MSU Libraries
Research Data Management
• Meaningful but short—255 character limit
• Use alphanumeric characters
– Example: abc123
• Capital letters or underscores differentiate
between words
• Surname first followed by initials of first name
Use File Naming Conventions
MSU Libraries
Research Data Management
• Year-month-day format for dates, with or
without hyphens
Example 1: 2006-03-13
Example 2: 20060313
• Decide on a simple versioning method
Example: file_v001
Use File Naming Conventions
MSU Libraries
Research Data Management
• To create consistent file names, specify a
template such as:
[investigator]_[descriptor]_[YYYYMMDD].[ex
t]
Use File Naming Conventions
This Not This
sharpeW_krillMicrograph_backscatter3_20110117.tif KrillData2011.tif
This Not This
borgesJ_collocation_20080414.xml Borges_Textbase.xml
MSU Libraries
Research Data Management
Choose Appropriate File Formats
• Non-proprietary
• Open, documented standard
• Common usage by research community
• Standard representation (ASCII, Unicode)
• Unencrypted
• Uncompressed
MSU Libraries
Research Data Management
Choose Appropriate File Formats
Format Genre Optimal Standards
TEXT .txt; .odt; .xml; .html
AUDIO .flac; .wav,
VIDEO .mp2/.mp4; .mkv
IMAGE .tif; .png; .svg; .jpg
DATA .sql; .csv
MSU Libraries
Research Data Management
Storage & Backup Practices
1. Avoid single points of
failure
2. Ensure data redundancy &
replication
3. Understand common
types of storage
(cc) George Ornbo
Data at significant risk of loss without storage
and backup plan
MSU Libraries
Research Data Management
Avoid Single Points of Failure
A single point of failure occurs when it
would only take one event to destroy all
data on a device
• Use managed networked storage when
possible
• Move data off of portable media
• Never rely on one copy of data
• Do not rely on CD or DVD copies to be
readable
• Be wary of software lifespans
MSU Libraries
Research Data Management
Ensure Data Redundancy
• Effective data storage plan provides for 3
copies:
– Primary authoritative copy
– Secondary local backup
– Tertiary remote backup
• Geographically distribute and secure
– Local vs. remote, depending on needed recovery
time
• Personal computer, external hard drives,
departmental, or university servers may be
used
MSU Libraries
Research Data Management
Ensure Data Redundancy
• Cloud storage
– Amazon s3
– Google
– MS Azure
– DuraCloud
– Rackspace
– Glacier
Note that many enterprise
cloud storage services
include a charge for in/out of
data transfers
$$$
MSU Libraries
Research Data Management
Understand Common Types of
Storage
• Optical Media
• Portable Flash Media
• Commercial Hard Drives
• Commercial NAS
• Cloud Storage
• Enterprise Network Storage
• Trusted Archival Storage
MSU Libraries
Research Data Management
Understand Common Types of
Storage
• Features of storage types:
• Portable data transfers
• Short-term storage
• Project term storage
• Networked data transfer
• Long-term storage
• Reliable backup option
MSU Libraries
Research Data Management
Understand Common Types of
StoragePortable
Data
Transfer
Short
Term
Storage
Project
Term
Storage
Networked
Data Transfer
Long
Term
Storage
Reliable
Backup
Option
Optical Media ✔ ✗ ✗ ✗ ✗ ✗
Portable Flash
Media
✔ ✔ ✗ ✗ ✗ ✗
Commercial Hard
Drives
✔ ✔ ✔ ✗ ✗ ✗
Commercial NAS ✗ ✔ ✔ ✔ ✗ ✗
Cloud Storage ✗ ✔ ✔ ✔ ✗ ✗
Enterprise Network
Storage
✗ ✔ ✔ ✔ ✔ ✔
Trusted Archival
Storage
✗ ✗ ✗ ✔ ✔ ✔
MSU Libraries
Research Data Management
Understand Common Types of
Storage
Media Storage @ MSU
Optical Media MSU Computer Store—Sells Optical Media and hardware accessories
UAHC Media Storage Service—Offers physical lock-box like storage for MSU
Flash Media MSU Computer Store—Sells Optical Media and hardware accessories
UAHC Media Storage Service—Offers physical lock-box like storage for MSU
Commercial Hard
Drives
MSU Computer Store—Sells Optical Media and hardware accessories.
UAHC Media Storage Service—Offers physical lock-box like storage for MSU
Enterprise Cloud
Storage
Angel—Free. Ideal for collaboration; not storage space. Phase out 2015
Desire2Learn—Free. Ideal for collaboration; not storage space. Replaces Angel
GoogleApps—Free. Ideal for collaboration; not intended as storage space
Enterprise
Network Storage
AFS Space—Free to 1GB, add’l space can be purchased w/dept. account
IT Services Individual, Mid-Tier and Enterprise Storage—Fee based
HPCC Home or Research—Free up to 1TB. Fee based additions available
Trusted Archival
Storage
Disciplinary Repositories – Disciplinary repositories offer archival services for
pertinent research data.
MSU Libraries
Research Data Management
Data Publishing, Sharing, Reuse
1. Time-intensive, with potentially
high return on investment
2. Publish data in several data
publication venues to more
broadly share results of research
Research datasets on par with peer-reviewed
journal articles as first-class scholarly contributions
MSU Libraries
Research Data Management
Sharing & Publishing Data
• Data preparation for sharing and publication
is a time-intensive process
• Potential positive outcomes:
• Increased research impact and citations
• Enable additional scientific inquiry
• Opportunities for co-authorship and
collaboration
• Enhance your grant proposal’s
competitiveness
MSU Libraries
Research Data Management
Data Publication Venues
• Multiple ways to publish research data
• Faculty or project website
• Journal supplementary materials
• Disciplinary data repository (data archive)
• Varying levels of support for indexing, access
controls, and long-term curation
MSU Libraries
Research Data Management
Data Publication Venues
• Disciplinary Data Repository
• Securely share data, ensure long-term access
• High visibility
• Often offer persistent citations
• Availability varies across domains
• Databib.org directory
MSU Libraries
Research Data Management
Data Publication Venues
• Disciplinary Data Repository
• Securely share data, ensure long-term access
• High visibility
• Often offer persistent citations
• Availability varies across domains
• Databib.org directory
MSU Libraries
Research Data Management
Protecting Data & Responsible Reuse
1. Consider how to protect
data and intellectual
property rights while
encouraging reuse
2. Keep in mind ethical
concerns when sharing
data
(cc) Will Scullin
MSU Libraries
Research Data Management
Intellectual Property
• IP refers to exclusive rights of creators of
works
• Individual data cannot be protected by US
copyright
• Organization of data such as database,
creative work produced by data, and research
instruments used may be protected
Š
MSU Libraries
Research Data Management
Intellectual Property
• Principal investigator’s institution holds IP
rights
• Provide clearly stated license for producing
derivatives, reusing, and redistributing
datasets
• License under Creative Commons
• State if any restrictions or embargos on use
• Provide example of how work should be cited
to encourage proper attribution on reuse
• Document any IP / copyright issues
MSU Libraries
Research Data Management
Ethics & Data Sharing
• Keep in mind the following ethical concerns
when sharing your data:
• Privacy
• Confidentiality
• Security and integrity of the data
• For data involving human subjects, obtain
written permission or consent stating how the
data may be reused
MSU Libraries
Research Data Management
Best Practices = High Impact Data
• File organization ensures easier access and
retrieval of data
• Documentation makes datasets accessible
and intelligible to users
• Storage and backup safeguards data
• Data publishing and sharing encourages the
most widespread reuse of data
• Data protection ensures responsible reuse
MSU Libraries
Research Data Management
• Introduction
• Background
• The Impetus: NSF Data Management Plan Mandate
• The Effect: Policy to Practice
• The Response: Changing Data Landscape
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Publishing, Sharing, & Reuse
• Protecting Data & Responsible Reuse
• Data Lifecycle Resources
Agenda
MSU Libraries
Research Data Management
Volunstrordinaries!
Aaron
Collie
Devin Higgins Brandon
Locke
Ranti Junus Judy
Matthews
Tina Qin
MSU Libraries
Research Data Management
We teach people about RDM
Librarianship
Training
Assessment
Consultation
Ad-hoc
6-12 new clients per semester
100% satisfied / 100% would use
again
71% of new clients are referrals
60% requested additional services
15% through NFO, 14% through
website
MSU Libraries
Research Data Management
http://www.lib.msu.edu/rdmg
MSU Libraries
Research Data Management
RDM@MSU 101
• Who: You, as the designated steward
• What: “the data”
• When: Minimum 3 years after
publ./degree
• Where: Managed networked storage
• Why: Legal, Ethical, Scholarly
• How: With fidelity and documentation
sufficient to reproduce the research
MSU Libraries
Research Data Management
Contact
Aaron Collie
collie@msu.edu
@aaroncollie
http://www.lib.msu.edu/rdmg

More Related Content

What's hot

RDMG Service Overview
RDMG Service OverviewRDMG Service Overview
RDMG Service OverviewAaron Collie
 
Research Data Management Guidance overview
Research Data Management Guidance overviewResearch Data Management Guidance overview
Research Data Management Guidance overviewAaron Collie
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...ICPSR
 
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...ICPSR
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research RequirementsICPSR
 
2011 dlf-halbert-data res
2011 dlf-halbert-data res2011 dlf-halbert-data res
2011 dlf-halbert-data resmartin-halbert
 
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014ICPSR
 
CU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesCU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesC. Tobin Magle
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementCunera Buys
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipICPSR
 
Building and providing data management services a framework for everyone!
Building and providing data management services  a framework for everyone!Building and providing data management services  a framework for everyone!
Building and providing data management services a framework for everyone!Renaine Julian
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
 
Organizational Implications of Data Science Environments in Education, Resear...
Organizational Implications of Data Science Environments in Education, Resear...Organizational Implications of Data Science Environments in Education, Resear...
Organizational Implications of Data Science Environments in Education, Resear...Victoria Steeves
 
Using Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesUsing Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesICPSR
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Managementaaroncollie
 
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...datacite
 

What's hot (20)

RDMG Service Overview
RDMG Service OverviewRDMG Service Overview
RDMG Service Overview
 
Research Data Management Guidance overview
Research Data Management Guidance overviewResearch Data Management Guidance overview
Research Data Management Guidance overview
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
 
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research Requirements
 
2011 dlf-halbert-data res
2011 dlf-halbert-data res2011 dlf-halbert-data res
2011 dlf-halbert-data res
 
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
Instructional Data Sets from Q-step Launch Event (Univ of Exeter) 3-20-2014
 
CU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesCU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data Services
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data Stewardship
 
Building and providing data management services a framework for everyone!
Building and providing data management services  a framework for everyone!Building and providing data management services  a framework for everyone!
Building and providing data management services a framework for everyone!
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goal
 
Organizational Implications of Data Science Environments in Education, Resear...
Organizational Implications of Data Science Environments in Education, Resear...Organizational Implications of Data Science Environments in Education, Resear...
Organizational Implications of Data Science Environments in Education, Resear...
 
Using Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesUsing Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR Resources
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
 
Stephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science ResearchStephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science Research
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
 

Similar to Research Data Management

Creating dmp
Creating dmpCreating dmp
Creating dmpSherry Lake
 
Library resources and services for grant development
Library resources and services for grant developmentLibrary resources and services for grant development
Library resources and services for grant developmentrds-wayne-edu
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshoplindahauck
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Datacunera
 
Overview and library support for data management/sharing
Overview and library support for data management/sharingOverview and library support for data management/sharing
Overview and library support for data management/sharingrds-wayne-edu
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data ThingsKatina Toufexis
 
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
 
Research Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & ResponsibilitiesResearch Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & ResponsibilitiesAmyLN
 
Data Management Plan Advising? A New Business Venture for Libraries
Data Management Plan Advising?  A New Business Venture for LibrariesData Management Plan Advising?  A New Business Venture for Libraries
Data Management Plan Advising? A New Business Venture for LibrariesAndrew Sallans
 
Data Services at a Liberal Arts College Library
Data Services at a Liberal Arts College LibraryData Services at a Liberal Arts College Library
Data Services at a Liberal Arts College LibraryJulie Judkins
 
Supporting research life cycle librarians
Supporting research life cycle   librariansSupporting research life cycle   librarians
Supporting research life cycle librariansSherry Lake
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeSpencer Keralis
 
Data Management Lab: Session 1 Slides
Data Management Lab: Session 1 SlidesData Management Lab: Session 1 Slides
Data Management Lab: Session 1 SlidesIUPUI
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...University of California Curation Center
 
Why managedata
Why managedataWhy managedata
Why managedataSherry Lake
 
Why we care about research data? Why we share?
Why we care about research data? Why we share?Why we care about research data? Why we share?
Why we care about research data? Why we share?Richard Ferrers
 
Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"The TMC Library
 
RDAP 16 Poster: Data Management Training Clearinghouse
RDAP 16 Poster: Data Management Training ClearinghouseRDAP 16 Poster: Data Management Training Clearinghouse
RDAP 16 Poster: Data Management Training ClearinghouseASIS&T
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)aaroncollie
 

Similar to Research Data Management (20)

Creating dmp
Creating dmpCreating dmp
Creating dmp
 
Library resources and services for grant development
Library resources and services for grant developmentLibrary resources and services for grant development
Library resources and services for grant development
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshop
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Data
 
Overview and library support for data management/sharing
Overview and library support for data management/sharingOverview and library support for data management/sharing
Overview and library support for data management/sharing
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
 
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
 
Research Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & ResponsibilitiesResearch Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & Responsibilities
 
Data Management Plan Advising? A New Business Venture for Libraries
Data Management Plan Advising?  A New Business Venture for LibrariesData Management Plan Advising?  A New Business Venture for Libraries
Data Management Plan Advising? A New Business Venture for Libraries
 
Data Services at a Liberal Arts College Library
Data Services at a Liberal Arts College LibraryData Services at a Liberal Arts College Library
Data Services at a Liberal Arts College Library
 
Supporting research life cycle librarians
Supporting research life cycle   librariansSupporting research life cycle   librarians
Supporting research life cycle librarians
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the Challenge
 
Data Management Lab: Session 1 Slides
Data Management Lab: Session 1 SlidesData Management Lab: Session 1 Slides
Data Management Lab: Session 1 Slides
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
 
Why managedata
Why managedataWhy managedata
Why managedata
 
Why we care about research data? Why we share?
Why we care about research data? Why we share?Why we care about research data? Why we share?
Why we care about research data? Why we share?
 
Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"
 
RDAP 16 Poster: Data Management Training Clearinghouse
RDAP 16 Poster: Data Management Training ClearinghouseRDAP 16 Poster: Data Management Training Clearinghouse
RDAP 16 Poster: Data Management Training Clearinghouse
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Research Data Management

  • 1. MSU Libraries Research Data Management Research Data Management Aaron Collie collie@msu.edu @aaroncollie
  • 2. MSU Libraries Research Data Management Introductions • Please tell us your name and department • A brief description of your primary research area • What do you consider to be your research data • Experience and/or comfort level with managing research data? cc http://www.flickr.com/photos/quinnanya/
  • 3. MSU Libraries Research Data Management • Introduction • Background • The Impetus: NSF Data Management Plan Mandate • The Effect: Policy to Practice • The Response: Changing Data Landscape • Fundamentals Practices • File Organization • Data Documentation • Reliable Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse • Data Lifecycle Resources Agenda
  • 4. MSU Libraries Research Data Management Data Management. Isn’t that… trivial? • Not so much. Data is a primary output of research; it is very expensive to produce high quality data. Data may be collected in nanoseconds, but it takes the expert application of research protocol and design to generate data. CC-BY-SA-3.0 Rob Lavinsky CC-BY-SA-3.0 Rob
  • 5. MSU Libraries Research Data Management Data is the input of a process that generates higher orders of understanding. Wisdom Knowledge Information Data Understanding is hierarchical! Russell Ackoff
  • 6. MSU Libraries Research Data Management This is the engine of the academic industry…
  • 7. MSU Libraries Research Data Management This is the engine of the academic industry…
  • 9. MSU Libraries Research Data Management The scientific method “is often misrepresented as a fixed sequence of steps,” rather than being seen for what it truly is, “a highly variable and creative process” (AAAS 2000:18). Gauch, Hugh G. Scientific Method in Practice. New York: Cambridge University Press, 2010. Print. (Emphasis added)
  • 10. MSU Libraries Research Data Management And, things get a little messy.
  • 11. MSU Libraries Research Data Management May you live in interesting times
  • 12. MSU Libraries Research Data Management Public Trust of Science : The Crisis Tri- force • Scholarly Communication crisis – Challenge: Increasing costs of access – Opportunity: Open Access • Reproducibility crisis – Challenge: Failure-to-replicate rates – Opportunity: Open Science • Higher Education Crisis – Challenge: Value of education – Opportunity: Open Education
  • 13. MSU Libraries Research Data Management Crisis, Part 1: Scholarly Communication “The cage” • https://www.lib.msu. edu/about/collections /scholcomm/more/ • http://www.arl.org/sto rage/documents/mo nograph-serial- costs.pdf
  • 14. MSU Libraries Research Data Management Crisis, Part 2: Replication Crisis “The Canary” • Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124 • "Estimating the reproducibility of psychological science". Science 349 (6251). August 28, 2015. doi:10.1126/science.aac4716. Retrieved September 12, 2015.
  • 15. MSU Libraries Research Data Management Crisis, Part 3: Higher Education “The coal mine” http://news.bbc.co.uk/onthisday/hi/dates/stories/december/30/newsid_2547000/2547587.stm
  • 16. MSU Libraries Research Data Management This is the engine of the academic industry…
  • 18. MSU Libraries Research Data Management The Research Depth Chart Scientific Method Research Design Research Method Research Tasks MoreDomainSpecificMoreGeneric
  • 19. MSU Libraries Research Data Management Problem Identification Study Concept Literature Review Environmental Scan Funding & Proposal Research Design Research Methodology Research Workflow Hypothesis Formation Design Validation Research Activity Data Management Data Organization Data Storage Data Description Data Sharing Scholarly Communication Report Findings Publish Peer Review
  • 20. MSU Libraries Research Data Management Problem Identification Study Concept Literature Review Environmental Scan Funding & Proposal Research Design Research Methodology Research Workflow Hypothesis Formation Design Validation Research Activity Data Management Data Organization Data Storage Data Description Data Sharing Scholarly Communication Report Findings Publish Peer Review
  • 21. MSU Libraries Research Data Management Data Management • The process of planning for and implementing a system of care for your research data before, during, and after a research project in order to ensure a (re)usable resource.
  • 22. MSU Libraries Research Data Management • Introduction • Background • The Impetus: NSF Data Management Plan Mandate • The Effect: Policy to Practice • The Response: Changing Data Landscape • Fundamentals Practices • File Organization • Data Documentation • Reliable Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse • Data Lifecycle Resources Agenda
  • 23. MSU Libraries Research Data Management • Introduction • Background • The Impetus: NSF Data Management Plan Mandate • The Effect: Policy to Practice • The Response: Changing Data Landscape • Fundamentals Practices • File Organization • Data Documentation • Reliable Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse • Data Lifecycle Resources Agenda
  • 24. MSU Libraries Research Data Management So why are we here? Good science! Government and Research Funder Mandates
  • 25. MSU Libraries Research Data Management But why are we really here? • Impetus: NSF has mandated that all grant applications submitted after January 18th, 2011 must include a supplemental “Data Management Plan” • Effect: The original NSF mandate has had a domino effect, and many funders now require or state guidelines for data management of grant funded research • Response: Data management has not traditionally received a full treatment in (many) graduate and doctoral curricula; intervention is necessary
  • 26. MSU Libraries Research Data Management Positive reinforcement…. • National Science Foundation Data Management Plan mandate (January 18, 2011) • Presidential Memorandum on Managing Government Records (August 24, 2012) – Managing Government Records Directive: All permanent electronic records in Federal agencies will be managed electronically to the fullest extent possible for eventual transfer and accessioning by NARA in an electronic format.
  • 27. MSU Libraries Research Data Management Positive reinforcement… (cont.) • White House policy memo (February 22, 2013) – Increasing Access to the Results of Federally Funded Scientific Research: Federal agencies with more than $100M in R&D expenditures must develop plans to make the published results of federally funded research freely available to the public within one year of publication. • OSTP policy memo (March 20, 2014) – Improving the Management of and Access to Scientific Collections: directs each Federal agency that owns, maintains, or otherwise financially supports permanent scientific collections to develop a draft scientific-collections management and access policy within six months.
  • 28. MSU Libraries Research Data Management Positive reinforcement… (cont. w/ teeth!) • AHRQ = “…all AHRQ-funded researchers will be required to include a data management plan for sharing final research data in digital format, or state why data sharing is not possible. • NASA = This plan extends NASA’s culture of open data access to all NASA-funded research.” • USDA = Phased approach beginning with DMP • More: http://www.arl.org/focus-areas/public-access- policies/federally-funded-research/2696-white-house- directive-on-public-access-to-federally-funded- research-and-data#agency-policies
  • 29. MSU Libraries Research Data Management 1980 Forsham v. Harris •Research data not subject to FOIA 1999 Data Access Act of 1999 •OMB Circular A- 110 revised: data produced with Federal monies subject to FOIA 2003 NIH Data Sharing Plans •Grants over $500,000 require plans 2010 America Competes Reauthorization Act •OSTP must coordinate policies for dissemination and stewardship of scholarly publications and data produced with Federal funds 2011 NSF Data Management Plans •Grants require supplementary DMP 2013 OSTP memo: Increasing Access to the Results of Federally Funded Research •Requires agencies to develop plans for public access to publications and data Federally funded research data Fischer, E. A. (2013). Public Access to Data from Federally Funded Research: Provisions in OMB Circular A-110 (Congressional Research Service). Retrieved from HTTP://congressional.proquest.com.proxy2.cl.msu.edu/congressional/docview/t21.d22.crs-2013-rsi-0116?accountid=12598
  • 30. MSU Libraries Research Data Management Funder Policies NASA “promotes the full and open sharing of all data” “requires that data…be submitted to and archived by designated national data centers.” “expects the timely release and sharing of final research data" "IMLS encourages sharing of research data." “…should describe how the project team will manage and disseminate data generated by the project”
  • 31. MSU Libraries Research Data Management  Policies for re-use, re-distribution, and creation of derivatives  Plans for archiving data, samples, and other research outcomes, maintaining access  Types of data, samples, physical collections, software generated • Standards for data and metadata format and content • Access and sharing policies, with stipulations for privacy, confidentiality, security, intellectual property, or other rights or requirements
  • 32. MSU Libraries Research Data Management • NSF will not evaluate any proposal missing a DMP • PI may state that project will not generate data • DMP is reviewed as part of intellectual merit or broader impacts of application, or both • Costs to implement DMP may be included in proposal’s budget • May be up to two pages long
  • 33. MSU Libraries Research Data Management • Investigators seeking $500,000 or more in direct costs in any year should include a description of how final research data will be shared, or explain why data sharing is not possible. • The precise content of the data-sharing plan will vary, depending on the data being collected and how the investigator is planning to share the data. • More stringent data management and sharing requirements may be required in specific NIH Funding Opportunity Announcements. Principal Investigators must discuss how these requirements will be met in their Data Sharing Plans.
  • 34. MSU Libraries Research Data Management  Roles and responsibilities  Expected Data  Period of data retention • Data formats and dissemination • Data storage and preservation of access
  • 35. MSU Libraries Research Data Management Local Policy University Research Council Best Practices: https://rio.msu.edu/research-data Research Data: Management, Control, and Access – To assure that research data are appropriately recorded, archived for a reasonable period of time, and available for review under the appropriate circumstances. • Ownership = MSU • “Stewardship” = You • Period of Retention = 3 years • Transfer of Responsibility = Written Request
  • 36. MSU Libraries Research Data Management Broader Response: Changing Data Landscapes • Data Management Competencies – Standards & Best Practices – Discipline Specific Discourse • Data sharing and open data – Data sets as publications – Data journals – Citations for data (e.g., used in secondary analysis) – Data as supplementary materials to traditional articles – Data repositories and archives
  • 37. MSU Libraries Research Data Management Science Paradigms • Thousand years ago: science was empirical describing natural phenomena • Last few hundred years: theoretical branch using models, generalizations • Last few decades: a computational branch simulating complex phenomena • Today: data exploration (eScience) unify theory, experiment, and simulation – Data captured by instruments Or generated by simulator – Processed by software – Information/Knowledge stored in computer – Scientist analyzes database / files using data management and statistics 2 2 2 . 3 4 a cG a a             Slide credit: Gray, J. & Szalay, A. (11 January 2007). eScience Talk at NRC-CSTB meeting. http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt
  • 38. MSU Libraries Research Data Management Curation responsibilities (Carlson, The Chronicle, 2006) “Data from Big Science is … easier to handle, understand and archive. Small Science is horribly heterogeneous and far more vast. In time Small Science will generate 2-3 times more data than Big Science.” big science data small science data institution? domain? MacColl, John (2010). The Role of libraries in data curation. RLG Partnership Annual Meeting, Chicago. June 2010
  • 39. MSU Libraries Research Data Management What’s in it for me? • Better organization = less headaches – Course management – Bibliographic management – File management – Research • Career advancement – Publish datasets and list on your CV – Data management is an “unnamed practice” – name it for yourself and your students!
  • 40. MSU Libraries Research Data Management Data Sharing Impacts • Reinforces open scientific inquiry • Encourages diversity of analysis and opinion • Promotes new research, testing of new or alternative hypotheses and methods of analysis • Supports studies on data collection methods and measurement Cc http://www.flickr.com/photos/pinchof_10/
  • 41. MSU Libraries Research Data Management Data Sharing Impacts • Facilitates education of new researchers • Enables exploration of topics not envisioned by initial investigators • Permits creation of new datasets by combining data from multiple sources
  • 42. MSU Libraries Research Data Management Data as Publication Figure 1. To be published, datasets are typically deposited in a repository to make them available, documented to support reproduction and reuse, and assigned an identifier to facilitate citation. Kratz J and Strasser C (2014) [version 3] Data publication consensus and controversies. F1000Research 3:94 (doi: 10.12688/f1000research.3979.3)
  • 43. MSU Libraries Research Data Management What could come next? • Increasing emphasis on making data available over long time periods from institutionally maintained repositories. – The end of project or PI based data repositories • Increasing pressure for rapid release of data and project transparency – The annual reports maybe used to dictate data release in the near future • How these challenges are resolved is a core institutional responsibility. It is important that MSU be a leader in this initiative. – Institutions are broadly competing to establish these data repositories, and I suspect these will be a competitive indicator for proposal success in the future. Slide from Messina, Joseph. (2015) Data Plans in SBE and Geography. CI Forum October 22, 2015
  • 44. MSU Libraries Research Data Management http://retractionwatch.com/2014/01/07/doing-the-right-thing-authors-retract-brain-paper-with-systematic-human-error-in-coding/
  • 45. MSU Libraries Research Data Management • Introduction • Background • The Impetus: NSF Data Management Plan Mandate • The Effect: Policy to Practice • The Response: Changing Data Landscape • Fundamentals Practices • File Organization • Data Documentation • Reliable Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse • Data Lifecycle Resources Agenda
  • 46. MSU Libraries Research Data Management Research Data Management Fundamentals • Documentation • File Organization • Storage & Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse
  • 47. MSU Libraries Research Data Management Documentation Practices: Overview • Researchers benefit from proper documentation to decipher or reuse their datasets – even prior to thinking about sharing • Think “downstream”
  • 48. MSU Libraries Research Data Management Documentation Practices: Overview 1. At minimum create a README file that you can use to document your project 2. Utilize standards for describing data including Metadata Standards 3. If applicable, use in-line code commentary to explain code (cc) Will Scullin
  • 49. MSU Libraries Research Data Management Create a README file • At minimum, store documentation in readme.txt file or equivalent, with data – What data consists of – How it was collected – Restrictions to distribution or use – Other descriptive information
  • 50. MSU Libraries Research Data Management • “Data about data” • Standardized way of describing data • Explains who, what, where, when of data creation and methods of use • Data more easily found • Data more easily compared to other data sets Use Metadata Standards
  • 51. MSU Libraries Research Data Management Use Metadata Standards Basic project metadata: • Title • Language • File Formats • Creator • Dates • File Structure • Identifier • Location • Variable List • Subject • Methodology • Code Lists • Funders • Data Processing • Versions • Rights • Sources • Checksums • Access Information • List of File Names
  • 52. MSU Libraries Research Data Management Use Metadata Standards • Dublin Core: Commonly-used descriptive metadata format facilitates dataset discovery across the Web. • Data Documentation Initiative (DDI): Defines metadata content, presentation, transport, and preservation for the social and behavioral sciences. • ISO 19115:2003: Describes geographic data such as maps and charts. • More examples:http://www.lib.msu.edu/about/diginfo/coll ect.jsp
  • 53. MSU Libraries Research Data Management Use In-Line Code Commentary Example of R code commentary # Cumulative normal density pnorm(c(-1.96,0,1.96)) • If applicable, in-line code commentary helps explain code
  • 54. MSU Libraries Research Data Management File Organization Practices: Overview 1. Design a file plan for your research project 2. Use file naming conventions that work for your project 3. Choose file formats to maximize usefulness “When I was a freshmen I named my assignments Paper Paperr Paperrr Paperrrr” -Undergrad
  • 55. MSU Libraries Research Data Management Design a File Plan • File structure is the framework • Classification system makes it easier to locate folders/files • Benefits: – Simple organization intuitive to team members and colleagues – Reduces duplicate copies in personal drives and e-mail attachments
  • 56. MSU Libraries Research Data Management Design a File Plan Choose a sortable directory hierarchy • Example 1: Investigator, Process, Date Collie TEI_Encoding 20110117 • Example 2: Instrument, Date, Sample Usability Survey 2012043 sample_1
  • 57. MSU Libraries Research Data Management Design a File Plan Example documentation of Directory Hierarchy: /[Project]/[Grant Number]/[Event]/[Investigator/Date]
  • 58. MSU Libraries Research Data Management Use File Naming Conventions – Enable better access/retrieval of files – Create logical sequences for file sorting – More easily identify what you’re searching for
  • 59. MSU Libraries Research Data Management • Meaningful but short—255 character limit • Use alphanumeric characters – Example: abc123 • Capital letters or underscores differentiate between words • Surname first followed by initials of first name Use File Naming Conventions
  • 60. MSU Libraries Research Data Management • Year-month-day format for dates, with or without hyphens Example 1: 2006-03-13 Example 2: 20060313 • Decide on a simple versioning method Example: file_v001 Use File Naming Conventions
  • 61. MSU Libraries Research Data Management • To create consistent file names, specify a template such as: [investigator]_[descriptor]_[YYYYMMDD].[ex t] Use File Naming Conventions This Not This sharpeW_krillMicrograph_backscatter3_20110117.tif KrillData2011.tif This Not This borgesJ_collocation_20080414.xml Borges_Textbase.xml
  • 62. MSU Libraries Research Data Management Choose Appropriate File Formats • Non-proprietary • Open, documented standard • Common usage by research community • Standard representation (ASCII, Unicode) • Unencrypted • Uncompressed
  • 63. MSU Libraries Research Data Management Choose Appropriate File Formats Format Genre Optimal Standards TEXT .txt; .odt; .xml; .html AUDIO .flac; .wav, VIDEO .mp2/.mp4; .mkv IMAGE .tif; .png; .svg; .jpg DATA .sql; .csv
  • 64. MSU Libraries Research Data Management Storage & Backup Practices 1. Avoid single points of failure 2. Ensure data redundancy & replication 3. Understand common types of storage (cc) George Ornbo Data at significant risk of loss without storage and backup plan
  • 65. MSU Libraries Research Data Management Avoid Single Points of Failure A single point of failure occurs when it would only take one event to destroy all data on a device • Use managed networked storage when possible • Move data off of portable media • Never rely on one copy of data • Do not rely on CD or DVD copies to be readable • Be wary of software lifespans
  • 66. MSU Libraries Research Data Management Ensure Data Redundancy • Effective data storage plan provides for 3 copies: – Primary authoritative copy – Secondary local backup – Tertiary remote backup • Geographically distribute and secure – Local vs. remote, depending on needed recovery time • Personal computer, external hard drives, departmental, or university servers may be used
  • 67. MSU Libraries Research Data Management Ensure Data Redundancy • Cloud storage – Amazon s3 – Google – MS Azure – DuraCloud – Rackspace – Glacier Note that many enterprise cloud storage services include a charge for in/out of data transfers $$$
  • 68. MSU Libraries Research Data Management Understand Common Types of Storage • Optical Media • Portable Flash Media • Commercial Hard Drives • Commercial NAS • Cloud Storage • Enterprise Network Storage • Trusted Archival Storage
  • 69. MSU Libraries Research Data Management Understand Common Types of Storage • Features of storage types: • Portable data transfers • Short-term storage • Project term storage • Networked data transfer • Long-term storage • Reliable backup option
  • 70. MSU Libraries Research Data Management Understand Common Types of StoragePortable Data Transfer Short Term Storage Project Term Storage Networked Data Transfer Long Term Storage Reliable Backup Option Optical Media ✔ ✗ ✗ ✗ ✗ ✗ Portable Flash Media ✔ ✔ ✗ ✗ ✗ ✗ Commercial Hard Drives ✔ ✔ ✔ ✗ ✗ ✗ Commercial NAS ✗ ✔ ✔ ✔ ✗ ✗ Cloud Storage ✗ ✔ ✔ ✔ ✗ ✗ Enterprise Network Storage ✗ ✔ ✔ ✔ ✔ ✔ Trusted Archival Storage ✗ ✗ ✗ ✔ ✔ ✔
  • 71. MSU Libraries Research Data Management Understand Common Types of Storage Media Storage @ MSU Optical Media MSU Computer Store—Sells Optical Media and hardware accessories UAHC Media Storage Service—Offers physical lock-box like storage for MSU Flash Media MSU Computer Store—Sells Optical Media and hardware accessories UAHC Media Storage Service—Offers physical lock-box like storage for MSU Commercial Hard Drives MSU Computer Store—Sells Optical Media and hardware accessories. UAHC Media Storage Service—Offers physical lock-box like storage for MSU Enterprise Cloud Storage Angel—Free. Ideal for collaboration; not storage space. Phase out 2015 Desire2Learn—Free. Ideal for collaboration; not storage space. Replaces Angel GoogleApps—Free. Ideal for collaboration; not intended as storage space Enterprise Network Storage AFS Space—Free to 1GB, add’l space can be purchased w/dept. account IT Services Individual, Mid-Tier and Enterprise Storage—Fee based HPCC Home or Research—Free up to 1TB. Fee based additions available Trusted Archival Storage Disciplinary Repositories – Disciplinary repositories offer archival services for pertinent research data.
  • 72. MSU Libraries Research Data Management Data Publishing, Sharing, Reuse 1. Time-intensive, with potentially high return on investment 2. Publish data in several data publication venues to more broadly share results of research Research datasets on par with peer-reviewed journal articles as first-class scholarly contributions
  • 73. MSU Libraries Research Data Management Sharing & Publishing Data • Data preparation for sharing and publication is a time-intensive process • Potential positive outcomes: • Increased research impact and citations • Enable additional scientific inquiry • Opportunities for co-authorship and collaboration • Enhance your grant proposal’s competitiveness
  • 74. MSU Libraries Research Data Management Data Publication Venues • Multiple ways to publish research data • Faculty or project website • Journal supplementary materials • Disciplinary data repository (data archive) • Varying levels of support for indexing, access controls, and long-term curation
  • 75. MSU Libraries Research Data Management Data Publication Venues • Disciplinary Data Repository • Securely share data, ensure long-term access • High visibility • Often offer persistent citations • Availability varies across domains • Databib.org directory
  • 76. MSU Libraries Research Data Management Data Publication Venues • Disciplinary Data Repository • Securely share data, ensure long-term access • High visibility • Often offer persistent citations • Availability varies across domains • Databib.org directory
  • 77. MSU Libraries Research Data Management Protecting Data & Responsible Reuse 1. Consider how to protect data and intellectual property rights while encouraging reuse 2. Keep in mind ethical concerns when sharing data (cc) Will Scullin
  • 78. MSU Libraries Research Data Management Intellectual Property • IP refers to exclusive rights of creators of works • Individual data cannot be protected by US copyright • Organization of data such as database, creative work produced by data, and research instruments used may be protected Š
  • 79. MSU Libraries Research Data Management Intellectual Property • Principal investigator’s institution holds IP rights • Provide clearly stated license for producing derivatives, reusing, and redistributing datasets • License under Creative Commons • State if any restrictions or embargos on use • Provide example of how work should be cited to encourage proper attribution on reuse • Document any IP / copyright issues
  • 80. MSU Libraries Research Data Management Ethics & Data Sharing • Keep in mind the following ethical concerns when sharing your data: • Privacy • Confidentiality • Security and integrity of the data • For data involving human subjects, obtain written permission or consent stating how the data may be reused
  • 81. MSU Libraries Research Data Management Best Practices = High Impact Data • File organization ensures easier access and retrieval of data • Documentation makes datasets accessible and intelligible to users • Storage and backup safeguards data • Data publishing and sharing encourages the most widespread reuse of data • Data protection ensures responsible reuse
  • 82. MSU Libraries Research Data Management • Introduction • Background • The Impetus: NSF Data Management Plan Mandate • The Effect: Policy to Practice • The Response: Changing Data Landscape • Fundamentals Practices • File Organization • Data Documentation • Reliable Backup • Data Publishing, Sharing, & Reuse • Protecting Data & Responsible Reuse • Data Lifecycle Resources Agenda
  • 83. MSU Libraries Research Data Management Volunstrordinaries! Aaron Collie Devin Higgins Brandon Locke Ranti Junus Judy Matthews Tina Qin
  • 84. MSU Libraries Research Data Management We teach people about RDM Librarianship Training Assessment Consultation Ad-hoc 6-12 new clients per semester 100% satisfied / 100% would use again 71% of new clients are referrals 60% requested additional services 15% through NFO, 14% through website
  • 85. MSU Libraries Research Data Management http://www.lib.msu.edu/rdmg
  • 86. MSU Libraries Research Data Management RDM@MSU 101 • Who: You, as the designated steward • What: “the data” • When: Minimum 3 years after publ./degree • Where: Managed networked storage • Why: Legal, Ethical, Scholarly • How: With fidelity and documentation sufficient to reproduce the research
  • 87. MSU Libraries Research Data Management Contact Aaron Collie collie@msu.edu @aaroncollie http://www.lib.msu.edu/rdmg

Editor's Notes

  1. Data management is about more than just the lost back-pack. It is about expert application. Expert application in any industry is expensive.
  2. In the academic industry data is the input to our final product. It takes years of training and experience to succeed in this field.
  3. Research is a process, it is scientific, and we use an overarching model to describe the process at a high level. But this is a conceptual model, it is not a process model. But this is a pretty sterile model; and we know that because it is not prescriptive to all academic disciplines.
  4. Research is a process, it is scientific, and we use an overarching model to describe the process at a high level. But this is a conceptual model, it is not a process model. But this is a pretty sterile model; and we know that because it is not prescriptive to all academic disciplines.
  5. In practice, research is a complicated process. It is a creative process as well as a scientific process.
  6. This has been noticed.
  7. Research is hard, managing research is boring.
  8. And right now is a really good time to be O.K. with this, because science is under attack. And, given the interesting times we live in, it might be a good opportunity to get our house in order.
  9. So back to this. Research is a process, it is scientific, and we use an overarching model to describe the process at a high level. But this is a conceptual model, it is not a process model. But this is a pretty sterile model; and we know that because it is not prescriptive to all academic disciplines.
  10. You might think of the scientific method as a bit of an iceberg model. At the tip of the iceberg are these general activities, but research isn’t really conducted at this high of a level.
  11. Research is a thing that happens at many levels simultaneously. The more experience you gain with research, the more of the depth chart you develop expertise within.
  12. Data management is a subprocess of research. It is part of a holistic research method that includes a ton of other functions like funding, literature reviews, workflows and publication.
  13. Today we are just going to focus on the one of these areas. Data management.
  14. HANDOUT: DMP (blue)
  15. Federal debate on the right of public access to government funded research data dates back to 1980 Forsham v. Harris – research data not subject to FOIA. Private grantees are not agencies subject to FOIA – the data had not been created/obtained by a federal agency. Plaintiffs were physicians seeking to obtain data underlying a Dept. of Health, Education, and Welfare report on diabetes treatment regimens. Scientific research funding via grants was meant to allow the scientific knowledge system to operate on its established norms free from partisan influence. The 1999 “Shelby Amendment” changed this by revising OMB Circular A-110, making research data subject to FOIA if it is “research data relating to published research findings produced under an award that were used by the Federal Government in developing an agency action that has the force and effect of law.” Senator Shelby advocated for the amendment on the basis of (1) transparency, (2) accountability. Further policy arguments for data access = “return on scientific capital” Based on effort to obtain data from Harvard’s “Six Cities” study (funded by NIH) showing a link between particulate air pollution and health, which was the scientific basis of debates around EPA air quality regulations. NIH – 2003 policy requires data sharing plans for grants over $500K, and since 1994 included policy that requirements making results available to the public. NSF – has had a data sharing policy in place since 1989 which states that grantees are expected to share data, and encourage to facilitate data sharing, but it does not specifically require formal data publication. Neither does the 2011 DMP policy, although it has been a major force in increased attention to open data. In recent years, reports from the National Academies and elsewhere on the merits of CI and data sharing have snowballed. In 2010 the ACRA required the Director of OSTP to coordinate agency policies “related to the dissemination and long-term stewardship of the results of unclassified research, including digital data and peer-reviewed scholarly publications, supported wholly, or in part, by funding from the Federal science agencies.” As a result, 2013 OSTP memo, requires federal agencies funding more than $100 million in R&D annually to develop and implement plans for public access to publications and data.
  16. National Oceanic and Atmospheric Administration (NOAA) IMLS encourages sharing of research data. Applications that develop digital products must fill out an additional form with ten questions focused on “Developing Data Management Plans for Research Projects. The federal government has the right to obtain, reproduce, publish or otherwise use the data first produced under an award and authorize others to do so for government purposes.” Ex: Digging Into Data
  17. HANDOUT: DMP examples (white)
  18. NSF’s data management plan requirement May be up to two pages long PI may state that project will not generate data or samples DMP is reviewed as part of intellectual merit or broader impacts of application, or both
  19. HANDOUT: DMP examples (white)
  20. HANDOUT: DMP examples (white)
  21. (OMB Circular A-10, Sec. 53; 42CFR, Part 50, Subpart A)
  22. Replication, transparency, re-use, mashups, repurposing, extending grant dollars and enabling more research…
  23. 37
  24. In fact, it is one thing to share data and quite another to publish data. Data publication allows for data to be a “first class object” as part of the scholarly record, allowing for collection, management, curation, and citation. Dissenting opinion – data publication is a round peg in a square hole, shouldn’t try and make it conform to an antiquated system of scholarly communication. Pragmatism = treat as publication
  25. Bad press
  26. Benefits include: Electronic documents maintained together in one place and easily accessible to project staff Data backed up and recoverable in the event of system failure Promote culture of sharing information as an institutional resource, rather than individual ownership Reduce duplicate copies in personal drives and email attachments
  27. Starting point
  28. nuances of metadata -- data dictionaries, lab notebooks / journals,
  29. Starting point
  30. Descriptive documentation that accompanies a dataset
  31. Better project transitions
  32. Electronic documents maintained together in one place, easily accessible to project staff Reduces duplicate copies in personal drives and email attachments (Hierarchical/taxonomical/temporal)
  33. Benefits include: Electronic documents maintained together in one place and easily accessible to project staff Data backed up and recoverable in the event of system failure Promote culture of sharing information as an institutional resource, rather than individual ownership Reduce duplicate copies in personal drives and email attachments
  34. Benefits include: Electronic documents maintained together in one place and easily accessible to project staff Data backed up and recoverable in the event of system failure Promote culture of sharing information as an institutional resource, rather than individual ownership Reduce duplicate copies in personal drives and email attachments
  35. Will know how to name future folders as your project grows.
  36. Good practices
  37. Good choices include… Consider later lifecycle activities Flexible What format used for analysis, preservation, etc.
  38. Consider later lifecycle activities Flexible What format used for analysis, preservation, etc.
  39. Data at significant risk of loss without storage and backup plan, including: Hardware / network failures Bit rot Human error Singular commercial grade hard drives Effective data storage plan provides for: Primary authoritative copy Secondary local backup Tertiary remote backup
  40. One event might be a dropped hard drive Good practices Be wary of software lifespans, such as with course management software like ANGEL or Desire2Learn
  41. Examples of 3 copies original + external/local + external/remote original + 2 formats on 2 drives in 2 locations Mention new Backup Media Storage service offered by the University Archives.
  42. Mention new Backup Media Storage service offered by the University Archives. ANGEL, Desire2Learn, and Google Apps might be considered Cloud offerings from MSU. Good for collaboration and short term, don’t use for long-term storage. Not immune to data loss – Dedoose example.
  43. In booklet For example….
  44. Include description
  45. Angel and Desire2Learn not intended as storage space For more information on disciplinary repositories, contact RDMG or peruse Databib.org
  46. In booklet For example….
  47. In booklet For example….
  48. In booklet For example….
  49. In booklet For example….
  50. Principal investigator’s institution holds IP rights-- usually
  51. File organization ensures easier access and retrieval of data during and after project Documentation make datasets accessible and intelligible to users Storage and backup safeguards data against technical failure, human error, and natural catastrophe Data publishing and sharing encourages the most widespread reuse of data Data protection ensures responsible reuse in light of intellectual property and ethical concerns Increase impact of data and promote new research opportunities
  52. Service model
  53. A Plus / Delta exercise focusing on extant infrastructure and services Weave known MSU resources Discussion starters: Describe your interaction with dept, college, university, external bodies? What makes managing research data difficult? What services/tools do you need/want? Advice Website Database designers Targeted seminar series Data storage and curation options