Spring 2014 Data Management Lab: Session 2 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
Researcher KnowHow session on Anonymisation 101, based on slides and training materials by Dr Sarah Nevitt, Research Associate at the University of Liverpool with a section on Research Data Management and Anonymisation by Judith Carr, Research Data Manager and co-ordinated by Gary Jeffers, Research Data Officer at University of Liverpool Library.
It is about:
Introduction: What Is “Research Data”? and Data Lifecycle
Part 1:
Why Manage Your Data?
Formatting and organizing the data
Storage and Security of Data
Data documentation and meta data
Quality Control
Version controlling
Working with sensitive data
Controlled Vocabulary
Centralized Data Management
Part 2:
Data sharing
What are publishers & funders saying about data sharing?
Researchers’ Attitudes
Benefits of data sharing
Considerations before data sharing
Methods of Data Sharing
Shared Data Uses and Its’ Limitations
Data management plans
Brief summary
Acknowledgment , References
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
Researcher KnowHow session on Anonymisation 101, based on slides and training materials by Dr Sarah Nevitt, Research Associate at the University of Liverpool with a section on Research Data Management and Anonymisation by Judith Carr, Research Data Manager and co-ordinated by Gary Jeffers, Research Data Officer at University of Liverpool Library.
It is about:
Introduction: What Is “Research Data”? and Data Lifecycle
Part 1:
Why Manage Your Data?
Formatting and organizing the data
Storage and Security of Data
Data documentation and meta data
Quality Control
Version controlling
Working with sensitive data
Controlled Vocabulary
Centralized Data Management
Part 2:
Data sharing
What are publishers & funders saying about data sharing?
Researchers’ Attitudes
Benefits of data sharing
Considerations before data sharing
Methods of Data Sharing
Shared Data Uses and Its’ Limitations
Data management plans
Brief summary
Acknowledgment , References
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
Researcher KnowHow session presented by Carrol Gamble, Anna Kearney and Paula Williamson, Department of Health Data Science. University of Liverpool and Trials Methodology Research Partnership.
RDAP 16 Poster: Interpreting Local Data Policies in PracticeASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Line Pouchard, Purdue University
Donna Ferullo, Purdue University
S. Venkataraman (DCC) talks about the basics of Research Data Management and how to apply this when creating or reviewing a Data Management Plan (DMP). He discusses data formats and metadata standards, persistent identifiers, licensing, controlled vocabularies and data repositories.
link to : dcc.ac.uk/resources
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Presentation for Northwestern University's first Computational Research Day, April 22, 2014. http://www.it.northwestern.edu/research/about/campus-events/research-day/agenda.html . By Cunera Buys, e-Science Librarian, and Claire Stewart, Director, Center for Scholarly Communication and Digital Curation and Head, Digital Collections
This is a presentation for the Erwin Hahn Instiutute in Essen, explaining the background, functional design and technical architecture of the Donders Repository. Furthermore, it explains how it aligns with the DCCN project management and with the researchers workflow
A basic course on Research data management, part 1: what and whyLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2015-02-09. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Presentation by Bernd Pulverer on EMBO's 'Source Data' and the next generation of open access given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
Researcher KnowHow session presented by Carrol Gamble, Anna Kearney and Paula Williamson, Department of Health Data Science. University of Liverpool and Trials Methodology Research Partnership.
RDAP 16 Poster: Interpreting Local Data Policies in PracticeASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Line Pouchard, Purdue University
Donna Ferullo, Purdue University
S. Venkataraman (DCC) talks about the basics of Research Data Management and how to apply this when creating or reviewing a Data Management Plan (DMP). He discusses data formats and metadata standards, persistent identifiers, licensing, controlled vocabularies and data repositories.
link to : dcc.ac.uk/resources
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Presentation for Northwestern University's first Computational Research Day, April 22, 2014. http://www.it.northwestern.edu/research/about/campus-events/research-day/agenda.html . By Cunera Buys, e-Science Librarian, and Claire Stewart, Director, Center for Scholarly Communication and Digital Curation and Head, Digital Collections
This is a presentation for the Erwin Hahn Instiutute in Essen, explaining the background, functional design and technical architecture of the Donders Repository. Furthermore, it explains how it aligns with the DCCN project management and with the researchers workflow
A basic course on Research data management, part 1: what and whyLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2015-02-09. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Presentation by Bernd Pulverer on EMBO's 'Source Data' and the next generation of open access given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
Aim:- To show how research data management can contribute to the success of your PhD.
*What is research data and why it is important?
*The Research Data lifecycle
* Research Data – more than just your results
* FAIR data and Open Research
* DMP online tool
Responsible conduct of research: Data ManagementC. Tobin Magle
A presentation for the Food and Nutrition Science Responsible conduct of research class on data management best practices. Covers material in the context of writing a data management plan.
Paper was presented at European Survey Research Association 2013, in the session Research Data Management for Re-use: Bringing Researchers and Archivists closer.
Overview of the Research on Open Educational Resources for Development (ROER4D) Open Data initiative, highlighting data management principles, the five pillars of the ROER4D data publication approach and the project de-identification approach.
Preparing your data for sharing and publishingVarsha Khodiyar
Talk given as part of the MRC Cognition and Brain Sciences Unit Open Science Day on 20th November 2018 , University of Cambridge (https://www.eventbrite.co.uk/e/open-science-day-at-the-mrc-cbu-tickets-50363553745)
Are you interesting in offering data management services at your library but aren’t sure where to start? Then this class is for you! During this session, we will
• Outline the data management topics that are commonly offered in libraries
• Present strategies for how to determine what services might be most useful on your campus and create synergistic partnerships with other university entities
• Dive into how to offer support with data management plans
• Present a case study for using an institutional repository to archive and share research data
• Identify additional training opportunities and open educational resources you can use to develop robust DM services
The class will consist of a mix of presentations, hands on activities, and discussion. So come ready to participate!
LITA’s Altmetrics and Digital Analytics Interest Group is proud to present Heather Coates, Richard Naples, and Lauren Collister in our second free webinar of the season. Heather will introduce the concept of altmetrics with a quick "Altmetrics 101," Richard will discuss the Smithsonian's implementation of Altmetric, and Lauren will share the University of Pittsburgh's experience with Plum Analytics.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
An introduction to open science for the Library Journal webcast Case Studies for Open Science on February 9, 2016.
http://lj.libraryjournal.com/2016/01/webcasts/case-studies-for-open-science/
Academics must provide evidence to demonstrate the impact and outcomes of their scholarly work. This webinar, presented by librarians, will help faculty explore various forms of documentary evidence to support their case for excellence. Sponsored by the IUPUI Office of Academic Affairs.
Note: The webinar included demonstrations of Web of Science & Scopus, which the slides do not reflect.
Teaching data management in a lab environment (IASSIST 2014)IUPUI
Equipping researchers with the skills to effectively utilize data in the global data ecosystem requires proficiency with data literacies and electronic resource management. This is a valuable opportunity for libraries to leverage existing expertise and infrastructure to address a significant gap data literacy education. This session will describe a workshop for developing core skills in data literacy. In light of the significant gap between common practice and effective strategies emerging from specific research communities, we incorporated elements of a lab format to build proficiency with specific strategies. The lab format is traditionally used for training procedural skills in a controlled setting, which is also appropriate for teaching many daily data management practices. The focus of the curriculum is to teach data management strategies that support data quality, transparency, and re-use. Given the variety of data formats and types used in health and social sciences research, we adopted a skills-based approach that transcends particular domains or methodologies. Attendees applied selected strategies using a combination of their own research projects and a carefully defined case study to build proficiency.
Objectives: To explore potential collaborations between academic libraries and Clinical Translational Science Award (CTSA)-funded institutes with respect to
data management training and support.
Methods: The National Institutes of Health CTSAs have established a well-funded, crucial infrastructure supporting large-scale collaborative biomedical research. This infrastructure is also valuable for smaller, more localized research projects. While infrastructure and corresponding support is often available for large, well-funded projects, these services have generally not been extended to smaller projects. This is a missed opportunity on both accounts. Academic libraries providing data services can leverage CTSA-based resources, while CTSA-funded institutes can extend their reach beyond large biomedical projectsto serve the long tail of research data.
Results: A year-long series of conversations with the Indiana CTSI Data Management Team resulted in resource sharing, consensus building about key issues in data management, provision of expert feedback on a data management training curriculum, and several avenues for future collaborations.
Conclusions:Data management training for graduate students and early career researchers is a vital area of need that would benefit from the combined infrastructure and expertise of translational science institutes and academic libraries. Such partnerships can leverage the instructional, preservation, and access expertise in academic libraries, along with the storage, security, and analytical expertise in translational science institutes to improve the management, protection, and access of valuable research data.
Data sharing promotes many goals of the NIH research endeavor. It is particularly important for unique data that cannot be readily replicated. Data sharing allows scientists to expedite the translation of research results into knowledge, products, and procedures to improve human health. Do you know what a data sharing plan should include? Are you aware of common practices and standards for data sharing? Do you know what services are available to help share your data responsibly? This workshop will begin to address these questions. Q&A will follow the presentation. Anyone interested in or planning to apply for NIH funding should attend. Note: The NIH data-sharing policy applies to applicants seeking $500,000 or more in direct costs in any year of the proposed research.
Data sharing promotes many goals of the NIH research endeavor. It is particularly important for unique data that cannot be readily replicated. Data sharing allows scientists to expedite the translation of research results into knowledge, products, and procedures to improve human health. Do you know what a data sharing plan should include? Are you aware of common practices and standards for data sharing? Do you know what services are available to help share your data responsibly? This workshop will begin to address these questions. Q&A will follow the presentation. Anyone interested in or planning to apply for NIH funding should attend. Note: The NIH data-sharing policy applies to applicants seeking $500,000 or more in direct costs in any year of the proposed research.
Data Management Lab: Session 4 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data Management Lab: Session 4 Review OutlineIUPUI
Data Management Lab: Session 4 Review Outline (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data Management Lab: Session 3 slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data Management Lab: Session 3 Data Entry Best PracticesIUPUI
Data Management Lab: Session 3 Data Entry Best Practices (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
3. DMP
Data map: complete the partially mapped
research question
OR
Start your own data map
Don’t forget to upload your DMP to Box.
Suggested file name: DMP_20140401
7. What is Metadata
DATADETAILS
Time of data development
Specific details about problems with individual items or
specific dates are lost relatively rapidly
General details about datasets are
lost through time
Accident or
technology
change may
make data
unusable
Retirement or career change
makes access to “mental
storage” difficult or unlikely
Loss of data
developer leads to
loss of remaining
information
TIME (From Michener et al 1997)
8. Why do we document?
“Scientific publications have at least two goals:
(i) to announce a result and (ii) to convince
readers that the result is correct… papers in
experimental science should describe the results
and provide a clear enough protocol to allow
successful repetition and extension”
-Mesirov, 2010
9. Analysis and Workflows
• Reproducibility at core of scientific method
• Complex process = more difficult to reproduce
• Good documentation required for reproducibility
o Metadata: data about data
o Process metadata: data about process used to create, manipulate,
and analyze data
CCimagebyRichardCarteronFlickr
Provenance: where your data
came from and what has been
done to it
Crucial for replication/
reproducibility
10. Why do we document?
• Provide an accurate, reliable record of your work
– Including all the details you will not remember when
it’s time to write up the project
• Facilitate writing of high quality publications
• Necessary for reproducibility, a core principle of
scientific process
• Establish provenance
– Relevant to commercial application and patents
(legal), defending your publications (scientific),
responsible conduct of research (scientific)
11. Best Practices
Best Practices for Preparing Ecological Data Sets, ESA, August 2010
The 20-Year Rule
• The metadata accompanying a data set should be
written for a user 20 years into the future--what
does that investigator need to know to use the
data?
• Prepare the data and documentation for a user
who is unfamiliar with your project, methods, and
observations
11
13. What?
• Everything that is crucial for others to
understand, interpret, evaluate, and build on
your work
• What do YOU think?
• Metadata should capture the who, what,
when, where, how, why of your data
14. Think-Pair-Share
What do you think you need to document
about your project?
Share with your partner/group
Share with the class
15. What? How much?
Project-level
• Project history, aims, objectives and hypotheses
• Data collection methods: data collection protocol, sampling design,
instruments, hardware and software used, data scale and resolution,
temporal coverage and geographic coverage
• Dataset structure of data files, cases, relationships between files
• Data sources used (enough detail to find it again)
• Data validation, checking, proofing, cleaning and other quality assurance
procedures carried out
• Modifications made to data over time since their original creation and
identification of different versions of datasets
• Information on data confidentiality, access and use conditions
16. What? How much?
Data-level
• Names, labels and descriptions for variables, records and their values
• Units of measurement
• Explanation of codes and classification schemes used
• Codes of, and reasons for, missing values
• Derived data created after collection, with code, algorithm or command
file used to create them
• Weighting and grossing variables created
• Data listing with descriptions for cases, individuals or items studied
• Equipment, instruments, or other data collection tools used
• Field, lab, or interview conditions
17. What? How much?
• What went right
– So you can repeat/replicate it
• What went wrong
– So you can determine the cause (e.g., human error,
machine error, etc.) and prevent it from happening again
18. Some Effective Strategies
• Data
– Data models
– Data dictionaries
– Metadata
• Project (see Documentation Instructions for
examples)
– Procedures Manual
– Protocols
– Lab Notebooks
– Codebook
– Reference Libraries
20. Data Dictionary
A description of all study variables; for each variable:
• Variable name
• Role of the variable (analytical)
• Variable label
• Unit of measurement (if applicable)
• Type of variable
• Permissible values or range of values
• Definitions of redefined or derived variables
• Additional edits to be performed (logic & consistency)
21.
22. What is Metadata?
• A structured set of terms describing a defined world
– Standardized
– Structured
• Metadata can be created automatically or manually
• Ex: ClinicalTrials.gov
23. Why Use/Create Metadata?
• Metadata is critical for communicating context for data
• How is metadata used?
– To find things
– To describe things
– To merge things
• Metadata standards define a common set of terms and
structure to communicate information
– Enables consistency, shared definitions, shared language, and
shared structure for interoperability
• Different standards have been developed for different
purposes (social science data, clinical trials, ecology)
27. Think-Pair-Share
Transform narrative description to structured
metadata using the provided template.
Write the information corresponding to the field
on your index card.
Abstract at http://doi.org/10.1542/peds.2013-
1488
32. LEARNING
OUTCOMES
• Develop a consistent and
coherent file organization
and naming convention
scheme for all project
files.
• Select appropriate non-
proprietary hardware and
software formats for
storing data.
• Create protected copies
of files at crucial points in
your study
• Use versioning software
or documentation for
tracking changes to files
over time.
33. File Organization & Naming
• Be Clear, Concise, Consistent, Correct, and
Conformant
• Consider what is necessary to find and access
files in next year and when the project is
complete.
• Develop a scheme and use it.
• Track changes.
34. Organization: Filing v. Piling
• Filing (hierarchical)
– When organizing files, directory top-level folder
should include the project title, unique identifier, and
date (year).
– The substructure should have a clear, documented
naming convention; for example, each run of an
experiment, each version of a dataset, and/or each
person in the group.
• Piling (tags)
– All files in one directory, rely on sorting and searching.
38. Naming Files
• Be Clear, Concise, Consistent, Correct, and
Conformant
• Make it meaningful
• Remember the purpose is to provide context
39. Elements of a File Name
• Project/grant name and/or number
• Date of creation/modification
• Name of creator/investigator: last name first
followed by (initials of) first name
• Research team/department associated with the data
• Content or subject descriptor
• Data collection method (instrument, site, etc.)
• Version number
• Project phase
41. Technical Tips
• For sequential numbering, use leading zeros.
– For example, a sequence of 1-10 should be numbered 01-10; a
sequence of 1-100 should be numbered 001-010-100.
• No special characters in file names
& , * % # ; * ( ) ! @$ ^ ~ ' { } [ ] ? < > -
• Use only one period ONLY before the file extension (e.g.
name_paper.doc NOT name.paper.doc OR
name_paper..doc)
?Will your files still be unique and comprehensible
if moved to another location
42. Think-Pair-Share
• Develop a file naming scheme for your
project (enter it in your DMP).
• Share it with your partner.
• Share with class.
43. File Formats
• Choose formats that are more likely to be accessible
in the future (10-20 years)
– Non-proprietary
– Open, documented standard
– Commonly used
– Standardized (ASCII, Unicode)
• Also, if possible
– Unencrypted
– Uncompressed
• Ex: PDF/A (not .doc/x), ASCII (not .xls/x), MPEG-4,
TIFF or JPEG2000, XML or RDF (not RDBMS)
44. Master Files
• Provides snapshots of key phases in the data
life cycle
– Raw
– Cleaned
– Phases of processing
• In combination with detailed documentation,
these files make write-up easier and supports
reproducibility and reuse
• Demonstrate provenance (i.e., an audit trail)
45. Version Control
• Manual – file names
– Sequential numbered system
– Dated
• Automatic – version control software
– Mercurial
– TortoiseSVN
– GitHub
• Keep log files, supplement with documentation
(e.g., readme.txt, comments, etc.)
46. DMP
Sections to work on:
• Format (revise)
–Are you choosing the best formats?
• Data organization (write)
–File & Folder structure
–File naming convention
–Master files/Data locks
47. References
1. DataONE Education Module: Data Management Planning. DataONE. From
http://www.dataone.org/sites/all/
documents/L03_DataManagementPlanning.pptx
2. DataONE Education Module: Data Citation. DataONE. From
http://www.dataone.org/sites/all/documents/L09_DataCitation.pptx
3. McNeill, K. (2013). Research Data Management: File Organization. From:
http://libraries.mit.edu/guides/subjects/data-
management/File%20Organization_JulyAP2013.pdf
4. MIT. (2014). Organizing your files. From:
http://libraries.mit.edu/guides/subjects/data-management/organizing.html
5. Savage, A. (nd). Mythbusters. From http://weknowmemes.com/2012/10/
the-only-difference-between-screwing-around-and-science/
6. Whitmire, A. (2014). Research Data Management – Organizing Your Data.
From http://guides.library.oregonstate.edu/grad521lectures