Be aware of the research process, so you have some context for your experience. This can also help you organize your thoughts about executing/carrying out your projects.
Goal: help you translate your research protocol into a practical plan to carry out your project/studyAlthough these things do take some extra time at the beginning of your project, it will make analysis and writing much, much easier because you will be clear about what was done.
-data model: map out relationships between data, especially aggregated or calculated variables; translate research questions into analyses, then map to data to be used; can be particularly important if you are integrating data from multiple sources or have large quantitative datasets-data organization strategy: it should be part of the planning process and answer where, when, how? will talk more about this in the next slide-software: IUWare, IUanyWare, StatMath, RFS, SDA (links on handout)-ethical & legal issues: confidentiality, privacy, HIPAA, intellectual property, and copyright issues may arise; discuss these potential problems with your advisor; links for further information on handout)
-although facts cannot be copyrighted, specific instances of them (such as a database) can be
-research project: one option is to write a structured abstract (see handout)-dataset organization: use your plan and update it as things change (more on the next slide)-describeyour data files: what do you need to know to interpret the data? parameters, units, define coded values, define missing values-methods: -standards: don’t deviate from standards in your discipline or research community, unless you have a good reason for doing so; these standards reflect a common understanding and help to make data interoperable-citation: if you use someone else’s data, you should document and cite it: source, URL/DOI, detailed title of dataset, version information, date retrieved, authors/creators, brief description-timeframe: particularly if you’re using data from multiple sources or collecting data over a period of time, this needs to be documented clearly
-data typing: use appropriate field for data: date field for dates; comments included in a separate column-document your folder structure & file naming system -don’t rely on the computer’s time and date metadata; it’s not reliable and can be manipulated -keep file names short but descriptive; use a coding system to include project name, file contents, date, etc.-QA & data integrity: minimize opportunity to introduce human error, automate processing, check and verify periodically-version control & authenticity: especially important if multiple people are working on the same dataset; keep copies of your data before/after each major processing step; save you lots of work if errors creep in; you won’t have to start all over from the raw data; document how this is done
-backup strategy: quick and dirty way is to check and verify file quantity, file size, and randomly check values in original and copies-if you need to share or transfer files, use Slashtmp instead of a flash drive; especially if the data involve human subjects data
Good data practices for graduate students
Graduate Office Student Success Series GOOD DATA PRACTICES FOR RESEARCH January 12, 2012Heather Coates, MLS, MS | Digital Scholarship & Data Management Librarian
CONTEXT: DATA LIFECYCLESource: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008.<http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
OVERVIEWPlanningDescribing the dataHandling data filesStorage & backup
PLAN AHEAD: BEYOND THE PROTOCOL Plan early, before data collection Identify ethical and legal issues Define the data model Think about a data organization strategy Identify the most appropriate tools: instruments & software
ETHICAL & LEGAL ISSUES Privacy Are there people (human subjects) involved in your project? Animals? Does the study involve personal or health information? Can it be used to identify an individual? Copyright Are you using copyrighted data? Have you sought permission? Intellectual Property You should cite any product that you use for your project: data, publications, software, etc.
DESCRIBING YOUR DATA Describe the research project Describe overall organization of your dataset Describe your data files Describe the methods used to create your data Describe measurement techniques (protocols, instruments) Data processing – why, how, assumptions Sensor network, taxonomic information, spatial location Choose & use standard terminology (concepts, methods, tools) Identify and use relevant metadata standards Data citation Describe the timeframe
HANDLING DATA FILES Create, manage, and document your data storage system Use descriptive file names Define Formats for date and time Units of measurement Parameters Missing code values Values that are estimated Use consistent codes Use appropriate field delimiters Store data values separately from data annotations or notes Store data at the right level of precision Quality assurance & data integrity Version control & authenticity
STORAGE & BACKUP Backup your data: regular intervals, 3 copies Local Semi-local Remote Document your backup strategy Make sure backup locations are secure and accessible Use standard file formats Non-proprietary, open format Commonly used in your community Unencrypted* Uncompressed*
PROCESSING & ANALYSIS Defining your research questions and documenting your data are iterative processes Inform each other Are never done, until the project is complete Developing good documentation will make analysis easier and more efficient Having good documentation will make writing your paper/thesis/dissertation much easier Use your readme or codebook files as source documents for your methods sections Having good documentation will identify problems sooner, when it may be possible to resolve them or minimize the damage to your data
RESOURCES @IU IUWare IUanyWARE StatMath ITTraining RFS & SDA Open access/public use data sets DataCite ICPSR Data.gov Subject liaison librarians can assist in locating data on your topic
THANK YOUFind us at http://ulib.iupui.edu/digitalscholarshipHeather Coates, MLS, MSDigital Scholarship & Data Management Librarianhcoates@iupui.edu317-278-7125