Meeting the NSF DMP Requirement: March 7, 2012

900
-1

Published on

March 7 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
900
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Housekeeping: hold questions until the end, make sure everyone has handoutsResources: SlidesDSP Guide to NSF DMPNSF Policy language handoutCIC Author Addendum
  • We’re going to spend the majority of our time today walking through each section of the DMP, but there are some basic things you need to know first.
  • The main reason people are talking about data management and curation right now are the funding agency requirements. These came about within the context of broader discussions about scholarly communications in the science, so we’ll quickly review that discussionbefore getting into the practical steps of developing a DMP.1) We want to prepare you to engage in discussions about scholarly communication, specifically open access and data management and sharing.2) We will provide information so that you are making informed decisions regarding copyright, IP, patent, and other issues when it comes to choosing how your research is disseminated, who has rights to it, and who can access it.There are many other compelling reasons to plan for curating and/or sharing your data. The good news is that data sharing can boost the scholarly impact of your data and research in general, which is always good for promotion and tenure. -exposure  increased citation for scholarly works; people are working out ways to cite datasets as well-collaborations  funders are increasingly looking for interdisciplinary and multi-institution collaborationsHowever, the benefits of digital data come with costs. Unlike with physical specimens or paper-based data, we can’t assume that we’ll be able to access and use digital data in 5, 10, or 50 years. We need to plan, manage, and preserve valuable digital data so that the scientific record isn’t lost. Essentially, if we can’t find it, it didn’t happen. These issues of persistent access and long-term preservation are challenges that libraries have been solving for a very long time.
  • Some people wonder why the library is taking on this challenge of helping researchers to manage and preserve their data. There are several good reasons.-every college or university has a library-our place within IUPUI facilitates collaboration; we have existing relationships with each department; these collaborations are another way to build capacity for data curation, making use of resources that are already available-libraries and librarians have been caring for information in many formats for thousands of years; while the formats change more rapidly these days, our core principles remain the sameLibrarians & librarieshave been preserving information in various forms for a very long time. Other campus units can help you with your research, but have a different focus, such as compliance with human subjects or animal use guidelines, contracts and grants, bioethics, etc. I’ll provide information about these resourcesat the end.
  • The Data Services Program is part of the University Library’s Program of Digital Scholarship. The Data Services Program offers workshops and consultations for developing an NSF data management plan as well as data management and curation in general. In addition, we have established a data repository for IUPUI research. The repository is one of many tools available to your for preserving and sharing your research data.On our website, we’ve provided links to :Sample NSF DMP from other institutionsvarious toolsGuidance from institutions like the ICPSR and Digital Curation Centre (UK)Significant publications discussing data management and curation
  • I want to clarify some terms so we’re all on the same page. Data management is largely seen as the purview of scientists and biostatisticians since it varies by research community and discipline. Data sharing is not an all-or-none proposition. It encompasses a wide spectrum of activities ranging from open data publishing on the internet without restriction to controlled access by pre-defined partners or collaboratorsData citation is a concept similar to citation of scholarly publications and refers to mechanisms that allowseasy reuse and verification of data (DataCite);the impact of data to be tracked (DataCite);And creates a scholarly structure that recognizes and rewards data producers (DataCite)
  • As I said earlier, these policies came about as a result of broader conversations about scholarly communication. In case you aren’t familiar with the term, it refers to the processes by which we produce and disseminate information relating to teaching, research, and other scholarly activities. Our goal is to provide you with the information necessary to engage in these discussions within IU and your research communities so that you are making informed decisions about how your research gets out there, who retains rights, and who can access it.The NSF policy is not a radical change that is likely to go away. These policies illustrate slow progress towards increasing public access to research results and products and greater awareness of data management challenges.http://www.nsf.gov/bfa/dias/policy/dmpfaqs.jsp
  • IU as an institution has been engaged in discussions about scholarly communications for several years and have voiced their commitment to these issues by including data management and curation in the IT Strategic Plan.
  • The open access conversation is focused on the dissemination of research products like peer-reviewed articles and books at the end of the research life cycle, whereas data management planning is most effective when it’s initiated before data collection begins and implemented throughout the research life cycle.
  • Important to know that the language was crafted to:Allow the research community to shape the implementationRole for communities of practice to develop relevant best practicesThe budget allocations and narrative should tell a cohesive story; if you identify big challenges in data storage and preservation, but do not allocate funds to address these challenges, it will likely raise a red flag for the review committee.
  • Ultimately, this document should demonstrate that you are aware of data management and preservation issues in general, more specifically the relevant practices within your research community or discipline, and that you have thought through how these affect your proposed project. As you develop each section of your DMP, it’s important to do two things: Explain your reasoning  it could just be that it’s a standard practice in your field/communityIdentify roles for data management and curation activities  think about who on your team or in another campus unit will carry out the activities described; this section should identify who will be carrying out the major elements of your plan. This may include the PI, staff, students, external contractors, institutional IT, the library, and external data repositories.The plans proposed should be feasible – for you and for us.
  • If you look at the guide I’ve provided, you’ll see these topics are broken down into a variety of specific questions to address. We’ll go through each section in more detail.It may be helpful to begin your DMP with a few sentences describing the research project in general, to provide context for the detailed information in each section.
  • In this first section, you want to describe two things: the data you will generate or use and the documentation you will create to facilitate data management and curation.
  • In addition to describing your plan in the DMP, data collection and processing activities should be described in working documents throughout the life of the project. Research methods, even within a single lab, change over time. Creating data documentation is easiest and most efficient at the beginning of a project. Good documentation ensures 3 things: a shared understanding of the data throughout a project; that future researchers will be able to understand data within the context they were created;that re-users of data are able to interpret the data appropriately. You don’t need to spend a lot of time or space describing the planned documentation, but it is worthwhile to mention what format it will take and who will be responsible for creating and maintaining it. This documentation is often deposited along with the data for preservation and sharing.Good documentation can facilitate efficient data collection and processing and preserve data integrity.
  • Data screening tests: histograms, boxplots, Z-scores, etc.
  • Who created the data?What is the content of the data?When were the data created?Where is it geographically?How were the data developed?Why were the data developed?
  • Ask yourself are your data self-explanatory? Consider it from the perspective of a typical reader of a journal you publish in or a colleague who might be interested in collaborating. The answer to this question is no; the solution is good documentation and metadata. More frequently, the people analyzing the data are not those who collected it. Metadata and good data documentation facilitatestronger understanding of the data: quality and appropriate useThere are a lot of standards out there; the best approach to determining which to use is to see what others in your discipline or research community are doing. Another option, if you know you will be depositing your data in a particular repository, is to ask them what their requirements or recommendations are. Interdisciplinary and longitudinal studies should think carefully about how their data will be used across multiple disciplines and the potential for re-use. You may want to consider standards that are well-supported and established over specialized standards that may complicate re-use and analysis in future.
  • Let’s take a look at the handout with the NSF policy language. Again, the language is broad and allows for practices to vary by research community. As you can see from the policy, data dissemination and sharing does not refer to publishing in scholarly journals. Also, the requirements vary by Directorate, so be sure to check to see if your Directorate has different expectations.In this section, you should define what you will share, how, and the procedures for access. If you plan to use a specific data repository, they can help you develop this section; likely, they will have standard processes in place.Acceptable practices for data sharing vary by discipline; some have very mature data repositories while others rely on informal channels. Best practices for persistent access indicate more permanent and secure mechanisms than a faculty or department website. The solution at IUPUI is our data repository (IUPUIDataWorks).In terms of the access procedures, you want to think about what mechanism will be used for requests, whether registration and authentication are necessary, and what information you want to keep for your own records about those who request and receive your data. This can be useful information to demonstrate the value and impact of your research.Data sharing encompasses a wide spectrum of activities. Even if you are part of a community in which data sharing is not common practice, I urge you to think about what data might be shared or re-used without compromising your intellectual property or competitiveness. You may have older data on which you’ve completed analysis. This data might be useful to students or beginning researchers in your field or here at IUPUI. We can help you figure out how to share your data securely.
  • This section relates back to the access and sharing information, but should focus on policies and permissions for re-use, re-distribution, and production of derivatives works as opposed to the mechanisms described in the previous section. It’s possible to protect your ability to use the data for ongoing analysis while sharing as much of it as possible with your research community and the general public. While you can’t plan for every case, it is useful to imagine who might be interested in the data, how it might be used, and set up a process for handling those cases. Depending on where you decide to deposit your data, this could be very formalized or relatively informal.If you decide to share your data through a repository, often there are mechanisms built in for applyingCreative Commons licenses. This is true for our data repository as well.
  • Here, you should build on the information you’ve outlined in previous sections to describe your long-term preservation strategy. A key component of your plan is the description of the cyberinfrastructure available to you and how you will use it to carry out your plan as a responsible data steward. Although your lab may be equipped to store and maintain the data for a project while it’s active, you may not have the capacity to make sure the data is preserved once the project is complete and your lab resources are dedicated to new endeavors. Neither IU nor NSF want to see scientific data lost and are investing significant effort and resources in maintaining the scientific record.This is an opportunity for you to discuss with us or an external data repository in your discipline, the long-term plan for keeping your data safe. If you are completely unsure how to approach this, feel free to contact the DSP for support. We can help you develop a feasible and appropriate preservation strategy that relies on existing services and infrastructure, whether at IU or elsewhere. These are activities that the Library specifically is invested in and equipped to do; our focus is on long-term preservation, curation, and access. What this means will likely vary by dataset, project, and lab; we’re happy to think this through with you to develop a plan that will meet the needs.
  • There are a wealth of resources at IU to help you with your research. These are just a few of those relevant to data management and curation.
  • Meeting the NSF DMP Requirement: March 7, 2012

    1. 1. DATA MANAGEMENTPLANS & PLANNING: March 7, 2012 MEETING THE NSF REQUIREMENT
    2. 2. WHO ARE WE?Heather CoatesDigital Scholarship & Data Management LibrarianUniversity LibraryKristi PalmerDigital Libraries Team LeaderUniversity Library
    3. 3. LEARNING OBJECTIVESAfter attending this workshop: You will understand the NSF data policies. You will be aware of the relevant data -related services at IUPUI. You will have resources to develop a data management plan (DMP) for your NSF proposal(s). You will be able to write a comprehensive DMP for your NSF proposal(s). You will send your DMP draft to the Data Services Program for review and assistance as needed.
    4. 4. OVERVIEW Context for the NSF data policies Meeting the NSF DMP requirement  The requirement: 5 elements  Developing a Data Management Plan  Implementing your plan Workshop Evaluation
    5. 5. CONTEXT: SCHOLARLY COMMUNICATIONS Funding agency requirements Scholarly Impact  Exposure  increased citation  More equal access (especially for students)  Facilitates reproducibility  Facilitate new discoveries via secondary analysis/data re -use  Foster productive collaborations  Lead to new computational techniques Planning for the future  If we can’t find it, it doesn’t exist  Persistent access  Long-term preservation
    6. 6. CONTEXT: WHY THE LIBRARY? preservation, curation, access Trusted member of the institution Organizational structure lends itself to collaboration with researchers Interdisciplinary by nature Existing infrastructure for digital information Existing expertise in preserving and providing access to information  Program of Digital Scholarship  Archives
    7. 7. CONTEXT: DATA SERVICES PROGRAM Part of the Program of Digital Scholarship Mission  Identifying data issues and connecting you to the solutions Services  Workshops  Individual consultations  Data repository Resources  Guide to NSF Data Management Plan Requirement  Website
    8. 8. CONTEXT: TERMINOLOGY Cyberinfrastructure: computing resources & networks, services, & people (see Empowering People, 2009 for more) Data management: technical processing and preparation of data for analysis Data curation: selection of data for preservation and adding value for current and future use Data citation: mechanisms to enable easy reuse and verification, track impact of data, and create structures to recognize and reward researchers (DataCite) Data sharing: must take into account ethical and legal issues; a spectrum with many options
    9. 9. CONTEXT: FEDERAL POLICIES Issues in scholarly communication  Open access  Open data & data citation  Data management & curation Federal policies (incremental steps towards openness)  National Research Council, 1985  Office of Management & Budget, 1999: Circular A-110  NIH Data Sharing Policy, 2003  NIH Public Access Policy, 2008  NSF DMP Requirement, 2011  Other policies: Wellcome Trust, Howard Hughes Medical Institute, NOAA, NEH
    10. 10. CONTEXT: IU STRATEGIC PLANIU Empowering People Strategic Plan for IT (2009) Action 33: “IU should provision a data utility service for research data that affords abundant near- and long-term storage, ease of use, and preservation capabilities. This data utility will need to offer a range of services for securing data, providing authorized access within and beyond IU; ensuring metadata description, annotation, and provenance; and providing backup/recovery services.”
    11. 11. CONTEXT: OPEN ACCESS What is Open Access?  Freely available, online, and free of most copyright restrictions Why should you care?  Right thing to do?  Increase your citations  “We analysed 119,924 conference articles in computer science and related disciplines. The mean number of citations to offline articles is 2.74, and the mean number of citations to online articles is 7.03, an increase of 157%.” (Lawrence, 2008)  Publisher functions need not reside in for profit hands  "Between 1975 and 2005 the average cost of journals in chemistry and physics rose from $76.84 to $1,879.56. In the same period, the cost of a gallon of unleaded regular gasoline rose from 55 cents to $1.82. If the gallon of gas had increased in price at the same rate as chemistry and physics journals over this period it would have reached $12.43 in 2005, and would be over $14.50 today.” (Lewis, 2008)
    12. 12. CONTEXT: OPEN ACCESS @ IUPUI IUPUI University Library Program of Digital Scholarship http://www.ulib.iupui.edu/digitalscholarship  Open Journals  IUPUIScholarWorks-Faculty Scholarship  Electronic Theses and Dissertations  Cultural Heritage Collections  Data  eArchives
    13. 13. CONTEXT: RESEARCH LIFE CYCLESource: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008.<http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
    14. 14. CONTEXT: BENEFITS OF PLANNING Saves time  Less reorganization down the road Increases efficiency  Gathers necessary information for analysis and writing  Prevents problems in understanding data and metadata Makes it easier to preserve your data Requirements from some funding agencies and institutions
    15. 15. DMP: THE REQUIREMENT Why?  Increased impact of research money  Reduce redundant data collection  Enhance use and value of existing data  Further scientific research Language is broad to allow input from research communities Implementation costs of the DMP CAN be included in direct costs
    16. 16. DMP: PRACTICAL TIPS The gist of it…  Describe what you will do with your data during and after the proposed project  Ensures data is safe now and in the future DMP should reflect…  Awareness of data management and curation in your discipline  Feasible plan to utilize available cyberinfrastructure Try to…  Explain the rationale for your choices  Identify roles for data management and curation activities
    17. 17. DMP: ELEMENTS Types of data Standards and metadata Access and sharing Re-use, re-distribution, and the production of derivatives Long-term preservation [Budget]
    18. 18. DMP: TYPES OF DATA [1] Use standards common in your research community Characterize the data to be generated or used  Types of data?  experimental, observational, raw or derived, models, simulations, curriculum materials, software, images, audio, video, etc.  What file formats will be used?  Text, spreadsheet, database, etc.  How will it be collected? (describe the process)  How much data?  Will the data be reproducible? How does the project relate to existing data?  If dataset will be combined, how to ensure interoperability?
    19. 19. DMP: TYPES OF DATA [2] How will data be collected?  How? (tools, instruments, measurements, etc.)  When? (timeframe, series)  Where? How will data be processed?  Workflows  Software packages How will the data be stored and managed?  File naming conventions  Version control
    20. 20. DMP: TYPES OF DATA [3] What QA & QC measures will be used?  Identify steps during processing and analysis to eliminate bad data points  Examples: double data entry, data screening tests What is the backup and security plan?  Plan for particular security or confidentiality issues  Location & frequency Roles & responsibilities  Who will carry out data collection, processing, and backup activities?
    21. 21. EXAMPLE: TYPES OF DATAAtmospheric Concentrations of CO2, Mauna Loa Observatory,Hawaii, 2011-2013https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Formatted.pdfArthropod responses to grassland nutrient limitationhttps://www.dataone.org /sites/all/documents/DMP_NutNet_Formatted.pdf
    22. 22. DMP: STANDARDS & METADATA [1] Metadata describes the who, what, when, where, how, why of the data Purpose of metadata is to enable finding, organization, interoperability, identification, archiving & preservation Standards are commonly agreed upon terms and definitions in a structured format
    23. 23. DMP: STANDARDS & METADATA [2] Will your datasets be self -explanatory or understandable in isolation? Decisions to make about metadata  Relevant standard(s)  Format  Content  What information is needed to use and interpret in 5 years, 25 years?  Ask your fellow researchers and check with data centers or repositories How are metadata created?  Automatically generated  Manually created
    24. 24. EXAMPLE: STANDARDS & METADATA [1]Atmospheric Concentrations of CO2, Mauna Loa Observatory,Hawaii, 2011-2013https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Formatted.pdfMetadata will be comprised of two formats —Contextualinformation about the data in a text based document and ISO19115 standard metadata in an xml file. These two formats formetadata were chosen to provide a full explanation of the data(text format) and to ensure compatibility with internationalstandards (xml format). The standard XML file will be morecomplete; the document file will be a human -readable summary ofthe XML file.
    25. 25. EXAMPLE: STANDARDS & METADATA [2]R i o G ra n d e H yd rol ogic G e o d atabase C o m p e n di umhtt ps:/ /www. dataone .org /site s /al l/ doc ume nts /D M P_ Hydrol ogic _ Form atte d.pdfM i c ro s o f t A c c e s s D a ta b a s e fo r ma t w i l l b e u s e d s i n c e i t i s re a d i l y a c c e s s i b l e a n d i t i sco m p a t i b l e w i t h E S R I A rc G I S ( htt p : / / w w w. e s r i . co m/s o f twa re /a rcg i s /i n d ex . ht ml ), aG e o g ra p h i c I nfo r m at i o n S y s te m s o f t w a re p a c ka g e u s e d by t h e s ta ke h o l d e rs . N a m i n gco nv e nt i o n s w i l l b e co n s i s te nt – n o s p a c e s w i l l b e u s e d i n ta b l e n a m e s o r f i e l d n a m e s .T h e f i l e n a m i n g co nv e nt i o n w i l l co n s i s t o f t h e d a ta s o u rc e _ d a ta t y p e fo r m a t fo r ra w d a taf i l e s . D a ta re p o r t i n g f u n c t i o n a l i t y w i l l b e b u i l t i nto t h e V B A p ro c e s s i n g p ro g ra m s top ro v i d e o u t p u t i n .t x t f i l e fo r m at fo r n u m b e r o f re co rd s p e r s o u rc e w h e n u p d ata b l e d atas o u rc e s a re ref re s h e d .Ev e r y ef fo r t w i l l b e m a d e to g o b a c k to t h e a u t h o r i ta t i v e s o u rc e fo r a n i d e nt i f i e d d a ta s et .Q u a l i t y co nt ro l o f t h e d a ta b a s e w i l l b e p e r fo r me d u s i n g S Q L s ta te m e nt s t h a t ca p i ta l i ze o nt h e d a ta b a s e s t r u c t u re to e n s u re re l a t i o n a l d a ta b a s e i nte g r i t y. A p p ro p r i a te p r i m a r y key sw i l l b e a s s i g n e d to m a n a g e p o s s i b l e d a ta d u p l i ca te s . Po te nt i a l d u p l i ca te s i te I D s , w i l l b eh a n d l e d t h ro u g h a u to m a te d p ro c e d u re s a n d t h e c re a t i o n o f a l te r n a te I D ta b l e s .A d a ta d i c t i o n a r y w i l l b e c re ate d t h a t d ef i n e s t h e ta b l e d ef i n i t i o n , ta b l e f i e l d s , a n d ta b l ef i e l d d a ta t y p e s . A n e nt i t y - ­ ­ re l at i o n s h i p d i a g ra m w i l l b e c re a te d t h a t d ef i n e s t h ere l a t i o n a l s t r u c t u re o f t h e d a ta b a s e .A m eta d a ta re co rd w i l l b e p ro d u c e d u s i n g t h e F G D C s ta n d a rd t h a t d e s c r i b e s t h e e nt i reg e o d a ta b a s e. T h e F G D C s ta n d a rd w a s c h o s e n d u e to re q u i re d Fe d e ra l g o v e r n m e nts t a n d a rd s .
    26. 26. DMP: ACCESS & SHARING What are your obligations for sharing?  Funding agency, institution, other organization, legal, etc. What are the ethical or legal issues? (i.e., privacy, confidentiality, security, intellectual property, or other rights) How will the data be made available? What is the process for gaining access? When will the data be made available?  When will the data become available?  For how long will the data be available? What is the process for gaining access? Who will have access to the data?
    27. 27. DMP: RE-USE, RE-DISTRIBUTION, ETC. What rights will you retain before data is made available? Will permission restrictions be necessary?  Limits or conditions for political, commercial, or patent reasons? Is there an embargo period? Why? Future users and uses  Who might be interested in the data?  How might you anticipate this data being used?  What value might the data have for these people?
    28. 28. EXAMPLE: ACCESS, SHARING, RE-USEDevelopment of a NanoKlein Calorimeterhttp://libguides.unm.edu/content.php?pid=137795&sid=1422879We expect to apply for a patent for this instrument. All of thematerials submitted as part of the patent process will be a matterof public record. We will also make technical drawings, test dataand calibration data available through our institutional repository.Cave Microbiologyhttp://libguides.unm.edu/content.php?pid=137795&sid=1422879
    29. 29. DMP: LONG-TERM PRESERVATION What data will be preserved?  What transformations are necessary to prepare the data? How long do you think the data will be useful? How long will the data be preserved? Contextual information needed to make the data reusable  metadata, references, reports, manuscripts, grant proposal, etc. Where will it be preserved?  Links to published materials and other outcomes? Use of persistent citation?  Procedures for preservation and back-up? Who will be the contact for the dataset?
    30. 30. EXAMPLE: LONG-TERM PRESERVATION [1]Arthropod responses to grassland nutrient limitationhttps://www.dataone.org /sites/all/documents/DMP_NutNet_Formatted.pdfWe will preserve both arthropod datasets generated during thisproject (abundance and stoichiometry) for the long term in theDigital Conservancy at the U of M. We will include the .csv files,along with the associated metadata files. We will also submit anabstract with the datasets that describe their original context andany potentially relevant project information. Borer will beresponsible for preparing data for long -term preservation and forupdating contact information for investigators.
    31. 31. EXAMPLE: LONG-TERM PRESERVATION [2]Improving the long-term preservability of HDF-formatted data bycreating maps to file contentshttps://www.dataone.org /sites/all/documents/DMP_HDFMap_Formatted.pdfThe writer software will be preserved by the HDF Group for the lifeof the HDF libraries. The HDF Group uses industry­standard bestpractices to ensure the integrity of their software and systems.Once the map writer has been used to generate maps for everyHDF file in existence, the continued existence of the writersoftware is not required. The reader software will be preserved atSourceForge.org for as long as there is community interest. Thecollection of HDF files will be preserved at NSIDC as long as utilityis deemed high.
    32. 32. IMPLEMENTING YOUR PLAN [1] The DMP is a working document NSF expects progress to be reported Incorporate implementation into the project startup process  C&G, IRB, IACUC all have to be in place before data collection can begin  Review, revise, and set up your system during startup Good documentation ensures…  A shared understanding of the data throughout a project  That future researchers will be able to understand data within the relevant context  That re-users of data are able to interpret the data appropriately Resources for backing up data during a project  Research File System: http://pti.iu.edu/storage/rfs  Scholarly Data Archive: http://pti.iu.edu/storage/sda
    33. 33. IMPLEMENTING YOUR PLAN [2]Program of Digital Scholarship: http://ulib.iupui.edu/digitalscholarshipCenter for Research & Learning: http://crl.iupui.edu/OVCR: http://research.iupui.edu/development/Office of Academic Affairs: http://www.academicaffairs.iupui.eduIntellectual Property Policy: https://www.indiana.edu/~vpfaa/academicguide/index.php/Policy_I-11Research File System: http://pti.iu.edu/storage/rfsScholarly Data Archive: http://pti.iu.edu/storage/sdaResearch Technologies, UITS: http://uits.iu.edu/page/avelCore Ser vices, UITS: http://pti.iu.edu/csScholarly Cyberinfrastructure, UITS: http://uits.iu.edu/page/ameeC TSI Tools: http://www.indianactsi.org /rct (Alfresco Share, REDCap )IUWare: https://iuware.iu.eduIUanyWare: https://iuanyware.iu.edu/vpn/index.htmlStatMath: http://www.indiana.edu/~statmath/Statistics Consulting Center: http://www.math.iupui.edu/asci/
    34. 34. RESOURCES [1]Data Services Program site:http://ulib.iupui.edu/digitalscholarship/dataservices.htmlNational Science Board, Digital Research Data Sharing &Management, 2012 (pre-publication):http://www.nsf.gov/nsb/publications/2011/nsb1124.pdfNational Institutes of Health, Data Sharing Policyhttp://grants.nih.gov/grants/policy/data_sharing /data_sharing_guidance.htmNIH Public Access Policy Implicationshttp://publicaccess.nih.gov/public_access_policy_implications_2012.pdfIU New Employee Compliance Orientation (NECO)http://researchadmin.iu.edu/EO/eo_sessions.html
    35. 35. RESOURCES [2]UK Data Archive: Managing & Sharing Data Brochure:http://www.data-archive.ac.uk/media/2894/managingsharing.pdfUK Data Archive Costing Tool:http://www.data-archive.ac.uk/media/257647/ukda_jiscdmcosting.pdfCreative Commons Licenses & Data:http://wiki.creativecommons.org /DataLicensing Research Data, Digital Curation Centrehttp://www.dcc.ac.uk/resources/how -guides/license-research-dataCIC Author Addendumhttp://www.cic.net/authorsDMPTool: https://dmp.cdlib.org /DMPOnline: https://dmponline.dcc.ac.uk/
    36. 36. COMPELLING CASES FOR OPEN DATATim Berners-Lee: http://www.ted.com/talks/tim_berners_lee_on_the_next_web.htmlOpen-source cancer research: http://www.ted.com/talks/jay_bradner_open_source_cancer_research.htmlPolymath problem blogs:http://polymathprojects.org /about/http://stevekochscience.blogspot.com/2011/02/open -data-success-story.htmlhttp://eaves.ca/2011/09/07/the -economics-of-open-data-mini-case-transit-data-translink/
    37. 37. REFERENCES1. Higgins, S. ( nd). What are metadata standards. http://ww w.dcc.ac.uk/ resources/bri efing -papers/standards -watch-papers/what -are- metadata - standards2. Digital Curation Centre. ( nd). DCC Charter and Statement of Principles. Retrieved from http://ww w.dcc.ac.uk/about -us/dcc- charter.3. Indiana Universit y. (2011). Indiana Universit y ’s Advanced Cyberinf rast ructure. Retri eved from http://pti.iu.edu/cyberinf rast ructure.pdf.4. Indiana Universit y. (2009). Empowering Peopl e: Indiana Universit y ’s Strategic Plan for Information Technology. Retrieved from http://ovpit.iu. edu/st rategic2/ .5. National Science Foundati on. (2011 ). Award and Administration Guide: Chapter IV C.4., Disseminati on and Sharing of Research Results. Ret ri eved from http://ww w.nsf. gov/pubs/policydocs/pappguide/nsf 1 1001/aag_6. jsp#VI D4 .6. Lawrence, S., Free online availability substantially increases a paper ’s impact, Nature, 31 May 2001. http://ww w.nat ure. com/nature/debates/e - access/Articles/lawrence.html (accessed November 5, 2008,)7. Lewis, David W. "Librar y budgets, open access, and the future of scholarl y communication: Transformati ons in academic publishing." C&RL News, May 2008, Vol. 69, No. 5. [Available at: http://ww w.ala.org /ala/mgrps/di vs/acrl/publicati ons/crlnews/ 2008/may/ALA_print _layout _1_ 47113 9_471 139. cf m ]
    38. 38. THANK YOUTell us what you think, take a brief survey.Find us @http://ulib.iupui.edu/digitalscholarshipHeather Coates, hcoates@iupui.edu, 317-278-7125Kristi Palmer, klpalmer@iupui.edu, 317-274-8230
    39. 39. EXTRA: NIH DATA SHARING POLICY $500,000 or more in direct costs in any year of the proposed research Final research data, not summary statistics or tables, not underlying pathology reports and other clinical source documents, might include both raw data and derived variables If an application describes a data -sharing plan, NIH expects that plan to be enacted. NIH expects the timely release and sharing of data to be no later than the acceptance for publication of the main findings from the final dataset. It is the responsibility of the investigators, their Institutional Review Board (IRB), and their institution to protect the rights of subjects and the confidentiality of the data. Prior to sharing, data should be redacted to strip all identifiers, and effective strategies should be adopted to minimize risks of unauthorized disclosure of personal identifiers.
    40. 40. EXTRA: NIH DATA SHARING PLAN describe briefly the expected schedule for data sharing the format of the final dataset the documentation to be provided whether or not any analytic tools also will be provided whether or not a data -sharing agreement will be required  if so, a brief description of such an agreement (including the criteria for deciding who can receive the data and whether or not any conditions will be placed on their use) mode of data sharing (e.g., under their own auspices by mailing a disk or posting data on their institutional or personal website, through a data archive or enclave) Applicants may request funds in their application for data sharing.

    ×