Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Lesson 2 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 4 in the EDI Data Publishing training course. In this module, you will learn how to group your data files and other information products into a publishable unit.
Lesson 8 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 2 in the EDI Data Publishing training course. In this module, you will learn about the Environmental Data Initiative, the project that created these trainings. EDI operates the EDI Data Repository and has curators on staff to help scientists deposit their data.
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Lesson 2 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 4 in the EDI Data Publishing training course. In this module, you will learn how to group your data files and other information products into a publishable unit.
Lesson 8 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 2 in the EDI Data Publishing training course. In this module, you will learn about the Environmental Data Initiative, the project that created these trainings. EDI operates the EDI Data Repository and has curators on staff to help scientists deposit their data.
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
This is module 6 in the EDI Data Publishing training course. In this module, you will learn how to create quality metadata and be introduced to the landscape of data repositories and their functions.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
Introduction to the Environmental Data Initiative (EDI)Corinna Gries
The Environmental Data Initiative enables the environmental science community to maximize knowledge development through the reusability of FAIR environmental data by providing curation services, training, and a robust and modern data repository.
Please cite as: Gries, Corinna. (2018, December). Introduction to the Environmental Data Initiative (EDI) (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.4672376
S. Venkataraman (DCC) talks about the basics of Research Data Management and how to apply this when creating or reviewing a Data Management Plan (DMP). He discusses data formats and metadata standards, persistent identifiers, licensing, controlled vocabularies and data repositories.
link to : dcc.ac.uk/resources
Spring 2014 Data Management Lab: Session 1 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data management is a key skill in the age of large, complex data sets. Collaborative research makes the process of managing research data harder. This presentation will cover some key features of the Open Science Framework that facilitate collaborative research.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
DataONE Education Module 10: Legal and Policy IssuesDataONE
Lesson 10 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
DataONE Education Module 09: Analysis and WorkflowsDataONE
Lesson 9 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 10 in the EDI Data Publishing training course. In this module, you will receive an introduction to what a data package is, how DOIs are assigned to data packages, and the repository's steps to insert a data package.
Who owns the data? Intellectual property considerations for academic research...Rebekah Cummings
Intellectual property (IP) is often complicated but is even more so as it pertains to data, as “facts” are not eligible for copyright protection under United States copyright law. The IP issues surrounding data in academic research environments are often exacerbated by the fact that data ownership has rarely been discussed in university environments prior to NSF’s data management plan requirement in 2011. Researchers retained custody over their datasets and other stakeholders – namely universities and funding agencies – rarely contested ownership. Now, as datasets are increasingly seen as valuable outputs of research alongside publications, questions of data ownership are coming to the fore. This presentation will frame the complex issues surrounding data ownership in an academic research setting and will discuss strategies for educating and advising your researchers on intellectual property issues related to research data.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
This is module 6 in the EDI Data Publishing training course. In this module, you will learn how to create quality metadata and be introduced to the landscape of data repositories and their functions.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
Introduction to the Environmental Data Initiative (EDI)Corinna Gries
The Environmental Data Initiative enables the environmental science community to maximize knowledge development through the reusability of FAIR environmental data by providing curation services, training, and a robust and modern data repository.
Please cite as: Gries, Corinna. (2018, December). Introduction to the Environmental Data Initiative (EDI) (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.4672376
S. Venkataraman (DCC) talks about the basics of Research Data Management and how to apply this when creating or reviewing a Data Management Plan (DMP). He discusses data formats and metadata standards, persistent identifiers, licensing, controlled vocabularies and data repositories.
link to : dcc.ac.uk/resources
Spring 2014 Data Management Lab: Session 1 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data management is a key skill in the age of large, complex data sets. Collaborative research makes the process of managing research data harder. This presentation will cover some key features of the Open Science Framework that facilitate collaborative research.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
DataONE Education Module 10: Legal and Policy IssuesDataONE
Lesson 10 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
DataONE Education Module 09: Analysis and WorkflowsDataONE
Lesson 9 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This is module 10 in the EDI Data Publishing training course. In this module, you will receive an introduction to what a data package is, how DOIs are assigned to data packages, and the repository's steps to insert a data package.
Who owns the data? Intellectual property considerations for academic research...Rebekah Cummings
Intellectual property (IP) is often complicated but is even more so as it pertains to data, as “facts” are not eligible for copyright protection under United States copyright law. The IP issues surrounding data in academic research environments are often exacerbated by the fact that data ownership has rarely been discussed in university environments prior to NSF’s data management plan requirement in 2011. Researchers retained custody over their datasets and other stakeholders – namely universities and funding agencies – rarely contested ownership. Now, as datasets are increasingly seen as valuable outputs of research alongside publications, questions of data ownership are coming to the fore. This presentation will frame the complex issues surrounding data ownership in an academic research setting and will discuss strategies for educating and advising your researchers on intellectual property issues related to research data.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
Introduction to research data management. Presented by Natasha Simons at the C3DIS post conference workshop: Managed data – trusted research: an introduction to Research Data Management, Melbourne 31st may 2018
Opening/Framing Comments: John Behrens, Vice President, Center for Digital Data, Analytics, & Adaptive Learning Pearson
Discussion of how the field of educational measurement is changing; how long held assumptions may no longer be taken for granted and that new terminology and language are coming into the.
Panel 1: Beyond the Construct: New Forms of Measurement
This panel presents new views of what assessment can be and new species of big data that push our understanding for what can be used in evidentiary arguments.
Marcia Linn, Lydia Liu from UC Berkeley and ETS discuss continuous assessment of science and new kinds of constructs that relate to collaboration and student reasoning.
John Byrnes from SRI International discusses text and other semi-structured data sources and different methods of analysis.
Kristin Dicerbo from Pearson discusses hidden assessments and the different student interactions and events that can be used in inferential processes.
Panel 2: The Test is Just the Beginning: Assessments Meet Systems Context
This panel looks at how assessments are not the end game, but often the first step in larger big-data practices at districts/state/national levels.
Gerald Tindal from the University of Oregon discusses State data systems and special education, including curriculum-based measurement across geographic settings.
Jack Buckley Commissioner of the National Center for Educational Statistics discussing national datasets where tests and other data connect.
Lindsay Page, Will Marinell from the Strategic Data Project at Harvard discussing state and district datasets used for evaluating teachers, colleges of education, and student progress.
Panel 3: Connecting the Dots: Research Agendas to Integrate Different Worlds
This panel will look at how research organizations are viewing the connections between the perspectives presented in Panels 1 and 2; what is known, what is still yet to be discovered in order to achieve the promised of big connected data in education.
Andrea Conklin Bueschel Program Director at the Spencer Foundation
Ed Dieterle Senior Program Officer at the Bill and Melinda Gates Foundation
Edith Gummer Program Manager at National Science Foundation
Meeting the NSF DMP Requirement June 13, 2012IUPUI
June 13 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
Data Management for Research (New Faculty Orientation)aaroncollie
Situates research data management as a contingency that should be addressed and provisioned for during planning and research design. Draws out fundamental practices for file management, data description, and enumerates storage decision points.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
This presentation was provided by Lisa Johnston, University of Minnesota, for a NISO Virtual Conference on data curation held on Wednesday, August 31, 2016
Data Management Lab: Session 3 slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Similar to DataONE Education Module 01: Why Data Management? (20)
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
1. Why Data Management
Lesson 1: Introduction to Data Management
Why Data Management?
CCimagebyUniversityofMarylandPressReleasesonFlickr
2. Why Data Management
• The data world around us
• Importance of data management
• The data lifecycle
• The case for data management
CCimagebyinterpunctonFlickr
3. Why Data Management
After completing this lesson, the participant will be able to:
• Give two general examples of why increasing amounts of
data is a concern
• Explain, using two examples, how lack of data management
makes an impact
• Define the research data lifecycle
• Give one example of how well-managed data can result in
new scientific conclusions
6. Why Data Management
Photocourtesyofwww.carboafrica.net
Data are collected from sensors, sensor
networks, remote sensing, observations,
and more - this calls for increased attention
to data management and stewardship
Photocourtesyof
http://modis.gsfc.nasa.gov/
Photocourtesyof
http://www.futurlec.com
CCimagebytajaionFlickr
CCimagebyCIMMYTonFlickr
ImagecollectedbyVivHutchinson
7. Source: John Gantz, IDC Corporation: The Expanding Digital Universe
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
2005 2006 2007 2008 2009 2010
Transient
information
or unfilled
demand for
storage
Information
Available Storage
PetabytesWorldwide
8. Why Data Management
• Natural disaster
• Facilities infrastructure failure
• Storage failure
• Server hardware/software failure
• Application software failure
• External dependencies (e.g. PKI
failure)
• Format obsolescence
• Legal encumbrance
• Human error
• Malicious attack by human or
automated agents
• Loss of staffing competencies
• Loss of institutional commitment
• Loss of financial stability
• Changes in user expectations and
requirements
CCimagebySharynMorrowonFlickr
CCimagebymomboleumonFlickr
10. “MEDICARE PAYMENT ERRORS NEAR $20B” (CNN) December 2004
Miscoding and billing errors from doctors and hospitals totaled $20 billion in FY 2003 (9.3%
error rate). The error rate measured claims that were paid despite being medically
unnecessary, inadequately documented, or improperly coded. This error rate actually was an
improvement over the previous fiscal year (9.8% error rate).
“AUDIT: JUSTICE STATS ON ANTI-TERROR CASES FLAWED” (AP) February 2007
The Justice Department Inspector General found only two sets of data out of 26 concerning
terrorism attacks were accurate. The Justice Department uses these statistics to argue for
their budget. The Inspector General said the data “appear to be the result of decentralized
and haphazard methods of collections … and do not appear to be intentional.”
“SOCIAL SECURITY DATA CAN TURN PEOPLE INTO THE LIVING DEAD” (NPR) August 2016
In 2011, an audit found that about 1,000 people a month in the U.S. were marked deceased
when they were very much alive. Rona Lawson, who works in the Office of the Inspector
General at the Social Security Administration, says that number has gone down. It's now
around 500 people a month. Lawson says 90 percent of the time, the cascade of
misinformation starts with an input error by Social Security staff — a regular mistake on a
regular office day that just happens to kill a person off, at least on paper.
Slide courtesy of BLM
11. A wildlife biologist for a small field office was the in-house GIS
expert and provided support for all the staff’s GIS needs. However,
the data were stored on her own workstation. When the biologist
relocated to another office, no one understood how the data were
stored or managed.
Solution: A state office GIS specialist retrieved the workstation and
sifted through files trying to salvage relevant data.
Cost: 1 work month ($4,000) plus the value of
data that were not recovered
Consider that the situation could have been worse, because the data
were not being backed up as they would have been if stored on a server.
Poor Science Data Management Example
12. Why Data Management
In preparation for a Resource Management Plan, an office
discovered 14 duplicate GPS inventories of roads. However,
because none of the inventories had enough metadata, it was
impossible to know which inventory was best or if any of the
inventories actually met their requirements.
Solution: Re-Inventory roads
Cost: Estimated 9 work months/inventory
@$4,000/work month
(14 inventories = $504,000)
CCimagebyruffin_readyonFlickr
14. Why Data Management
“Please forgive my paranoia about protocols, standards, and data review. I'm
in the latter stages of a long career with USGS (30 years, and counting), and
have experienced much. Experience is the knowledge you get just after you
needed it.
Several times, I've seen colleagues called to court in order to testify about
conditions they have observed.
Without a strong tradition of constant review and approval of basic data, they
would've been in deep trouble under cross-examination. Instead, they were
able to produce field notes, data approval records, and the like, to back up
their testimony.
It's one thing to be questioned by a college student who is working on a
project for school. It's another entirely to be grilled by an attorney under oath
with the media present.”
- Nelson Williams, Scientist
US Geological Survey
15. Why Data Management
The climate scientists at the centre of a media storm
over leaked emails were yesterday cleared of
accusations that they fudged their results and silenced
critics, but a review found they had failed to be open
enough about their work.
16. Why Data Management
• Manage your data for yourself:
o Keep yourself organized – be able to find your files (data
inputs, analytic scripts, outputs at various stages of the
analytic process, etc.)
o Track your science processes for reproducibility – be able to
match up your outputs with exact inputs and
transformations that produced them
o Better control versions of data – easily identify versions
that can be periodically purged
o Quality control your data more efficiently
17. Why Data Management
• To avoid data loss (e.g. making backups)
• Format your data for re-use (by yourself or others)
• Be prepared: Document your data for your own
recollection, accountability, and re-use (by yourself or
others)
• Gain credibility and recognition for your science efforts
through data sharing!
CCimagebyUWWResNetonFlickr
18. Why Data Management
• Data is a valuable asset – it is expensive and time consuming to
collect
• Data should be managed to:
o maximize the effective use and value of data and information
assets
o continually improve the quality including: data accuracy, integrity,
integration, timeliness of data capture and presentation,
relevance, and usefulness
o ensure appropriate use of data and information
o facilitate data sharing
o ensure sustainability and accessibility in long term for re-use in
science
19.
20.
21.
22. Why Data Management
Here are a few reasons (from the UK Data Archive):
• Increases the impact and visibility of research
• Promotes innovation and potential new data uses
• Leads to new collaborations between data users and creators
• Maximizes transparency and accountability
• Enables scrutiny of research findings
• Encourages improvement and validation of research methods
• Reduces cost of duplicating data collection
• Provides important resources for education and training
23. Why Data Management
Spatio-Temporal Exploratory
Models predict the
probability of occurrence of
bird species across the United
States at a 35 km x 35 km
grid.
Land Cover
Potential Uses-
• Examine patterns of migration
• Infer impacts of climate change
• Measure patterns of habitat usage
• Measure population trends
Model results
eBird
Meteorology
MODIS –
Remote
sensing data
Occurrence of Indigo Bunting (2008)
Jan Sep DecJunApr
Slide courtesy of DataOne
24. Why Data Management
A new image processing technique reveals something not before seen in this Hubble Space
Telescope image taken 11 years ago: A faint planet (arrows), the outermost of three
discovered with ground-based telescopes last year around the young star HR 8799.D.
Lafrenière et al., Astrophysical Journal Letters.
“The first thing it tells you is how valuable maintaining long-term archives can be. Here is a major
discovery that’s been lurking in the data for about 10 years!” comments Matt Mountain, director
of the Space Telescope Science Institute in Baltimore, which operates Hubble.
“The second thing its tells you is having a well calibrated archive is necessary but not sufficient to
make breakthroughs — it also takes a very innovative group of people to develop very smart
extraction routines that can get rid of all the artifacts to reveal the planet hidden under all that
telescope and detector structure.”
D.Lafrenièreetal.,ApJLetters
26. Why Data Management
• …there are best practices… and… tools to help!
• The following data management lessons will illustrate in
detail each stage of the data lifecycle
• Your well-managed and accessible data can contribute to
science in ways you may not even imagine today!
27. Why Data Management
• If data are:
o Well-organized
o Documented
o Preserved
o Accessible
o Verified as to accuracy and validity
• Result is:
o High quality data
o Easy to share and re-use in science
o Citation and credibility to the researcher
o Cost-savings to science
28. Why Data Management
• The data deluge has created a surge of information that
needs to be well-managed and made accessible.
• The cost of not doing data management can be very high.
• Be cognizant of best practices and tools associated with the
data lifecycle to manage your data well.
• Many benefits are associated with the act of managing
data, including the ability to find, access, understand,
integrate, and re-use data.
29. Why Data Management
1. Chatfield, T., Selbach, R. February 2011. Data Management Training
Workshop. Bureau of Land Management (BLM).
2. Strasser, Carly. February 2012. Data Management for Scientists.
http://www.slideshare.net/carlystrasser/oceansciences2012workshop
3. UK Data Archive. May 2011. Managing and Sharing Data: Best
Practices for Researchers. http://www.data-
archive.ac.uk/media/2894/managingsharing.pdf
4. DAMA International, The DAMA Guide to the Data Management Body
of Knowledge. https://www.dama.org/content/body-knowledge
30. Why Data Management
The full slide deck may be downloaded from:
http://www.dataone.org/education-modules
Suggested citation:
DataONE Education Module: Data Management. DataONE.
Retrieved Nov 16, 2016. From
http://www.dataone.org/sites/all/documents/L01_DataManage
ment.pptx
Copyright license information:
No rights reserved; you may enhance and reuse for
your own purposes. We do ask that you provide
appropriate citation and attribution to DataONE.
Editor's Notes
The topics covered in this module will answer the questions:
Data is being generated in massive quantities daily. Improvements in technology enable higher precision and coverage in data acquisition and makes higher capacity systems store and migrate more data –increasing the importance of managing, integrating, and re-using data.
Manage your data for yourself: Keep yourself organized – be able to find your files (data inputs, analytic scripts, outputs at various stages of the analytic process, etc) Track your science processes for reproducibility – be able to match up your outputs with exact inputs and transformations that produced themBetter control versions of data – identify easily versions that can be periodically purgedQuality control your data more efficiently
Manage your data for yourself: Make backups to avoid data lossFormat your data for re-use (by yourself or others)Be prepared: Document your data for your own recollection and re-use (by yourself or others) Prepare it to share it – gain credibility and recognition for your science efforts
Data is a valuable asset – it is expensive and time consuming to collect Data should be managed to:maximize the effective use and value of data and information assetscontinually improve the quality including: data accuracy, integrity, integration, timeliness of data capture and presentation, relevance and usefulnessensure appropriate use of data and informationfacilitate data sharingensure sustainability and accessibility in long term for re-use in science
Data management and organization facilitate archiving, sharing and publishing data. These activities feed data re-use and reproducibility in science.
By re-using data collected from a variety of sources – eBird database, land cover data, meteorology, and remotely sensed -- this project was able to compile and process the data using supercomputering to determine bird migration routes for particular species.
There is an abundance of data and metadata (if it is done) end up in filing cabinets, on discarded hard drives, in hard-copy journals on the library shelves -- or on the web, but many are subscription only journals.
Data should be properly managed and eventually be placed where they are accessible, understandable, and re-usable.
A data lifecycle illustrates stages thru which well-managed data passes from the inception of a research project to its conclusion. In the reality of science research, the stages do not always follow a continuous circle.
For each stage of the data lifecycle…there are best practices…..and….tools to help! The following data management lessons will illustrate in detail each stage of the data lifecycle Your well-managed and accessible data can contribute to science in ways you may not even imagine today!
The data deluge has created a surge of information that needs to be well-managed and made accessible.The cost of not doing data management is very high. Be cognizant of best practices and tools associated with the data lifecycle to manage your data well. Many benefits are associated with the act of managing data, including the ability to find, access, understand, integrate and re-use data.
For each stage of the data lifecycle…there are best practices…..and….tools to help! The following data management lessons will illustrate in detail each stage of the data lifecycle Your well-managed and accessible data can contribute to science in ways you may not even imagine today!