This document discusses the importance of digital preservation. It notes that digital data is costly to produce and can contribute to scientific progress if preserved and shared. Preserving data ensures it can be found in the future as technologies and standards change over time. The document outlines reasons to preserve data, what types of data and associated materials should be preserved, methods for ensuring long-term access and retrieval such as metadata and standards, and challenges around curation and maintenance of preserved data collections.
Just Digitise It by Daniel Wilksch of the Public Records Office Victoria. Presented at the 2018 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops
Just Digitise It by Daniel Wilksch of the Public Records Office Victoria. Presented at the 2017 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...GigaScience, BGI Hong Kong
Alexandra Basford's talk in the curation session at the InCoB meeting in Kuala Lumpar, 30/11/11 on: GigaScience: A Journal’s Perspective on Data Standards and Biocuration
Invited talk given to the National Acquisitions Group conference, 5 September 2012.
Focusing on the reasons for building the Digital Library, making the case, and the social/organisational and technological aspects of digital preservation. Not covered are aspects such as collection development, audience engagement, and resource discovery.
Just Digitise It by Daniel Wilksch of the Public Records Office Victoria. Presented at the 2018 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops
Just Digitise It by Daniel Wilksch of the Public Records Office Victoria. Presented at the 2017 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...GigaScience, BGI Hong Kong
Alexandra Basford's talk in the curation session at the InCoB meeting in Kuala Lumpar, 30/11/11 on: GigaScience: A Journal’s Perspective on Data Standards and Biocuration
Invited talk given to the National Acquisitions Group conference, 5 September 2012.
Focusing on the reasons for building the Digital Library, making the case, and the social/organisational and technological aspects of digital preservation. Not covered are aspects such as collection development, audience engagement, and resource discovery.
Just Digitise It by Daniel Wilksch (Coordinator Digital Projects, Public Record Office Victoria). Presented at the 2015 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops.
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
Status of Alien Invasive Species Information in CanadaHans Herrmann
This is a presentation to the Secretariat of the Convention on Biological Diversity, on the results of the Survey "Status of AIS information in Canada". The results shown are raw data.
Federation and Interoperability in the Nectar Research CloudOpenStack
Audience Level
Beginner
Synopsis
The Nectar Research Cloud provides an OpenStack cloud for Australia’s academic researchers. Since its inception in 2012 it has grown steadily to over 30,000 CPUs, with over 10,000 registered users from more than 50 research institutions. It is different to many clouds in being a federation across eight organisations, each of which runs cloud infrastructure in one or more data centres and contributes to a distributed help desk and user support. A Nectar core services team runs centralised cloud services. This presentation will give an overview of the experiences, challenges and benefits of running a federated OpenStack cloud and a short demonstration on using the Nectar cloud. We will also describe some current approaches that are looking to extend this federation to encompass other institutions including some in New Zealand, to extend the infrastructure using commercial cloud providers, and to move towards interoperability with the growing number of international science and research clouds through the new Open Research Cloud initiative.
Speaker Bio
Dr Paul Coddington is a Deputy Director of Nectar, responsible for the Nectar national Research Cloud, and also Deputy Director of eResearch SA. He has over 30 years experience in eResearch including computational science, high performance and distributed computing, cloud computing, software development, and research data management.
I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Legal Hold and Data Preservation Best PracticesZapproved
The basis for Legal Hold and Data Preservation Best Practices was the exceptional work by the group of presenters at the 2012 Conference on Preservation Excellence, held in Portland, Oregon in late September. The conference focused solely on the area of data preservation best practices. The presenters wanted to address a recurring issue heard at e-discovery events that found only limited attention being given to data preservation; often followed by a speaker blurting out “We could spend an entire day on preservation.” Well, we didn’t spend an entire day — we spent two.
The level of dialogue and depth of discussion on litigation holds and data preservation at the conference was unprecedented. Following the event, at which appeared many nationally recognized experts in electronic discovery and in-house practitioners from around the U.S., the consensus was that the event succeeded in demonstrating that preservation is a unique field of focus.
This Guide on “best practices” continues the goal of helping lead legal professionals on the path to excellence in legal holds and data preservation. Many organizations are working to instill sound data preservation processes and creating awareness internally among various groups of the importance of meeting the needs of the courts. However, few would rate themselves as achieving a level of excellence.
This Guide coalesces the discussions of some of the best minds in electronic discovery to discuss the Aristotelian Ideal of what litigation holds and data preservation can be, not necessarily what it is today. The real opportunity is to take the concepts and apply them in your workplace to achieve the following:
Be better and more confident at what you do.
Reduce your risk.
Lower your costs.
Strengthen your leverage when litigating.
Make your time more productive.
Make your co-workers’ time more productive.
Annoy the courts less.
You are an integral part in advancing the practice of data preservation. The knowledge you gain by reading this Guide is a concrete step in advancing the level of expertise in our community. Together we can improve how organizations of all shapes, sizes and industries approach the task of responding to a preservation obligation while building a valuable knowledge base for all to do better.
D.3.1: State of the Art - Linked Data and Digital PreservationPRELIDA Project
by D. Giaretta (APARSEN), presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
An invited talk to 40+ directors of national libraries worldwide at the annual ExLibris member meeting at IFLA (Helsinki, Finland) on August 15th, 2012.
Just Digitise It by Daniel Wilksch (Coordinator Digital Projects, Public Record Office Victoria). Presented at the 2015 Community Heritage Grants (CHG) Preservation and Collection Management Training Workshops.
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
Status of Alien Invasive Species Information in CanadaHans Herrmann
This is a presentation to the Secretariat of the Convention on Biological Diversity, on the results of the Survey "Status of AIS information in Canada". The results shown are raw data.
Federation and Interoperability in the Nectar Research CloudOpenStack
Audience Level
Beginner
Synopsis
The Nectar Research Cloud provides an OpenStack cloud for Australia’s academic researchers. Since its inception in 2012 it has grown steadily to over 30,000 CPUs, with over 10,000 registered users from more than 50 research institutions. It is different to many clouds in being a federation across eight organisations, each of which runs cloud infrastructure in one or more data centres and contributes to a distributed help desk and user support. A Nectar core services team runs centralised cloud services. This presentation will give an overview of the experiences, challenges and benefits of running a federated OpenStack cloud and a short demonstration on using the Nectar cloud. We will also describe some current approaches that are looking to extend this federation to encompass other institutions including some in New Zealand, to extend the infrastructure using commercial cloud providers, and to move towards interoperability with the growing number of international science and research clouds through the new Open Research Cloud initiative.
Speaker Bio
Dr Paul Coddington is a Deputy Director of Nectar, responsible for the Nectar national Research Cloud, and also Deputy Director of eResearch SA. He has over 30 years experience in eResearch including computational science, high performance and distributed computing, cloud computing, software development, and research data management.
I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Legal Hold and Data Preservation Best PracticesZapproved
The basis for Legal Hold and Data Preservation Best Practices was the exceptional work by the group of presenters at the 2012 Conference on Preservation Excellence, held in Portland, Oregon in late September. The conference focused solely on the area of data preservation best practices. The presenters wanted to address a recurring issue heard at e-discovery events that found only limited attention being given to data preservation; often followed by a speaker blurting out “We could spend an entire day on preservation.” Well, we didn’t spend an entire day — we spent two.
The level of dialogue and depth of discussion on litigation holds and data preservation at the conference was unprecedented. Following the event, at which appeared many nationally recognized experts in electronic discovery and in-house practitioners from around the U.S., the consensus was that the event succeeded in demonstrating that preservation is a unique field of focus.
This Guide on “best practices” continues the goal of helping lead legal professionals on the path to excellence in legal holds and data preservation. Many organizations are working to instill sound data preservation processes and creating awareness internally among various groups of the importance of meeting the needs of the courts. However, few would rate themselves as achieving a level of excellence.
This Guide coalesces the discussions of some of the best minds in electronic discovery to discuss the Aristotelian Ideal of what litigation holds and data preservation can be, not necessarily what it is today. The real opportunity is to take the concepts and apply them in your workplace to achieve the following:
Be better and more confident at what you do.
Reduce your risk.
Lower your costs.
Strengthen your leverage when litigating.
Make your time more productive.
Make your co-workers’ time more productive.
Annoy the courts less.
You are an integral part in advancing the practice of data preservation. The knowledge you gain by reading this Guide is a concrete step in advancing the level of expertise in our community. Together we can improve how organizations of all shapes, sizes and industries approach the task of responding to a preservation obligation while building a valuable knowledge base for all to do better.
D.3.1: State of the Art - Linked Data and Digital PreservationPRELIDA Project
by D. Giaretta (APARSEN), presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
An invited talk to 40+ directors of national libraries worldwide at the annual ExLibris member meeting at IFLA (Helsinki, Finland) on August 15th, 2012.
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to microbes. Overview of work underway to add applications and computational analysis pipelines to iPlant for metagenomics and microbial ecology.
RDAP13 John Kunze: The Data Management EcosystemASIS&T
John Kunze, University of California, Curation Center
California Digital Library (CDL)
The Data Management Ecosystem
Panel: Partnerships between institutional repositories, domain repositories, and publishers
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Research Cyberinfrastructure at UCSD - David Minor - RDAP12ASIS&T
Research Cyberinfrastructure at UCSD
David Minor
UC San Diego Libraries San Diego Supercomputer Center
Presentation at Research Data Access & Preservation Summit
22 March 2012
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...Ardan Patwardhan
The open and public access to structural data is of utmost importance for validation, development, testing and training. The Electron Microscopy Data Bank (EMDB) archive is the authoritative source for 3DEM data. In 2014 PDBe started EMPIAR – the electron microscopy pilot image archive to store raw image data related to EMDB structures. The challenge here has been in dealing with the storage and transfer of large datasets. EMPIAR is now fully functional with routine uploads and downloads in the Terabyte range. The success of EMPIAR has spurred interest in wider bio-imaging circles as a working example of image archiving and possibly even a prototype for a broader bio-imaging archive. I will describe EMPIAR and discuss the prospects for public archiving of bio-imaging data.
If Big Data is data that exceeds the processing capacity of conventional systems, thereby necessitating alternative processing measures, we are looking at an essentially technological challenge that IT managers are best equipped to address.
The DCC is currently working with 18 HEIs to support and develop their capabilities in the management of research data and, whilst the aforementioned challenge is not usually core to their expressed concerns, are there particular issues of curation inherent to Big Data that might force a different perspective?
We have some understanding of Big Data from our contacts in the Astronomy and High Energy Physics domains, and the scale and speed of development in Genomics data generation is well known, but the inability to provide sufficient processing capacity is not one of their more frequent complaints.
That’s not to say that Big Science and its Big Data are free of challenges in data curation; only that they are shared with their lesser cousins, where one might say that the real challenge is less one of size than diversity and complexity.
This brief presentation explores those aspects of data curation that go beyond the challenges of processing power but which may lend a broader perspective to the technology selection process.
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
Analyzing and assessing ecological transition in building sustainable citiesBeniamino Murgante
"Analyzing and assessing ecological transition in building sustainable cities" Keynote presentation at "International Conference on Sustainable Environment and Technologies" 23 September 2022, Nicolas Tesla University Union, Belgrade, Serbia
Smart Cities: New Science for the Cities
Beniamino Murgante
School of Engineering, University of Basilicata
Lecture at the Department of Community and Regional Planning
Smart Cities course - Professor Alenka Poplin
Keynote at the 24th International Conference on Urban Planning and Regional Development in the Information Society
GeoMultimedia 2019, 2-4 April 2019
Karlsruhe Institute of Technology, Germany
Involving citizens in smart energy approaches: the experience of an energy pa...Beniamino Murgante
Involving citizens in smart energy approaches: the experience of an energy park in Calvello municipality
4th International Conference on Urban e-Planning, University of Lisbon, 23-24 April 2019
Programmazione per la governance territoriale in tema di tutela della biodive...Beniamino Murgante
Programmazione per la governance territoriale in tema di tutela della biodiversità - Sabrina Lai - Regione Sardegna, Direzione generale della difesa dell’ambiente slai@regione.sardegna.it
Università degli Studi di Cagliari, DICAAR, sabrinalai@unica.it
RISCHIO TERRITORIALE NEL GOVERNO DEL TERRITORIO: Ricerca e formazione nelle s...Beniamino Murgante
RISCHIO TERRITORIALE NEL GOVERNO DEL TERRITORIO: Ricerca e formazione nelle scuole di ingegneria
Giuseppe Las Casas, Beniamino Murgante, Francesco Scorza
UrbIng 2016
GEOGRAPHIC INFORMATION – NEED TO KNOW (GI-N2K) Towards a more demand-driven g...Beniamino Murgante
GEOGRAPHIC INFORMATION – NEED TO KNOW (GI-N2K) Towards a more demand-driven geospatial workforce education/training system
Mauro Salvemini, Giuliana Vitiello, Monica Sebillo, Sergio Farruggia. Beniamino Murgante
Focussing Energy Consumers’ Behaviour Change towards Energy Efficiency and Lo...Beniamino Murgante
Focussing Energy Consumers’ Behaviour Change towards Energy Efficiency and Low Carbon Economy: Perspective for Policy Making, Transnational Cooperation and Research.
Beniamino Murgante, Francesco Scorza,
Alessandro Attolico, Federico Amato
Presented at the REAL CORP 2016 - 21st International Conference on Urban Planning
and Regional Development in the Information Society
GEOGRAPHIC INFORMATION – NEED TO KNOW (GI-N2K) Towards a more demand-driven g...Beniamino Murgante
GEOGRAPHIC INFORMATION – NEED TO KNOW (GI-N2K) Towards a more demand-driven geospatial workforce education/training system
Mauro Salvemini, Francesco Di Massa, Monica Sebillo, Sergio Farruggia. Beniamino Murgante
Garden in motion. An experience of citizens involvement in public space regen...Beniamino Murgante
Garden in motion. An experience of citizens involvement in public space regeneration.
Sara Lorusso, Gerardo Sassano, Michele Scioscia, Antonio Graziadei, Pasquale Passannante, Sara Bellarosa, Francesco Scaringi, Beniamino Murgante
Fino alla fine degli anni '80 un urbanista che cercava di supportare dei ragionamenti di piano con l'informatica riusciva ad ottenere, nel migliore dei casi, qualche dato statistico sulla popolazione. Con il trascorrere degli anni si è assistito ad un incremento dell'utilizzo delle tecnologie per la costruzione dei quadri conoscitivi a supporto del processo di piano, fino a raggiungere l'attuale Information Explosion Era.
Il contenuto dell'intervento si baserà su aspetti teorici ed applicativi a partire dall'esperienza di Ian McHarg fino all'ultima "moda" delle Smart Cities.
Introduzione
Andreina Maahsen-Milan
Università di Bologna
Tecnologie, Territorio, Smartness
Beniamino Murgante
Università della Basilicata
Facoltà Ingegneria Edile di Ravenna - Università di Bologna
Via Tombesi dall'Ova 55, 48121 Ravenna
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
6. Data deluge
• At end of 2011 – info created and replicated > 1.8 zettabytes
• 90% data created in the last 2 years
• 5 hour flight – 240 Tbytes
• Facebook – 200 million users, >70 languages
• Each person in England is filmed 300 times/day
• Teenagers in the US send average 110 phone text messages a day
=> We need to build arks during the deluge - PRESERVATION
7. Outline
• Why preserve?
• What to preserve?
• How to preserve?
• Where to preserve?
And a few associated challenges
8. Outline
• Why preserve?
• What to preserve?
• How to preserve?
• Where to preserve?
And a few associated challenges
9. WHY PRESERVE
• Costly to produce
• Contribute to progress of science
• Intrinsic value
culture/science/sustainability
10. WHY PRESERVE
• Costly to produce
– Infrastructure, power, software, models, visualization,
people
– Hardware, Software, Peopleware
• Contribute to progress of science
– Reproducibility and reusability
– Publication and sharing
– Quality
• Intrinsic value culture/science/sustainability
– Digital humanities
– Domesday project
– Fonoteca Neotropical Jacques Vieillard
11. WHY PRESERVE
• Costly to produce
– Infrastructure, power, software, models, visualization,
people
– Hardware, Software, Peopleware
• Contribute to progress of science
– Reproducibility and reusability
– Publication and sharing
– Quality
• Intrinsic value culture/science/sustainability
– Digital humanities
– Domesday project
– Fonoteca Neotropical Jacques Vieillard
12. WHY PRESERVE
• Costly to produce
– Infrastructure, power, software, models, visualization,
people
– Hardware, Software, Peopleware
• Contribute to progress of science
– Reproducibility and reusability
– Publication and sharing
– Quality
• Intrinsic value culture/science/sustainability
– Digital humanities
– Domesday project
– Fonoteca Neotropical Jacques Vieillard
13. The Domesday Project 1086-1986
• Digital decay
• Equipment obsolescence
• Software obsolescence
20. What to preserve?
• Data
• BUT what is “data”?
– Files and records
– Models, documentation, annotations, sketches,
experiments, recordings
• Only data?
21. What to preserve?
• Data
• BUT what is “data”?
– Files and records
– Models, documentation, annotations, sketches,
experiments, recordings
• Only data?
– How produced it – workflows, devices,
methodologies, materials and methods,
reasonings, logs --- provenance
22. What to preserve?
• Data
• Environment in which was produced
• Data needed to preserve occupies more space
than the data itself
• Preservation means storing more than object
itself
23. What about our research data?
(slide adapted from Jim Gray)
Experiments
Instruments
Files Questions
Papers Answers
Simulations
Models
DATA
Data-driven science “Collaboratory”
23/10000
24. Data sources?
Table of Product Characteristics
id Property name Value
MilkProd productsrep MilkA
MilkProd quantity 10000
MilkProd validity date 10/06/2006
CheeseProd productsr Minas
CheeseProd epquantity 2000
CheeseProd validity date 12/02/2006
CheeseProd shape Circular
24/10000
30. How to preserve?
How to construct the ark during the
deluge?
Presaervare, Manutenere and Share
31. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
• To ensure quality
– Curation procedures
• To afford maintenance costs
– Cloud? CAP theorem?
32. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
• To ensure quality
– Curation procedures
• To afford maintenance costs
– Cloud? CAP theorem?
33. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
• To ensure quality
– Curation procedures
• To afford maintenance costs
– Cloud? CAP theorem?
34. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
• To ensure quality
– Curation procedures, metadata,standards
• To afford maintenance costs
– Cloud? CAP theorem?
35. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
• To ensure quality
– Curation procedures,metadata, standards
• To afford maintenance costs
– Cloud? CAP theorem? ======= WHERE
36. How to preserve?
• To ensure retrievability and sharing
– Index structures
– Ontologies, metadata, keywords, standards
– Workflows
• To ensure longevity
– Media decay, software decay, hardware decay
– PEOPLE DECAY
• To ensure quality
– Curation procedures,metadata, standards
• To afford maintenance costs
– Cloud? CAP theorem? ======= WHERE
37. Sharing and open access
NSF Data Management Policy
Paper and data publication
38.
39. Sharing of Data Leads to Progress on Alzheimer’s
By GINA KOLATA
Published: August 12, 2010
= NEW YORK TIMES
In 2003, a group of scientists and executives from the National Institutes of Health, the Food and
Drug Administration, the drug and medical-imaging industries, universities and nonprofit groups
joined in a project that experts say had no precedent: a collaborative effort to find the biological
markers that show the progression of Alzheimer’s disease in the human brain.
share all the data, making every single
finding public immediately, available to
anyone with a computer anywhere in the
world
=> AVAILABILITY and REUSE
40. • Data must be properly curated throughout its
life-cycle and released with the appropriate
high-quality metadata.
• Medical Research Council UK
40/10000
41. • Research data should be made available for
use by other researchers. Researchers must
retain research data, including electronic data,
in a durable, indexed and retrievable form.
• Australian Govnmt National Health and
Medical Research Council
41/10000
44. • Citing data is as important as citing papers
• For researchers, publishers, data centers
• Over 1M DOI, several major national research
libraries
– Germany, France, Korea, Netherlands, Australia,
USA...
• Present manager – German National Library of
Science and Technology
44/10000
45. Publish on the Cloud
Add metadata
Pre-print sharing
45/10000
46. FNJV
proj.lis.ic.unicamp.br/fnjv
• Sharing by publishing on the Web
• Retrievability by extending metadata
46/10000
54. Outline
• Why preserve?
• What to preserve?
• How to preserve?
• Where to preserve?
And a few associated challenges
PRE-SAVE and MANU-TENERE
55. Outline
• Why preserve?
– Costly to produce (hardware, software, peopleware)
– Contribute to progress of science
– Value – culture, science, sustainability
• What to preserve?
– Data [WHAT IS DATA?]
– Context of production and use
• How to preserve?
– Accessibility and sharing – standards, metadata,
ontologies
– Integrity and quality – context to use (hw, sw),
standards
57. References
NSF – CISE Data management policy
The Domesday Project
http://www.atsf.co.uk/dottext/domesday.html
The CLARIN Project (languages)
Eigenfactor.org
Altmetrics movement