myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.
This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.
This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.
Opening talk at the "Interdisciplinary Data Resources to Address the Challenges of Urban Living” Workshop at the Urban Big Data Centre, University of Glasgow, 4 April 2016
Digital Identity is fundamental to collaboration in bioinformatics research and development because it enables attribution, contribution, publication to be recorded and quantified.
However, current models of identity are often obsolete and have problems capturing both small contributions "microattribution" and large contributions "mega-attribution" in Science. Without adequate identity mechanisms, the incentive for collaboration can be reduced, and the utility of collaborative social tools hindered.
Using examples of metabolic pathway analysis with the taverna workbench and myexperiment.org, this talk will illustrate problems and solutions to identifying scientists accurately and effectively in collaborative bioinformatics networks on the Web.
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Cyberinfrastructure Day 2010: Applications in BiocomputingJeremy Yang
UNM Cyberinfrastructure Day 2010 presentation: Applications in Biocomputing, biomedical and cheminformatics research computing cyberinfrastructure issues.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Carole Goble
Presented at Digital Life 2018, Bergen, March 2018. In the Trust and Accountability session.
In recent years we have seen a change in expectations for the management and availability of all the outcomes of research (models, data, SOPs, software etc) and for greater transparency and reproduciblity in the method of research. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for stewardship [1] have proved to be an effective rallying-cry for community groups and for policy makers.
The FAIRDOM Initiative (FAIR Data Models Operations, http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards and sensitivity to asset sharing and credit anxiety. Our aim is a FAIR Research Commons that blends together the doing of research with the communication of research. The Platform has been installed by over 30 labs/projects and our public, centrally hosted FAIRDOMHub [2] supports the outcomes of 90+ projects. We are proud to support projects in Norway’s Digital Life programme.
2018 is our 10th anniversary. Over the past decade we learned a lot about trust between researchers, between researchers and platform developers and curators and between both these groups and funders. We have experienced the Tragedy of the Commons but also seen shifts in attitudes.
In this talk we will use our experiences in FAIRDOM to explore the political, economic, social and technical, social practicalities of Trust.
[1] Wilkinson et al (2016) The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
[2] Wolstencroft, et al (2016) FAIRDOMHub: a repository and collaboration environment for sharing systems biology research Nucleic Acids Research, 45(D1): D404-D407. DOI: 10.1093/nar/gkw1032
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
This presentation will provide an overview of issues in digital preservation. Presentation was delivered during the joint DPE/Planets/CAPAR/nestor training event, ‘The Preservation challenge: basic concepts and practical applications’ (Barcelona, March 2009)
What to curate? Preserving and Curating Software-Based Artneilgrindley
This is a presentation given at the CHArt (Computers and History of Art) conference held in London in November 2011. The slides on the title page are images taken from works exhibited at the V&A Decode exhibition.
Text-Fabric: how to do text research in a FAIR way.
Text is one of the simplest and most common data types in computer science.
But there is a lot in text that does not meet the eye, and so people have been annotating texts, century-by-century.
When you research texts, you consume and produce such annotations.
Suddenly you find yourself in the midst of a big fabric of thoughts, contributed by many authors.
Text-Fabric is a tool that helps you to follow the threads that came before you and to weave a few of your own and add them to the scholarly record.
I'll show you how that looks for clay tablets of the Uruk period (the oldest writing on earth), the much more recent Hebrew Bible, and the ultramodern General Missives of the VOC time.
Towards TextPy, a module for processing text.
If we define annotated text as a graph with additional structure, we can make text processing more efficient, in the same way that Pandas makes processing dataframes more efficient.
More Related Content
Similar to 2010 Digital Humanities London - Evolution of Preservation
Opening talk at the "Interdisciplinary Data Resources to Address the Challenges of Urban Living” Workshop at the Urban Big Data Centre, University of Glasgow, 4 April 2016
Digital Identity is fundamental to collaboration in bioinformatics research and development because it enables attribution, contribution, publication to be recorded and quantified.
However, current models of identity are often obsolete and have problems capturing both small contributions "microattribution" and large contributions "mega-attribution" in Science. Without adequate identity mechanisms, the incentive for collaboration can be reduced, and the utility of collaborative social tools hindered.
Using examples of metabolic pathway analysis with the taverna workbench and myexperiment.org, this talk will illustrate problems and solutions to identifying scientists accurately and effectively in collaborative bioinformatics networks on the Web.
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Cyberinfrastructure Day 2010: Applications in BiocomputingJeremy Yang
UNM Cyberinfrastructure Day 2010 presentation: Applications in Biocomputing, biomedical and cheminformatics research computing cyberinfrastructure issues.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Carole Goble
Presented at Digital Life 2018, Bergen, March 2018. In the Trust and Accountability session.
In recent years we have seen a change in expectations for the management and availability of all the outcomes of research (models, data, SOPs, software etc) and for greater transparency and reproduciblity in the method of research. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for stewardship [1] have proved to be an effective rallying-cry for community groups and for policy makers.
The FAIRDOM Initiative (FAIR Data Models Operations, http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards and sensitivity to asset sharing and credit anxiety. Our aim is a FAIR Research Commons that blends together the doing of research with the communication of research. The Platform has been installed by over 30 labs/projects and our public, centrally hosted FAIRDOMHub [2] supports the outcomes of 90+ projects. We are proud to support projects in Norway’s Digital Life programme.
2018 is our 10th anniversary. Over the past decade we learned a lot about trust between researchers, between researchers and platform developers and curators and between both these groups and funders. We have experienced the Tragedy of the Commons but also seen shifts in attitudes.
In this talk we will use our experiences in FAIRDOM to explore the political, economic, social and technical, social practicalities of Trust.
[1] Wilkinson et al (2016) The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
[2] Wolstencroft, et al (2016) FAIRDOMHub: a repository and collaboration environment for sharing systems biology research Nucleic Acids Research, 45(D1): D404-D407. DOI: 10.1093/nar/gkw1032
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
This presentation will provide an overview of issues in digital preservation. Presentation was delivered during the joint DPE/Planets/CAPAR/nestor training event, ‘The Preservation challenge: basic concepts and practical applications’ (Barcelona, March 2009)
What to curate? Preserving and Curating Software-Based Artneilgrindley
This is a presentation given at the CHArt (Computers and History of Art) conference held in London in November 2011. The slides on the title page are images taken from works exhibited at the V&A Decode exhibition.
Text-Fabric: how to do text research in a FAIR way.
Text is one of the simplest and most common data types in computer science.
But there is a lot in text that does not meet the eye, and so people have been annotating texts, century-by-century.
When you research texts, you consume and produce such annotations.
Suddenly you find yourself in the midst of a big fabric of thoughts, contributed by many authors.
Text-Fabric is a tool that helps you to follow the threads that came before you and to weave a few of your own and add them to the scholarly record.
I'll show you how that looks for clay tablets of the Uruk period (the oldest writing on earth), the much more recent Hebrew Bible, and the ultramodern General Missives of the VOC time.
Towards TextPy, a module for processing text.
If we define annotated text as a graph with additional structure, we can make text processing more efficient, in the same way that Pandas makes processing dataframes more efficient.
We demonstrate how Text-Fabric can handle the display of text and annotations, even when chunks of text are not properly embedded in each other. This demo contains examples from the Hebrew Bible and the Old Babylonian Letters (cuneiform clay tablets).
Researchers in ancient text corpora can take control over their data. We show a way to do so by means of Text-Fabric.
Co-production of Cody Kingham and Dirk Roorda
Biblia Hebraica Stuttgartensia Amstelodamensis. Coding the Hebrew Bible with an Open Science ethos: Text-Fabric.
Text-Fabric is several things: (1) a browser for ancient text corpora; (2) a Python3 package for processing ancient corpora
A corpus of ancient texts and linguistic annotations represents a large body of knowledge. Text-Fabric makes that knowledge accessible to non-programmers by means of built-in a search interface that runs in your browser.
From there the step to program your own analytics is not so big anymore. Because you can call the Text-Fabric API from your Python programs, and it works really well in Jupyter notebooks.
Developing a tool for handling text with linguistic annotations. Text-Fabric is meant to support researchers that wnat to contribute portions of the data, and weaves the contributions in into a meaningful whole. Currently, it is primarily meant for working with the Hebrew Bible, based on the ETCBC (Amsterdam) linguistic database.
Conference presentation for 2016 annual meeting of the Society of Biblical Literature, San Antonio. (https://www.sbl-site.org).
Authors: Janet Dyk (linguistic ideas) and Dirk Roorda (computational implementation).
A verb organizes the elements in a sentence. Different patterns of constituents affect the meaning of a verb in a given context. The potential of a verb to combine with patterns of elements is known as its valence. A single set of questions, organized as a flow chart, selects the relevant building blocks within the context of a verb. The resulting pattern provides a particular significance for the verb in question. Because all contexts are submitted to the same flow chart, similarities and differences between verbs come to light. For example, verbs of movement in their causative formation manifest the same patterns as transitive verbs with an object that gets moved. We apply this approach to the whole Hebrew Bible, using the database of the Eep Talstra Centre for Bible and Computer (ETCBC), which contains the relevant linguistic annotations. This allows us to have a complete listing of all patterns for all verbs. It provides the basis for consistent proposals for the significance of specific patterns occurring with a particular verb. The valence results are made available in SHEBANQ, an online research tool based on the ETCBC database. It presents the basic data, text and linguistic features, together with annotations by researchers. The valence results consist of a set of algorithmically generated annotations which show up between the lines of the text. The algorithm itself and its documentation can be found at https://shebanq.ancient-data.org/tools?goto=valence. By using SHEBANQ we achieve several goals with respect to the scholarly workflow: (1) all our results are openly accessible online, and other researchers may comment on them; (2) all resources needed to reproduce this research are available online and can be downloaded (Open Access).
Text as Data: processing the Hebrew BibleDirk Roorda
The merits of stand-off markup (LAF) versus inline markup (TEI) for processing text as data. Ideas applied to work with the Hebrew Bible, resulting in tools for researchers and end-users.
Datamanagement for Research: A Case StudyDirk Roorda
How practices of data sharing can help researchers to produce more science.
Session in the data management course organized by RDNL (Research Data in the Netherlands)
Hebrew Bible as Data: Laboratory, Sharing, LessonsDirk Roorda
Recently, the Hebrew Bible has been published online as a database. We show what you can do with it, and how to share your results with others. Work by the Amsterdam scholars of the Eep Talstra Centre for Bible and Computer, supported by CLARIN-NL.
LAF-Fabric: a tool to process the ETCBC Hebrew Text Database in Linguistic Annotation Framework.
How researchers in theology and linguistics can create workflows to analyse the text of the Hebrew Bible and extract data for visualization. Those workflows can be written in Python, and run conveniently in the IPython Notebook.
Joint work with Martijn Naaijer (VU University).
With the Hebrew Bible encoded in Linguistic Annotation Framework (LAF-ISO), and with a new LAF processing tool, we demonstrate how you can do practical data analysis. The tool, LAF-Fabric, integrates with the ipython notebook approach. Our example here is lexeme cooccurrence analysis of bible books. For now, the road from data to visualization is more important than the exact visualization.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
2010 Digital Humanities London - Evolution of Preservation
1. Ecology of Longevity
the relevance of evolutionary theory
for digital preservation
Peter . Doorn @ dans.knaw.nl
Dirk . Roorda @ dans.knaw.nl
http://www.dans.knaw.nl/en
2. Sustainability
http://en.wikipedia.org/wiki/Sustainability
the capacity to endure. In ecology the
word describes how biological systems
remain diverse and productive over time
http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf
4. Darwin Lamarck
populations of organisms
in an environment
living and reproducing
imperfect inheritance of traits inheritance
environment cannot sustain of acquired properties
offspring
much speedier
individuals have different evolution
survival chances
valid for cultural
natural selection of “the fittest” inheritance/heredity
✖Mendel ✖genetics ✖DNA
5. Modern synthesis
heredity works through genes,
collectively: DNA
DNA is copied, recombined
damaged, repaired, mutated
At the core of life is information processing.
6. ParadoxesLessons
(i) individuals don’t survive, the system does
(ii) survival is an unintentional result
(iii) there is no long-term, only short-term
... with extensions ...
7. complexity science
sociology
http://en.wikipedia.org/wiki/Chicago_
school_(sociology)
patterns and processes of
individuals versus
communities
M. Mitchell
economy Waldrop
psychology
mechanics Critical Events in Evolving Networks -
... biology! http://www.creen.org/
Back from the Brink -
http://ideas.repec.org/p/dgr/umamer/2000005.html
8. hardware
evolution in ict tube
computing capacity transistor
mainframe batch media chip
numbers
personal computer
text
cloud image
audio
video networking
simulation device
software
organisation
dos-windows-android
world-wide
basic – pascal – java
wordperfect – ODF - TEI
9. strategies to preserve
• emulation
• preserve the environment of the data
• migration
• adapt the data to the environment
10. Parallel 1: use and retain
current digital preservation thought
first preserve, then re-use
perverse consequence:
preserving too well prevents re-use!
evolution gets rid of unused functions
color view - eyes - odor receptors
evolutionary preservation thought
first re-use, then preserve
11. Parallel 2: copies and evolvability
copies are free to evolve
originals are fixed
evolution is not interested in originals
make copies in evolvable formats
migration preferable over emulation
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000112
12. Parallel 3: Sexual Selection
indicators
of
survival
success
promote mating
hard-to-fake
considerable investment
13. Paradigms: bio - tech
part
construction
machine
assembly line
identical copies
refactoring
market
15. Toy ecosystem
clouds
= { workspaces }
http://www.duraspace.org/duracloud.php
works
= { copies }
users
= owners of workspaces
16. Infrastructure rules
• I access therefore I copy
• accessing a work is copying it to your own
workspace
• the cloud may de-duplicate identical
copies in different workspaces
• for storage optimisation
• the cloud may re-duplicate copies
• for access optimisation
• the cloud extracts metadata from works
and has mechanisms to search
17. Usability rules
• the cloud provides a linking system
between works
• not between copies
• intelligent persistent identifiers
• the cloud maintains a system of access
rights
• which is respected by the linking behaviour
18. Economic rules
• users pay for their workspace
• for storage capacity multiplied by time:
• GByte * month
• users pay for making a copy of a work
• this fee will come to the good of current
owners of a copy of the same work
• divided in equal parts
• this fee is not for intellectual property, but is
purely infrastructural
19. Preservation rules
• a preservation institution is just a
workspace user
• all users may request from the cloud
• the number of extant copies of works
• a preservation institution specialises in:
• rare works!
20. Incentives for preservation
• adding value to a work implies
• more access => more income
• keeping only rare works implies
• every access is lucrative
• a surviving preservation institution keeps
• looking for value enhancement
• improving usability of its resources
• throwing away abundant resources
21. Ecology of sustainable
information
• financial incentive to preserve
• optimisations in value / cost
• data use is instrumental in preservation
• {actors} ≈ {stakeholders}
it is already faintly realistic
22. Simulation
• play with the monetary parameters
• study models with interesting behaviour
• maybe there are emergent patterns
• hopefully meaningful in the real world
23. Evolution and Digital Preservation
Thank You
a preservation institution
helps rare works to survive
and thus sustains its own survival
the Future is in the Clouds
Editor's Notes
title
“Solving” Digital Preservationintro (2)sustainability (from wikipedia) endure – diverse – natural resourcesblue ribbon report – incentives value economy(handy review: http://www.sdsc.edu/News%20Items/PR022610_blueribbon.html)
intro (3)-an evolutionary approach trades certainty for enhanced insight in probabilities-past successes of biological evolution are no guarantees for future success of digital evolution: only parallels can be drawn-the parallels between biological evolution and digital preservation serve exploration and heuristics-it is not yet tested but it will lead to testable hypotheses (see example at the end)
core - biologycrash course in 19th century evolution theory
core - biologyevolutionary biologyThe modern synthesis is Darwinian evolution plus genetics: the mechanism of heredity unveiled.Consequences:life is much more an information process than previously thought: a deep connection with computers: symbolic processingthe provenance of life is much clearer: the genetic clockthere are more parallels between digital preservation and the evolution of life
core – paradoxes and lessons1. what survives is life as an ecosystem, not individuals, not even individual speciesin digital preservation: we should focus on preserving the ecosystem in which information thrives, not only on the individual datasets2. trying to keep data exactly as it is (media conservation, bitstream preservation) prevents it from evolving: it will die out.Preservation involves active transformation, on all levels.So what is preserved, you might wonder?3. in biological evolution, long term survival is largely unintentional and a matter of luck. The better chances, however are where species are versatile and evolving.In digital preservation, there are other activities that might contribute more to digital preservation than digital preservation proper: adding value to data by using it and commenting it; combining data, making data interoperable4. long term is sustainable short termEvery surviving species consists of individuals that overcome the daily dangers of the environment.A species can survive a long time if it accumulates techniques that can be used in other environments as well, or if it can evolve rapidly.In digital preservation, it is not handy to focus on the indeterminate future, not even on time-spans as the next 20 years or so.We should preserve the data now, in its most usable form, optimised for access and reuse, optimised for further transformation and interoperability. Then in the future it will take less effort to preserve the data for the next round of change.
intro (4)the wikipedia article gives a nice overview.To be highlighted: warning against reductionism and too shallow biological models
core – ictthe spectrum of change in ICTreflect on what has become obsolete
core – preservationhow to preserve in a fast changing worldemulation: the preservation of dynamics, performancemigration: the preservation of semantics, meaning, informationsustainability factors: make the unit of preservation self-contained, self explanatory, self-everything
core parallelsreproduction and metabolismwhat evolved first: the capability to reproduce or to metaboliseIn biology: probably there developed at first metabolic reactions in confined spaces, taking food stuffs from the environment, en secreting waste in turn. There was not yet a machinery for self-preservation or copying, it developed on the way.In digital preservation: you can not waitwith re-using till you have policies for preservation in place. The practice of using and re-using information preceeds the preservation of information.Paradox! In short term: you can only (re)use information that you have at hand (preserved).In the long run: you will loose information that you do not use, you will preserve for the long run that which has been used a lot.
core – parallelsa good gene will not mutate in a population. If it does, it looses function, the individual will not procreate, the mutation is lost.but: sometimes genes get duplicated in the genome. One of the copies is free to evolve.In digital preservation: you can make a repository with ur-authentic copies, complying with the sustainability factors.But if nothing else is done, they are not optimised for access. If you make copies for access, then the changing needs of the users of the data will tend to change / migrate / transform the data. Over time, the new forms might become more important than the old authentic forms in the repository.
core – parallelswhen a species evolves optimizing reproductive capacity at the expense of survival capacity for the individual.The male or female develop exaggerated characteristics on which they select each other.These features can be quite costly: antlers, peacock feathers. The features represent hard to fake interesting properties.Application in digital preservation: certificates, datasealofapprovals, assessment procedures, meant as sign that interoperability is safe.Quite costly, it might go out of hand with respect to the original goal.
paradigm: biology versus technologysynthesis versus analysisvery different production processes
core – a model of sustainable informationif you combine elements from google, bit-torrent, duracloud, lockss, persistent identifiers, you can in fact start realising this