SlideShare a Scribd company logo
1 of 48
Download to read offline
La préservation des logiciels: défis et opportunités
pour la reproductibilité en science et technologie
Roberto Di Cosmo
roberto@dicosmo.org
3 Décembre 2015
Roberto Di Cosmo Software Preservation 3 Décembre 2015 1 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 2 / 35
How we built our scientific knowledge
The experimental method
make an observation
formulate an hypothesis
set up an experiment
formulate a theory
And then we reproduce and verify.
Reproducibility is the key
non-reproducible single occurrences are of no
significance to science
Karl Popper, The Logic of Scientific Discovery, 1934
Roberto Di Cosmo Software Preservation 3 Décembre 2015 3 / 35
Reproducibility, today
Reproducibility (Wikipedia)
the ability of an entire experiment or study to be reproduced, either
by the researcher or by someone else working independently.
It is one of the main principles of the scientific method.
Why we want it
foundation of the scientific method
accelerator of research: allows to build upon previous work
visibility: reproducible results are cited more often
transparency of results eases acceptance
necessary for industrial transfer
reproducibility is the essence of industry!
Roberto Di Cosmo Software Preservation 3 Décembre 2015 4 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 5 / 35
Reproducibility in the digital age
For an experiment involving software, we need
open access to the scientific article describing it
open data sets used in the experiment
source code of all the components
environment of execution
stable references between all this
Remark
The first two items are already widely discussed!
Roberto Di Cosmo Software Preservation 3 Décembre 2015 6 / 35
Software is Knowledge
Software is an essential component of modern scientific research
[...] the vast majority describe exper-
imental methods or sofware that have
become essential in their fields.
Top 100 papers (Nature,
October 2014) http://www.nature.com/
news/the-top-100-papers-1.16224
Roberto Di Cosmo Software Preservation 3 Décembre 2015 7 / 35
Use the Source, Luke!
Some people claim that having (all) the source of the code used in
an experiment is not worth the effort (see “Replicability is not
Reproducibility: Nor is it Good Science”, Chris Drummond, ICML
2009)
Roberto Di Cosmo Software Preservation 3 Décembre 2015 8 / 35
Use the Source, Luke!
Some people claim that having (all) the source of the code used in
an experiment is not worth the effort (see “Replicability is not
Reproducibility: Nor is it Good Science”, Chris Drummond, ICML
2009)
Sure, diversity is important, but:
Source code is like the proof used in a theorem: can we really
accept Fermat statements like “the details are omitted due to
lack of space”?
modern complex systems makes even the simplest experiment
depend on a wealth of components and configuration options
access to all the source code is not just necessary to reproduce,
it is also useful to evolve and modify, to build new experiments
from the old ones
Roberto Di Cosmo Software Preservation 3 Décembre 2015 8 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 9 / 35
Software and reproducibility
A fundamental question
How are we doing, regarding reproducibility, in Software?
The case of Computer Systems Research
A field with Computer experts ... we have high expectations!
Christian Collberg set out to check them.
Measuring Reproducibility in Computer Systems Research
Long and detailed technical report, March 2014
http:
//reproducibility.cs.arizona.edu/v1/tr.pdf
Roberto Di Cosmo Software Preservation 3 Décembre 2015 10 / 35
Collberg’s report from the trenches
Analysis of 613 papers
8 ACM conferences:
ASPLOS’12, CCS’12,
OOPSLA’12, OSDI’12,
PLDI’12, SIGMOD’12,
SOSP’11, VLDB’12
5 journals: TACO’9,
TISSEC’15, TOCS’30,
TODS’37, TOPLAS’34
all very practical oriented
The basic question
can we get the code to build
and run?
The workflow
Roberto Di Cosmo Software Preservation 3 Décembre 2015 11 / 35
The result
Even if these numbers can be debated ...
... that’s a whopping 81% of non reproducible works!
Roberto Di Cosmo Software Preservation 3 Décembre 2015 12 / 35
The reasons (or, “the dog ate my program”)
Many issues, nice anectotes, and it finally boils down to
Availability
Traceability
Environment
Automation (do you use continuous integration?)
Documentation
Understanding ( including Open Source)
Roberto Di Cosmo Software Preservation 3 Décembre 2015 13 / 35
The reasons (or, “the dog ate my program”)
Many issues, nice anectotes, and it finally boils down to
Availability
Traceability
Environment
Automation (do you use continuous integration?)
Documentation
Understanding ( including Open Source)
The first two are important software preservation issues
Yes, code is fragile:
it can be destroyed, and we can loose trace of it
Roberto Di Cosmo Software Preservation 3 Décembre 2015 13 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 14 / 35
Like all digital information, software is fragile
An example is worth a thousand words...
let’s see a few
Roberto Di Cosmo Software Preservation 3 Décembre 2015 15 / 35
Inconsiderate or malicious loss of code
The Year 2000 Bug ... uncovered an inconvenient truth
in 1999, an estimated 40% of companies had either
lost, or thrown away the original source code for their
systems!
CodeSpaces: source code hosting, 2007-2014
Yes, for seven years all seemed ok.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 16 / 35
Inconsiderate or malicious loss of code
The Year 2000 Bug ... uncovered an inconvenient truth
in 1999, an estimated 40% of companies had either
lost, or thrown away the original source code for their
systems!
CodeSpaces: source code hosting, 2007-2014
Yes, for seven years all seemed ok.
No, they did not recover the data.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 16 / 35
Business-driven loss of code support: Google
Roberto Di Cosmo Software Preservation 3 Décembre 2015 17 / 35
Business-driven loss of code support: Google, cont’d.
http://google-opensource.blogspot.fr/2015/Roberto Di Cosmo Software Preservation 3 Décembre 2015 18 / 35
Business-driven loss of code support: Gitorious
From: Rolf Bjaanes <rolf@gitorious.org>
To: zack@upsilon.cc
Subject: Gitorious.org is dead, long live
Gitorious.org
Message-Id: <30589491.20150416155909.552fdc4d1647
Hi zacchiro,
I’m Rolf Bjaanes, CEO of Gitorious, and you
are receiving this email because you have
a user on gitorious.org. As you may know,
Gitorious was acquired by GitLab [1] about a
month ago (NDLR: 3/3/2015), and we announced
that Gitorious.org would be shutting down at
the end of May, 2015.
... Rolf
Roberto Di Cosmo Software Preservation 3 Décembre 2015 19 / 35
Disruption of the web of reference
Web links are not permanent (even permalinks)}
there is no general guarantee that a URL... which at
one time points to a given object continues to do so
T. Berners-Lee et al. Uniform Resource Locators. RFC
1738.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 20 / 35
Disruption of the web of reference
Web links are not permanent (even permalinks)}
there is no general guarantee that a URL... which at
one time points to a given object continues to do so
T. Berners-Lee et al. Uniform Resource Locators. RFC
1738.
URLs used in articles decay!
Analysis of IEEE Computer (Computer), and the Communications of
the ACM (CACM): 1995-1999
the half-life of a referenced URL is approximately 4 years from
its publication date D. Spinellis. The Decay and Failures of URL
References.
Communications of the ACM, 46(1):71-77, January 2003.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 20 / 35
Disruption of the web of reference: our Gforge
Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
Disruption of the web of reference: our Gforge
Fixed, adding a redirection, by the Gforge team
in 1 day this one was fixed!
Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
Disruption of the web of reference: our Gforge
Fixed, adding a redirection, by the Gforge team
in 1 day this one was fixed!
Not always that lucky, though ...
Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 22 / 35
The Software Heritage Project
Our mission
Collect, organise, preserve and share all the software that lies at the
heart of our culture and our society.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 23 / 35
Our mission in our logo
Collect
We collect software, all of it.
Preserve
We preserve software, because it embodies our technical
and scientific knowledge.
Share
We will index, organise, and make widely accessible all
the Software we collect
Roberto Di Cosmo Software Preservation 3 Décembre 2015 24 / 35
The Knowledge Conservancy Magic Triangle
The Knowledge Conservancy Magic Triangle
Roberto Di Cosmo Software Preservation 3 Décembre 2015 25 / 35
The Knowledge Conservancy Magic Triangle
The Knowledge Conservancy Magic Triangle
Legenda (links are important!)
articles: ArXiv, HAL, ...
data: Zenodo, ...
software: Software Heritage to the rescue
Roberto Di Cosmo Software Preservation 3 Décembre 2015 25 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 26 / 35
Free an Open Source Software is crucial
D. Rosenthal, EUDAT, 9/2014
you have to do [digital preservation] with open-source
software; closed-source preservation has the same fatal "just
trust me" aspect that closed-source encryption (and cloud
storage) suffer from.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 27 / 35
Free an Open Source Software is crucial
D. Rosenthal, EUDAT, 9/2014
you have to do [digital preservation] with open-source
software; closed-source preservation has the same fatal "just
trust me" aspect that closed-source encryption (and cloud
storage) suffer from.
recommendation
our preferred platform(s) should:
provide full details on their architecture
make available all the source code used
use open standards
encourage a collaborative development process
Roberto Di Cosmo Software Preservation 3 Décembre 2015 27 / 35
Web links are not permanent (even permalinks)
T. Berners-Lee et al. Uniform Resource Locators. RFC 1738.
Users should beware that there is no general guarantee that
a URL which at one time points to a given object continues
to do so, and does not even at some later time point to a
different object due to the movement of objects on servers.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
Web links are not permanent (even permalinks)
T. Berners-Lee et al. Uniform Resource Locators. RFC 1738.
Users should beware that there is no general guarantee that
a URL which at one time points to a given object continues
to do so, and does not even at some later time point to a
different object due to the movement of objects on servers.
The Decay and Failures of URL References
half life of web references is 4 years
Diomidis Spinellis, CACM 2003
Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
Web links are not permanent (even permalinks)
T. Berners-Lee et al. Uniform Resource Locators. RFC 1738.
Users should beware that there is no general guarantee that
a URL which at one time points to a given object continues
to do so, and does not even at some later time point to a
different object due to the movement of objects on servers.
The Decay and Failures of URL References
half life of web references is 4 years
Diomidis Spinellis, CACM 2003
recommendation
our preferred platform(s) should:
provide intrinsic resource identifiers
avoid volatile identifiers like DOI or URLs
Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
Replication is the key
Thomas Jefferson, February 18, 1791
...let us save what remains: not by vaults and locks which
fence them from the public eye and use in consigning them
to the waste of time, but by such a multiplication of copies,
as shall place them beyond the reach of accident.
Roberto Di Cosmo Software Preservation 3 Décembre 2015 29 / 35
Replication is the key
Thomas Jefferson, February 18, 1791
...let us save what remains: not by vaults and locks which
fence them from the public eye and use in consigning them
to the waste of time, but by such a multiplication of copies,
as shall place them beyond the reach of accident.
recommendation
our preferred platform(s) should:
provide easy means for making copies
encourage the growth of a mirror network (like ArXiv did)
Roberto Di Cosmo Software Preservation 3 Décembre 2015 29 / 35
Caring for the long term
not just a project
projects have limited time-frame
not commercial
business interests come and go
a shared concern
cultural heritage
scientific infrastructure
industrial infrastructure
Unix philosophy
do one thing, do it well
Roberto Di Cosmo Software Preservation 3 Décembre 2015 30 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 31 / 35
Need your help
make it easy to integrate
in development workflow
in publishing workflow
Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
Need your help
make it easy to integrate
in development workflow
in publishing workflow
make it ok to integrate, from the legal point of view
make licences explicit
make licences of dependencies explicit
Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
Need your help
make it easy to integrate
in development workflow
in publishing workflow
make it ok to integrate, from the legal point of view
make licences explicit
make licences of dependencies explicit
make it sustainable
support/sponsorship
open process
collaboration
Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
Focus on the legal issues
a plurality of concerns
Who owns the rights to your research?
articles, data, software
too often forgotten: metadata
Software Track in Science of Computer Programming, 2015
You own the software, but who owns the metadata?
we need to recover our rigths
it is possible!
compulsory exclusive copyright transfer for free
is illegal in France (art L. 131-4 of CPI)
is debatable in all jurisdictions
see Free Scientific Publication
paying the editors (OpenAire) is not a solution
Roberto Di Cosmo Software Preservation 3 Décembre 2015 33 / 35
Outline
1 The scientific method
2 Reproducibility
3 The state of Software reproducibility
4 Software is Fragile
5 Software Heritage
6 Bits from the drawing board
7 Call to action
8 Questions
Roberto Di Cosmo Software Preservation 3 Décembre 2015 34 / 35
Questions
Questions?
Pour rester en contact
mailing list: swh-science@inria.fr
https:
//sympa.inria.fr/sympa/info/swh-science
Roberto Di Cosmo Software Preservation 3 Décembre 2015 35 / 35

More Related Content

What's hot

Automated Detection of Software Bugs and Vulnerabilities in Linux
Automated Detection of Software Bugs and Vulnerabilities in LinuxAutomated Detection of Software Bugs and Vulnerabilities in Linux
Automated Detection of Software Bugs and Vulnerabilities in LinuxSilvio Cesare
 
Software Development Graveyard
Software Development GraveyardSoftware Development Graveyard
Software Development GraveyardErika Barron
 
Variability, Bugs, and Cognition
Variability, Bugs, and CognitionVariability, Bugs, and Cognition
Variability, Bugs, and CognitionAndrzej Wasowski
 
Software Bertillonage: Finding the Provenance of an Entity
Software Bertillonage: Finding the Provenance of an EntitySoftware Bertillonage: Finding the Provenance of an Entity
Software Bertillonage: Finding the Provenance of an Entitymigod
 
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...Theo Jungeblut
 
Clean Code Part III - Craftsmanship at SoCal Code Camp
Clean Code Part III - Craftsmanship at SoCal Code CampClean Code Part III - Craftsmanship at SoCal Code Camp
Clean Code Part III - Craftsmanship at SoCal Code CampTheo Jungeblut
 
Clean Code Part I - Design Patterns at SoCal Code Camp
Clean Code Part I - Design Patterns at SoCal Code CampClean Code Part I - Design Patterns at SoCal Code Camp
Clean Code Part I - Design Patterns at SoCal Code CampTheo Jungeblut
 
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaert
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaertDevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaert
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaertMykola Marzhan
 
What is Software Engineering Research Good For?
What is Software Engineering Research Good For?What is Software Engineering Research Good For?
What is Software Engineering Research Good For?Andrzej Wasowski
 
Clean Code for East Bay .NET User Group
Clean Code for East Bay .NET User GroupClean Code for East Bay .NET User Group
Clean Code for East Bay .NET User GroupTheo Jungeblut
 

What's hot (11)

Automated Detection of Software Bugs and Vulnerabilities in Linux
Automated Detection of Software Bugs and Vulnerabilities in LinuxAutomated Detection of Software Bugs and Vulnerabilities in Linux
Automated Detection of Software Bugs and Vulnerabilities in Linux
 
Clean code
Clean codeClean code
Clean code
 
Software Development Graveyard
Software Development GraveyardSoftware Development Graveyard
Software Development Graveyard
 
Variability, Bugs, and Cognition
Variability, Bugs, and CognitionVariability, Bugs, and Cognition
Variability, Bugs, and Cognition
 
Software Bertillonage: Finding the Provenance of an Entity
Software Bertillonage: Finding the Provenance of an EntitySoftware Bertillonage: Finding the Provenance of an Entity
Software Bertillonage: Finding the Provenance of an Entity
 
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...
Clean Code - Design Patterns and Best Practices for Bay.NET SF User Group (01...
 
Clean Code Part III - Craftsmanship at SoCal Code Camp
Clean Code Part III - Craftsmanship at SoCal Code CampClean Code Part III - Craftsmanship at SoCal Code Camp
Clean Code Part III - Craftsmanship at SoCal Code Camp
 
Clean Code Part I - Design Patterns at SoCal Code Camp
Clean Code Part I - Design Patterns at SoCal Code CampClean Code Part I - Design Patterns at SoCal Code Camp
Clean Code Part I - Design Patterns at SoCal Code Camp
 
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaert
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaertDevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaert
DevOps Days Kyiv 2019 -- continuous Infrafirstructure First //Kris buytaert
 
What is Software Engineering Research Good For?
What is Software Engineering Research Good For?What is Software Engineering Research Good For?
What is Software Engineering Research Good For?
 
Clean Code for East Bay .NET User Group
Clean Code for East Bay .NET User GroupClean Code for East Bay .NET User Group
Clean Code for East Bay .NET User Group
 

Viewers also liked

Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Etude d’un répartiteur générale téléphonique
Etude d’un répartiteur générale téléphoniqueEtude d’un répartiteur générale téléphonique
Etude d’un répartiteur générale téléphoniqueAchref Ben helel
 
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...Interface ULg, LIEGE science park
 
Science et technologie :une rencontre des esprits, general articles bp holdin...
Science et technologie :une rencontre des esprits, general articles bp holdin...Science et technologie :une rencontre des esprits, general articles bp holdin...
Science et technologie :une rencontre des esprits, general articles bp holdin...Wilhemina Neumann
 
De la technique à la technologie
De la technique à la technologieDe la technique à la technologie
De la technique à la technologiezaoyuma
 
Fondation la main à la pâte
Fondation la main à la pâteFondation la main à la pâte
Fondation la main à la pâteEduSkills OECD
 

Viewers also liked (7)

Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Etude d’un répartiteur générale téléphonique
Etude d’un répartiteur générale téléphoniqueEtude d’un répartiteur générale téléphonique
Etude d’un répartiteur générale téléphonique
 
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...
Financement des sociétés innovantes - Le Point du LIEGE science park - 30 sep...
 
Science et technologie :une rencontre des esprits, general articles bp holdin...
Science et technologie :une rencontre des esprits, general articles bp holdin...Science et technologie :une rencontre des esprits, general articles bp holdin...
Science et technologie :une rencontre des esprits, general articles bp holdin...
 
De la technique à la technologie
De la technique à la technologieDe la technique à la technologie
De la technique à la technologie
 
Fondation la main à la pâte
Fondation la main à la pâteFondation la main à la pâte
Fondation la main à la pâte
 
Building an Online Profile Using Social Networking and Amplification Tools fo...
Building an Online Profile Using Social Networking and Amplification Tools fo...Building an Online Profile Using Social Networking and Amplification Tools fo...
Building an Online Profile Using Social Networking and Amplification Tools fo...
 

Similar to La préservation des logiciels: défis et opportunités pour la reproductibilité en science et technologie

Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Roberto Di Cosmo
 
ScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilab
 
Observability für alle
Observability für alleObservability für alle
Observability für alleQAware GmbH
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...Arnaud Joly
 
Will Postgres Live Forever?
Will Postgres Live Forever?Will Postgres Live Forever?
Will Postgres Live Forever?EDB
 
Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Aron Ahmadia
 
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...Cyber Security Alliance
 
Hacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersHacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersDeepikaRana30
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)Brian Brazil
 
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...Paris Open Source Summit
 
Software Heritage: Archiving the Free Software Commons for Fun & Profit
Software Heritage: Archiving the Free Software Commons for Fun & ProfitSoftware Heritage: Archiving the Free Software Commons for Fun & Profit
Software Heritage: Archiving the Free Software Commons for Fun & ProfitSpeck&Tech
 
PyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPiotr Dyba
 
A look inside the European Covid Green Certificate (Codemotion 2021)
A look inside the European Covid Green Certificate (Codemotion 2021)A look inside the European Covid Green Certificate (Codemotion 2021)
A look inside the European Covid Green Certificate (Codemotion 2021)Luciano Mammino
 
Software Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesSoftware Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesAron Ahmadia
 
Enhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code ForensicsEnhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code ForensicsTechWell
 
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...Reveal the Security Risks in the software Development Lifecycle Meetup 060320...
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...lior mazor
 
Software Heritage, a revolutionary infrastructure for software source code, O...
Software Heritage, a revolutionary infrastructure for software source code, O...Software Heritage, a revolutionary infrastructure for software source code, O...
Software Heritage, a revolutionary infrastructure for software source code, O...OW2
 

Similar to La préservation des logiciels: défis et opportunités pour la reproductibilité en science et technologie (20)

Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...
 
ScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilabTEC 2015 - Irill
ScilabTEC 2015 - Irill
 
Observability für alle
Observability für alleObservability für alle
Observability für alle
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...
 
Will Postgres Live Forever?
Will Postgres Live Forever?Will Postgres Live Forever?
Will Postgres Live Forever?
 
Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013
 
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...
ASFWS 2013 - Cryptocat: récents défis en faisant la cryptographie plus facile...
 
Hacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersHacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginners
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
 
Raising the Bar
Raising the BarRaising the Bar
Raising the Bar
 
Software Heritage: Archiving the Free Software Commons for Fun & Profit
Software Heritage: Archiving the Free Software Commons for Fun & ProfitSoftware Heritage: Archiving the Free Software Commons for Fun & Profit
Software Heritage: Archiving the Free Software Commons for Fun & Profit
 
PyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPyConPL 2017 - with python: security
PyConPL 2017 - with python: security
 
A look inside the European Covid Green Certificate (Codemotion 2021)
A look inside the European Covid Green Certificate (Codemotion 2021)A look inside the European Covid Green Certificate (Codemotion 2021)
A look inside the European Covid Green Certificate (Codemotion 2021)
 
Software Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesSoftware Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical Sciences
 
Enhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code ForensicsEnhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code Forensics
 
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...Reveal the Security Risks in the software Development Lifecycle Meetup 060320...
Reveal the Security Risks in the software Development Lifecycle Meetup 060320...
 
Software Heritage, a revolutionary infrastructure for software source code, O...
Software Heritage, a revolutionary infrastructure for software source code, O...Software Heritage, a revolutionary infrastructure for software source code, O...
Software Heritage, a revolutionary infrastructure for software source code, O...
 
Preserve or preserve not
Preserve or preserve notPreserve or preserve not
Preserve or preserve not
 
Icpc16.ppt
Icpc16.pptIcpc16.ppt
Icpc16.ppt
 

Recently uploaded

Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 

Recently uploaded (20)

Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 

La préservation des logiciels: défis et opportunités pour la reproductibilité en science et technologie

  • 1. La préservation des logiciels: défis et opportunités pour la reproductibilité en science et technologie Roberto Di Cosmo roberto@dicosmo.org 3 Décembre 2015 Roberto Di Cosmo Software Preservation 3 Décembre 2015 1 / 35
  • 2. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 2 / 35
  • 3. How we built our scientific knowledge The experimental method make an observation formulate an hypothesis set up an experiment formulate a theory And then we reproduce and verify. Reproducibility is the key non-reproducible single occurrences are of no significance to science Karl Popper, The Logic of Scientific Discovery, 1934 Roberto Di Cosmo Software Preservation 3 Décembre 2015 3 / 35
  • 4. Reproducibility, today Reproducibility (Wikipedia) the ability of an entire experiment or study to be reproduced, either by the researcher or by someone else working independently. It is one of the main principles of the scientific method. Why we want it foundation of the scientific method accelerator of research: allows to build upon previous work visibility: reproducible results are cited more often transparency of results eases acceptance necessary for industrial transfer reproducibility is the essence of industry! Roberto Di Cosmo Software Preservation 3 Décembre 2015 4 / 35
  • 5. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 5 / 35
  • 6. Reproducibility in the digital age For an experiment involving software, we need open access to the scientific article describing it open data sets used in the experiment source code of all the components environment of execution stable references between all this Remark The first two items are already widely discussed! Roberto Di Cosmo Software Preservation 3 Décembre 2015 6 / 35
  • 7. Software is Knowledge Software is an essential component of modern scientific research [...] the vast majority describe exper- imental methods or sofware that have become essential in their fields. Top 100 papers (Nature, October 2014) http://www.nature.com/ news/the-top-100-papers-1.16224 Roberto Di Cosmo Software Preservation 3 Décembre 2015 7 / 35
  • 8. Use the Source, Luke! Some people claim that having (all) the source of the code used in an experiment is not worth the effort (see “Replicability is not Reproducibility: Nor is it Good Science”, Chris Drummond, ICML 2009) Roberto Di Cosmo Software Preservation 3 Décembre 2015 8 / 35
  • 9. Use the Source, Luke! Some people claim that having (all) the source of the code used in an experiment is not worth the effort (see “Replicability is not Reproducibility: Nor is it Good Science”, Chris Drummond, ICML 2009) Sure, diversity is important, but: Source code is like the proof used in a theorem: can we really accept Fermat statements like “the details are omitted due to lack of space”? modern complex systems makes even the simplest experiment depend on a wealth of components and configuration options access to all the source code is not just necessary to reproduce, it is also useful to evolve and modify, to build new experiments from the old ones Roberto Di Cosmo Software Preservation 3 Décembre 2015 8 / 35
  • 10. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 9 / 35
  • 11. Software and reproducibility A fundamental question How are we doing, regarding reproducibility, in Software? The case of Computer Systems Research A field with Computer experts ... we have high expectations! Christian Collberg set out to check them. Measuring Reproducibility in Computer Systems Research Long and detailed technical report, March 2014 http: //reproducibility.cs.arizona.edu/v1/tr.pdf Roberto Di Cosmo Software Preservation 3 Décembre 2015 10 / 35
  • 12. Collberg’s report from the trenches Analysis of 613 papers 8 ACM conferences: ASPLOS’12, CCS’12, OOPSLA’12, OSDI’12, PLDI’12, SIGMOD’12, SOSP’11, VLDB’12 5 journals: TACO’9, TISSEC’15, TOCS’30, TODS’37, TOPLAS’34 all very practical oriented The basic question can we get the code to build and run? The workflow Roberto Di Cosmo Software Preservation 3 Décembre 2015 11 / 35
  • 13. The result Even if these numbers can be debated ... ... that’s a whopping 81% of non reproducible works! Roberto Di Cosmo Software Preservation 3 Décembre 2015 12 / 35
  • 14. The reasons (or, “the dog ate my program”) Many issues, nice anectotes, and it finally boils down to Availability Traceability Environment Automation (do you use continuous integration?) Documentation Understanding ( including Open Source) Roberto Di Cosmo Software Preservation 3 Décembre 2015 13 / 35
  • 15. The reasons (or, “the dog ate my program”) Many issues, nice anectotes, and it finally boils down to Availability Traceability Environment Automation (do you use continuous integration?) Documentation Understanding ( including Open Source) The first two are important software preservation issues Yes, code is fragile: it can be destroyed, and we can loose trace of it Roberto Di Cosmo Software Preservation 3 Décembre 2015 13 / 35
  • 16. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 14 / 35
  • 17. Like all digital information, software is fragile An example is worth a thousand words... let’s see a few Roberto Di Cosmo Software Preservation 3 Décembre 2015 15 / 35
  • 18. Inconsiderate or malicious loss of code The Year 2000 Bug ... uncovered an inconvenient truth in 1999, an estimated 40% of companies had either lost, or thrown away the original source code for their systems! CodeSpaces: source code hosting, 2007-2014 Yes, for seven years all seemed ok. Roberto Di Cosmo Software Preservation 3 Décembre 2015 16 / 35
  • 19. Inconsiderate or malicious loss of code The Year 2000 Bug ... uncovered an inconvenient truth in 1999, an estimated 40% of companies had either lost, or thrown away the original source code for their systems! CodeSpaces: source code hosting, 2007-2014 Yes, for seven years all seemed ok. No, they did not recover the data. Roberto Di Cosmo Software Preservation 3 Décembre 2015 16 / 35
  • 20. Business-driven loss of code support: Google Roberto Di Cosmo Software Preservation 3 Décembre 2015 17 / 35
  • 21. Business-driven loss of code support: Google, cont’d. http://google-opensource.blogspot.fr/2015/Roberto Di Cosmo Software Preservation 3 Décembre 2015 18 / 35
  • 22. Business-driven loss of code support: Gitorious From: Rolf Bjaanes <rolf@gitorious.org> To: zack@upsilon.cc Subject: Gitorious.org is dead, long live Gitorious.org Message-Id: <30589491.20150416155909.552fdc4d1647 Hi zacchiro, I’m Rolf Bjaanes, CEO of Gitorious, and you are receiving this email because you have a user on gitorious.org. As you may know, Gitorious was acquired by GitLab [1] about a month ago (NDLR: 3/3/2015), and we announced that Gitorious.org would be shutting down at the end of May, 2015. ... Rolf Roberto Di Cosmo Software Preservation 3 Décembre 2015 19 / 35
  • 23. Disruption of the web of reference Web links are not permanent (even permalinks)} there is no general guarantee that a URL... which at one time points to a given object continues to do so T. Berners-Lee et al. Uniform Resource Locators. RFC 1738. Roberto Di Cosmo Software Preservation 3 Décembre 2015 20 / 35
  • 24. Disruption of the web of reference Web links are not permanent (even permalinks)} there is no general guarantee that a URL... which at one time points to a given object continues to do so T. Berners-Lee et al. Uniform Resource Locators. RFC 1738. URLs used in articles decay! Analysis of IEEE Computer (Computer), and the Communications of the ACM (CACM): 1995-1999 the half-life of a referenced URL is approximately 4 years from its publication date D. Spinellis. The Decay and Failures of URL References. Communications of the ACM, 46(1):71-77, January 2003. Roberto Di Cosmo Software Preservation 3 Décembre 2015 20 / 35
  • 25. Disruption of the web of reference: our Gforge Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
  • 26. Disruption of the web of reference: our Gforge Fixed, adding a redirection, by the Gforge team in 1 day this one was fixed! Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
  • 27. Disruption of the web of reference: our Gforge Fixed, adding a redirection, by the Gforge team in 1 day this one was fixed! Not always that lucky, though ... Roberto Di Cosmo Software Preservation 3 Décembre 2015 21 / 35
  • 28. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 22 / 35
  • 29. The Software Heritage Project Our mission Collect, organise, preserve and share all the software that lies at the heart of our culture and our society. Roberto Di Cosmo Software Preservation 3 Décembre 2015 23 / 35
  • 30. Our mission in our logo Collect We collect software, all of it. Preserve We preserve software, because it embodies our technical and scientific knowledge. Share We will index, organise, and make widely accessible all the Software we collect Roberto Di Cosmo Software Preservation 3 Décembre 2015 24 / 35
  • 31. The Knowledge Conservancy Magic Triangle The Knowledge Conservancy Magic Triangle Roberto Di Cosmo Software Preservation 3 Décembre 2015 25 / 35
  • 32. The Knowledge Conservancy Magic Triangle The Knowledge Conservancy Magic Triangle Legenda (links are important!) articles: ArXiv, HAL, ... data: Zenodo, ... software: Software Heritage to the rescue Roberto Di Cosmo Software Preservation 3 Décembre 2015 25 / 35
  • 33. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 26 / 35
  • 34. Free an Open Source Software is crucial D. Rosenthal, EUDAT, 9/2014 you have to do [digital preservation] with open-source software; closed-source preservation has the same fatal "just trust me" aspect that closed-source encryption (and cloud storage) suffer from. Roberto Di Cosmo Software Preservation 3 Décembre 2015 27 / 35
  • 35. Free an Open Source Software is crucial D. Rosenthal, EUDAT, 9/2014 you have to do [digital preservation] with open-source software; closed-source preservation has the same fatal "just trust me" aspect that closed-source encryption (and cloud storage) suffer from. recommendation our preferred platform(s) should: provide full details on their architecture make available all the source code used use open standards encourage a collaborative development process Roberto Di Cosmo Software Preservation 3 Décembre 2015 27 / 35
  • 36. Web links are not permanent (even permalinks) T. Berners-Lee et al. Uniform Resource Locators. RFC 1738. Users should beware that there is no general guarantee that a URL which at one time points to a given object continues to do so, and does not even at some later time point to a different object due to the movement of objects on servers. Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
  • 37. Web links are not permanent (even permalinks) T. Berners-Lee et al. Uniform Resource Locators. RFC 1738. Users should beware that there is no general guarantee that a URL which at one time points to a given object continues to do so, and does not even at some later time point to a different object due to the movement of objects on servers. The Decay and Failures of URL References half life of web references is 4 years Diomidis Spinellis, CACM 2003 Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
  • 38. Web links are not permanent (even permalinks) T. Berners-Lee et al. Uniform Resource Locators. RFC 1738. Users should beware that there is no general guarantee that a URL which at one time points to a given object continues to do so, and does not even at some later time point to a different object due to the movement of objects on servers. The Decay and Failures of URL References half life of web references is 4 years Diomidis Spinellis, CACM 2003 recommendation our preferred platform(s) should: provide intrinsic resource identifiers avoid volatile identifiers like DOI or URLs Roberto Di Cosmo Software Preservation 3 Décembre 2015 28 / 35
  • 39. Replication is the key Thomas Jefferson, February 18, 1791 ...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident. Roberto Di Cosmo Software Preservation 3 Décembre 2015 29 / 35
  • 40. Replication is the key Thomas Jefferson, February 18, 1791 ...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident. recommendation our preferred platform(s) should: provide easy means for making copies encourage the growth of a mirror network (like ArXiv did) Roberto Di Cosmo Software Preservation 3 Décembre 2015 29 / 35
  • 41. Caring for the long term not just a project projects have limited time-frame not commercial business interests come and go a shared concern cultural heritage scientific infrastructure industrial infrastructure Unix philosophy do one thing, do it well Roberto Di Cosmo Software Preservation 3 Décembre 2015 30 / 35
  • 42. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 31 / 35
  • 43. Need your help make it easy to integrate in development workflow in publishing workflow Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
  • 44. Need your help make it easy to integrate in development workflow in publishing workflow make it ok to integrate, from the legal point of view make licences explicit make licences of dependencies explicit Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
  • 45. Need your help make it easy to integrate in development workflow in publishing workflow make it ok to integrate, from the legal point of view make licences explicit make licences of dependencies explicit make it sustainable support/sponsorship open process collaboration Roberto Di Cosmo Software Preservation 3 Décembre 2015 32 / 35
  • 46. Focus on the legal issues a plurality of concerns Who owns the rights to your research? articles, data, software too often forgotten: metadata Software Track in Science of Computer Programming, 2015 You own the software, but who owns the metadata? we need to recover our rigths it is possible! compulsory exclusive copyright transfer for free is illegal in France (art L. 131-4 of CPI) is debatable in all jurisdictions see Free Scientific Publication paying the editors (OpenAire) is not a solution Roberto Di Cosmo Software Preservation 3 Décembre 2015 33 / 35
  • 47. Outline 1 The scientific method 2 Reproducibility 3 The state of Software reproducibility 4 Software is Fragile 5 Software Heritage 6 Bits from the drawing board 7 Call to action 8 Questions Roberto Di Cosmo Software Preservation 3 Décembre 2015 34 / 35
  • 48. Questions Questions? Pour rester en contact mailing list: swh-science@inria.fr https: //sympa.inria.fr/sympa/info/swh-science Roberto Di Cosmo Software Preservation 3 Décembre 2015 35 / 35