Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Software Heritage
Why and How We Preserve all of Mankind’s Source Code
Roberto Di Cosmo
roberto@dicosmo.org
June 5th, 2018...
Outline
1 Software all around us
2 The Software Heritage initiative
3 Building for the long term
4 Conclusion
Roberto Di C...
Software is everywhere
Software
Software embodies our collective Knowledge and Cultural Heritage
Roberto Di Cosmo www.dico...
Software Source Code is special
Harold Abelson, Structure and Interpretation of Computer Programs (1st ed.) 1985
“Programs...
~ 50 years, a lightning fast growth
Apollo 11 Guidance Computer (~60.000 lines), 1969
"When I first got into it, nobody kn...
Software is spread all around
Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 5 / 14
Software is fragile
Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 6 / 14
Software lacks its own research infrastructure
Photo: ALMA(ESO/NAOJ/NRAO), R. Hills
Roberto Di Cosmo www.dicosmo.org www.s...
Outline
1 Software all around us
2 The Software Heritage initiative
3 Building for the long term
4 Conclusion
Roberto Di C...
The Software Heritage Project www.softwareheritage.org
Our mission
Collect, preserve and share the source code of all the ...
A principled infrastructure http://bit.ly/swhpaper
Technology
transparency and FOSS
replicas all the way down
Content
intr...
Architecture (simplified)
Git
loader
Mercurial
loader
Debian source
package loader
tar loader
.
.
.
Software Heritage Arch...
Using the archive
Preview just for you: a "wayback machine" for archived code!
go to http://archive.softwareheritage.org/b...
Outline
1 Software all around us
2 The Software Heritage initiative
3 Building for the long term
4 Conclusion
Roberto Di C...
Growing Support
Landmark Inria Unesco agreement, April 3rd, 2017
Sharing the vision
Contributing to the mission
>= 100Ke/y...
You can help!
Coding, advice see forge.softwareheritage.org
Current development priorities ... all contributions equally w...
Outline
1 Software all around us
2 The Software Heritage initiative
3 Building for the long term
4 Conclusion
Roberto Di C...
Come in, we’re open!
www.softwareheritage.org @swheritage
Grand opening, June 7th, UNESCO heaquarters!
Library of Alexandr...
Upcoming SlideShare
Loading in …5
×

Osis18_Cloud : Software-heritage

37 views

Published on



Le logiciel est au cœur de notre société numérique et le code source des logiciels contient une part croissante de nos connaissances scientifiques, techniques et organisationnelles, au point d'être devenu désormais une partie intégrante du patrimoine de l'Humanité.

La mission de «Software Heritage» est de veiller à ce que cette précieuse masse des connaissances soit collectée, préservée, organisée et mise à la disposition de tous.

Construire une telle infrastructure pose des défis importants, à la fois techniques et stratégiques, et nous pouvons tous contribuer à les résoudre.

Published in: Business
  • Be the first to comment

  • Be the first to like this

Osis18_Cloud : Software-heritage

  1. 1. Software Heritage Why and How We Preserve all of Mankind’s Source Code Roberto Di Cosmo roberto@dicosmo.org June 5th, 2018 THE GREAT LIBRARY OF SOURCE CODE Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 1 / 14
  2. 2. Outline 1 Software all around us 2 The Software Heritage initiative 3 Building for the long term 4 Conclusion Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 2 / 14
  3. 3. Software is everywhere Software Software embodies our collective Knowledge and Cultural Heritage Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 2 / 14
  4. 4. Software Source Code is special Harold Abelson, Structure and Interpretation of Computer Programs (1st ed.) 1985 “Programs must be written for people to read, and only incidentally for machines to execute.” Quake 2 source code (excerpt) Net. queue in Linux (excerpt) Len Shustek, Computer History Museum “Source code provides a view into the mind of the designer.” Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 3 / 14
  5. 5. ~ 50 years, a lightning fast growth Apollo 11 Guidance Computer (~60.000 lines), 1969 "When I first got into it, nobody knew what it was that we were doing. It was like the Wild West." Margaret Hamilton Linux Kernel ... now in your pockets! are we taking care of all this? Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 4 / 14
  6. 6. Software is spread all around Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 5 / 14
  7. 7. Software is fragile Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 6 / 14
  8. 8. Software lacks its own research infrastructure Photo: ALMA(ESO/NAOJ/NRAO), R. Hills Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 7 / 14
  9. 9. Outline 1 Software all around us 2 The Software Heritage initiative 3 Building for the long term 4 Conclusion Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 8 / 14
  10. 10. The Software Heritage Project www.softwareheritage.org Our mission Collect, preserve and share the source code of all the software that is available Past, present and future Preserving the past, enhancing the present, preparing the future Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 8 / 14
  11. 11. A principled infrastructure http://bit.ly/swhpaper Technology transparency and FOSS replicas all the way down Content intrinsic identifiers facts and provenance Organization non-profit multi-stakeholder Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 9 / 14
  12. 12. Architecture (simplified) Git loader Mercurial loader Debian source package loader tar loader . . . Software Heritage Archive Merkle DAG + blob storage Loading & deduplication dsc dsc hg hg hg git git git git svn svn svn tar zip software origins Package repos Forges GitHub lister GitLab lister Debian lister PyPi lister . . . Distros ... Scheduling Listing (full/incremental) full development history permanently archived origins: GitHub (automated), Debian (automated), Gitorious, Google Code, GNU ~150Tb raw contents, ~10Tb graph (7+Bn nodes, 60+Bn edges) Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 10 / 14
  13. 13. Using the archive Preview just for you: a "wayback machine" for archived code! go to http://archive.softwareheritage.org/browse use the credentials: devoxx / 2018 No time for a demo, let’s highlight some features... Origin search Directory browsing Revisions as diffs Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 11 / 14
  14. 14. Outline 1 Software all around us 2 The Software Heritage initiative 3 Building for the long term 4 Conclusion Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 12 / 14
  15. 15. Growing Support Landmark Inria Unesco agreement, April 3rd, 2017 Sharing the vision Contributing to the mission >= 100Ke/year >= 50Ke/year >= 25Ke/year >= 10Ke/year Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 12 / 14
  16. 16. You can help! Coding, advice see forge.softwareheritage.org Current development priorities ... all contributions equally welcome! documentation listers/loaders for unsupported forges, VCS Web UI improvements Yes, we’ll be hiring, see www.softwareheritage.org/jobs Funding make your company a sponsor : sponsorship.softwareheritage.org give your own contribution : www.softwareheritage.org/donate Spread the word! follow and relay project news share the vision, tell others how to support the mission Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 13 / 14
  17. 17. Outline 1 Software all around us 2 The Software Heritage initiative 3 Building for the long term 4 Conclusion Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 14 / 14
  18. 18. Come in, we’re open! www.softwareheritage.org @swheritage Grand opening, June 7th, UNESCO heaquarters! Library of Alexandria of code recover the past structure the future A CERN for Software build better software for industry for society as a whole Roberto Di Cosmo www.dicosmo.org www.softwareheritage.org June 5th, 2018 14 / 14

×