Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Perl Dist::Surveyor 2011


Published on

Slides on my lightning talk at the London Perl Workshop, November 2011.

Published in: Technology, Art & Photos
  • Be the first to comment

Perl Dist::Surveyor 2011

  1. 1. Dist::Surveyor“what’s in that lib directory?” Tim Bunce - Nov 2011 Creative Commons BY-NC-SA 3.0
  2. 2. The ContextPerl 5.8 Applications CPAN Businessmodules modules
  3. 3. The Context• A large library of CPAN distributions - In a local::lib style dir .../cpan-5.008/{man,bin,lib}/ - Installed over many years - No external record of what has been installed - Almost 5000 modules - In production in many systems on many machines
  4. 4. The Itch• Want to upgrade from perl 5.8 - so need to clone our local library of CPAN modules - to .../cpan-5.012/{man,bin,lib}/ - with recompiled perl extensions• Want the exact set of distribution versions - so when testing “nothing but perl changed”
  5. 5. “What’s in that lib directory?”
  6. 6. Innocence and Hope• Vague memory of something called ‘packlists’• Vague memory of perllocal.pod install log• Vague memory of some work by brian d foy• Usual hope that someone’s already done this• “How hard can it be?”
  7. 7. /.packlist• Records only what files were installed• Doesn’t record the origin distribution• Useless for my needs
  8. 8.• Chris Williams’s• Matches installed modules to distributions• Only matches to the latest distributions• Looked like a good place to start• I hacked it to use perllocal.pod data and a bunch of heuristics.• It worked, mostly. Annoying edge cases.• Lots of hacks, heuristics, and blind luck.
  9. 9. perllocal.pod• Records a “name” and “version”• Name is the Makefile.PL NAME - can be the module or distribution name - or something else entirely• Version is the Makefile.PL VERSION - not always the version in the distribution filename• Incomplete! - Not written by Module::Build based distributions
  10. 10. BackPAN::Version::Discover• “Figure out exactly which dist versions you have installed”• Based on BackPAN::Index• Incomplete and “very alpha”• Matching logic not very robust• Just doesn’t work very well for us
  11. 11. DPAN• “start with an existing Perl distribution and work backward to the MiniCPAN that would re-install the same thing” - brian d foy• Indexes MD5 and other metadata for all BackPAN modules and scripts• Incomplete: doesn’t yet work out what distribution versions are installed.
  12. 12. GitPAN• Git repo for every distribution on CPAN• Includes all distro versions on BackPAN• Pondered using git hashes and the github API• But GitPAN isn’t being maintained
  13. 13.
  14. 14. MetaCPAN
  15. 15. MetaCPAN• Repository for CPAN metadata - ElasticSearch distributed database (Lucene) - RESTful API• CPAN and entire BackPAN fully indexed• Very detailed metadata• Full Of Awesome
  16. 16. MetaCPAN• Find all releases that contain a particular version of a module:curl -XPOST -d { "query": { "filtered":{ "query":{"match_all":{}}, "filter":{"and":[ {"term":{"":"DBI::Profile"}}, {"term":{"file.module.version":"2.014123"}} ]} }}, "fields":["release"]}
  17. 17.
  18. 18. The Method• Get installed module names, versions, file sizes• For every module: - find “candidate distributions” that included that module version, ideally also matching the file size.• For every candidate distribution: - get all modules and versions shipped in that distro - score each candidate by the proportion of its modules and versions which match what’s installed
  19. 19. An Example
  20. 20. Cloning From The List
  21. 21. Cloning From The List• Can’t simply feed results to cpanm - It’ll fetch the latest version of any prereqs• Tried to put the list in dependancy order• Tried to use MiniCPAN::Inject• Finally added a --makecpan dir option - Fetches distro tarballs and writes index - can be used as CPAN repo by cpanm
  22. 22. Typical UsageSurvey what distributions are installed in a library:$ --makecpan my_cpan /a/perl/lib/dir > installed_dists.txtInstall exactly those distributions in a new library:$ cpanm --mirror file:$PWD/my_cpan --mirror-only -l new_lib < installed_dists.txtBonus: re-tests all distros with current prereqs
  23. 23. Status• Currently a single script• Ought to be turned into a module• Looking for a maintainer
  24. 24. Interested?Tim.Bunce@pobox.com @timbunce on twitter