• Like
Perl Dist::Surveyor 2011
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Perl Dist::Surveyor 2011


Slides on my lightning talk at the London Perl Workshop, November 2011.

Slides on my lightning talk at the London Perl Workshop, November 2011.

Published in Technology , Art & Photos
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Dist::Surveyor“what’s in that lib directory?” Tim Bunce - Nov 2011 Creative Commons BY-NC-SA 3.0
  • 2. The ContextPerl 5.8 Applications CPAN Businessmodules modules
  • 3. The Context• A large library of CPAN distributions - In a local::lib style dir .../cpan-5.008/{man,bin,lib}/ - Installed over many years - No external record of what has been installed - Almost 5000 modules - In production in many systems on many machines
  • 4. The Itch• Want to upgrade from perl 5.8 - so need to clone our local library of CPAN modules - to .../cpan-5.012/{man,bin,lib}/ - with recompiled perl extensions• Want the exact set of distribution versions - so when testing “nothing but perl changed”
  • 5. “What’s in that lib directory?”
  • 6. Innocence and Hope• Vague memory of something called ‘packlists’• Vague memory of perllocal.pod install log• Vague memory of some work by brian d foy• Usual hope that someone’s already done this• “How hard can it be?”
  • 7. /.packlist• Records only what files were installed• Doesn’t record the origin distribution• Useless for my needs
  • 8. what_dists.pl• Chris Williams’s github.com/bingos/throwaway• Matches installed modules to distributions• Only matches to the latest distributions• Looked like a good place to start• I hacked it to use perllocal.pod data and a bunch of heuristics.• It worked, mostly. Annoying edge cases.• Lots of hacks, heuristics, and blind luck.
  • 9. perllocal.pod• Records a “name” and “version”• Name is the Makefile.PL NAME - can be the module or distribution name - or something else entirely• Version is the Makefile.PL VERSION - not always the version in the distribution filename• Incomplete! - Not written by Module::Build based distributions
  • 10. BackPAN::Version::Discover• “Figure out exactly which dist versions you have installed”• Based on BackPAN::Index• Incomplete and “very alpha”• Matching logic not very robust• Just doesn’t work very well for us
  • 11. DPAN• “start with an existing Perl distribution and work backward to the MiniCPAN that would re-install the same thing” - brian d foy• Indexes MD5 and other metadata for all BackPAN modules and scripts• Incomplete: doesn’t yet work out what distribution versions are installed.
  • 12. GitPAN• Git repo for every distribution on CPAN• Includes all distro versions on BackPAN• Pondered using git hashes and the github API• But GitPAN isn’t being maintained
  • 13.
  • 14. MetaCPAN
  • 15. MetaCPAN• Repository for CPAN metadata - ElasticSearch distributed database (Lucene) - RESTful API• CPAN and entire BackPAN fully indexed• Very detailed metadata• Full Of Awesome
  • 16. MetaCPAN• Find all releases that contain a particular version of a module:curl -XPOST api.metacpan.org/v0/file/_search -d { "query": { "filtered":{ "query":{"match_all":{}}, "filter":{"and":[ {"term":{"file.module.name":"DBI::Profile"}}, {"term":{"file.module.version":"2.014123"}} ]} }}, "fields":["release"]}
  • 17.
  • 18. The Method• Get installed module names, versions, file sizes• For every module: - find “candidate distributions” that included that module version, ideally also matching the file size.• For every candidate distribution: - get all modules and versions shipped in that distro - score each candidate by the proportion of its modules and versions which match what’s installed
  • 19. An Example
  • 20. Cloning From The List
  • 21. Cloning From The List• Can’t simply feed results to cpanm - It’ll fetch the latest version of any prereqs• Tried to put the list in dependancy order• Tried to use MiniCPAN::Inject• Finally added a --makecpan dir option - Fetches distro tarballs and writes index - can be used as CPAN repo by cpanm
  • 22. Typical UsageSurvey what distributions are installed in a library:$ dist_surveyor.pl --makecpan my_cpan /a/perl/lib/dir > installed_dists.txtInstall exactly those distributions in a new library:$ cpanm --mirror file:$PWD/my_cpan --mirror-only -l new_lib < installed_dists.txtBonus: re-tests all distros with current prereqs
  • 23. Status• Currently a single script• Ought to be turned into a module• Looking for a maintainer
  • 24. Interested?Tim.Bunce@pobox.comhttp://blog.timbunce.org @timbunce on twitter