Perl Dist::Surveyor 2011

  • 1,986 views
Uploaded on

Slides on my lightning talk at the London Perl Workshop, November 2011.

Slides on my lightning talk at the London Perl Workshop, November 2011.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,986
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
8
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Dist::Surveyor“what’s in that lib directory?” Tim Bunce - Nov 2011 Creative Commons BY-NC-SA 3.0
  • 2. The ContextPerl 5.8 Applications CPAN Businessmodules modules
  • 3. The Context• A large library of CPAN distributions - In a local::lib style dir .../cpan-5.008/{man,bin,lib}/ - Installed over many years - No external record of what has been installed - Almost 5000 modules - In production in many systems on many machines
  • 4. The Itch• Want to upgrade from perl 5.8 - so need to clone our local library of CPAN modules - to .../cpan-5.012/{man,bin,lib}/ - with recompiled perl extensions• Want the exact set of distribution versions - so when testing “nothing but perl changed”
  • 5. “What’s in that lib directory?”
  • 6. Innocence and Hope• Vague memory of something called ‘packlists’• Vague memory of perllocal.pod install log• Vague memory of some work by brian d foy• Usual hope that someone’s already done this• “How hard can it be?”
  • 7. /.packlist• Records only what files were installed• Doesn’t record the origin distribution• Useless for my needs
  • 8. what_dists.pl• Chris Williams’s github.com/bingos/throwaway• Matches installed modules to distributions• Only matches to the latest distributions• Looked like a good place to start• I hacked it to use perllocal.pod data and a bunch of heuristics.• It worked, mostly. Annoying edge cases.• Lots of hacks, heuristics, and blind luck.
  • 9. perllocal.pod• Records a “name” and “version”• Name is the Makefile.PL NAME - can be the module or distribution name - or something else entirely• Version is the Makefile.PL VERSION - not always the version in the distribution filename• Incomplete! - Not written by Module::Build based distributions
  • 10. BackPAN::Version::Discover• “Figure out exactly which dist versions you have installed”• Based on BackPAN::Index• Incomplete and “very alpha”• Matching logic not very robust• Just doesn’t work very well for us
  • 11. DPAN• “start with an existing Perl distribution and work backward to the MiniCPAN that would re-install the same thing” - brian d foy• Indexes MD5 and other metadata for all BackPAN modules and scripts• Incomplete: doesn’t yet work out what distribution versions are installed.
  • 12. GitPAN• Git repo for every distribution on CPAN• Includes all distro versions on BackPAN• Pondered using git hashes and the github API• But GitPAN isn’t being maintained
  • 13.
  • 14. MetaCPAN
  • 15. MetaCPAN• Repository for CPAN metadata - ElasticSearch distributed database (Lucene) - RESTful API• CPAN and entire BackPAN fully indexed• Very detailed metadata• Full Of Awesome
  • 16. MetaCPAN• Find all releases that contain a particular version of a module:curl -XPOST api.metacpan.org/v0/file/_search -d { "query": { "filtered":{ "query":{"match_all":{}}, "filter":{"and":[ {"term":{"file.module.name":"DBI::Profile"}}, {"term":{"file.module.version":"2.014123"}} ]} }}, "fields":["release"]}
  • 17.
  • 18. The Method• Get installed module names, versions, file sizes• For every module: - find “candidate distributions” that included that module version, ideally also matching the file size.• For every candidate distribution: - get all modules and versions shipped in that distro - score each candidate by the proportion of its modules and versions which match what’s installed
  • 19. An Example
  • 20. Cloning From The List
  • 21. Cloning From The List• Can’t simply feed results to cpanm - It’ll fetch the latest version of any prereqs• Tried to put the list in dependancy order• Tried to use MiniCPAN::Inject• Finally added a --makecpan dir option - Fetches distro tarballs and writes index - can be used as CPAN repo by cpanm
  • 22. Typical UsageSurvey what distributions are installed in a library:$ dist_surveyor.pl --makecpan my_cpan /a/perl/lib/dir > installed_dists.txtInstall exactly those distributions in a new library:$ cpanm --mirror file:$PWD/my_cpan --mirror-only -l new_lib < installed_dists.txtBonus: re-tests all distros with current prereqs
  • 23. Status• Currently a single script• Ought to be turned into a module• Looking for a maintainer
  • 24. Interested?Tim.Bunce@pobox.comhttp://blog.timbunce.org @timbunce on twitter