CPANTSKwalitative website and its tools                    Kenichi Ishigaki                         (charsbar)            ...
Kenichi Ishigaki   (charsbar)From Shibuya.pm,  Tokyo, Japan.
Freelancer - Perl programmer- Writer/Translator
Around 40 CPAN distributions
DBD::SQLite
Acme::CPANAuthors
We have been enjoying theCPANTS game since 2005.
輝け!全日本最強 CPAN Author  決定選手権            by Koichi Taniguchihttp://blog.livedoor.jp/nipotan/archives/16108466.html
He picked upJapanese authors    by eye.
Our names are easy to find.
There were notso many authors.  - Total: ~4000 - Japanese: ~50
YAPC::Asia increased  the number of Japanese authors.
YAPC::Asia / Japanese authors   2006 (Mar)       98   2007 (Apr)      154   2008 (May)       191   2009 (Sep)      228   2...
Neededsomething to pick   up Japanese  authors more      easily.
Thats why Icreated a list ofJapanese authors and a script to   maintain it.
Ive been reporting theJapanese top 10 authors since     2008.
Ive been adding something new   every year.
2008: sum of the kwalitee scores   per author
2009: authors who releasedmost in the year
2010: authors/population ratio
2011: launched awebsite (finally)acme.cpanauthors.org
It had one big   problem.
No data.
The officialCPANTS site had been down for   some time.
I needed to set    up mine.
I created aprivate repositoryand put everything       into it.
Merged recentcommits from   domms repository.
Added a few columns.
TweakedCatalyst/DBIC    stuff.
It worked.
Warnings were    left.
I needed to find  some tuits to remove them.
Perl QAHackathon
Warnings were  removed.
Ported some ofthe changes I didlocally to daxims   repository.
Showed a new acme.cpanauthors.orgfeaturing CPANTS info.
Unfortunately,the porting took too much time.
I didnt mergethe changes backto my repository.
OSDC.TW
I finally merged  the changes.
Got severalreports thatCPANTS was  broken.
What brokeCPANTS was a small change.
"modules" : [  {    "file" : "lib/Path/Extended.pm",    "in_basedir" : 0,    "in_lib" : 1,    "module" : "Path::Extended",...
I dont think this  change is bad.
Module::CPANTS::  ProcessCPAN shouldnt have  died by this.
It should have  had tests.
Is should have  run faster.
It should havebeen easier to fix    analysis.
Enough issues for   a summer.
What should we     do?
- We need tests.- we need to find   test cases.- we need to do it   many times.
Making it runfaster is thefirst priority.
I wrote abarebone scriptto store data in    parallel.
JSONcreate table if not exists analysis (      id integer primary key autoincrement,      path text unique,      distv tex...
Raw SQLstatements
Parallel::ForkManager
SQLite queue
Beware a race conditionmy ($id) = $dbh->selectrow_array("  SELECT id FROM queue  WHERE status = 0 LIMIT = 1");$dbh->do("  ...
sqlite_update_hookmy $id;my $dbh->sqlite_update_hook(sub {  (undef, undef, undef, $id) = @_;});
$dbh->do("  UPDATE queue    SET status = 1,  WHERE id IN (    SELECT id FROM queue    WHERE status = 0 LIMIT 1  )");
Archive::Any::Lite
Archive::Any::Plugin::Bzip2
WorePAN- Bundling is bad- We need a specificversion- Derived from OrePAN
use WorePAN;my $worepan = WorePAN->new(  root => path/to/a/directory/,  files => [qw(    I/IS/ISHIGAKI/WorePAN-0.01.tar.gz...
use WorePAN;my $worepan = WorePAN->new(  root => path/to/a/directory/,  files => [qw(    I/IS/ISHIGAKI/WorePAN-0.01.tar.gz...
use WorePAN;my $worepan = WorePAN->new(  root => path/to/a/directory/,  dists => {    Catalyst-Runtime => 5.9,    DBIx-Cla...
Bonus featuresmy $worepan = WorePAN->new(  root => path/to/a/CPAN/mirror/,  cleanup => 0,);my   $authors = $worepan->autho...
$worepan->add_files(qw{  /path/to/a/local/distribution-0.01.tar.gz});$worepan->update_indices;
Now we haveenough tools.
Processing time is   significantly    decreased.
Whats next?
::Site refactoring
Im preparing the    data now.
Creating moredatabases/tables.
Merginginformation fromexternal sources.- CPAN indices- CPAN uploads database
Calculating scores on prerequisite     modules.
It will be thisyears somethingnew in my annual     report.
And then, Illmove on to fixing  the metrics.
Some of them are  badly broken."versions" : {  "lib/Data/Phrasebook.pm" : "use vars qw($VERSION);¥n",  "lib/Data/Phraseboo...
Error is not a stash."error" : {  "easily_repackageable" : "easily_repackageable_by_fedora",  "easily_repackageable_by_fed...
Should haveinitialize/finalize phases.Module::CPANTS::Kwalitee::Distros doesnt clean up after mirrored       Debian CPANTS...
There are much more       to do.-   JSON API for metacpan.org and so on.-   Email Reporting like CPAN Testers-   Evaluate ...
Resources     github.com/charsbar/www-cpants       github.com/charsbar/worepangithub.com/daxim/Module-CPANTS-Analyse
Questions?
Thank you
CPANTS: Kwalitative website and its tools
CPANTS: Kwalitative website and its tools
CPANTS: Kwalitative website and its tools
CPANTS: Kwalitative website and its tools
CPANTS: Kwalitative website and its tools
Upcoming SlideShare
Loading in...5
×

CPANTS: Kwalitative website and its tools

671

Published on

CPANTS talk at YAPC::EU 2012

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
671
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CPANTS: Kwalitative website and its tools

  1. 1. CPANTSKwalitative website and its tools Kenichi Ishigaki (charsbar) @YAPC::EU 2012 August 22, 2012
  2. 2. Kenichi Ishigaki (charsbar)From Shibuya.pm, Tokyo, Japan.
  3. 3. Freelancer - Perl programmer- Writer/Translator
  4. 4. Around 40 CPAN distributions
  5. 5. DBD::SQLite
  6. 6. Acme::CPANAuthors
  7. 7. We have been enjoying theCPANTS game since 2005.
  8. 8. 輝け!全日本最強 CPAN Author 決定選手権 by Koichi Taniguchihttp://blog.livedoor.jp/nipotan/archives/16108466.html
  9. 9. He picked upJapanese authors by eye.
  10. 10. Our names are easy to find.
  11. 11. There were notso many authors. - Total: ~4000 - Japanese: ~50
  12. 12. YAPC::Asia increased the number of Japanese authors.
  13. 13. YAPC::Asia / Japanese authors 2006 (Mar) 98 2007 (Apr) 154 2008 (May) 191 2009 (Sep) 228 2010 (Oct) 255 2011 (Oct) 270
  14. 14. Neededsomething to pick up Japanese authors more easily.
  15. 15. Thats why Icreated a list ofJapanese authors and a script to maintain it.
  16. 16. Ive been reporting theJapanese top 10 authors since 2008.
  17. 17. Ive been adding something new every year.
  18. 18. 2008: sum of the kwalitee scores per author
  19. 19. 2009: authors who releasedmost in the year
  20. 20. 2010: authors/population ratio
  21. 21. 2011: launched awebsite (finally)acme.cpanauthors.org
  22. 22. It had one big problem.
  23. 23. No data.
  24. 24. The officialCPANTS site had been down for some time.
  25. 25. I needed to set up mine.
  26. 26. I created aprivate repositoryand put everything into it.
  27. 27. Merged recentcommits from domms repository.
  28. 28. Added a few columns.
  29. 29. TweakedCatalyst/DBIC stuff.
  30. 30. It worked.
  31. 31. Warnings were left.
  32. 32. I needed to find some tuits to remove them.
  33. 33. Perl QAHackathon
  34. 34. Warnings were removed.
  35. 35. Ported some ofthe changes I didlocally to daxims repository.
  36. 36. Showed a new acme.cpanauthors.orgfeaturing CPANTS info.
  37. 37. Unfortunately,the porting took too much time.
  38. 38. I didnt mergethe changes backto my repository.
  39. 39. OSDC.TW
  40. 40. I finally merged the changes.
  41. 41. Got severalreports thatCPANTS was broken.
  42. 42. What brokeCPANTS was a small change.
  43. 43. "modules" : [ { "file" : "lib/Path/Extended.pm", "in_basedir" : 0, "in_lib" : 1, "module" : "Path::Extended", "uses" : { "Sub::Install" : 1, "strict" : 1, "warnings" : 1 } }]
  44. 44. I dont think this change is bad.
  45. 45. Module::CPANTS:: ProcessCPAN shouldnt have died by this.
  46. 46. It should have had tests.
  47. 47. Is should have run faster.
  48. 48. It should havebeen easier to fix analysis.
  49. 49. Enough issues for a summer.
  50. 50. What should we do?
  51. 51. - We need tests.- we need to find test cases.- we need to do it many times.
  52. 52. Making it runfaster is thefirst priority.
  53. 53. I wrote abarebone scriptto store data in parallel.
  54. 54. JSONcreate table if not exists analysis ( id integer primary key autoincrement, path text unique, distv text, author text, json text, duration integer);
  55. 55. Raw SQLstatements
  56. 56. Parallel::ForkManager
  57. 57. SQLite queue
  58. 58. Beware a race conditionmy ($id) = $dbh->selectrow_array(" SELECT id FROM queue WHERE status = 0 LIMIT = 1");$dbh->do(" UPDATE queue SET status = 1 WHERE id = ?", undef, $id);
  59. 59. sqlite_update_hookmy $id;my $dbh->sqlite_update_hook(sub { (undef, undef, undef, $id) = @_;});
  60. 60. $dbh->do(" UPDATE queue SET status = 1, WHERE id IN ( SELECT id FROM queue WHERE status = 0 LIMIT 1 )");
  61. 61. Archive::Any::Lite
  62. 62. Archive::Any::Plugin::Bzip2
  63. 63. WorePAN- Bundling is bad- We need a specificversion- Derived from OrePAN
  64. 64. use WorePAN;my $worepan = WorePAN->new( root => path/to/a/directory/, files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], use_backpan => 1, no_network => 0, cleanup => 1,);
  65. 65. use WorePAN;my $worepan = WorePAN->new( root => path/to/a/directory/, files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], local_mirror => /home/ishigaki/minicpan/, no_network => 1, cleanup => 1,);
  66. 66. use WorePAN;my $worepan = WorePAN->new( root => path/to/a/directory/, dists => { Catalyst-Runtime => 5.9, DBIx-Class => 0, }, cleanup => 1,);
  67. 67. Bonus featuresmy $worepan = WorePAN->new( root => path/to/a/CPAN/mirror/, cleanup => 0,);my $authors = $worepan->authors;my $modules = $worepan->modules;my $file = $worepan->files;my $dists = $worepan->latest_distributions;
  68. 68. $worepan->add_files(qw{ /path/to/a/local/distribution-0.01.tar.gz});$worepan->update_indices;
  69. 69. Now we haveenough tools.
  70. 70. Processing time is significantly decreased.
  71. 71. Whats next?
  72. 72. ::Site refactoring
  73. 73. Im preparing the data now.
  74. 74. Creating moredatabases/tables.
  75. 75. Merginginformation fromexternal sources.- CPAN indices- CPAN uploads database
  76. 76. Calculating scores on prerequisite modules.
  77. 77. It will be thisyears somethingnew in my annual report.
  78. 78. And then, Illmove on to fixing the metrics.
  79. 79. Some of them are badly broken."versions" : { "lib/Data/Phrasebook.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Debug.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Generic.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Base.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Text.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Plain.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL/Query.pm" : "use vars qw($VERSION);¥n"},
  80. 80. Error is not a stash."error" : { "easily_repackageable" : "easily_repackageable_by_fedora", "easily_repackageable_by_fedora" : "fits_fedora_license", "metayml_conforms_spec_current" : [ "1.4", "Expected a map structure from data string or file. [Validation: 1.4]" ], "metayml_conforms_to_known_spec" : [ "1.0", "Expected a map structure from data string or file. [Validation: 1.0]" ], "no_pod_errors" : " home cpants tmp analyze 11442 8001be43fb65..."}
  81. 81. Should haveinitialize/finalize phases.Module::CPANTS::Kwalitee::Distros doesnt clean up after mirrored Debian CPANTS file https://rt.cpan.org/Ticket/Display.html?id=51514
  82. 82. There are much more to do.- JSON API for metacpan.org and so on.- Email Reporting like CPAN Testers- Evaluate new Kwalitee indicators- New metrics like portable filename- Blog about recent tendency- More comprehensive tests- Analysis per perl version/architecture- Cover Perl::Critic, CPAN::Critic::Module::Abstract- 35 RT tickets and several github isses
  83. 83. Resources github.com/charsbar/www-cpants github.com/charsbar/worepangithub.com/daxim/Module-CPANTS-Analyse
  84. 84. Questions?
  85. 85. Thank you
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×