(Ab)Using the MetaCPAN API for Fun and Profit

  • 1,648 views
Uploaded on

A quick

A quick

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,648
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
7
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • show of hands: \n1) have used the metacpan search site \n2) use it as their default search site \n3) have worked with the API\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • CPAN visualization tool\n
  • \n
  • \n
  • \n
  • Exports Pod into a format you can import right into your Kindle app.\n
  • \n
  • Drop-in replacement for Perldoc. Read documentation for modules which you haven’t even installed. Genius.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • You can see that for the email and website fields, we allow you to provide a list rather than a single value. Now, for our example we need an author’s Github profile. Matt Trout does not provide this, so he’s a bad test case for our script.\n
  • Things like StackOverflow, Twitter and Github usernames are all provided by authors voluntarily after logging in to MetaCPAN. In order to see what the profiles look like in a data structure, we need to find an author who has filled these fields.\n
  • You can see from Mo’s example here that he has filled out some of his profile information. He’s a good test case. Note the MetaCPAN explorer link on the bottom left corner. These links can also be found on the module and release pages.\n
  • This is a great way to explore the various endpoints of the API and practice crafting queries by hand. However, today we’re just concerned with the /author endpoint.\n
  • \n
  • \n
  • You can see here that since we’re no longer using a convenience endpoint, the output is a little busier. What we generally care about here is the list provided inside of hits->{hits}. In each list item, we care about _source and _source->{profile} in particular.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. (Ab)Using theMetaCPAN API for Fun and Profit Olaf Alders (OALDERS) @wundercounter
  • 2. Architecture• Built on ElasticSearch• Uses Catalyst as a thin wrapper• You don’t need to know this
  • 3. Real life examples
  • 4. iCPAN - iPhone
  • 5. iCPAN - iPad
  • 6. Android
  • 7. What can we build?
  • 8. What can we build?• Do something with Github
  • 9. What can we build?• Do something with Github• Get a list of all CPAN authors who have enabled the “hireable” flag in their Github profiles
  • 10. Let’s Get Started
  • 11. Let’s Get Started• We want to fetch some data
  • 12. Let’s Get Started• We want to fetch some data• We’ll use Sawyer’s MetaCPAN::API
  • 13. #!/usr/bin/env perluse strict;use warnings;use MetaCPAN::API;my $mcpan = MetaCPAN::API->new();my $author = $mcpan->author(MSTROUT);
  • 14. { dir => "id/M/MS/MSTROUT", email => ["perl-stuff@trout.me.uk"], gravatar_url => "https://secure.gravatar.com/avatar/...", name => "Matt S Trout", pauseid => "MSTROUT", website => ["http://www.trout.me.uk/"],}
  • 15. MetaCPAN Explorer
  • 16. my $author = $mcpan->author(MSTROUT);
  • 17. my $result = $mcpan->post( author, { query => { match_all => {} }, size => 1, },);
  • 18. { _shards => { failed => 0, successful => 5, total => 5 }, hits => { hits => [ { _id => "KHAMPTON", _index => "cpan_v1", _score => 1, _source => { city => "Los Angeles", country => "US", dir => "id/K/KH/KHAMPTON", email => ["khampton@totalcinema.com", "kip.hampton@tamarou.com"], gravatar_url => "http://www.gravatar.com/avatar/...", name => "Kip Hampton", pauseid => "KHAMPTON", profile => [ { id => "ubu", name => "coderwall" }, { id => "ubu", name => "github" }, { id => "kiphampton", name => "twitter" }, ], region => "CA", updated => "2011-07-22T20:42:06", website => ["http://totalcinema.com/"], }, _type => "author", }, ], max_score => 1, total => 9780, }, timed_out => bless(do{(my $o = 0)}, "JSON::XS::Boolean"), took => 1,}
  • 19. my $result = $mcpan->post( author, { query => { match_all => {} }, size => 1, },);# dump $result->{hits}->{hits}->[0]->{_source};
  • 20. { city => "Los Angeles", country => "US", dir => "id/K/KH/KHAMPTON", email => ["khampton@totalcinema.com", "kip.hampton@tamarou.com"], gravatar_url => "http://www.gravatar.com/avatar/...", name => "Kip Hampton", pauseid => "KHAMPTON", profile => [ { id => "ubu", name => "coderwall" }, { id => "ubu", name => "github" }, { id => "kiphampton", name => "twitter" }, ], region => "CA", updated => "2011-07-22T20:42:06", website => ["http://totalcinema.com/"],}
  • 21. my $result = $mcpan->post( author, { query => { match_all => {} }, size => 100, },);
  • 22. my $filter = { { term => { author.profile.name => stackoverflow, } },};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • 23. use Pithub;my $p = Pithub->new;AUTHOR:foreach my $author ( @{ $result->{hits}->{hits} } ) { foreach my $profile ( @{ $author->{_source}->{profile} } ) { if ( $profile->{name} eq github ) { my $username = $profile->{id}; $username =~ s{https?://github.com/(w*)/?}{$1}i; next AUTHOR if !$username; if ( $p->users->get( user => $username )->content->{hireable} ) { # do something... } next AUTHOR; } }}
  • 24. Getting fancy
  • 25. my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } } ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • 26. my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • 27. my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);# “missing” isn’t really helpful in this search# just an example of how you might use it
  • 28. my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • 29. my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, fields => [ pauseid, country ], },);
  • 30. my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, sort => [ { author.pauseid => ASC } ], },);
  • 31. Getting Help• #metacpan or irc.perl.org• https://metacpan.org/about/resources
  • 32. Resources• https://github.com/CPAN-API/cpan- api/wiki/Beta-API-docs• http://www.slideshare.net/ clintongormley/terms-of-endearment- the-elasticsearch-query-dsl-explained
  • 33. Bonus Slides
  • 34. Base URL• http://api.metacpan.org/v0
  • 35. Convenience Endpoints
  • 36. Convenience Endpoints• /author/DOY• /distribution/Moose• /release/Moose• /module/Moose• /pod/Moose
  • 37. Exporting Pod• /pod/Moose?content-type=text/html (default)• /pod/Moose?content-type=text/plain• /pod/Moose?content-type=text/x-pod• /pod/Moose?content-type=text/x-markdown
  • 38. The (real) Endpoints
  • 39. The (real) Endpoints• /author
  • 40. The (real) Endpoints• /author• /distribution
  • 41. The (real) Endpoints• /author• /distribution• /favorite
  • 42. The (real) Endpoints• /author• /distribution• /favorite• /rating
  • 43. The (real) Endpoints• /author• /distribution• /favorite• /rating• /release
  • 44. The (real) Endpoints• /author• /distribution• /favorite• /rating• /release• /file
  • 45. The (real) Endpoints• /author• /distribution• /favorite• /rating• /release• /file
  • 46. Using a cacheuse HTTP::Tiny::Mech;use MetaCPAN::API;use WWW::Mechanize::Cached;my $mcpan = MetaCPAN::API->new( ua => HTTP::Tiny::Mech->new( mechua => WWW::Mechanize::Cached->new() ));
  • 47. Enable Compression• use WWW::Mechanize::Gzip• use WWW::Mechanize::Cached::Gzip• Or set the appropriate request header
  • 48. Use the scrolling API• The scrolling API allows you to iterate over an arbitrary number of results• Be aware that when you scroll, your docs will come back unsorted