• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
(Ab)Using the MetaCPAN API for Fun and Profit
 

(Ab)Using the MetaCPAN API for Fun and Profit

on

  • 1,689 views

A quick

A quick

Statistics

Views

Total Views
1,689
Views on SlideShare
1,684
Embed Views
5

Actions

Likes
3
Downloads
6
Comments
0

2 Embeds 5

https://twitter.com 3
http://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • show of hands: \n1) have used the metacpan search site \n2) use it as their default search site \n3) have worked with the API\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • CPAN visualization tool\n
  • \n
  • \n
  • \n
  • Exports Pod into a format you can import right into your Kindle app.\n
  • \n
  • Drop-in replacement for Perldoc. Read documentation for modules which you haven’t even installed. Genius.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • You can see that for the email and website fields, we allow you to provide a list rather than a single value. Now, for our example we need an author’s Github profile. Matt Trout does not provide this, so he’s a bad test case for our script.\n
  • Things like StackOverflow, Twitter and Github usernames are all provided by authors voluntarily after logging in to MetaCPAN. In order to see what the profiles look like in a data structure, we need to find an author who has filled these fields.\n
  • You can see from Mo’s example here that he has filled out some of his profile information. He’s a good test case. Note the MetaCPAN explorer link on the bottom left corner. These links can also be found on the module and release pages.\n
  • This is a great way to explore the various endpoints of the API and practice crafting queries by hand. However, today we’re just concerned with the /author endpoint.\n
  • \n
  • \n
  • You can see here that since we’re no longer using a convenience endpoint, the output is a little busier. What we generally care about here is the list provided inside of hits->{hits}. In each list item, we care about _source and _source->{profile} in particular.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

(Ab)Using the MetaCPAN API for Fun and Profit (Ab)Using the MetaCPAN API for Fun and Profit Presentation Transcript

  • (Ab)Using theMetaCPAN API for Fun and Profit Olaf Alders (OALDERS) @wundercounter
  • Architecture• Built on ElasticSearch• Uses Catalyst as a thin wrapper• You don’t need to know this
  • Real life examples
  • iCPAN - iPhone
  • iCPAN - iPad
  • Android
  • What can we build?
  • What can we build?• Do something with Github
  • What can we build?• Do something with Github• Get a list of all CPAN authors who have enabled the “hireable” flag in their Github profiles
  • Let’s Get Started
  • Let’s Get Started• We want to fetch some data
  • Let’s Get Started• We want to fetch some data• We’ll use Sawyer’s MetaCPAN::API
  • #!/usr/bin/env perluse strict;use warnings;use MetaCPAN::API;my $mcpan = MetaCPAN::API->new();my $author = $mcpan->author(MSTROUT);
  • { dir => "id/M/MS/MSTROUT", email => ["perl-stuff@trout.me.uk"], gravatar_url => "https://secure.gravatar.com/avatar/...", name => "Matt S Trout", pauseid => "MSTROUT", website => ["http://www.trout.me.uk/"],}
  • MetaCPAN Explorer
  • my $author = $mcpan->author(MSTROUT);
  • my $result = $mcpan->post( author, { query => { match_all => {} }, size => 1, },);
  • { _shards => { failed => 0, successful => 5, total => 5 }, hits => { hits => [ { _id => "KHAMPTON", _index => "cpan_v1", _score => 1, _source => { city => "Los Angeles", country => "US", dir => "id/K/KH/KHAMPTON", email => ["khampton@totalcinema.com", "kip.hampton@tamarou.com"], gravatar_url => "http://www.gravatar.com/avatar/...", name => "Kip Hampton", pauseid => "KHAMPTON", profile => [ { id => "ubu", name => "coderwall" }, { id => "ubu", name => "github" }, { id => "kiphampton", name => "twitter" }, ], region => "CA", updated => "2011-07-22T20:42:06", website => ["http://totalcinema.com/"], }, _type => "author", }, ], max_score => 1, total => 9780, }, timed_out => bless(do{(my $o = 0)}, "JSON::XS::Boolean"), took => 1,}
  • my $result = $mcpan->post( author, { query => { match_all => {} }, size => 1, },);# dump $result->{hits}->{hits}->[0]->{_source};
  • { city => "Los Angeles", country => "US", dir => "id/K/KH/KHAMPTON", email => ["khampton@totalcinema.com", "kip.hampton@tamarou.com"], gravatar_url => "http://www.gravatar.com/avatar/...", name => "Kip Hampton", pauseid => "KHAMPTON", profile => [ { id => "ubu", name => "coderwall" }, { id => "ubu", name => "github" }, { id => "kiphampton", name => "twitter" }, ], region => "CA", updated => "2011-07-22T20:42:06", website => ["http://totalcinema.com/"],}
  • my $result = $mcpan->post( author, { query => { match_all => {} }, size => 100, },);
  • my $filter = { { term => { author.profile.name => stackoverflow, } },};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • use Pithub;my $p = Pithub->new;AUTHOR:foreach my $author ( @{ $result->{hits}->{hits} } ) { foreach my $profile ( @{ $author->{_source}->{profile} } ) { if ( $profile->{name} eq github ) { my $username = $profile->{id}; $username =~ s{https?://github.com/(w*)/?}{$1}i; next AUTHOR if !$username; if ( $p->users->get( user => $username )->content->{hireable} ) { # do something... } next AUTHOR; } }}
  • Getting fancy
  • my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } } ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • my $filter = { and => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);# “missing” isn’t really helpful in this search# just an example of how you might use it
  • my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, },);
  • my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, fields => [ pauseid, country ], },);
  • my $filter = { or => [ { term => { author.profile.name => github, } }, { term => { author.country => US, } }, { exists => { field => author.region } }, { missing => { field => author.location } }, ]};my $result = $mcpan->post( author, { query => { match_all => {} }, filter => $filter, size => 100, sort => [ { author.pauseid => ASC } ], },);
  • Getting Help• #metacpan or irc.perl.org• https://metacpan.org/about/resources
  • Resources• https://github.com/CPAN-API/cpan- api/wiki/Beta-API-docs• http://www.slideshare.net/ clintongormley/terms-of-endearment- the-elasticsearch-query-dsl-explained
  • Bonus Slides
  • Base URL• http://api.metacpan.org/v0
  • Convenience Endpoints
  • Convenience Endpoints• /author/DOY• /distribution/Moose• /release/Moose• /module/Moose• /pod/Moose
  • Exporting Pod• /pod/Moose?content-type=text/html (default)• /pod/Moose?content-type=text/plain• /pod/Moose?content-type=text/x-pod• /pod/Moose?content-type=text/x-markdown
  • The (real) Endpoints
  • The (real) Endpoints• /author
  • The (real) Endpoints• /author• /distribution
  • The (real) Endpoints• /author• /distribution• /favorite
  • The (real) Endpoints• /author• /distribution• /favorite• /rating
  • The (real) Endpoints• /author• /distribution• /favorite• /rating• /release
  • The (real) Endpoints• /author• /distribution• /favorite• /rating• /release• /file
  • The (real) Endpoints• /author• /distribution• /favorite• /rating• /release• /file
  • Using a cacheuse HTTP::Tiny::Mech;use MetaCPAN::API;use WWW::Mechanize::Cached;my $mcpan = MetaCPAN::API->new( ua => HTTP::Tiny::Mech->new( mechua => WWW::Mechanize::Cached->new() ));
  • Enable Compression• use WWW::Mechanize::Gzip• use WWW::Mechanize::Cached::Gzip• Or set the appropriate request header
  • Use the scrolling API• The scrolling API allows you to iterate over an arbitrary number of results• Be aware that when you scroll, your docs will come back unsorted