Miyagawa
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Miyagawa

on

  • 2,790 views

 

Statistics

Views

Total Views
2,790
Views on SlideShare
2,790
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Miyagawa Presentation Transcript

  • 1. Plagger – RSS/Atom remixing platform Tatsuhiko Miyagawa [email_address] Six Apart, Ltd. / Shibuya Perl Mongers YAPC::NA 2006 Chicago
  • 2.
    • What is Plagger?
  • 3.
    • Pl uggable
    • RSS/Atom
    • Agg regato r
  • 4.
    • Pl atform for
    • Aggr egation
    • / remixing
  • 5.
    • Whatever
  • 6.
    • Speaking of
    • RSS/Atom aggregator
  • 7.
    • Who here is
    • using Bloglines?
    • (or any other web-based aggregators)
  • 8.
    • Who here
    • thinks that it sucks?
  • 9.
    • Plagger is for you.
    • see Bloglines2gmail
  • 10.
    • Who here has
    • ever written a tool
    • using XML::RSS?
  • 11.
    • Welcome aboard.
    • There's a chance that you can transform
    • your script into a Plagger plugin.
    • And I can give you a svn commit bit!
  • 12.
    • Why Pluggable?
    • Just for a feed aggregation?
  • 13.
    • 2002 Apr.
    • baseball2rss
    • http://search.cpan.org/dist/WWW-Baseball-NPB/
  • 14.
    • 2003 Oct.
    • rss2javascript
    • http://blog.bulknews.net/cookbook/blosxom/rss/rss2js.html
  • 15.
    • 2004 Sep.
    • bloglines2ipod
    • http://bulknews.net/lib/utils/bloglines2ipod/
  • 16.
    • 2004 Oct.
    • rss2audiobook
    • http://bulknews.net/lib/utils/rss2audiobook/
  • 17.
    • 2005 Aug.
    • bloglines2gmail
    • http://svn.bulknews.net/repos/public/bloglines2email/trunk/
  • 18.
    • Looks like
    • It's not only me
    • doing these things.
  • 19.
    • rss2opml
    • http://aruntx.com/software/rss2opml/
  • 20.
    • rss2pdf
    • http://rss2pdf.com/
  • 21.
    • rss2atom
    • brian.wanamaker.com/mybicycle/2004/02/rss2atom.html
  • 22.
    • atom2rss
    • http://www.2rss.com/software.php?page=atom2rss
  • 23.
    • rss2ical
    • http://bura-bura.com/blog/archives/2004/06/22/rss2ical/
  • 24.
    • Bloglines2opml
    • http://mycvs.org/wp/wp-content/wp-transform.php
  • 25.
    • rss2gmail
    • http://www.cs.utexas.edu/~karu/gmailrss/
  • 26.
    • rss2imap
    • http://rss2imap.sourceforge.jp/
  • 27.
    • ebay2rss
    • http://www.2rss.com/software.php?page=ebay2rss
  • 28.
    • svn2rss
    • http://twiki.org/cgi-bin/view/Codev/Svn2rss
  • 29.
    • <anything>2<anything>
  • 30.
    • Being sick of
    • writing the same code
    • again and again
  • 31.
    • Why not creating
    • A pluggable platform
    • Instead?
  • 32.
    • With reusable
    • Parsers / Emitters
    • / Filters?
  • 33.
    • That's what
    • Plagger is.
  • 34. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 35. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 36. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 37. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML , XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 38.
    • Just like
    • Lego™ Block
  • 39.
    • Create an app
    • With combo of
    • Plugins!
  • 40.
    • Example App #1
    • Bloglines to Gmail
  • 41. bloglines2gmail.yaml plugins: - module: Subscription::Bloglines config: username: you@example.com password: foobar mark_read: 1 - module: Publish::Gmail config: mailto: [email_address] mailfrom: miyagawa@example.com mailroute: via: smtp host: smtp.example.com
  • 42.
    • Run it on crontab
    • % ./plagger –c bloglines2gmail.yaml
  • 43. RSS in Gmail
  • 44. HTML + Images
  • 45. Feed Image (Logo / Buddy Icon)
  • 46. Search
  • 47. Auto grouping (“Conversations”)
  • 48. Diff
  • 49. Tips: Filter
  • 50. Tips: is:unread
  • 51.
    • Example App #2
    • RSS to ircbot
  • 52. RSS bot in action
    • #plagger on freenode
  • 53. Config for RSS bot (1/3) plugins: - module: Subscription::Config config: feed: # Trac's feed for changesets - http://plagger.org/…/rss
  • 54. Config for RSS bot (2/3) # I don't like to be notified of same items # more than once! - module: Filter::Rule rule: module: Fresh mtime: path: /tmp/rssbot.time autoupdate: 1
  • 55. Config for RSS bot (3/3) - module: Notify::IRC config: daemon_port: 9999 nickname: plaggerbot server_host: chat.freenode.net server_channels: - #plagger-ja - #plagger
  • 56.
    • See more in
    • examples/irc.yaml
  • 57.
    • Example App #3
    • Planet
  • 58.
    • http://planet.yapcchicago.org/
  • 59. planet-yapcna.yaml (1/4) plugins: - module: Subscription::Config config: feed: - http://yapcchicago.org/feed/ - http://use.perl.org/search.pl?query=YAPC… - http://del.icio.us/rss/tag/yapcna2006 - http://feeds.technorati.com/feed/posts/… - http://bloglines.com/search?q=YAPC+NA&… # etc, etc …
  • 60. planet-yapcna.yaml (2/4) # Normalize feed title and permalinks - module: Filter::FeedBurnerPermalink - module: Filter::TruePermalink - module: Filter::StripTagsFromTitle
  • 61. planet-yapcna.yaml (3/4) # Create a smartfeed for all the entries merged - module: SmartFeed::All rule_op: AND rule: - module: Fresh duration: 10080 # seven days - module: URLBL dnsbl: rbl.bulkfeeds.jp config: title: Planet YAPC::NA
  • 62. planet-yapcna.yaml (4/4) # Generate nice XHTML out of the SmartFeed - module: Publish::Planet rule: expression: $args->{feed}->id eq 'smartfeed:all' config: dir: /path/to/htdocs skin: sixapart-std template: members_list: 1 style_url: http://example.com/style.css
  • 63.
    • (I admit this Planet config is so clumsy
    • and I'll work on that to make it suck less.)
  • 64.
    • Example App #4
    • YouTube downloader
  • 65. youtube.yaml plugins: - module: Subscription::Config config: feed: - http://www.youtube.com/rss/tag/yapc.rss # discover real .flv URLs on YouTube.com - module: Filter::FindEnclosures # fetch them to local directory - module: Filter::FertchEnclosure config: dir: path/to/save
  • 66.
    • Coming soon …
    • Filter::ffmpeg, Sync::PSP, Sync::iPodVideo
  • 67.
    • Plagger
    • Core features
  • 68.
    • RSS/Atom
    • Auto-Discovery
  • 69.
    • Support various
    • Feed formats
    • RSS 0.91 to Atom 1.0
  • 70.
    • Support parsing
    • Broken XML feeds
    • (XML::Liberal)
  • 71.
    • Podcast / Videocast
    • Support
    • (RSS 2.0 & Atom 1.0)
  • 72.
    • Photocast
    • Media RSS
    • iTunes RSS*
  • 73.
    • Sane I18N impl.
    • Unicode & Timezone
  • 74.
    • Access to
    • browser's Cookies
    • IE, Safari, Firefox and w3m
    • Thanks to brian d foy
  • 75.
    • Quick tour
    • On available plugins
  • 76. Plugin phases (types)
    • Subscription
    • Aggregator
    • CustomFeed
    • Filter
    • Publish
    • Notify
    • Search
  • 77.
    • Subscription
    • load subscriptions
    • (list the feeds/URLs to aggregate)
  • 78.
    • Subscription::Config
    - module: Subscription::Config config: feed: - http://www.yapcchicago.org/feed/ - http://tokyo.yapcasia.org/blog/
  • 79.
    • Subscription::OPML
    - module: Subscription::OPML config: url: http://www.example.com/subs.opml # subs.opml <opml> <outline xmlUrl=&quot;http://www.yapcchicago.org/feed/&quot; /> <outline htmlUrl=&quot;http://tokyo.yapcasia.org/blog/&quot; /> </opml>
  • 80.
    • Subscription::File
    - module: Subscription::File config: url: file:///path/to/subscription.txt % cat subscription.txt http://www.yapcchicago.org/feed/ http://tokyo.yapcasia.org/blog/ %
  • 81.
    • Subscription::XOXO
    - module: Subscription::XOXO config: url: http://www.example.com/subscription.html # subscription.html <ul class=&quot;xoxo&quot;> <li><a href=&quot;http://www.yapcchicago.org/feed/&quot;>YAPC::NA</a></li> <li><a href=&quot;http://tokyo.yapcasia.org/blog/&quot;>YAPC::NA</a></li> </ul>
  • 82.
    • Subscription::Bookmarks
    • Read bookmarks file of IE, Firefox and Safari
  • 83.
    • Aggregator
    • Aggregate and parse the feeds
    • listed in subscription(s)
  • 84.
    • Aggregator::Simple
    • The &quot;dumb&quot; aggregator
    • Using LWP and XML::Feed
    • Might be okay < 20 feeds
  • 85.
    • Aggregator::Xango
    • The &quot;fast&quot; aggregator using Xango.pm
    • the POE based scalable web crawler
    • For > 100 feeds
  • 86.
    • CustomFeed
    • Feed formats other than RSS/Atom
    • Scrapers
  • 87.
    • CustomFeed::POP3
    • Each email is a feed.
    • Attachments are enclosures.
  • 88.
    • CustomFeed::MySpace*
    • Your friends journal as feed
    • (* indicates it's not developed yet)
  • 89.
    • CustomFeed::FlickrSearch
    • Search results as feed
    • Each photo found is an entry
    • (with enclosures).
  • 90.
    • Filter
    • Normalize / Repair feed metadata
    • Upgrade feed content
    • Filter feed content using text filters
    • Invoke some action on entries
  • 91.
    • Filter::StripRSSAd
    Supports: Google AdSense, FeedBurner, Pheedo
  • 92.
    • Filter::EntryFullText
    • Upgrade content-less feed to fulltext feed
    • by fetching individual HTML
    • and extracting the content body
  • 93.
    • Filter::TruePermalink
    • Resolves nasty redirection URL
    • to the &quot;true&quot; permalink
    • (e.g. http://…/go.php?url=….)
  • 94.
    • Filter::FindEnclosures
    • Find enclosures from content body
    • <a href=&quot;http://…./foo.mp3&quot;>episode #1</a>
  • 95.
    • Filter::RSSLiberalDateTime
    • Deal with broken rss datetime format
    • <pubDate>2006/06/27 01:45:22 +0900</pubDate>
  • 96.
    • Publish
    • Publish aggregated entry to online services
    • Convert feeds to other formats
  • 97.
    • Publish::Feed
    • Republish feed in RSS/Atom
    • Good to use with scrapers
  • 98.
    • Publish::Delicious
    • Auto-post entries to your del.icio.us
    • using its REST API
  • 99.
    • Publish::iCal*
    • Publish iCal feeds out of RSS/Atom
  • 100.
    • Publish::MTWidget
  • 101.
    • Publish::Excel
    If your boss is unhappy your reading blogs on browsers.
  • 102.
    • Search
    • Index aggregated entries on search engines
  • 103.
    • Search::Spotlight
  • 104.
    • Search::Estraier
    • Uses HyperEstraier XMLRPC node API
  • 105.
    • Notify
    • Notify feed updates in various ways
  • 106.
    • Notify::Campfire
  • 107.
    • Notify::Growl
  • 108.
    • Notify::MSAgent
  • 109. Notify::Eject Supports: Windows, Linux, FreeBSD and Mac OSX!
  • 110.
    • So far,
    • Plagger rocks 
  • 111.
    • Actually,
    • Plagger sucks 
  • 112.
    • No good
    • Documentation
    • (Not a big deal if you can read Perl code)
  • 113.
    • Horrible lots of
    • CPAN deps.
    % grep requires Makefile.PL | wc –l 25 % grep recommends Makefile.PL | wc –l 70
  • 114.
    • cpan Plagger
    • Doesn't work (partially)
  • 115.
    • 144 open tickets
    • On Trac
    • http://plagger.org/trac/query
    • (Not a bad sign. I use it as a Wishlist)
  • 116.
    • The way it
    • de-dupes entries
    • is clumsy
  • 117.
    • No database
    • Backend (yet)
    • Planned to be in core of 0.8
  • 118.
    • Plugin invocations
    • Can be rule-based
    • but undocumented
  • 119.
    • Different Plugin functionalities
    • On the same namespaces
    • (CustomFeed, Filter, Publish)
  • 120.
    • Publish::Gmail
    • was badly named
  • 121.
    • I want you
    • To fix & improve it.
  • 122.
    • Plagger
    • dev. Status
  • 123.
    • Version
    • 0.7.3
  • 124.
    • Coming Soon …
  • 125.
    • iTunes RSS support
  • 126.
    • Enclosure processors
    • ffmpeg, Sync::PSP, Sync::iPodVideo
  • 127.
    • Rich Media metadata
    • ID3 tag in enclosures
    • Links to imdb.com / amazon.com
    • hReview microformats
  • 128.
    • Database Storage
    • & Server API
    • branches/plagger-server
  • 129.
    • Calendar Support
    • iCal parser & emitter
    • hCalendar microformats
    • .ics attached in emails
    • Sync::SyncML
  • 130.
    • How's the dev
    • going on?
  • 131.
    • 31 authors
    • 128 plugins
    • (most of them are from Japan)
  • 132.
    • Buzz in Japan
  • 133.
    • I am Happy
    • With &quot;the Buzz&quot;
  • 134.
    • I am !Happy
    • With &quot;In Japan&quot;
  • 135.
    • Help spreading
    • the words
  • 136.
    • Doc Contribution
    • Is highly welcome
  • 137.
    • http://plagger.org/
    • Planet, Mailing List, IRC
    • Bug Tracking, SVN repository
  • 138.
    • #plagger on freenode
  • 139.
    • Join Us!
  • 140.
    • Thank you
    • Questions?