Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
894
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Plagger – RSS/Atom remixing platform Tatsuhiko Miyagawa [email_address] Six Apart, Ltd. / Shibuya Perl Mongers YAPC::NA 2006 Chicago
  • 2.
    • What is Plagger?
  • 3.
    • Pl uggable
    • RSS/Atom
    • Agg regato r
  • 4.
    • Pl atform for
    • Aggr egation
    • / remixing
  • 5.
    • Whatever
  • 6.
    • Speaking of
    • RSS/Atom aggregator
  • 7.
    • Who here is
    • using Bloglines?
    • (or any other web-based aggregators)
  • 8.
    • Who here
    • thinks that it sucks?
  • 9.
    • Plagger is for you.
    • see Bloglines2gmail
  • 10.
    • Who here has
    • ever written a tool
    • using XML::RSS?
  • 11.
    • Welcome aboard.
    • There's a chance that you can transform
    • your script into a Plagger plugin.
    • And I can give you a svn commit bit!
  • 12.
    • Why Pluggable?
    • Just for a feed aggregation?
  • 13.
    • 2002 Apr.
    • baseball2rss
    • http://search.cpan.org/dist/WWW-Baseball-NPB/
  • 14.
    • 2003 Oct.
    • rss2javascript
    • http://blog.bulknews.net/cookbook/blosxom/rss/rss2js.html
  • 15.
    • 2004 Sep.
    • bloglines2ipod
    • http://bulknews.net/lib/utils/bloglines2ipod/
  • 16.
    • 2004 Oct.
    • rss2audiobook
    • http://bulknews.net/lib/utils/rss2audiobook/
  • 17.
    • 2005 Aug.
    • bloglines2gmail
    • http://svn.bulknews.net/repos/public/bloglines2email/trunk/
  • 18.
    • Looks like
    • It's not only me
    • doing these things.
  • 19.
    • rss2opml
    • http://aruntx.com/software/rss2opml/
  • 20.
    • rss2pdf
    • http://rss2pdf.com/
  • 21.
    • rss2atom
    • brian.wanamaker.com/mybicycle/2004/02/rss2atom.html
  • 22.
    • atom2rss
    • http://www.2rss.com/software.php?page=atom2rss
  • 23.
    • rss2ical
    • http://bura-bura.com/blog/archives/2004/06/22/rss2ical/
  • 24.
    • Bloglines2opml
    • http://mycvs.org/wp/wp-content/wp-transform.php
  • 25.
    • rss2gmail
    • http://www.cs.utexas.edu/~karu/gmailrss/
  • 26.
    • rss2imap
    • http://rss2imap.sourceforge.jp/
  • 27.
    • ebay2rss
    • http://www.2rss.com/software.php?page=ebay2rss
  • 28.
    • svn2rss
    • http://twiki.org/cgi-bin/view/Codev/Svn2rss
  • 29.
    • <anything>2<anything>
  • 30.
    • Being sick of
    • writing the same code
    • again and again
  • 31.
    • Why not creating
    • A pluggable platform
    • Instead?
  • 32.
    • With reusable
    • Parsers / Emitters
    • / Filters?
  • 33.
    • That's what
    • Plagger is.
  • 34. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 35. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 36. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 37. IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML , XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • 38.
    • Just like
    • Lego™ Block
  • 39.
    • Create an app
    • With combo of
    • Plugins!
  • 40.
    • Example App #1
    • Bloglines to Gmail
  • 41. bloglines2gmail.yaml plugins: - module: Subscription::Bloglines config: username: you@example.com password: foobar mark_read: 1 - module: Publish::Gmail config: mailto: [email_address] mailfrom: miyagawa@example.com mailroute: via: smtp host: smtp.example.com
  • 42.
    • Run it on crontab
    • % ./plagger –c bloglines2gmail.yaml
  • 43. RSS in Gmail
  • 44. HTML + Images
  • 45. Feed Image (Logo / Buddy Icon)
  • 46. Search
  • 47. Auto grouping (“Conversations”)
  • 48. Diff
  • 49. Tips: Filter
  • 50. Tips: is:unread
  • 51.
    • Example App #2
    • RSS to ircbot
  • 52. RSS bot in action
    • #plagger on freenode
  • 53. Config for RSS bot (1/3) plugins: - module: Subscription::Config config: feed: # Trac's feed for changesets - http://plagger.org/…/rss
  • 54. Config for RSS bot (2/3) # I don't like to be notified of same items # more than once! - module: Filter::Rule rule: module: Fresh mtime: path: /tmp/rssbot.time autoupdate: 1
  • 55. Config for RSS bot (3/3) - module: Notify::IRC config: daemon_port: 9999 nickname: plaggerbot server_host: chat.freenode.net server_channels: - #plagger-ja - #plagger
  • 56.
    • See more in
    • examples/irc.yaml
  • 57.
    • Example App #3
    • Planet
  • 58.
    • http://planet.yapcchicago.org/
  • 59. planet-yapcna.yaml (1/4) plugins: - module: Subscription::Config config: feed: - http://yapcchicago.org/feed/ - http://use.perl.org/search.pl?query=YAPC… - http://del.icio.us/rss/tag/yapcna2006 - http://feeds.technorati.com/feed/posts/… - http://bloglines.com/search?q=YAPC+NA&… # etc, etc …
  • 60. planet-yapcna.yaml (2/4) # Normalize feed title and permalinks - module: Filter::FeedBurnerPermalink - module: Filter::TruePermalink - module: Filter::StripTagsFromTitle
  • 61. planet-yapcna.yaml (3/4) # Create a smartfeed for all the entries merged - module: SmartFeed::All rule_op: AND rule: - module: Fresh duration: 10080 # seven days - module: URLBL dnsbl: rbl.bulkfeeds.jp config: title: Planet YAPC::NA
  • 62. planet-yapcna.yaml (4/4) # Generate nice XHTML out of the SmartFeed - module: Publish::Planet rule: expression: $args->{feed}->id eq 'smartfeed:all' config: dir: /path/to/htdocs skin: sixapart-std template: members_list: 1 style_url: http://example.com/style.css
  • 63.
    • (I admit this Planet config is so clumsy
    • and I'll work on that to make it suck less.)
  • 64.
    • Example App #4
    • YouTube downloader
  • 65. youtube.yaml plugins: - module: Subscription::Config config: feed: - http://www.youtube.com/rss/tag/yapc.rss # discover real .flv URLs on YouTube.com - module: Filter::FindEnclosures # fetch them to local directory - module: Filter::FertchEnclosure config: dir: path/to/save
  • 66.
    • Coming soon …
    • Filter::ffmpeg, Sync::PSP, Sync::iPodVideo
  • 67.
    • Plagger
    • Core features
  • 68.
    • RSS/Atom
    • Auto-Discovery
  • 69.
    • Support various
    • Feed formats
    • RSS 0.91 to Atom 1.0
  • 70.
    • Support parsing
    • Broken XML feeds
    • (XML::Liberal)
  • 71.
    • Podcast / Videocast
    • Support
    • (RSS 2.0 & Atom 1.0)
  • 72.
    • Photocast
    • Media RSS
    • iTunes RSS*
  • 73.
    • Sane I18N impl.
    • Unicode & Timezone
  • 74.
    • Access to
    • browser's Cookies
    • IE, Safari, Firefox and w3m
    • Thanks to brian d foy
  • 75.
    • Quick tour
    • On available plugins
  • 76. Plugin phases (types)
    • Subscription
    • Aggregator
    • CustomFeed
    • Filter
    • Publish
    • Notify
    • Search
  • 77.
    • Subscription
    • load subscriptions
    • (list the feeds/URLs to aggregate)
  • 78.
    • Subscription::Config
    - module: Subscription::Config config: feed: - http://www.yapcchicago.org/feed/ - http://tokyo.yapcasia.org/blog/
  • 79.
    • Subscription::OPML
    - module: Subscription::OPML config: url: http://www.example.com/subs.opml # subs.opml <opml> <outline xmlUrl=&quot;http://www.yapcchicago.org/feed/&quot; /> <outline htmlUrl=&quot;http://tokyo.yapcasia.org/blog/&quot; /> </opml>
  • 80.
    • Subscription::File
    - module: Subscription::File config: url: file:///path/to/subscription.txt % cat subscription.txt http://www.yapcchicago.org/feed/ http://tokyo.yapcasia.org/blog/ %
  • 81.
    • Subscription::XOXO
    - module: Subscription::XOXO config: url: http://www.example.com/subscription.html # subscription.html <ul class=&quot;xoxo&quot;> <li><a href=&quot;http://www.yapcchicago.org/feed/&quot;>YAPC::NA</a></li> <li><a href=&quot;http://tokyo.yapcasia.org/blog/&quot;>YAPC::NA</a></li> </ul>
  • 82.
    • Subscription::Bookmarks
    • Read bookmarks file of IE, Firefox and Safari
  • 83.
    • Aggregator
    • Aggregate and parse the feeds
    • listed in subscription(s)
  • 84.
    • Aggregator::Simple
    • The &quot;dumb&quot; aggregator
    • Using LWP and XML::Feed
    • Might be okay < 20 feeds
  • 85.
    • Aggregator::Xango
    • The &quot;fast&quot; aggregator using Xango.pm
    • the POE based scalable web crawler
    • For > 100 feeds
  • 86.
    • CustomFeed
    • Feed formats other than RSS/Atom
    • Scrapers
  • 87.
    • CustomFeed::POP3
    • Each email is a feed.
    • Attachments are enclosures.
  • 88.
    • CustomFeed::MySpace*
    • Your friends journal as feed
    • (* indicates it's not developed yet)
  • 89.
    • CustomFeed::FlickrSearch
    • Search results as feed
    • Each photo found is an entry
    • (with enclosures).
  • 90.
    • Filter
    • Normalize / Repair feed metadata
    • Upgrade feed content
    • Filter feed content using text filters
    • Invoke some action on entries
  • 91.
    • Filter::StripRSSAd
    Supports: Google AdSense, FeedBurner, Pheedo
  • 92.
    • Filter::EntryFullText
    • Upgrade content-less feed to fulltext feed
    • by fetching individual HTML
    • and extracting the content body
  • 93.
    • Filter::TruePermalink
    • Resolves nasty redirection URL
    • to the &quot;true&quot; permalink
    • (e.g. http://…/go.php?url=….)
  • 94.
    • Filter::FindEnclosures
    • Find enclosures from content body
    • <a href=&quot;http://…./foo.mp3&quot;>episode #1</a>
  • 95.
    • Filter::RSSLiberalDateTime
    • Deal with broken rss datetime format
    • <pubDate>2006/06/27 01:45:22 +0900</pubDate>
  • 96.
    • Publish
    • Publish aggregated entry to online services
    • Convert feeds to other formats
  • 97.
    • Publish::Feed
    • Republish feed in RSS/Atom
    • Good to use with scrapers
  • 98.
    • Publish::Delicious
    • Auto-post entries to your del.icio.us
    • using its REST API
  • 99.
    • Publish::iCal*
    • Publish iCal feeds out of RSS/Atom
  • 100.
    • Publish::MTWidget
  • 101.
    • Publish::Excel
    If your boss is unhappy your reading blogs on browsers.
  • 102.
    • Search
    • Index aggregated entries on search engines
  • 103.
    • Search::Spotlight
  • 104.
    • Search::Estraier
    • Uses HyperEstraier XMLRPC node API
  • 105.
    • Notify
    • Notify feed updates in various ways
  • 106.
    • Notify::Campfire
  • 107.
    • Notify::Growl
  • 108.
    • Notify::MSAgent
  • 109. Notify::Eject Supports: Windows, Linux, FreeBSD and Mac OSX!
  • 110.
    • So far,
    • Plagger rocks 
  • 111.
    • Actually,
    • Plagger sucks 
  • 112.
    • No good
    • Documentation
    • (Not a big deal if you can read Perl code)
  • 113.
    • Horrible lots of
    • CPAN deps.
    % grep requires Makefile.PL | wc –l 25 % grep recommends Makefile.PL | wc –l 70
  • 114.
    • cpan Plagger
    • Doesn't work (partially)
  • 115.
    • 144 open tickets
    • On Trac
    • http://plagger.org/trac/query
    • (Not a bad sign. I use it as a Wishlist)
  • 116.
    • The way it
    • de-dupes entries
    • is clumsy
  • 117.
    • No database
    • Backend (yet)
    • Planned to be in core of 0.8
  • 118.
    • Plugin invocations
    • Can be rule-based
    • but undocumented
  • 119.
    • Different Plugin functionalities
    • On the same namespaces
    • (CustomFeed, Filter, Publish)
  • 120.
    • Publish::Gmail
    • was badly named
  • 121.
    • I want you
    • To fix & improve it.
  • 122.
    • Plagger
    • dev. Status
  • 123.
    • Version
    • 0.7.3
  • 124.
    • Coming Soon …
  • 125.
    • iTunes RSS support
  • 126.
    • Enclosure processors
    • ffmpeg, Sync::PSP, Sync::iPodVideo
  • 127.
    • Rich Media metadata
    • ID3 tag in enclosures
    • Links to imdb.com / amazon.com
    • hReview microformats
  • 128.
    • Database Storage
    • & Server API
    • branches/plagger-server
  • 129.
    • Calendar Support
    • iCal parser & emitter
    • hCalendar microformats
    • .ics attached in emails
    • Sync::SyncML
  • 130.
    • How's the dev
    • going on?
  • 131.
    • 31 authors
    • 128 plugins
    • (most of them are from Japan)
  • 132.
    • Buzz in Japan
  • 133.
    • I am Happy
    • With &quot;the Buzz&quot;
  • 134.
    • I am !Happy
    • With &quot;In Japan&quot;
  • 135.
    • Help spreading
    • the words
  • 136.
    • Doc Contribution
    • Is highly welcome
  • 137.
    • http://plagger.org/
    • Planet, Mailing List, IRC
    • Bug Tracking, SVN repository
  • 138.
    • #plagger on freenode
  • 139.
    • Join Us!
  • 140.
    • Thank you
    • Questions?