Miyagawa
Upcoming SlideShare
Loading in...5
×
 

Miyagawa

on

  • 1,442 views

 

Statistics

Views

Total Views
1,442
Views on SlideShare
1,441
Embed Views
1

Actions

Likes
0
Downloads
2
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Miyagawa Miyagawa Presentation Transcript

  • Plagger – RSS/Atom remixing platform Tatsuhiko Miyagawa [email_address] Six Apart, Ltd. / Shibuya Perl Mongers YAPC::NA 2006 Chicago
    • What is Plagger?
    • Pl uggable
    • RSS/Atom
    • Agg regato r
    • Pl atform for
    • Aggr egation
    • / remixing
    • Whatever
    • Speaking of
    • RSS/Atom aggregator
    • Who here is
    • using Bloglines?
    • (or any other web-based aggregators)
    • Who here
    • thinks that it sucks?
    • Plagger is for you.
    • see Bloglines2gmail
    • Who here has
    • ever written a tool
    • using XML::RSS?
    • Welcome aboard.
    • There's a chance that you can transform
    • your script into a Plagger plugin.
    • And I can give you a svn commit bit!
    • Why Pluggable?
    • Just for a feed aggregation?
    • 2002 Apr.
    • baseball2rss
    • http://search.cpan.org/dist/WWW-Baseball-NPB/
    • 2003 Oct.
    • rss2javascript
    • http://blog.bulknews.net/cookbook/blosxom/rss/rss2js.html
    • 2004 Sep.
    • bloglines2ipod
    • http://bulknews.net/lib/utils/bloglines2ipod/
    • 2004 Oct.
    • rss2audiobook
    • http://bulknews.net/lib/utils/rss2audiobook/
    • 2005 Aug.
    • bloglines2gmail
    • http://svn.bulknews.net/repos/public/bloglines2email/trunk/
    • Looks like
    • It's not only me
    • doing these things.
    • rss2opml
    • http://aruntx.com/software/rss2opml/
    • rss2pdf
    • http://rss2pdf.com/
    • rss2atom
    • brian.wanamaker.com/mybicycle/2004/02/rss2atom.html
    • atom2rss
    • http://www.2rss.com/software.php?page=atom2rss
    • rss2ical
    • http://bura-bura.com/blog/archives/2004/06/22/rss2ical/
    • Bloglines2opml
    • http://mycvs.org/wp/wp-content/wp-transform.php
    • rss2gmail
    • http://www.cs.utexas.edu/~karu/gmailrss/
    • rss2imap
    • http://rss2imap.sourceforge.jp/
    • ebay2rss
    • http://www.2rss.com/software.php?page=ebay2rss
    • svn2rss
    • http://twiki.org/cgi-bin/view/Codev/Svn2rss
    • <anything>2<anything>
    • Being sick of
    • writing the same code
    • again and again
    • Why not creating
    • A pluggable platform
    • Instead?
    • With reusable
    • Parsers / Emitters
    • / Filters?
    • That's what
    • Plagger is.
  • IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML, XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
  • IRC, Eject, Growl MSAgent, SSTP … Filter Publish StripRSSAd TruePermalink EntryFullText Pipe Thumbnail FindEnclosures FetchEnclosure SpamAssassin RSSLiberalDateTime URLBL ResolveRelativeLink … Gmail Delicious PDF MT Feed Planet Speech … Notify Bloglines Config OPML , XOXO File, DBI, FOAF … Mixi, Yahoo360JP POP3, iCal iTunes, Amazon YouTube … Subscription CustomFeed
    • Just like
    • Lego™ Block
    • Create an app
    • With combo of
    • Plugins!
    • Example App #1
    • Bloglines to Gmail
  • bloglines2gmail.yaml plugins: - module: Subscription::Bloglines config: username: you@example.com password: foobar mark_read: 1 - module: Publish::Gmail config: mailto: [email_address] mailfrom: miyagawa@example.com mailroute: via: smtp host: smtp.example.com
    • Run it on crontab
    • % ./plagger –c bloglines2gmail.yaml
  • RSS in Gmail
  • HTML + Images
  • Feed Image (Logo / Buddy Icon)
  • Search
  • Auto grouping (“Conversations”)
  • Diff
  • Tips: Filter
  • Tips: is:unread
    • Example App #2
    • RSS to ircbot
  • RSS bot in action
    • #plagger on freenode
  • Config for RSS bot (1/3) plugins: - module: Subscription::Config config: feed: # Trac's feed for changesets - http://plagger.org/…/rss
  • Config for RSS bot (2/3) # I don't like to be notified of same items # more than once! - module: Filter::Rule rule: module: Fresh mtime: path: /tmp/rssbot.time autoupdate: 1
  • Config for RSS bot (3/3) - module: Notify::IRC config: daemon_port: 9999 nickname: plaggerbot server_host: chat.freenode.net server_channels: - #plagger-ja - #plagger
    • See more in
    • examples/irc.yaml
    • Example App #3
    • Planet
    • http://planet.yapcchicago.org/
  • planet-yapcna.yaml (1/4) plugins: - module: Subscription::Config config: feed: - http://yapcchicago.org/feed/ - http://use.perl.org/search.pl?query=YAPC… - http://del.icio.us/rss/tag/yapcna2006 - http://feeds.technorati.com/feed/posts/… - http://bloglines.com/search?q=YAPC+NA&… # etc, etc …
  • planet-yapcna.yaml (2/4) # Normalize feed title and permalinks - module: Filter::FeedBurnerPermalink - module: Filter::TruePermalink - module: Filter::StripTagsFromTitle
  • planet-yapcna.yaml (3/4) # Create a smartfeed for all the entries merged - module: SmartFeed::All rule_op: AND rule: - module: Fresh duration: 10080 # seven days - module: URLBL dnsbl: rbl.bulkfeeds.jp config: title: Planet YAPC::NA
  • planet-yapcna.yaml (4/4) # Generate nice XHTML out of the SmartFeed - module: Publish::Planet rule: expression: $args->{feed}->id eq 'smartfeed:all' config: dir: /path/to/htdocs skin: sixapart-std template: members_list: 1 style_url: http://example.com/style.css
    • (I admit this Planet config is so clumsy
    • and I'll work on that to make it suck less.)
    • Example App #4
    • YouTube downloader
  • youtube.yaml plugins: - module: Subscription::Config config: feed: - http://www.youtube.com/rss/tag/yapc.rss # discover real .flv URLs on YouTube.com - module: Filter::FindEnclosures # fetch them to local directory - module: Filter::FertchEnclosure config: dir: path/to/save
    • Coming soon …
    • Filter::ffmpeg, Sync::PSP, Sync::iPodVideo
    • Plagger
    • Core features
    • RSS/Atom
    • Auto-Discovery
    • Support various
    • Feed formats
    • RSS 0.91 to Atom 1.0
    • Support parsing
    • Broken XML feeds
    • (XML::Liberal)
    • Podcast / Videocast
    • Support
    • (RSS 2.0 & Atom 1.0)
    • Photocast
    • Media RSS
    • iTunes RSS*
    • Sane I18N impl.
    • Unicode & Timezone
    • Access to
    • browser's Cookies
    • IE, Safari, Firefox and w3m
    • Thanks to brian d foy
    • Quick tour
    • On available plugins
  • Plugin phases (types)
    • Subscription
    • Aggregator
    • CustomFeed
    • Filter
    • Publish
    • Notify
    • Search
    • Subscription
    • load subscriptions
    • (list the feeds/URLs to aggregate)
    • Subscription::Config
    - module: Subscription::Config config: feed: - http://www.yapcchicago.org/feed/ - http://tokyo.yapcasia.org/blog/
    • Subscription::OPML
    - module: Subscription::OPML config: url: http://www.example.com/subs.opml # subs.opml <opml> <outline xmlUrl=&quot;http://www.yapcchicago.org/feed/&quot; /> <outline htmlUrl=&quot;http://tokyo.yapcasia.org/blog/&quot; /> </opml>
    • Subscription::File
    - module: Subscription::File config: url: file:///path/to/subscription.txt % cat subscription.txt http://www.yapcchicago.org/feed/ http://tokyo.yapcasia.org/blog/ %
    • Subscription::XOXO
    - module: Subscription::XOXO config: url: http://www.example.com/subscription.html # subscription.html <ul class=&quot;xoxo&quot;> <li><a href=&quot;http://www.yapcchicago.org/feed/&quot;>YAPC::NA</a></li> <li><a href=&quot;http://tokyo.yapcasia.org/blog/&quot;>YAPC::NA</a></li> </ul>
    • Subscription::Bookmarks
    • Read bookmarks file of IE, Firefox and Safari
    • Aggregator
    • Aggregate and parse the feeds
    • listed in subscription(s)
    • Aggregator::Simple
    • The &quot;dumb&quot; aggregator
    • Using LWP and XML::Feed
    • Might be okay < 20 feeds
    • Aggregator::Xango
    • The &quot;fast&quot; aggregator using Xango.pm
    • the POE based scalable web crawler
    • For > 100 feeds
    • CustomFeed
    • Feed formats other than RSS/Atom
    • Scrapers
    • CustomFeed::POP3
    • Each email is a feed.
    • Attachments are enclosures.
    • CustomFeed::MySpace*
    • Your friends journal as feed
    • (* indicates it's not developed yet)
    • CustomFeed::FlickrSearch
    • Search results as feed
    • Each photo found is an entry
    • (with enclosures).
    • Filter
    • Normalize / Repair feed metadata
    • Upgrade feed content
    • Filter feed content using text filters
    • Invoke some action on entries
    • Filter::StripRSSAd
    Supports: Google AdSense, FeedBurner, Pheedo
    • Filter::EntryFullText
    • Upgrade content-less feed to fulltext feed
    • by fetching individual HTML
    • and extracting the content body
    • Filter::TruePermalink
    • Resolves nasty redirection URL
    • to the &quot;true&quot; permalink
    • (e.g. http://…/go.php?url=….)
    • Filter::FindEnclosures
    • Find enclosures from content body
    • <a href=&quot;http://…./foo.mp3&quot;>episode #1</a>
    • Filter::RSSLiberalDateTime
    • Deal with broken rss datetime format
    • <pubDate>2006/06/27 01:45:22 +0900</pubDate>
    • Publish
    • Publish aggregated entry to online services
    • Convert feeds to other formats
    • Publish::Feed
    • Republish feed in RSS/Atom
    • Good to use with scrapers
    • Publish::Delicious
    • Auto-post entries to your del.icio.us
    • using its REST API
    • Publish::iCal*
    • Publish iCal feeds out of RSS/Atom
    • Publish::MTWidget
    • Publish::Excel
    If your boss is unhappy your reading blogs on browsers.
    • Search
    • Index aggregated entries on search engines
    • Search::Spotlight
    • Search::Estraier
    • Uses HyperEstraier XMLRPC node API
    • Notify
    • Notify feed updates in various ways
    • Notify::Campfire
    • Notify::Growl
    • Notify::MSAgent
  • Notify::Eject Supports: Windows, Linux, FreeBSD and Mac OSX!
    • So far,
    • Plagger rocks 
    • Actually,
    • Plagger sucks 
    • No good
    • Documentation
    • (Not a big deal if you can read Perl code)
    • Horrible lots of
    • CPAN deps.
    % grep requires Makefile.PL | wc –l 25 % grep recommends Makefile.PL | wc –l 70
    • cpan Plagger
    • Doesn't work (partially)
    • 144 open tickets
    • On Trac
    • http://plagger.org/trac/query
    • (Not a bad sign. I use it as a Wishlist)
    • The way it
    • de-dupes entries
    • is clumsy
    • No database
    • Backend (yet)
    • Planned to be in core of 0.8
    • Plugin invocations
    • Can be rule-based
    • but undocumented
    • Different Plugin functionalities
    • On the same namespaces
    • (CustomFeed, Filter, Publish)
    • Publish::Gmail
    • was badly named
    • I want you
    • To fix & improve it.
    • Plagger
    • dev. Status
    • Version
    • 0.7.3
    • Coming Soon …
    • iTunes RSS support
    • Enclosure processors
    • ffmpeg, Sync::PSP, Sync::iPodVideo
    • Rich Media metadata
    • ID3 tag in enclosures
    • Links to imdb.com / amazon.com
    • hReview microformats
    • Database Storage
    • & Server API
    • branches/plagger-server
    • Calendar Support
    • iCal parser & emitter
    • hCalendar microformats
    • .ics attached in emails
    • Sync::SyncML
    • How's the dev
    • going on?
    • 31 authors
    • 128 plugins
    • (most of them are from Japan)
    • Buzz in Japan
    • I am Happy
    • With &quot;the Buzz&quot;
    • I am !Happy
    • With &quot;In Japan&quot;
    • Help spreading
    • the words
    • Doc Contribution
    • Is highly welcome
    • http://plagger.org/
    • Planet, Mailing List, IRC
    • Bug Tracking, SVN repository
    • #plagger on freenode
    • Join Us!
    • Thank you
    • Questions?