Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short version)
Upcoming SlideShare
Loading in...5
×
 

Hacking RSS: Filtering & Processing Obscene Amounts of Information (short version)

on

  • 3,998 views

The 15 minute version of the longer talk that I delivered at SXSW in March. More details: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/

The 15 minute version of the longer talk that I delivered at SXSW in March. More details: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/

Statistics

Views

Total Views
3,998
Views on SlideShare
1,989
Embed Views
2,009

Actions

Likes
4
Downloads
13
Comments
0

9 Embeds 2,009

http://fastwonderblog.com 1969
http://feeds.feedburner.com 16
http://phenomena70.tumblr.com 12
http://web.archive.org 4
http://anonymouse.org 3
http://translate.googleusercontent.com 2
http://www.slideshare.net 1
http://fastwonderblog.com. 1
http://webcache.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short version) Hacking RSS: Filtering & Processing Obscene Amounts of Information (short version) Presentation Transcript

    • Hacking RSS: Filtering & Processing Obscene Amounts of Information #hackingRSS Dawn FosterIntel Community Manager for MeeGo dawn@fastwonder.com
    • Information Overload CD Photo: http://www.flickr.com/photos/chefranden/2751354004/
    • Who Cares?● Most of it is … – complete crap – out of date / obsolete – not interesting to you – irrelevant for you Junk Pile: http://www.flickr.com/photos/zen/4013525/
    • You Want to Find the Needle Haystacks: http://www.flickr.com/photos/rasekh/4911673659/
    • RSS Alone is a Start● Sources you care about delivered right to you. But … – Do you care about everything in each feed? – What about the feeds you arent subscribed to? – Can you keep up with what you have?
    • Prioritize Your Reader● Put things you care about at the top● Categorize● Dont try to read everything
    • The Real Magic is in Filtering RSS Complete Crap Interesting Maybe Relevant Yay!● In my Google Reader right now: – Analyst research blogs mentioning Online Community – Analyst research blogs mentioning MeeGo – Searches across social sites mentioning me, my projects, my websites etc. - filtering out things I dont care about – My favorite blogs filtered using PostRank to find only the ones with a lot of comments or social mentions
    • RSS Filtering Tools● Yahoo Pipes (my favorite) – More powerful & fexible: options to filter any data found in any field in the rss feed (URL, title, description, author …) – Downside: takes some time to learn & can be a little faky at times. Also a single point of failure if Yahoo ever killed it.● Other Options – FeedRinse: easy to use, not as fexible. Import RSS feeds, add filters, get new RSS feeds out. – RSS readers with filtering / alerts (FeedDemon) – Code: write your own filters – Note: many free RSS filtering services have gone out of business – can be bandwidth intensive & costly to host.
    • Yahoo Pipes Filtering Example● Input: – WebWorkerDaily – ReadWriteWeb● Filter by content: – Collaborate – Collaboration – Collaborative● Output: – 1 RSS Feed – Matching 3 keywords 2 Minute Yahoo Pipe Video How-tos: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/
    • PostRank● Best Posts in a feed● Ranked on engagement (links, sharing, comments)● Can get output as RSS feed● Feed includes postrank number as a field
    • Whats In a Feed? PostRank (Yahoo Pipes View)● Content in feeds varies wildly depending on site.● Common: title, author, pubDate, link, content, description● Site-specific: postrank, lat/long, image links, username, twitter source … (most RSS readers dont show these)● API: usually has additional data & can output RSS● If its in the feed, you can use it!
    • Reformatting / Modifying RSS Feeds Dont be satisfied with default RSS feed formats! Twitter Search Twitter RSS Feed Modify & more quickly scan key data
    • Yahoo Pipes: Reformat Twitter Feed● Input: – Twitter Search feed● Loop String Build: – Author – : (spacing) – Title● Loop Assign: – Store result back into title● Output: – 1 RSS feed – Efficient format
    • BackTweets (BackType API)● Data about links on Twitter● Finds links regardless of shortening service● No RSS Feeds● But … You can use API + Pipes to build one!
    • BackType + Twitter API + Pipes Output● Data from BackType + Twitter● Built an RSS feed using Yahoo Pipes● Included the information relevant for me● Could have included or filtered on: name, listed count, location, profile image, user URL, ...
    • Admit it, we ALL do vanity searches ● You can enter your search queries in Google, Twitter, Flickr … – Add a new project & have to update all of them – Can be hard to filter out some results – May have duplicates from multiple searches ● Yahoo Pipes – Update keywords in a CSV file – Use CSV file as input into a bunch of searches (RSS or API inputs) – Filter out what you dont want – Get 1 filtered RSS feed as output2 minute video: http://fastwonderblog.com/2009/05/01/keyword-csv-files-and-searching-2-minute-yahoo-pipes-demo/
    • How Should / Shouldnt You Use All of This?● Do: – Use this for personal productivity – Play around, create prototypes and understand the possibilities● Dont: – Dont violate licenses on content or republish w/o permission – Dont use in critical or production environments● For production use or putting data on websites: – Re-write in a real programming language with cached results and error checking XKCD Comic: http://xkcd.com/327/
    • Learn MoreAbout Dawn:● Intel Community Manager for MeeGo● Author of Companies and Communities● More Info: http://fastwonderblog.com● Dawn@FastWonder.com● @geekygirldawn on Twitter 18Additional Reading & audio from 1 hour version of this talk:● http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/ Photo of Dawn: http://www.flickr.com/photos/ahockley/3036575066/
    • Backup
    • Outsource / Crowdsource New Sources
    • Yahoo Pipes: Reformat PostRank Feed● Input: – 3 PostRank feeds● Loop String Build: – PostRank – : (spacing) – Title● Loop Assign: – Store result back into title● Output: – 1 RSS feed – Efficient format
    • Yahoo Pipes PostRank Example● Input PostRank Feeds: – Engadget – CrunchGear – Boy Genius● Filter by content – Tablet● Sort: – PostRank● Output – 1 RSS feed – Best tablet posts
    • Using Web APIs 101● Many API calls are basically URLs● Constructing URLs – Use API documentation/examples to format the URL – http://api.twitter.com/1/statuses/show /ID.xml ● Version 1 of API show status for ID in .format● API keys – Tells API who you are (password)● Rate limiting – Only get so much & youre cut of – Limited by IP or API key – Chill out for a while & come back XKCD Comic: http://xkcd.com/844/
    • Backtweets API + Twitter API + Yahoo Pipes● What we want to do: – Start with a set of URLs (blog posts in a feed) – Find any tweet mentioning those URLs – Return the tweet and data about the person who posted it● Mission: Build feed using only data from these 2 APIs● BackType API provides Tweet ID (not humanly useful) – http://api.backtype.com/tweets/search/links.xml? q=URL&mode=batch&key=KEY – List of Twitter Status IDs for Tweets linking to URL – Note: I think this feature may be deprecated● Twitter API uses Tweet ID to get everything else – http://api.twitter.com/1/statuses/show/ID.xml – Returns a single status all relevant data for ID
    • BackTweets API: Get Tweet ID● Take WebWorkerDaily Author Feed● Use WWD URLs to build URLs for BackType API call● Fetch data from BackType URLs to get Tweet ID
    • Twitter API: Get Data Based on Tweet ID● Use BackType tweet ID to build URL for Twitter API● Fetch data about Tweet & User from Twitter API● Re-Build title to show “user (followers): tweet”
    • Add Filters to BackType + Twitter Example● Show only tweets from people with 1000+ followers