SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information

Dawn Foster
Dawn FosterDirector of Open Source Community Strategy
Hacking RSS:
        Filtering & Processing
    Obscene Amounts of Information
              #hackingRSS

       Dawn Foster
Intel Community Manager
        for MeeGo
 dawn@fastwonder.com
Information Overload




                       CD Photo: http://www.flickr.com/photos/chefranden/2751354004/
Who Cares?


●   Most of it is …
    –   complete crap
    –   out of date / obsolete
    –   not interesting to you
    –   irrelevant for you




                                 Junk Pile: http://www.flickr.com/photos/zen/4013525/
You Want to Find the Needle




                      Haystacks: http://www.flickr.com/photos/rasekh/4911673659/
RSS Alone is a Start
●   Sources you care about delivered right to you. But …
    –   Do you care about everything in each feed?
    –   What about the feeds you aren't subscribed to?
    –   Can you keep up with what you have?
Prioritize Your Reader



●   Put things you care about at the top
●   Categorize
●   Don't try to read everything
Outsource / Crowdsource New Sources
The Real Magic is in Filtering RSS
                       Complete Crap
                         Interesting
                        Maybe Relevant
                               Yay!
●   In my Google Reader right now:
    –   Analyst research blogs mentioning Online Community
    –   Analyst research blogs mentioning MeeGo
    –   Searches across social sites mentioning me, my projects, my
        websites etc. - filtering out things I don't care about
    –   My favorite blogs filtered using PostRank to find only the
        ones with a lot of comments or social mentions
RSS Filtering Tools
●   Yahoo Pipes (my favorite)
    –   More powerful & fexible: options to filter any data found in
        any field in the rss feed (URL, title, description, author …)
    –   Downside: takes some time to learn & can be a little faky at
        times. Also a single point of failure if Yahoo ever killed it.



●   Other Options
    –   FeedRinse: easy to use, not as fexible. Import RSS feeds,
        add filters, get new RSS feeds out.
    –   RSS readers with filtering / alerts (FeedDemon)
    –   Code: write your own filters
    –   Note: many free RSS filtering services have gone out of
        business – can be bandwidth intensive & costly to host.
Yahoo Pipes Filtering Example
●   Input:
    –   WebWorkerDaily
    –   ReadWriteWeb
●   Filter by content:
    –   Collaborate
    –   Collaboration
    –   Collaborative
●   Output:
    –   1 RSS Feed
    –   Matching 3 keywords




          2 Minute Yahoo Pipe Video How-to's: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/
PostRank
●   Best Posts in a
    feed
●   Ranked on
    engagement (links,
    sharing, comments)
●   Can get output as
    RSS feed
●   Feed includes
    postrank number as
    a field
What's In a Feed? PostRank (Yahoo Pipes View)




●   Content in feeds varies wildly depending on site.
●   Common: title, author, pubDate, link, content, description
●   Site-specific: postrank, lat/long, image links, username,
    twitter source … (most RSS readers don't show these)
●   API: usually has additional data & can output RSS
●   If it's in the feed, you can use it!
Yahoo Pipes PostRank Example
●   Input PostRank
    Feeds:
    –   Engadget
    –   CrunchGear
    –   Boy Genius
●   Filter by content
    –   Tablet
●   Sort:
    –   PostRank
●   Output
    –   1 RSS feed
    –   Best tablet posts
Reformatting / Modifying RSS Feeds
   Don't be satisfied with default RSS feed formats!

 Twitter
 Search




 Twitter
 RSS
 Feed

           Modify & more quickly scan key data
Yahoo Pipes: Reformat Twitter Feed
●   Input:
    –   Twitter Search
        feed
●   Loop String Build:
    –   Author
    –   : (spacing)
    –   Title
●   Loop Assign:
    –   Store result back
        into title
●   Output:
    –   1 RSS feed
    –   Efficient format
Yahoo Pipes: Reformat PostRank Feed
●   Input:
    –   3 PostRank feeds
●   Loop String Build:
    –   PostRank
    –   : (spacing)
    –   Title
●   Loop Assign:
    –   Store result back
        into title
●   Output:
    –   1 RSS feed
    –   Efficient format
Using Web APIs 101
●   Many API calls are basically URLs
●   Constructing URLs
    –   Use API documentation/examples to
        format the URL
    –   http://api.twitter.com/1/statuses/show
        /ID.xml
         ●   Version 1 of API show status for ID
             in .format
●   API keys
    –   Tells API who you are (password)
●   Rate limiting
    –   Only get so much & you're cut of
    –   Limited by IP or API key
    –   Chill out for a while & come back
                                                   XKCD Comic: http://xkcd.com/844/
BackTweets (BackType API)
●   Data about links on
    Twitter
●   Finds links regardless of
    shortening service
●   No RSS Feeds
●   But … You can use
    API + Pipes to build
    one!
Backtweets API + Twitter API + Yahoo Pipes
●   What we want to do:
    –   Start with a set of URLs (blog posts in a feed)
    –   Find any tweet mentioning those URLs
    –   Return the tweet and data about the person who posted it
●   Mission: Build feed using only data from these 2 APIs
●   BackType API provides Tweet ID (not humanly useful)
    –   http://api.backtype.com/tweets/search/links.xml?
        q=URL&mode=batch&key=KEY
    –   List of Twitter Status IDs for Tweets linking to URL
    –   Note: I think this feature may be deprecated
●   Twitter API uses Tweet ID to get everything else
    –   http://api.twitter.com/1/statuses/show/ID.xml
    –   Returns a single status all relevant data for ID
BackTweets API: Get Tweet ID




●   Take WebWorkerDaily Author Feed
●   Use WWD URLs to build URLs for BackType API call
●   Fetch data from BackType URLs to get Tweet ID
Twitter API: Get Data Based on Tweet ID




●   Use BackType tweet ID to build URL for Twitter API
●   Fetch data about Tweet & User from Twitter API
●   Re-Build title to show “user (followers): tweet”
BackType + Twitter API + Pipes Output
●   Data from BackType + Twitter
●   Built an RSS feed using Yahoo Pipes
●   Included the information relevant for me
●   Could have included or filtered on: name, listed count,
    location, profile image, user URL, ...
Add Filters to BackType + Twitter Example
●   Show only tweets from people with 1000+ followers
Admit it, we ALL do vanity searches
 ●   You can enter your search queries in Google, Twitter,
     Flickr …
       –   Add a new project & have to update all of them
       –   Can be hard to filter out some results
       –   May have duplicates from multiple searches
 ●   Yahoo Pipes
       –   Update keywords in a CSV file
       –   Use CSV file as input into a bunch of searches (RSS or
           API inputs)
       –   Filter out what you don't want
       –   Get 1 filtered RSS feed as output



2 minute video: http://fastwonderblog.com/2009/05/01/keyword-csv-files-and-searching-2-minute-yahoo-pipes-demo/
How Should / Shouldn't You Use All of This?
●   Do:
    –   Use this for personal productivity
    –   Play around and understand the possibilities
    –   Create prototypes for something you might want to build
●   Don't: Use in critical or production environments




●   Everything I've done here could be done in most
    programming languages
●   For production use or putting data on websites:
    –   Re-write in a real programming language with cached
        results and error checking
                      XKCD Comic: http://xkcd.com/327/
Q&A
About Dawn:
● Intel Community Manager for MeeGo

● More Info: http://fastwonderblog.com

● Dawn@FastWonder.com

● @geekygirldawn on Twitter




                                                                          26


Additional Reading:
● http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/


                             Photo of Dawn: http://www.flickr.com/photos/ahockley/3036575066/
03/15/11   27
1 of 27

Recommended

Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve... by
Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...
Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve...Dawn Foster
1.9K views27 slides
RSS and Its Use In Libraries by
RSS and Its Use In LibrariesRSS and Its Use In Libraries
RSS and Its Use In LibrariesSukhdev Singh
10.3K views127 slides
20130504 - FeWeb - Twitter API by
20130504  - FeWeb - Twitter API20130504  - FeWeb - Twitter API
20130504 - FeWeb - Twitter APIPascal Alberty
1.8K views22 slides
4 x backlink bomb by
4 x backlink bomb4 x backlink bomb
4 x backlink bombasumerall
1 view5 slides
Integrating RSS Into Your Web Site by
Integrating RSS Into Your Web SiteIntegrating RSS Into Your Web Site
Integrating RSS Into Your Web SiteMichael Sauers
5.1K views202 slides
5 Time-Saving SEO Alerts to Use Right Now - brightonSEO 2019 by
5 Time-Saving SEO Alerts to Use Right Now - brightonSEO 20195 Time-Saving SEO Alerts to Use Right Now - brightonSEO 2019
5 Time-Saving SEO Alerts to Use Right Now - brightonSEO 2019Marco Bonomo
10.5K views46 slides

More Related Content

What's hot

Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa... by
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...State of Search Conference
1.5K views74 slides
Effective Use of the Twitter Search API by
Effective Use of the Twitter Search APIEffective Use of the Twitter Search API
Effective Use of the Twitter Search APIEric Jensen
10K views22 slides
Using RSS to Post Jobs to Multiple Channels by
Using RSS to Post Jobs to Multiple ChannelsUsing RSS to Post Jobs to Multiple Channels
Using RSS to Post Jobs to Multiple ChannelsJeffrey Levy
10.4K views28 slides
The PLE as a personal tool for the researcher and the teacher by
The PLE as a personal tool for the researcher and the teacherThe PLE as a personal tool for the researcher and the teacher
The PLE as a personal tool for the researcher and the teacherIsmael Peña-López
1.2K views37 slides
Presentation on Search engine optimization 2019 by
Presentation on Search engine optimization 2019Presentation on Search engine optimization 2019
Presentation on Search engine optimization 2019Pooja Kulkarni
117 views26 slides
5 seo-fundamentals-on page optimization (part 2)-slides by
5 seo-fundamentals-on page optimization (part 2)-slides5 seo-fundamentals-on page optimization (part 2)-slides
5 seo-fundamentals-on page optimization (part 2)-slidesMasterCode.vn
1.2K views12 slides

What's hot(9)

Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa... by State of Search Conference
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Effective Use of the Twitter Search API by Eric Jensen
Effective Use of the Twitter Search APIEffective Use of the Twitter Search API
Effective Use of the Twitter Search API
Eric Jensen10K views
Using RSS to Post Jobs to Multiple Channels by Jeffrey Levy
Using RSS to Post Jobs to Multiple ChannelsUsing RSS to Post Jobs to Multiple Channels
Using RSS to Post Jobs to Multiple Channels
Jeffrey Levy10.4K views
The PLE as a personal tool for the researcher and the teacher by Ismael Peña-López
The PLE as a personal tool for the researcher and the teacherThe PLE as a personal tool for the researcher and the teacher
The PLE as a personal tool for the researcher and the teacher
Ismael Peña-López1.2K views
Presentation on Search engine optimization 2019 by Pooja Kulkarni
Presentation on Search engine optimization 2019Presentation on Search engine optimization 2019
Presentation on Search engine optimization 2019
Pooja Kulkarni117 views
5 seo-fundamentals-on page optimization (part 2)-slides by MasterCode.vn
5 seo-fundamentals-on page optimization (part 2)-slides5 seo-fundamentals-on page optimization (part 2)-slides
5 seo-fundamentals-on page optimization (part 2)-slides
MasterCode.vn 1.2K views
Hands On WordPress SEO Mozinar - June 4, 2013 by Evolving SEO
Hands On WordPress SEO Mozinar - June 4, 2013Hands On WordPress SEO Mozinar - June 4, 2013
Hands On WordPress SEO Mozinar - June 4, 2013
Evolving SEO1.4K views
SEO Overview and Tips for Beginners by Deepak Rajput
SEO Overview and Tips for BeginnersSEO Overview and Tips for Beginners
SEO Overview and Tips for Beginners
Deepak Rajput602 views
Technical Sourcing Productivity #1 by Denys Dinkevych
Technical Sourcing Productivity #1Technical Sourcing Productivity #1
Technical Sourcing Productivity #1
Denys Dinkevych196 views

Viewers also liked

Idea champions brainstorm facilitation testimonials by
Idea champions brainstorm facilitation testimonialsIdea champions brainstorm facilitation testimonials
Idea champions brainstorm facilitation testimonialsMitchell Ditkoff
15.2K views17 slides
Why train people to become brainstorm facilitators? by
Why train people to become brainstorm facilitators?Why train people to become brainstorm facilitators?
Why train people to become brainstorm facilitators?Mitchell Ditkoff
31.4K views20 slides
The DNA of IDEA CHAMPIONS WORKSHOPS by
The DNA of IDEA CHAMPIONS WORKSHOPSThe DNA of IDEA CHAMPIONS WORKSHOPS
The DNA of IDEA CHAMPIONS WORKSHOPSMitchell Ditkoff
51.5K views25 slides
Building Thought Leadership through Content Curation by
Building Thought Leadership through Content CurationBuilding Thought Leadership through Content Curation
Building Thought Leadership through Content CurationCorinne Weisgerber
219.1K views79 slides
Deltacloud Presentation OpenHouse 2010 by
Deltacloud Presentation OpenHouse 2010Deltacloud Presentation OpenHouse 2010
Deltacloud Presentation OpenHouse 2010Michal Fojtik
534 views12 slides
Deltacloud Presentation - OSSConf 2010 by
Deltacloud Presentation - OSSConf 2010Deltacloud Presentation - OSSConf 2010
Deltacloud Presentation - OSSConf 2010Michal Fojtik
702 views28 slides

Viewers also liked(20)

Idea champions brainstorm facilitation testimonials by Mitchell Ditkoff
Idea champions brainstorm facilitation testimonialsIdea champions brainstorm facilitation testimonials
Idea champions brainstorm facilitation testimonials
Mitchell Ditkoff15.2K views
Why train people to become brainstorm facilitators? by Mitchell Ditkoff
Why train people to become brainstorm facilitators?Why train people to become brainstorm facilitators?
Why train people to become brainstorm facilitators?
Mitchell Ditkoff31.4K views
The DNA of IDEA CHAMPIONS WORKSHOPS by Mitchell Ditkoff
The DNA of IDEA CHAMPIONS WORKSHOPSThe DNA of IDEA CHAMPIONS WORKSHOPS
The DNA of IDEA CHAMPIONS WORKSHOPS
Mitchell Ditkoff51.5K views
Building Thought Leadership through Content Curation by Corinne Weisgerber
Building Thought Leadership through Content CurationBuilding Thought Leadership through Content Curation
Building Thought Leadership through Content Curation
Corinne Weisgerber219.1K views
Deltacloud Presentation OpenHouse 2010 by Michal Fojtik
Deltacloud Presentation OpenHouse 2010Deltacloud Presentation OpenHouse 2010
Deltacloud Presentation OpenHouse 2010
Michal Fojtik534 views
Deltacloud Presentation - OSSConf 2010 by Michal Fojtik
Deltacloud Presentation - OSSConf 2010Deltacloud Presentation - OSSConf 2010
Deltacloud Presentation - OSSConf 2010
Michal Fojtik702 views
Evaluation Question 1 by 04tollidayl
Evaluation Question 1Evaluation Question 1
Evaluation Question 1
04tollidayl657 views
Introduction to research on open source software by Matthias Stürmer
Introduction to research on open source softwareIntroduction to research on open source software
Introduction to research on open source software
Matthias Stürmer3.4K views
Open source: a job and adventure by Dawn Foster
Open source: a job and adventureOpen source: a job and adventure
Open source: a job and adventure
Dawn Foster1.7K views
9.7 Things Every Programmer Should Know About User Experience by Burr Sutter
9.7 Things Every Programmer Should Know About User Experience9.7 Things Every Programmer Should Know About User Experience
9.7 Things Every Programmer Should Know About User Experience
Burr Sutter1.5K views
Open Source Software For Education (Mel Mc Intyre) Open App by The 4C Initiative
Open Source Software For Education (Mel Mc Intyre) Open AppOpen Source Software For Education (Mel Mc Intyre) Open App
Open Source Software For Education (Mel Mc Intyre) Open App
The 4C Initiative4.9K views
An overview of open source in East Asia (China, Japan, Korea) by OSCON Byrum
An overview of open source in East Asia (China, Japan, Korea)An overview of open source in East Asia (China, Japan, Korea)
An overview of open source in East Asia (China, Japan, Korea)
OSCON Byrum11K views
Open Data Vorlesung 2015: Open Corporate Data by Matthias Stürmer
Open Data Vorlesung 2015: Open Corporate DataOpen Data Vorlesung 2015: Open Corporate Data
Open Data Vorlesung 2015: Open Corporate Data
Matthias Stürmer980 views
Enterprise Developer Journey to the IoT by Burr Sutter
Enterprise Developer Journey to the IoTEnterprise Developer Journey to the IoT
Enterprise Developer Journey to the IoT
Burr Sutter1.2K views
Tui the phoenix project book review by Rudiger Wolf
Tui the phoenix project book reviewTui the phoenix project book review
Tui the phoenix project book review
Rudiger Wolf6.6K views
Devoxx 2011 integration-camel-cxf-servicemix-activemq by Charles Moulliard
Devoxx 2011 integration-camel-cxf-servicemix-activemqDevoxx 2011 integration-camel-cxf-servicemix-activemq
Devoxx 2011 integration-camel-cxf-servicemix-activemq
Charles Moulliard2.4K views
Cloud State of the Union for Java Developers by Burr Sutter
Cloud State of the Union for Java DevelopersCloud State of the Union for Java Developers
Cloud State of the Union for Java Developers
Burr Sutter1.8K views
My 'Phoenix Project'—One Developer's Evolutionary Journey by Burr Sutter
My 'Phoenix Project'—One Developer's Evolutionary JourneyMy 'Phoenix Project'—One Developer's Evolutionary Journey
My 'Phoenix Project'—One Developer's Evolutionary Journey
Burr Sutter940 views

Similar to SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information

Optimizing Content Visibility (St. Louis WordCamp) by
Optimizing Content Visibility (St. Louis WordCamp)Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)Teresa Lane
1.9K views25 slides
Webinar Structured Data by
Webinar Structured DataWebinar Structured Data
Webinar Structured DataBotify
500 views39 slides
SMX Advanced 2015 Seattle | SEO Recap by
SMX Advanced 2015 Seattle | SEO RecapSMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO RecapRenee Girard
1.1K views36 slides
How to annotate_with_wordpress by
How to annotate_with_wordpressHow to annotate_with_wordpress
How to annotate_with_wordpressSTIinnsbruck
250 views10 slides
Tracking online conversations with Yahoo Pipes by
Tracking online conversations with Yahoo PipesTracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo PipesCorinne Weisgerber
5.9K views51 slides
DMAP: Data Aggregation and Presentation Framework by
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkParang Saraf
613 views20 slides

Similar to SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information(20)

Optimizing Content Visibility (St. Louis WordCamp) by Teresa Lane
Optimizing Content Visibility (St. Louis WordCamp)Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)
Teresa Lane1.9K views
Webinar Structured Data by Botify
Webinar Structured DataWebinar Structured Data
Webinar Structured Data
Botify500 views
SMX Advanced 2015 Seattle | SEO Recap by Renee Girard
SMX Advanced 2015 Seattle | SEO RecapSMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO Recap
Renee Girard1.1K views
How to annotate_with_wordpress by STIinnsbruck
How to annotate_with_wordpressHow to annotate_with_wordpress
How to annotate_with_wordpress
STIinnsbruck250 views
Tracking online conversations with Yahoo Pipes by Corinne Weisgerber
Tracking online conversations with Yahoo PipesTracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo Pipes
Corinne Weisgerber5.9K views
DMAP: Data Aggregation and Presentation Framework by Parang Saraf
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation Framework
Parang Saraf613 views
SEO for Developers - Little Rock Tech Fest 2014 by Bill Hartzer
SEO for Developers - Little Rock Tech Fest 2014SEO for Developers - Little Rock Tech Fest 2014
SEO for Developers - Little Rock Tech Fest 2014
Bill Hartzer1.2K views
Social Media Data Collection & Analysis by Scott Sanders
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
Scott Sanders260 views
StripeCon EU 2021 - Can you make it more like google? by Andrew Paxley
StripeCon EU 2021 - Can you make it more like google?StripeCon EU 2021 - Can you make it more like google?
StripeCon EU 2021 - Can you make it more like google?
Andrew Paxley84 views
a4uexpo BT Live Theatre - Mobile and App SEO by auexpo Conference
a4uexpo BT Live Theatre - Mobile and App SEOa4uexpo BT Live Theatre - Mobile and App SEO
a4uexpo BT Live Theatre - Mobile and App SEO
auexpo Conference 2.4K views
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ... by Weiai Wayne Xu
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Weiai Wayne Xu11.5K views
Search Engine Optimization - SEO by Kranthi Shaik
Search Engine Optimization - SEOSearch Engine Optimization - SEO
Search Engine Optimization - SEO
Kranthi Shaik1.3K views
CSE5656 Complex Networks - Gathering Data from Twitter by Marcello Tomasini
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from Twitter
Marcello Tomasini320 views
John Lincoln, MivaCon 2016 - 7 Actionable SEO Strategies to Build Real Revenu... by John Lincoln
John Lincoln, MivaCon 2016 - 7 Actionable SEO Strategies to Build Real Revenu...John Lincoln, MivaCon 2016 - 7 Actionable SEO Strategies to Build Real Revenu...
John Lincoln, MivaCon 2016 - 7 Actionable SEO Strategies to Build Real Revenu...
John Lincoln876 views
7 Actionable SEO Strategies to Build Real Revenue Now by Miva
7 Actionable SEO Strategies to Build Real Revenue Now7 Actionable SEO Strategies to Build Real Revenue Now
7 Actionable SEO Strategies to Build Real Revenue Now
Miva292 views

More from Dawn Foster

CHAOSS Metrics Overview and Examples by
CHAOSS Metrics Overview and ExamplesCHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and ExamplesDawn Foster
8 views22 slides
Be a Good Corporate Citizen in Kubernetes by
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesDawn Foster
8 views17 slides
Overcoming Imposter Syndrome to Become a Conference Speaker! by
Overcoming Imposter Syndrome to Become a Conference Speaker!Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!Dawn Foster
12 views35 slides
How to Be a Good Corporate Citizen in Open Source by
How to Be a Good Corporate Citizen in Open SourceHow to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open SourceDawn Foster
15 views25 slides
Open Source Collaboration and Companies: Finding the Right Balance by
Open Source Collaboration and Companies: Finding the Right BalanceOpen Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right BalanceDawn Foster
46 views24 slides
Navigating Open Source Risk by
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source RiskDawn Foster
143 views25 slides

More from Dawn Foster(20)

CHAOSS Metrics Overview and Examples by Dawn Foster
CHAOSS Metrics Overview and ExamplesCHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and Examples
Dawn Foster8 views
Be a Good Corporate Citizen in Kubernetes by Dawn Foster
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in Kubernetes
Dawn Foster8 views
Overcoming Imposter Syndrome to Become a Conference Speaker! by Dawn Foster
Overcoming Imposter Syndrome to Become a Conference Speaker!Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!
Dawn Foster12 views
How to Be a Good Corporate Citizen in Open Source by Dawn Foster
How to Be a Good Corporate Citizen in Open SourceHow to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open Source
Dawn Foster15 views
Open Source Collaboration and Companies: Finding the Right Balance by Dawn Foster
Open Source Collaboration and Companies: Finding the Right BalanceOpen Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right Balance
Dawn Foster46 views
Navigating Open Source Risk by Dawn Foster
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source Risk
Dawn Foster143 views
Measuring Project Health at VMware by Dawn Foster
Measuring Project Health at VMwareMeasuring Project Health at VMware
Measuring Project Health at VMware
Dawn Foster118 views
Navigating Open Source Risk by Dawn Foster
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source Risk
Dawn Foster112 views
Collaborative Leadership: Governance Beyond Company Affiliation by Dawn Foster
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
Dawn Foster204 views
Collaborative Leadership: Governance Beyond Company Affiliation by Dawn Foster
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
Dawn Foster228 views
Collaborative Leadership: Governance Beyond Company Affiliation by Dawn Foster
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
Dawn Foster148 views
Collaborative Leadership: Governance Beyond Company Affiliation by Dawn Foster
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
Dawn Foster180 views
Is this Open Source Project Healthy or Lifeless? by Dawn Foster
Is this Open Source Project Healthy or Lifeless?Is this Open Source Project Healthy or Lifeless?
Is this Open Source Project Healthy or Lifeless?
Dawn Foster198 views
Collaboration in Linux Kernel Mailing Lists by Dawn Foster
Collaboration in Linux Kernel Mailing Lists Collaboration in Linux Kernel Mailing Lists
Collaboration in Linux Kernel Mailing Lists
Dawn Foster178 views
Be a Good Corporate Citizen in Kubernetes by Dawn Foster
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in Kubernetes
Dawn Foster215 views
Being a Good Corporate Citizen in Open Source by Dawn Foster
Being a Good Corporate Citizen in Open SourceBeing a Good Corporate Citizen in Open Source
Being a Good Corporate Citizen in Open Source
Dawn Foster196 views
Building Community for your Company’s OSS Projects by Dawn Foster
Building Community for your Company’s OSS ProjectsBuilding Community for your Company’s OSS Projects
Building Community for your Company’s OSS Projects
Dawn Foster167 views
Building Community for your Company’s OSS Project by Dawn Foster
Building Community for your Company’s OSS ProjectBuilding Community for your Company’s OSS Project
Building Community for your Company’s OSS Project
Dawn Foster221 views
How to be a terrible hiring manager by Dawn Foster
How to be a terrible hiring managerHow to be a terrible hiring manager
How to be a terrible hiring manager
Dawn Foster509 views
A week in the Life of Kubernetes by Dawn Foster
A week in the Life of KubernetesA week in the Life of Kubernetes
A week in the Life of Kubernetes
Dawn Foster310 views

Recently uploaded

NTGapps NTG LowCode Platform by
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform Mustafa Kuğu
423 views30 slides
The Role of Patterns in the Era of Large Language Models by
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language ModelsYunyao Li
85 views65 slides
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueShapeBlue
138 views15 slides
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... by
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...ShapeBlue
119 views17 slides
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ by
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericShapeBlue
130 views9 slides
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
54 views69 slides

Recently uploaded(20)

NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu423 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li85 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue138 views
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... by ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue119 views
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ by ShapeBlue
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
ShapeBlue130 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker54 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash158 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue126 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue206 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue173 views
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPool by ShapeBlue
Extending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPoolExtending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPool
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPool
ShapeBlue123 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue135 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue203 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue297 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue106 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue180 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10139 views
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue184 views

SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information

  • 1. Hacking RSS: Filtering & Processing Obscene Amounts of Information #hackingRSS Dawn Foster Intel Community Manager for MeeGo dawn@fastwonder.com
  • 2. Information Overload CD Photo: http://www.flickr.com/photos/chefranden/2751354004/
  • 3. Who Cares? ● Most of it is … – complete crap – out of date / obsolete – not interesting to you – irrelevant for you Junk Pile: http://www.flickr.com/photos/zen/4013525/
  • 4. You Want to Find the Needle Haystacks: http://www.flickr.com/photos/rasekh/4911673659/
  • 5. RSS Alone is a Start ● Sources you care about delivered right to you. But … – Do you care about everything in each feed? – What about the feeds you aren't subscribed to? – Can you keep up with what you have?
  • 6. Prioritize Your Reader ● Put things you care about at the top ● Categorize ● Don't try to read everything
  • 8. The Real Magic is in Filtering RSS Complete Crap Interesting Maybe Relevant Yay! ● In my Google Reader right now: – Analyst research blogs mentioning Online Community – Analyst research blogs mentioning MeeGo – Searches across social sites mentioning me, my projects, my websites etc. - filtering out things I don't care about – My favorite blogs filtered using PostRank to find only the ones with a lot of comments or social mentions
  • 9. RSS Filtering Tools ● Yahoo Pipes (my favorite) – More powerful & fexible: options to filter any data found in any field in the rss feed (URL, title, description, author …) – Downside: takes some time to learn & can be a little faky at times. Also a single point of failure if Yahoo ever killed it. ● Other Options – FeedRinse: easy to use, not as fexible. Import RSS feeds, add filters, get new RSS feeds out. – RSS readers with filtering / alerts (FeedDemon) – Code: write your own filters – Note: many free RSS filtering services have gone out of business – can be bandwidth intensive & costly to host.
  • 10. Yahoo Pipes Filtering Example ● Input: – WebWorkerDaily – ReadWriteWeb ● Filter by content: – Collaborate – Collaboration – Collaborative ● Output: – 1 RSS Feed – Matching 3 keywords 2 Minute Yahoo Pipe Video How-to's: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/
  • 11. PostRank ● Best Posts in a feed ● Ranked on engagement (links, sharing, comments) ● Can get output as RSS feed ● Feed includes postrank number as a field
  • 12. What's In a Feed? PostRank (Yahoo Pipes View) ● Content in feeds varies wildly depending on site. ● Common: title, author, pubDate, link, content, description ● Site-specific: postrank, lat/long, image links, username, twitter source … (most RSS readers don't show these) ● API: usually has additional data & can output RSS ● If it's in the feed, you can use it!
  • 13. Yahoo Pipes PostRank Example ● Input PostRank Feeds: – Engadget – CrunchGear – Boy Genius ● Filter by content – Tablet ● Sort: – PostRank ● Output – 1 RSS feed – Best tablet posts
  • 14. Reformatting / Modifying RSS Feeds Don't be satisfied with default RSS feed formats! Twitter Search Twitter RSS Feed Modify & more quickly scan key data
  • 15. Yahoo Pipes: Reformat Twitter Feed ● Input: – Twitter Search feed ● Loop String Build: – Author – : (spacing) – Title ● Loop Assign: – Store result back into title ● Output: – 1 RSS feed – Efficient format
  • 16. Yahoo Pipes: Reformat PostRank Feed ● Input: – 3 PostRank feeds ● Loop String Build: – PostRank – : (spacing) – Title ● Loop Assign: – Store result back into title ● Output: – 1 RSS feed – Efficient format
  • 17. Using Web APIs 101 ● Many API calls are basically URLs ● Constructing URLs – Use API documentation/examples to format the URL – http://api.twitter.com/1/statuses/show /ID.xml ● Version 1 of API show status for ID in .format ● API keys – Tells API who you are (password) ● Rate limiting – Only get so much & you're cut of – Limited by IP or API key – Chill out for a while & come back XKCD Comic: http://xkcd.com/844/
  • 18. BackTweets (BackType API) ● Data about links on Twitter ● Finds links regardless of shortening service ● No RSS Feeds ● But … You can use API + Pipes to build one!
  • 19. Backtweets API + Twitter API + Yahoo Pipes ● What we want to do: – Start with a set of URLs (blog posts in a feed) – Find any tweet mentioning those URLs – Return the tweet and data about the person who posted it ● Mission: Build feed using only data from these 2 APIs ● BackType API provides Tweet ID (not humanly useful) – http://api.backtype.com/tweets/search/links.xml? q=URL&mode=batch&key=KEY – List of Twitter Status IDs for Tweets linking to URL – Note: I think this feature may be deprecated ● Twitter API uses Tweet ID to get everything else – http://api.twitter.com/1/statuses/show/ID.xml – Returns a single status all relevant data for ID
  • 20. BackTweets API: Get Tweet ID ● Take WebWorkerDaily Author Feed ● Use WWD URLs to build URLs for BackType API call ● Fetch data from BackType URLs to get Tweet ID
  • 21. Twitter API: Get Data Based on Tweet ID ● Use BackType tweet ID to build URL for Twitter API ● Fetch data about Tweet & User from Twitter API ● Re-Build title to show “user (followers): tweet”
  • 22. BackType + Twitter API + Pipes Output ● Data from BackType + Twitter ● Built an RSS feed using Yahoo Pipes ● Included the information relevant for me ● Could have included or filtered on: name, listed count, location, profile image, user URL, ...
  • 23. Add Filters to BackType + Twitter Example ● Show only tweets from people with 1000+ followers
  • 24. Admit it, we ALL do vanity searches ● You can enter your search queries in Google, Twitter, Flickr … – Add a new project & have to update all of them – Can be hard to filter out some results – May have duplicates from multiple searches ● Yahoo Pipes – Update keywords in a CSV file – Use CSV file as input into a bunch of searches (RSS or API inputs) – Filter out what you don't want – Get 1 filtered RSS feed as output 2 minute video: http://fastwonderblog.com/2009/05/01/keyword-csv-files-and-searching-2-minute-yahoo-pipes-demo/
  • 25. How Should / Shouldn't You Use All of This? ● Do: – Use this for personal productivity – Play around and understand the possibilities – Create prototypes for something you might want to build ● Don't: Use in critical or production environments ● Everything I've done here could be done in most programming languages ● For production use or putting data on websites: – Re-write in a real programming language with cached results and error checking XKCD Comic: http://xkcd.com/327/
  • 26. Q&A About Dawn: ● Intel Community Manager for MeeGo ● More Info: http://fastwonderblog.com ● Dawn@FastWonder.com ● @geekygirldawn on Twitter 26 Additional Reading: ● http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/ Photo of Dawn: http://www.flickr.com/photos/ahockley/3036575066/
  • 27. 03/15/11 27