SlideShare a Scribd company logo
1 of 27
Download to read offline
Hacking RSS:
        Filtering & Processing
    Obscene Amounts of Information
              #hackingRSS

       Dawn Foster
Intel Community Manager
        for MeeGo
 dawn@fastwonder.com
Information Overload




                       CD Photo: http://www.flickr.com/photos/chefranden/2751354004/
Who Cares?


●   Most of it is …
    –   complete crap
    –   out of date / obsolete
    –   not interesting to you
    –   irrelevant for you




                                 Junk Pile: http://www.flickr.com/photos/zen/4013525/
You Want to Find the Needle




                      Haystacks: http://www.flickr.com/photos/rasekh/4911673659/
RSS Alone is a Start
●   Sources you care about delivered right to you. But …
    –   Do you care about everything in each feed?
    –   What about the feeds you aren't subscribed to?
    –   Can you keep up with what you have?
Prioritize Your Reader



●   Put things you care about at the top
●   Categorize
●   Don't try to read everything
The Real Magic is in Filtering RSS
                       Complete Crap
                         Interesting
                        Maybe Relevant
                               Yay!
●   In my Google Reader right now:
    –   Analyst research blogs mentioning Online Community
    –   Analyst research blogs mentioning MeeGo
    –   Searches across social sites mentioning me, my projects, my
        websites etc. - filtering out things I don't care about
    –   My favorite blogs filtered using PostRank to find only the
        ones with a lot of comments or social mentions
RSS Filtering Tools
●   Yahoo Pipes (my favorite)
    –   More powerful & fexible: options to filter any data found in
        any field in the rss feed (URL, title, description, author …)
    –   Downside: takes some time to learn & can be a little faky at
        times. Also a single point of failure if Yahoo ever killed it.



●   Other Options
    –   FeedRinse: easy to use, not as fexible. Import RSS feeds,
        add filters, get new RSS feeds out.
    –   RSS readers with filtering / alerts (FeedDemon)
    –   Code: write your own filters
    –   Note: many free RSS filtering services have gone out of
        business – can be bandwidth intensive & costly to host.
Yahoo Pipes Filtering Example
●   Input:
    –   WebWorkerDaily
    –   ReadWriteWeb
●   Filter by content:
    –   Collaborate
    –   Collaboration
    –   Collaborative
●   Output:
    –   1 RSS Feed
    –   Matching 3 keywords




          2 Minute Yahoo Pipe Video How-to's: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/
PostRank
●   Best Posts in a
    feed
●   Ranked on
    engagement (links,
    sharing, comments)
●   Can get output as
    RSS feed
●   Feed includes
    postrank number as
    a field
What's In a Feed? PostRank (Yahoo Pipes View)




●   Content in feeds varies wildly depending on site.
●   Common: title, author, pubDate, link, content, description
●   Site-specific: postrank, lat/long, image links, username,
    twitter source … (most RSS readers don't show these)
●   API: usually has additional data & can output RSS
●   If it's in the feed, you can use it!
Reformatting / Modifying RSS Feeds
   Don't be satisfied with default RSS feed formats!

 Twitter
 Search




 Twitter
 RSS
 Feed

           Modify & more quickly scan key data
Yahoo Pipes: Reformat Twitter Feed
●   Input:
    –   Twitter Search
        feed
●   Loop String Build:
    –   Author
    –   : (spacing)
    –   Title
●   Loop Assign:
    –   Store result back
        into title
●   Output:
    –   1 RSS feed
    –   Efficient format
BackTweets (BackType API)
●   Data about links on
    Twitter
●   Finds links regardless of
    shortening service
●   No RSS Feeds
●   But … You can use
    API + Pipes to build
    one!
BackType + Twitter API + Pipes Output
●   Data from BackType + Twitter
●   Built an RSS feed using Yahoo Pipes
●   Included the information relevant for me
●   Could have included or filtered on: name, listed count,
    location, profile image, user URL, ...
Admit it, we ALL do vanity searches
 ●   You can enter your search queries in Google, Twitter,
     Flickr …
       –   Add a new project & have to update all of them
       –   Can be hard to filter out some results
       –   May have duplicates from multiple searches
 ●   Yahoo Pipes
       –   Update keywords in a CSV file
       –   Use CSV file as input into a bunch of searches (RSS or
           API inputs)
       –   Filter out what you don't want
       –   Get 1 filtered RSS feed as output



2 minute video: http://fastwonderblog.com/2009/05/01/keyword-csv-files-and-searching-2-minute-yahoo-pipes-demo/
How Should / Shouldn't You Use All of This?
●   Do:
    –   Use this for personal productivity
    –   Play around, create prototypes and understand the possibilities
●   Don't:
    –   Don't violate licenses on content or republish w/o permission
    –   Don't use in critical or production environments




●   For production use or putting data on websites:
    –   Re-write in a real programming language with cached results
        and error checking
                       XKCD Comic: http://xkcd.com/327/
Learn More
About Dawn:
● Intel Community Manager for MeeGo

● Author of Companies and Communities

● More Info: http://fastwonderblog.com

● Dawn@FastWonder.com

● @geekygirldawn on Twitter




                                                                         18


Additional Reading & audio from 1 hour version of this talk:
● http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/


                            Photo of Dawn: http://www.flickr.com/photos/ahockley/3036575066/
Backup
Outsource / Crowdsource New Sources
Yahoo Pipes: Reformat PostRank Feed
●   Input:
    –   3 PostRank feeds
●   Loop String Build:
    –   PostRank
    –   : (spacing)
    –   Title
●   Loop Assign:
    –   Store result back
        into title
●   Output:
    –   1 RSS feed
    –   Efficient format
Yahoo Pipes PostRank Example
●   Input PostRank
    Feeds:
    –   Engadget
    –   CrunchGear
    –   Boy Genius
●   Filter by content
    –   Tablet
●   Sort:
    –   PostRank
●   Output
    –   1 RSS feed
    –   Best tablet posts
Using Web APIs 101
●   Many API calls are basically URLs
●   Constructing URLs
    –   Use API documentation/examples to
        format the URL
    –   http://api.twitter.com/1/statuses/show
        /ID.xml
         ●   Version 1 of API show status for ID
             in .format
●   API keys
    –   Tells API who you are (password)
●   Rate limiting
    –   Only get so much & you're cut of
    –   Limited by IP or API key
    –   Chill out for a while & come back
                                                   XKCD Comic: http://xkcd.com/844/
Backtweets API + Twitter API + Yahoo Pipes
●   What we want to do:
    –   Start with a set of URLs (blog posts in a feed)
    –   Find any tweet mentioning those URLs
    –   Return the tweet and data about the person who posted it
●   Mission: Build feed using only data from these 2 APIs
●   BackType API provides Tweet ID (not humanly useful)
    –   http://api.backtype.com/tweets/search/links.xml?
        q=URL&mode=batch&key=KEY
    –   List of Twitter Status IDs for Tweets linking to URL
    –   Note: I think this feature may be deprecated
●   Twitter API uses Tweet ID to get everything else
    –   http://api.twitter.com/1/statuses/show/ID.xml
    –   Returns a single status all relevant data for ID
BackTweets API: Get Tweet ID




●   Take WebWorkerDaily Author Feed
●   Use WWD URLs to build URLs for BackType API call
●   Fetch data from BackType URLs to get Tweet ID
Twitter API: Get Data Based on Tweet ID




●   Use BackType tweet ID to build URL for Twitter API
●   Fetch data about Tweet & User from Twitter API
●   Re-Build title to show “user (followers): tweet”
Add Filters to BackType + Twitter Example
●   Show only tweets from people with 1000+ followers

More Related Content

Similar to Hacking RSS: Filtering & Processing Obscene Amounts of Information (short version)

Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)Teresa Lane
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured DataBotify
 
SMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO RecapSMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO RecapRenee Girard
 
How to annotate_with_wordpress
How to annotate_with_wordpressHow to annotate_with_wordpress
How to annotate_with_wordpressSTIinnsbruck
 
SEO for Developers - Little Rock Tech Fest 2014
SEO for Developers - Little Rock Tech Fest 2014SEO for Developers - Little Rock Tech Fest 2014
SEO for Developers - Little Rock Tech Fest 2014Bill Hartzer
 
Tracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo PipesTracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo PipesCorinne Weisgerber
 
StripeCon EU 2021 - Can you make it more like google?
StripeCon EU 2021 - Can you make it more like google?StripeCon EU 2021 - Can you make it more like google?
StripeCon EU 2021 - Can you make it more like google?Andrew Paxley
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & AnalysisScott Sanders
 
The Zeitgeist Movement
The Zeitgeist MovementThe Zeitgeist Movement
The Zeitgeist Movementguest915c8c5
 
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit Booking
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit BookingIndia Pr Wire May 11, 2009 Sensex Down 193 Points On Profit Booking
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit BookingJagannadham Thunuguntla
 
Miyagawa
MiyagawaMiyagawa
Miyagawaguru100
 
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Weiai Wayne Xu
 
DMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkParang Saraf
 
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...State of Search Conference
 

Similar to Hacking RSS: Filtering & Processing Obscene Amounts of Information (short version) (20)

Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)Optimizing Content Visibility (St. Louis WordCamp)
Optimizing Content Visibility (St. Louis WordCamp)
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured Data
 
SMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO RecapSMX Advanced 2015 Seattle | SEO Recap
SMX Advanced 2015 Seattle | SEO Recap
 
How to annotate_with_wordpress
How to annotate_with_wordpressHow to annotate_with_wordpress
How to annotate_with_wordpress
 
SEO for Developers - Little Rock Tech Fest 2014
SEO for Developers - Little Rock Tech Fest 2014SEO for Developers - Little Rock Tech Fest 2014
SEO for Developers - Little Rock Tech Fest 2014
 
Tracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo PipesTracking online conversations with Yahoo Pipes
Tracking online conversations with Yahoo Pipes
 
StripeCon EU 2021 - Can you make it more like google?
StripeCon EU 2021 - Can you make it more like google?StripeCon EU 2021 - Can you make it more like google?
StripeCon EU 2021 - Can you make it more like google?
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
The Zeitgeist Movement
The Zeitgeist MovementThe Zeitgeist Movement
The Zeitgeist Movement
 
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit Booking
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit BookingIndia Pr Wire May 11, 2009 Sensex Down 193 Points On Profit Booking
India Pr Wire May 11, 2009 Sensex Down 193 Points On Profit Booking
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Rss Feeds
Rss FeedsRss Feeds
Rss Feeds
 
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
 
DMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation Framework
 
Indexing repositories: Pitfalls & best practices
Indexing repositories: Pitfalls & best practicesIndexing repositories: Pitfalls & best practices
Indexing repositories: Pitfalls & best practices
 
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
Working Smarter: SEO Automation to Increase Efficiency and Effectiveness - Pa...
 

More from Dawn Foster

CHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and ExamplesCHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and ExamplesDawn Foster
 
Be a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesDawn Foster
 
Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!Dawn Foster
 
How to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open SourceHow to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open SourceDawn Foster
 
Open Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right BalanceOpen Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right BalanceDawn Foster
 
Navigating Open Source Risk
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source RiskDawn Foster
 
Measuring Project Health at VMware
Measuring Project Health at VMwareMeasuring Project Health at VMware
Measuring Project Health at VMwareDawn Foster
 
Navigating Open Source Risk
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source RiskDawn Foster
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationDawn Foster
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationDawn Foster
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationDawn Foster
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationDawn Foster
 
Is this Open Source Project Healthy or Lifeless?
Is this Open Source Project Healthy or Lifeless?Is this Open Source Project Healthy or Lifeless?
Is this Open Source Project Healthy or Lifeless?Dawn Foster
 
Collaboration in Linux Kernel Mailing Lists
Collaboration in Linux Kernel Mailing Lists Collaboration in Linux Kernel Mailing Lists
Collaboration in Linux Kernel Mailing Lists Dawn Foster
 
Be a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesDawn Foster
 
Being a Good Corporate Citizen in Open Source
Being a Good Corporate Citizen in Open SourceBeing a Good Corporate Citizen in Open Source
Being a Good Corporate Citizen in Open SourceDawn Foster
 
Building Community for your Company’s OSS Projects
Building Community for your Company’s OSS ProjectsBuilding Community for your Company’s OSS Projects
Building Community for your Company’s OSS ProjectsDawn Foster
 
Building Community for your Company’s OSS Project
Building Community for your Company’s OSS ProjectBuilding Community for your Company’s OSS Project
Building Community for your Company’s OSS ProjectDawn Foster
 
How to be a terrible hiring manager
How to be a terrible hiring managerHow to be a terrible hiring manager
How to be a terrible hiring managerDawn Foster
 
A week in the Life of Kubernetes
A week in the Life of KubernetesA week in the Life of Kubernetes
A week in the Life of KubernetesDawn Foster
 

More from Dawn Foster (20)

CHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and ExamplesCHAOSS Metrics Overview and Examples
CHAOSS Metrics Overview and Examples
 
Be a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in Kubernetes
 
Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!Overcoming Imposter Syndrome to Become a Conference Speaker!
Overcoming Imposter Syndrome to Become a Conference Speaker!
 
How to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open SourceHow to Be a Good Corporate Citizen in Open Source
How to Be a Good Corporate Citizen in Open Source
 
Open Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right BalanceOpen Source Collaboration and Companies: Finding the Right Balance
Open Source Collaboration and Companies: Finding the Right Balance
 
Navigating Open Source Risk
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source Risk
 
Measuring Project Health at VMware
Measuring Project Health at VMwareMeasuring Project Health at VMware
Measuring Project Health at VMware
 
Navigating Open Source Risk
Navigating Open Source RiskNavigating Open Source Risk
Navigating Open Source Risk
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
 
Collaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company AffiliationCollaborative Leadership: Governance Beyond Company Affiliation
Collaborative Leadership: Governance Beyond Company Affiliation
 
Is this Open Source Project Healthy or Lifeless?
Is this Open Source Project Healthy or Lifeless?Is this Open Source Project Healthy or Lifeless?
Is this Open Source Project Healthy or Lifeless?
 
Collaboration in Linux Kernel Mailing Lists
Collaboration in Linux Kernel Mailing Lists Collaboration in Linux Kernel Mailing Lists
Collaboration in Linux Kernel Mailing Lists
 
Be a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in KubernetesBe a Good Corporate Citizen in Kubernetes
Be a Good Corporate Citizen in Kubernetes
 
Being a Good Corporate Citizen in Open Source
Being a Good Corporate Citizen in Open SourceBeing a Good Corporate Citizen in Open Source
Being a Good Corporate Citizen in Open Source
 
Building Community for your Company’s OSS Projects
Building Community for your Company’s OSS ProjectsBuilding Community for your Company’s OSS Projects
Building Community for your Company’s OSS Projects
 
Building Community for your Company’s OSS Project
Building Community for your Company’s OSS ProjectBuilding Community for your Company’s OSS Project
Building Community for your Company’s OSS Project
 
How to be a terrible hiring manager
How to be a terrible hiring managerHow to be a terrible hiring manager
How to be a terrible hiring manager
 
A week in the Life of Kubernetes
A week in the Life of KubernetesA week in the Life of Kubernetes
A week in the Life of Kubernetes
 

Recently uploaded

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Hacking RSS: Filtering & Processing Obscene Amounts of Information (short version)

  • 1. Hacking RSS: Filtering & Processing Obscene Amounts of Information #hackingRSS Dawn Foster Intel Community Manager for MeeGo dawn@fastwonder.com
  • 2. Information Overload CD Photo: http://www.flickr.com/photos/chefranden/2751354004/
  • 3. Who Cares? ● Most of it is … – complete crap – out of date / obsolete – not interesting to you – irrelevant for you Junk Pile: http://www.flickr.com/photos/zen/4013525/
  • 4. You Want to Find the Needle Haystacks: http://www.flickr.com/photos/rasekh/4911673659/
  • 5. RSS Alone is a Start ● Sources you care about delivered right to you. But … – Do you care about everything in each feed? – What about the feeds you aren't subscribed to? – Can you keep up with what you have?
  • 6. Prioritize Your Reader ● Put things you care about at the top ● Categorize ● Don't try to read everything
  • 7. The Real Magic is in Filtering RSS Complete Crap Interesting Maybe Relevant Yay! ● In my Google Reader right now: – Analyst research blogs mentioning Online Community – Analyst research blogs mentioning MeeGo – Searches across social sites mentioning me, my projects, my websites etc. - filtering out things I don't care about – My favorite blogs filtered using PostRank to find only the ones with a lot of comments or social mentions
  • 8. RSS Filtering Tools ● Yahoo Pipes (my favorite) – More powerful & fexible: options to filter any data found in any field in the rss feed (URL, title, description, author …) – Downside: takes some time to learn & can be a little faky at times. Also a single point of failure if Yahoo ever killed it. ● Other Options – FeedRinse: easy to use, not as fexible. Import RSS feeds, add filters, get new RSS feeds out. – RSS readers with filtering / alerts (FeedDemon) – Code: write your own filters – Note: many free RSS filtering services have gone out of business – can be bandwidth intensive & costly to host.
  • 9. Yahoo Pipes Filtering Example ● Input: – WebWorkerDaily – ReadWriteWeb ● Filter by content: – Collaborate – Collaboration – Collaborative ● Output: – 1 RSS Feed – Matching 3 keywords 2 Minute Yahoo Pipe Video How-to's: http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/
  • 10. PostRank ● Best Posts in a feed ● Ranked on engagement (links, sharing, comments) ● Can get output as RSS feed ● Feed includes postrank number as a field
  • 11. What's In a Feed? PostRank (Yahoo Pipes View) ● Content in feeds varies wildly depending on site. ● Common: title, author, pubDate, link, content, description ● Site-specific: postrank, lat/long, image links, username, twitter source … (most RSS readers don't show these) ● API: usually has additional data & can output RSS ● If it's in the feed, you can use it!
  • 12. Reformatting / Modifying RSS Feeds Don't be satisfied with default RSS feed formats! Twitter Search Twitter RSS Feed Modify & more quickly scan key data
  • 13. Yahoo Pipes: Reformat Twitter Feed ● Input: – Twitter Search feed ● Loop String Build: – Author – : (spacing) – Title ● Loop Assign: – Store result back into title ● Output: – 1 RSS feed – Efficient format
  • 14. BackTweets (BackType API) ● Data about links on Twitter ● Finds links regardless of shortening service ● No RSS Feeds ● But … You can use API + Pipes to build one!
  • 15. BackType + Twitter API + Pipes Output ● Data from BackType + Twitter ● Built an RSS feed using Yahoo Pipes ● Included the information relevant for me ● Could have included or filtered on: name, listed count, location, profile image, user URL, ...
  • 16. Admit it, we ALL do vanity searches ● You can enter your search queries in Google, Twitter, Flickr … – Add a new project & have to update all of them – Can be hard to filter out some results – May have duplicates from multiple searches ● Yahoo Pipes – Update keywords in a CSV file – Use CSV file as input into a bunch of searches (RSS or API inputs) – Filter out what you don't want – Get 1 filtered RSS feed as output 2 minute video: http://fastwonderblog.com/2009/05/01/keyword-csv-files-and-searching-2-minute-yahoo-pipes-demo/
  • 17. How Should / Shouldn't You Use All of This? ● Do: – Use this for personal productivity – Play around, create prototypes and understand the possibilities ● Don't: – Don't violate licenses on content or republish w/o permission – Don't use in critical or production environments ● For production use or putting data on websites: – Re-write in a real programming language with cached results and error checking XKCD Comic: http://xkcd.com/327/
  • 18. Learn More About Dawn: ● Intel Community Manager for MeeGo ● Author of Companies and Communities ● More Info: http://fastwonderblog.com ● Dawn@FastWonder.com ● @geekygirldawn on Twitter 18 Additional Reading & audio from 1 hour version of this talk: ● http://fastwonderblog.com/yahoo-pipes-and-rss-hacks/ Photo of Dawn: http://www.flickr.com/photos/ahockley/3036575066/
  • 20. Outsource / Crowdsource New Sources
  • 21. Yahoo Pipes: Reformat PostRank Feed ● Input: – 3 PostRank feeds ● Loop String Build: – PostRank – : (spacing) – Title ● Loop Assign: – Store result back into title ● Output: – 1 RSS feed – Efficient format
  • 22. Yahoo Pipes PostRank Example ● Input PostRank Feeds: – Engadget – CrunchGear – Boy Genius ● Filter by content – Tablet ● Sort: – PostRank ● Output – 1 RSS feed – Best tablet posts
  • 23. Using Web APIs 101 ● Many API calls are basically URLs ● Constructing URLs – Use API documentation/examples to format the URL – http://api.twitter.com/1/statuses/show /ID.xml ● Version 1 of API show status for ID in .format ● API keys – Tells API who you are (password) ● Rate limiting – Only get so much & you're cut of – Limited by IP or API key – Chill out for a while & come back XKCD Comic: http://xkcd.com/844/
  • 24. Backtweets API + Twitter API + Yahoo Pipes ● What we want to do: – Start with a set of URLs (blog posts in a feed) – Find any tweet mentioning those URLs – Return the tweet and data about the person who posted it ● Mission: Build feed using only data from these 2 APIs ● BackType API provides Tweet ID (not humanly useful) – http://api.backtype.com/tweets/search/links.xml? q=URL&mode=batch&key=KEY – List of Twitter Status IDs for Tweets linking to URL – Note: I think this feature may be deprecated ● Twitter API uses Tweet ID to get everything else – http://api.twitter.com/1/statuses/show/ID.xml – Returns a single status all relevant data for ID
  • 25. BackTweets API: Get Tweet ID ● Take WebWorkerDaily Author Feed ● Use WWD URLs to build URLs for BackType API call ● Fetch data from BackType URLs to get Tweet ID
  • 26. Twitter API: Get Data Based on Tweet ID ● Use BackType tweet ID to build URL for Twitter API ● Fetch data about Tweet & User from Twitter API ● Re-Build title to show “user (followers): tweet”
  • 27. Add Filters to BackType + Twitter Example ● Show only tweets from people with 1000+ followers