Not just talking about the affiliate penalty, but the big challenge with data feeds is to differentiate.Typically we’re looking to make the site be unique, and as a consequence get unique content too.
Two causes of the affiliate penalty. Algorithms and people.
This is the classical model of unique content. People think of unique content in terms of absolutes.
However, the truth is much more complicated. Unique content is a sliding scale.You can have unique content duplicated across your own site, you can mash up public information.Pages can be combinations of unique and non-unique content.
User generated content is awesome. But hard to get hold of without a community.
Ways of getting content from “users”.
Ways of getting content from users on other sites. Or mashing up content from sites behind login walls etc.
A springboard of ideas for building your own tools and mashups.
Content generated by users, but they didn’t imagine it would make it onto the site.E.g. “queries used to find this page”
Talking about the conversation you had with Chewie – how it’s not just about unique content, but about
Example keyword on the left pulled from anamazon feed. Keyword on the right what people search for.Perform keyword research intelligently and group/theme your keywords so that the products you’re fed match up with what people search for.
You can be unique, by having non-unique content and displaying it in valuable ways.
Transcript of "Data Feed SEO for Affiliates by Will Critchlow"
Data Feed SEO<br />A4uexpo London, October 2010<br />Will Critchlow<br />
Building quick & dirty SEO ToolsA Cheat Sheet & Inspiration<br />Sources<br />Magic<br />Horsepower<br />APIs (more on programmable web)<br />AdWords – Keywords<br />Alchemy – Structured data & text<br />Bing – Search, news, spelling<br />Evri – Sentiment and popularity<br />Face.com – Face detection<br />Facebook – Social graph<br />Google Analytics – Visitor data<br />Hostip – Geo data<br />LinkedIn – Professional data<br />Pingdom – Website uptime<br />Postrank (1, 2, 3) – real-time & influence<br />Rapleaf – Social media profiles<br />Twitter – Real time and social<br />... And of course:<br />Linkscape – Links<br />YQL – Yahoo! Query Language<br />select * from html where url=“<url>" and xpath=“<xpath>“<br />select * from html where url=“<url>"<br />select * from feed where url=“<url>”<br />select * from search.web where query = “<query>"<br />Crawlers / Scrapers<br />Mozenda<br />80legs<br />Google App Engine<br />Amazon Web Services<br />Human Touch<br />Amazon Mechanical Turk<br />Smartsheet(interface to Mechanical Turk)<br />oDesk<br />Python<br />Since Python is the language of Google App Engine, here is how you can use YQL easily within Python:<br />Download source – extract to yql folder within your application<br />import yql<br />y = yql.Public()<br />result = y.execute(“<yql query>”)<br />xpath(more examples)<br />/foo – the element ‘foo’<br />//bar – all elements ‘bar’<br />foo/bar – all bar elements children of foo<br />foo//bar – bar arbitrary levels below foo<br />foo/*/bar – bar grandchildren of foo<br />foo/* - all children elements of foo<br />foo/@bar – bar attribute on foo<br />foo/[@bar] – foo with bar attributes<br />foo/[@bar=baz] – where attribute=baz<br />Data (more on infochimps)<br />Data.gov – US government data<br />Data.gov.uk – UK government data<br />Delicious list – from Peter Skomoroch<br />Google Public Data - Directory<br />Guardian – content and data<br />World Bank – finance, health, etc.<br />80legs – prepackaged crawl data<br />By Will Critchlow, www.distilled.co.uk. First published: www.seomoz.org<br />
User Generated “Content”<br /><ul><li>External search queries
FAQs/Support emails</li></li></ul><li>Tracking # of Reviews<br />_gaq.push(['_setCustomVar', 1, // This custom var is set to slot #1.<br /> ‘Number of Reviews', // The top level name for the variable<br /> ‘1', // The Number of Reviews<br /> 3 // Page level variable ]);<br />
Context Is Key<br />Google News: Google likes alternative facts<br />Lyrics: Never considered duplicate content<br />Context is key<br />Look to stand out from your competitors<br />“Use a source of content that’s not unique, but that no-one else in your space is using”<br />
Manipulate & Clean Your Data<br />“Kingston DataTraveler 101 USB flash drive - 4 GB – Cyan”<br />“Kingston USB memory stick 4gb”<br />vs<br />
Of Course, Links Always Win<br />http://www.seobook.com/black-hat-seo-case-study<br />
Manual Reviews – aka “Hand Jobs”<br />Check out the quality rater guidelines<br />“Add value to users”<br />“Relevant”<br />These are subjective!!<br />