Google’s Ranking Factors 2011Early data from SEOmoz’s survey of 132 SEO professionals and correlation data from 10,000+ keyword rankingsDownload at:http://bit.ly/rankfactorssydneyRand Fishkin, SEOmoz CEO, April 2011
SEOmoz Makes Software! We don’t offer consulting.
Understanding, Interpreting & Using Survey Opinion DataEverybody’s wrong sometimes, but there’s a lot we can learn from the aggregation of opinions
#1: Opinions are Not Fact(these are smart people, but they can’t know everything about Google’s rankings)#2: Not Everyone Agrees(standard deviation can help show us the degree of consensus)#3: Data is Still Preliminary(these are raw responses without any filtering)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlMany thanks to all who contributed their time to take the survey!
Understanding, Interpreting & Using Correlation DataThis is powerful, useful information, but with that power comes responsibility to present it accurately
Methodology10,271 Keywords, pulled from Google AdWords US Suggestions(all SERPs were pulled from Google in March 2011, after the Panda/Farmer update)Top 30 Results Retrieved for Each Keyword(excluding all vertical/non-standard results)Correlations are for Pages/Sites that Appear Higher in the Top 30(we use the mean of Spearman’s correlation coefficient across all SERPs)Results Where <2 URLs Contain a Given Feature Are Excluded(this also holds true for results where all the URLs contain the same values for a feature)More details, including complete documentation and the raw dataset will be released in May with the published version of the 2011 Ranking Factorshttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation & DolphinsDolphins who swim at the front of the pod tend to have larger dorsal fins, more muscular tails and more damage on their flippers. The first two might have a causal link, but the damaged flippers is likely a result of swimming at the front (i.e. having damaged flippers doesn’t make a dolphin a better front-of-the-pod-swimmer). Likewise, with ranking correlations, there’s probably many features that are correlated but not necessarily the cause of the positive/negative rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation IS NOT CausationEarning more linking root domains to a URL may indeed increase that page’s ranking.But, will adding more characters to the HTML code of a page increase rankings? Probably not.Just because a feature is correlated, even very highly, doesn’t necessarily mean that improving that metric on your site will necessarily improve your rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
How Confident Can We Be in the Accuracy of these Correlations?Because we have such a large data set, standard error is extremely low. This means even for small correlations, our estimates of the mean correlation are close to the actual mean correlation across all searches.Standard error won’t be reported in this presentation, but it’s less than 0.0035 for all of Spearman correlation results (so we can feel quite confident about our numbers)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Do Correlations in this Range Have Value/Meaning?A factor w/ 1.0 correlation would explain 100% of Google’s algorithm across 10K+ keywordsMost of our data is in this rangeA rough rule of thumb with linear fit numbers is that they explain the number squared of the system’s variance. Thus, a factor with correlation 0.3 would explain ~9% of Google’s algorithm.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Are You Ready for Some Data?!
Overall Algorithmic FactorsThese compare opinion/survey data from 2009 vs. 2011
In 2009, link-based factors (page and domain-level) comprised 65%+ of voters’ algorithmic assessmenthttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
In 2011, link-based factors (page and domain-level) have shrunk in the voters’ minds to only ~45% of algorithmic components. Note: because the question options changed slightly (and more options were added), direct comparison may not be entirely fair.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Page-Specific Link SignalsThese metrics are based on links that point specifically to the ranking page
Most Important Page-Level Link Factors(as voted on by 132 SEOs)My guess: Some voters didn’t fully understand the “linking c-blocks” choiceWith opinion data, voters ordered the factors from most important to least. Thus, when looking at opinion stats, the factor voters felt was most important will have the smallest rank.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
In the rest of this deck, we’ll use linking c-blocks as a reference point, hence the red This data is exactly what an SEO would expect – the more diverse the sources, the greater the correlation with higher rankings. These numbers are relatively similar to June 2010 data.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlations of Page-Level, Anchor Text-Based Link DataNo Surprise: Total links (including internal) w/ anchor text is less well-correlated than external links w/ anchor textPartial anchor text matches have greater correlation than exact match. This might be correlation only, or could indicate that the common SEO wisdom to vary anchor text is accurate.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Rand’s Takeaways#1: SEOs Believe the Power of Links Has Declined(correlation of link data w/ rankings has fallen slightly from 2010 to 2011 as well)#2: Diversity of Links > Raw Quantity(This fits well with most SEOs expectations. Also helps me feel better about the correlation data)#3: Exact Match Anchor Text Appears Slightly Less Well Correlated than Partial Anchor Text in External Links(This was surprising to me, though from Google’s perspective, it makes good sense. The aggregated voter opinions agreed with this, too.)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlThese are my personal takeaways from the data; others’ interpretations may vary
Domain-Wide Link SignalsThese metrics are based on links that point to anywhere on the ranking domain
Most Important Domain-Level Link Factors(as voted on by 132 SEOs)C-Blocks: Likely the same vote interpretation issue as with page-levelhttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlVoters seem to believe that diversity/quantity is more important that quality.
Correlation of Domain-Level Link DataNice Work! Excluding the “c-blocks” issue, voters + correlations match nicely.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlDomain-level link data is surprisingly similar to page-level link data in correlation
Rand’s Takeaways#1: Google May Rank Pages, But Domains Matter Too(the closeness of correlation data and the opinions of voters both back this up)#2: Link Velocity & Diversity of Link Types Would Be Interesting to Measure Given Voters’ Opinions (Hopefully we can look at these in future analyses)#3: Correlations w/ “All” Links vs. Followed-Only is Odd(Let’s take a closer link at these correlations)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Something Funny About NofollowsThese compare followed vs. nofollowed links to the domain + page
Correlation of Followed vs. Nofollowed LinksNofollowed Matters? Many SEOs have been saying that nofollow links can help w/ rankings. The correlation suggests maybe they’re right.These numbers exhibit why we like to build ranking models using machine learning. Models can help determine whether nofollowed links have a causal impact or whether it’s mere correlation.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation of Followed Links toNofollowedLinks(i.e. Are  nofollowed links well correlated w/ rankings only because they’re indicative of followed links?)Hard to know for sure, but based on this data, it could go either way – nofollowed  links, in some way, seem to have a positive  impact on rankings. Some live tests are likely in order http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
On-Page SignalsThese metrics are based on keyword usage and features of the ranking document
Most Important On-Page, Keyword-Use Factors(as voted on by 132 SEOs)My guess: Some voters didn’t fully understand the internal/external link anchors choiceNOTE: We surveyed SEOs about more on-page optimization features, but I didn’t include them all on this chart as it would make the labels very tiny and hard to read http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation of On-Page Keyword-Use ElementsCurious: Longer documents seem to rank better than shorter onesKeyword-based factors are generally less well correlated w/ higher rankings than links.This is just a sampling of the on-page elements we observed; some factors haven’t yet been calculated and thus couldn’t be compared for this presentation. They’ll be in the full version.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation of On-Page Keyword-Use ElementsThe theory that AdSense use boosts rankings isn’t supported by the dataMore reason to believe Google when they say page load speed is a factor, but a very small oneThere’s a longtime rumor that linking externally to Google.com (or Microsoft on Bing) helps with rankings. It’s comforting to see that correlation-wise, linking to MS is better on Google http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Rand’s Takeaways#1: Very Tough to Differentiate w/ On-Page Optimization(as in the past, the data suggests that lots of results are getting on-page right)#2: Longer/Larger Documents Tend to Rank Better(It could be that post-Panda/Farmer update, robust content is rewarded more)#3: Long Titles + URLs are Still Likely Bad for SEO(In addition to the negative correlations, they’re harder to share, to type-in and to link to)#4: Using Keywords Earlier in Tags/Docs Seems Wise(Correlation backs up the common wisdom that keywords closer to the top matter more)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlWe definitely need to look at more on-page factors in the data for the full report, too.
Domain Name Match SignalsThese signals are based on data from users of Twitter, Facebook & Google Buzz via their APIs
Domain Name Extensions in the Search Results:Google may not love .info and .biz, but they like them better than Canadians! 
Spearman’s Correlation with Google Rankings forExact Match Domain Names June 2010 vs. March 2011Whoa! The influence of exact match domain names seems to have waned considerably. Links… not so much.The sample data sets are fairly comparable in every way – both come via Google AdWords suggestions, both include approx. 10K keyword rankings and both were gathered from Google US.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Rand’s Takeaways#1: Exact Match Domains May Not Be as Powerful(though it’s possible that both number reflect correlation-only, not causation)#2: Exact .coms Fell Farther than Any Other Factor(Possibly a lot of gaming or manipulation happening w/ those sites?)#3: Link Count Correlations Remain Similar(This fits w/ my experience and makes me more comfortable comparing the data sets)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlDomain names are still powerful (0.22 correlation for .com exacts), but perhaps losing ground.
Social SignalsThese signals are based on data from users of Twitter, Facebook & Google Buzz via their APIs
Most Important Social Media-Based Factors(as voted on by 132 SEOs)Curious: For Twitter, voters felt authority matters more, while for Facebook, it’s raw quantity (could be because GG doesn’t have as much access to FB graph data).Although we didn’t ask voters for a cutoff on what they believe matters vs. doesn’t, I suspect many/most would have said that Google Buzz and Digg/Reddit/SU aren’t used in the rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation of Social Media-Based Factors(data via Topsy API & Google Buzz API)Amazing: Facebook Shares is our single highest correlated metric with higher Google rankings.Although voters thought Twitter data / tweets to URLs were more influential, Facebook’s metrics are substantially better correlated with rankings. Time to get more FB Shares!http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Percent of Results (from our 10,200 Keyword Set) in Which the Feature Was PresentIt amazed me that Facebook Share data was present for 61% of pages in the top 30 resultsFor most link factors, 99%+ of results had data from Linkscape; for social data, this was much lower, but still high enough that standard error is below 0.0025 for each of the metrics.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Correlation of Social Metrics, Controlling for Links(i.e. Are pages ranking well because of links and social metrics are simply good predictors of linking activity?)Correlations Controlling for LinksRaw CorrelationsTwitter’s correlation wanes dramatically, but Facebookfeatures, while lower, still appear quite influential. Facebook likely deserves much more SEO attention than it currently receives.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Rand’s Takeaways#1: Social is Shockingly Well-Correlated(it’s hard to doubt causation, particularly after reading the SearchEngineLand interview here)#2: Facebook may be more influential than Twitter(Or it may be that Facebook data is simply more robust/available for URLs in the SERPs)#3: Google Buzz is Probably Not in Use Directly(Since so many users simply have their Tweet streams go to Buzz, and correlation is lower)#4: We Need to Learn More About How Social is Used(Understanding how Google uses social metrics, parses “anchor text,” etc. looms large)Expect more experimentation and, sadly, some gaming attempts w/ Twitter + Facebook by SEOs (and spammers) in the future.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Highest Positively + Negatively Correlated Metrics OverallThese are the features most indicative of higher vs. lower rankings
Top 8 Strongest Correlated MetricsExact match domain is actually not in the top 8, but I thought I should include it, as it was, previously, one of the metrics most predictive of positive rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Top 8 Most Negatively Correlated MetricsBe concise and to-the-point; it’s good for users and for your rankings Long domain names, titles, URLs and domain names all had negative correlations with rankings. Again, I’ve included # of words in title, which isn’t technically in the top 8, but still interestinghttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Top 8 Most Negatively Correlated MetricsOne of the most surprising finds in our dataset. We double-checked to be sure. 40% of URLs in the set had only followed links, and these tended to have lower Page Authority (and lower rankings) than those w/ both followed and nofollowed links. Our data scientist thinks there’s some correlation between having nofollowed and other good/natural link signals.Also note that % of followed links on a page has a slightly negative correlation with rankings. Perhaps sites that make all their links out followed aren’t being careful about what they link to?http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Which Domains Appeared Most Frequently in Our 10K+ SERPs?
Top 20 Root Domains Most Prevalent in our 10,200 keyword set(top 30 rank positions)SEOs may be disappointed to see eHow.com performing so well, but classic content aggregators like About.com + Wikipedia still beat them.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
What do the Experts thinkthe Future Holds?
What Do SEOs Believe Will Happen w/ Google’s Use of Ranking Features in the Future?While there was some significant contention about issues like paid links and ads vs. content, the voters nearly all agreed that social signals and perceived user value signals have bright futures.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
IMPORTANT!Don’t Misuse or Misattribute Correlation Data!Think of correlation data as a way of seeing features of sites that rank well, rather than a way of seeing what metrics search engines are actually measuring and counting.A well-correlated metric can often be its own reward, even if it doesn’t directly impact search engine rankings. Virtually all the data in this report reflect the best practices of inbound marketing overall – and using the data to help support these is an excellent application Thanks much!RandWe are looking forward to sharing the full data in the new version of the Search Ranking Factors report coming in ay 2011. Lots more cool info along with the full dataset will be available then.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
Q+ADownload at:http://bit.ly/rankfactorssydney  You can now try SEOmoz PRO Free!http://www.seomoz.org/freetrialRand Fishkin, CEO & Co-Founder, SEOmoz Twitter: @randfish

Ranking Factors Data 2011: SMX Elite Sydney

  • 1.
    Google’s Ranking Factors2011Early data from SEOmoz’s survey of 132 SEO professionals and correlation data from 10,000+ keyword rankingsDownload at:http://bit.ly/rankfactorssydneyRand Fishkin, SEOmoz CEO, April 2011
  • 2.
    SEOmoz Makes Software!We don’t offer consulting.
  • 3.
    Understanding, Interpreting &Using Survey Opinion DataEverybody’s wrong sometimes, but there’s a lot we can learn from the aggregation of opinions
  • 4.
    #1: Opinions areNot Fact(these are smart people, but they can’t know everything about Google’s rankings)#2: Not Everyone Agrees(standard deviation can help show us the degree of consensus)#3: Data is Still Preliminary(these are raw responses without any filtering)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlMany thanks to all who contributed their time to take the survey!
  • 5.
    Understanding, Interpreting &Using Correlation DataThis is powerful, useful information, but with that power comes responsibility to present it accurately
  • 6.
    Methodology10,271 Keywords, pulledfrom Google AdWords US Suggestions(all SERPs were pulled from Google in March 2011, after the Panda/Farmer update)Top 30 Results Retrieved for Each Keyword(excluding all vertical/non-standard results)Correlations are for Pages/Sites that Appear Higher in the Top 30(we use the mean of Spearman’s correlation coefficient across all SERPs)Results Where <2 URLs Contain a Given Feature Are Excluded(this also holds true for results where all the URLs contain the same values for a feature)More details, including complete documentation and the raw dataset will be released in May with the published version of the 2011 Ranking Factorshttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 7.
    Correlation & DolphinsDolphinswho swim at the front of the pod tend to have larger dorsal fins, more muscular tails and more damage on their flippers. The first two might have a causal link, but the damaged flippers is likely a result of swimming at the front (i.e. having damaged flippers doesn’t make a dolphin a better front-of-the-pod-swimmer). Likewise, with ranking correlations, there’s probably many features that are correlated but not necessarily the cause of the positive/negative rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 8.
    Correlation IS NOTCausationEarning more linking root domains to a URL may indeed increase that page’s ranking.But, will adding more characters to the HTML code of a page increase rankings? Probably not.Just because a feature is correlated, even very highly, doesn’t necessarily mean that improving that metric on your site will necessarily improve your rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 9.
    How Confident CanWe Be in the Accuracy of these Correlations?Because we have such a large data set, standard error is extremely low. This means even for small correlations, our estimates of the mean correlation are close to the actual mean correlation across all searches.Standard error won’t be reported in this presentation, but it’s less than 0.0035 for all of Spearman correlation results (so we can feel quite confident about our numbers)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 10.
    Do Correlations inthis Range Have Value/Meaning?A factor w/ 1.0 correlation would explain 100% of Google’s algorithm across 10K+ keywordsMost of our data is in this rangeA rough rule of thumb with linear fit numbers is that they explain the number squared of the system’s variance. Thus, a factor with correlation 0.3 would explain ~9% of Google’s algorithm.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 11.
    Are You Readyfor Some Data?!
  • 12.
    Overall Algorithmic FactorsThesecompare opinion/survey data from 2009 vs. 2011
  • 13.
    In 2009, link-basedfactors (page and domain-level) comprised 65%+ of voters’ algorithmic assessmenthttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 14.
    In 2011, link-basedfactors (page and domain-level) have shrunk in the voters’ minds to only ~45% of algorithmic components. Note: because the question options changed slightly (and more options were added), direct comparison may not be entirely fair.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 15.
    Page-Specific Link SignalsThesemetrics are based on links that point specifically to the ranking page
  • 16.
    Most Important Page-LevelLink Factors(as voted on by 132 SEOs)My guess: Some voters didn’t fully understand the “linking c-blocks” choiceWith opinion data, voters ordered the factors from most important to least. Thus, when looking at opinion stats, the factor voters felt was most important will have the smallest rank.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 17.
    In the restof this deck, we’ll use linking c-blocks as a reference point, hence the red This data is exactly what an SEO would expect – the more diverse the sources, the greater the correlation with higher rankings. These numbers are relatively similar to June 2010 data.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 18.
    Correlations of Page-Level,Anchor Text-Based Link DataNo Surprise: Total links (including internal) w/ anchor text is less well-correlated than external links w/ anchor textPartial anchor text matches have greater correlation than exact match. This might be correlation only, or could indicate that the common SEO wisdom to vary anchor text is accurate.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 19.
    Rand’s Takeaways#1: SEOsBelieve the Power of Links Has Declined(correlation of link data w/ rankings has fallen slightly from 2010 to 2011 as well)#2: Diversity of Links > Raw Quantity(This fits well with most SEOs expectations. Also helps me feel better about the correlation data)#3: Exact Match Anchor Text Appears Slightly Less Well Correlated than Partial Anchor Text in External Links(This was surprising to me, though from Google’s perspective, it makes good sense. The aggregated voter opinions agreed with this, too.)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlThese are my personal takeaways from the data; others’ interpretations may vary
  • 20.
    Domain-Wide Link SignalsThesemetrics are based on links that point to anywhere on the ranking domain
  • 21.
    Most Important Domain-LevelLink Factors(as voted on by 132 SEOs)C-Blocks: Likely the same vote interpretation issue as with page-levelhttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlVoters seem to believe that diversity/quantity is more important that quality.
  • 22.
    Correlation of Domain-LevelLink DataNice Work! Excluding the “c-blocks” issue, voters + correlations match nicely.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlDomain-level link data is surprisingly similar to page-level link data in correlation
  • 23.
    Rand’s Takeaways#1: GoogleMay Rank Pages, But Domains Matter Too(the closeness of correlation data and the opinions of voters both back this up)#2: Link Velocity & Diversity of Link Types Would Be Interesting to Measure Given Voters’ Opinions (Hopefully we can look at these in future analyses)#3: Correlations w/ “All” Links vs. Followed-Only is Odd(Let’s take a closer link at these correlations)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 24.
    Something Funny AboutNofollowsThese compare followed vs. nofollowed links to the domain + page
  • 25.
    Correlation of Followedvs. Nofollowed LinksNofollowed Matters? Many SEOs have been saying that nofollow links can help w/ rankings. The correlation suggests maybe they’re right.These numbers exhibit why we like to build ranking models using machine learning. Models can help determine whether nofollowed links have a causal impact or whether it’s mere correlation.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 26.
    Correlation of FollowedLinks toNofollowedLinks(i.e. Are nofollowed links well correlated w/ rankings only because they’re indicative of followed links?)Hard to know for sure, but based on this data, it could go either way – nofollowed links, in some way, seem to have a positive impact on rankings. Some live tests are likely in order http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 27.
    On-Page SignalsThese metricsare based on keyword usage and features of the ranking document
  • 28.
    Most Important On-Page,Keyword-Use Factors(as voted on by 132 SEOs)My guess: Some voters didn’t fully understand the internal/external link anchors choiceNOTE: We surveyed SEOs about more on-page optimization features, but I didn’t include them all on this chart as it would make the labels very tiny and hard to read http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 29.
    Correlation of On-PageKeyword-Use ElementsCurious: Longer documents seem to rank better than shorter onesKeyword-based factors are generally less well correlated w/ higher rankings than links.This is just a sampling of the on-page elements we observed; some factors haven’t yet been calculated and thus couldn’t be compared for this presentation. They’ll be in the full version.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 30.
    Correlation of On-PageKeyword-Use ElementsThe theory that AdSense use boosts rankings isn’t supported by the dataMore reason to believe Google when they say page load speed is a factor, but a very small oneThere’s a longtime rumor that linking externally to Google.com (or Microsoft on Bing) helps with rankings. It’s comforting to see that correlation-wise, linking to MS is better on Google http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 31.
    Rand’s Takeaways#1: VeryTough to Differentiate w/ On-Page Optimization(as in the past, the data suggests that lots of results are getting on-page right)#2: Longer/Larger Documents Tend to Rank Better(It could be that post-Panda/Farmer update, robust content is rewarded more)#3: Long Titles + URLs are Still Likely Bad for SEO(In addition to the negative correlations, they’re harder to share, to type-in and to link to)#4: Using Keywords Earlier in Tags/Docs Seems Wise(Correlation backs up the common wisdom that keywords closer to the top matter more)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlWe definitely need to look at more on-page factors in the data for the full report, too.
  • 32.
    Domain Name MatchSignalsThese signals are based on data from users of Twitter, Facebook & Google Buzz via their APIs
  • 33.
    Domain Name Extensionsin the Search Results:Google may not love .info and .biz, but they like them better than Canadians! 
  • 34.
    Spearman’s Correlation withGoogle Rankings forExact Match Domain Names June 2010 vs. March 2011Whoa! The influence of exact match domain names seems to have waned considerably. Links… not so much.The sample data sets are fairly comparable in every way – both come via Google AdWords suggestions, both include approx. 10K keyword rankings and both were gathered from Google US.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 35.
    Rand’s Takeaways#1: ExactMatch Domains May Not Be as Powerful(though it’s possible that both number reflect correlation-only, not causation)#2: Exact .coms Fell Farther than Any Other Factor(Possibly a lot of gaming or manipulation happening w/ those sites?)#3: Link Count Correlations Remain Similar(This fits w/ my experience and makes me more comfortable comparing the data sets)http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlDomain names are still powerful (0.22 correlation for .com exacts), but perhaps losing ground.
  • 36.
    Social SignalsThese signalsare based on data from users of Twitter, Facebook & Google Buzz via their APIs
  • 37.
    Most Important SocialMedia-Based Factors(as voted on by 132 SEOs)Curious: For Twitter, voters felt authority matters more, while for Facebook, it’s raw quantity (could be because GG doesn’t have as much access to FB graph data).Although we didn’t ask voters for a cutoff on what they believe matters vs. doesn’t, I suspect many/most would have said that Google Buzz and Digg/Reddit/SU aren’t used in the rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 38.
    Correlation of SocialMedia-Based Factors(data via Topsy API & Google Buzz API)Amazing: Facebook Shares is our single highest correlated metric with higher Google rankings.Although voters thought Twitter data / tweets to URLs were more influential, Facebook’s metrics are substantially better correlated with rankings. Time to get more FB Shares!http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 39.
    Percent of Results(from our 10,200 Keyword Set) in Which the Feature Was PresentIt amazed me that Facebook Share data was present for 61% of pages in the top 30 resultsFor most link factors, 99%+ of results had data from Linkscape; for social data, this was much lower, but still high enough that standard error is below 0.0025 for each of the metrics.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 40.
    Correlation of SocialMetrics, Controlling for Links(i.e. Are pages ranking well because of links and social metrics are simply good predictors of linking activity?)Correlations Controlling for LinksRaw CorrelationsTwitter’s correlation wanes dramatically, but Facebookfeatures, while lower, still appear quite influential. Facebook likely deserves much more SEO attention than it currently receives.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 41.
    Rand’s Takeaways#1: Socialis Shockingly Well-Correlated(it’s hard to doubt causation, particularly after reading the SearchEngineLand interview here)#2: Facebook may be more influential than Twitter(Or it may be that Facebook data is simply more robust/available for URLs in the SERPs)#3: Google Buzz is Probably Not in Use Directly(Since so many users simply have their Tweet streams go to Buzz, and correlation is lower)#4: We Need to Learn More About How Social is Used(Understanding how Google uses social metrics, parses “anchor text,” etc. looms large)Expect more experimentation and, sadly, some gaming attempts w/ Twitter + Facebook by SEOs (and spammers) in the future.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 42.
    Highest Positively +Negatively Correlated Metrics OverallThese are the features most indicative of higher vs. lower rankings
  • 43.
    Top 8 StrongestCorrelated MetricsExact match domain is actually not in the top 8, but I thought I should include it, as it was, previously, one of the metrics most predictive of positive rankings.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 44.
    Top 8 MostNegatively Correlated MetricsBe concise and to-the-point; it’s good for users and for your rankings Long domain names, titles, URLs and domain names all had negative correlations with rankings. Again, I’ve included # of words in title, which isn’t technically in the top 8, but still interestinghttp:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 45.
    Top 8 MostNegatively Correlated MetricsOne of the most surprising finds in our dataset. We double-checked to be sure. 40% of URLs in the set had only followed links, and these tended to have lower Page Authority (and lower rankings) than those w/ both followed and nofollowed links. Our data scientist thinks there’s some correlation between having nofollowed and other good/natural link signals.Also note that % of followed links on a page has a slightly negative correlation with rankings. Perhaps sites that make all their links out followed aren’t being careful about what they link to?http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 46.
    Which Domains AppearedMost Frequently in Our 10K+ SERPs?
  • 47.
    Top 20 RootDomains Most Prevalent in our 10,200 keyword set(top 30 rank positions)SEOs may be disappointed to see eHow.com performing so well, but classic content aggregators like About.com + Wikipedia still beat them.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 48.
    What do theExperts thinkthe Future Holds?
  • 49.
    What Do SEOsBelieve Will Happen w/ Google’s Use of Ranking Features in the Future?While there was some significant contention about issues like paid links and ads vs. content, the voters nearly all agreed that social signals and perceived user value signals have bright futures.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 50.
    IMPORTANT!Don’t Misuse orMisattribute Correlation Data!Think of correlation data as a way of seeing features of sites that rank well, rather than a way of seeing what metrics search engines are actually measuring and counting.A well-correlated metric can often be its own reward, even if it doesn’t directly impact search engine rankings. Virtually all the data in this report reflect the best practices of inbound marketing overall – and using the data to help support these is an excellent application Thanks much!RandWe are looking forward to sharing the full data in the new version of the Search Ranking Factors report coming in ay 2011. Lots more cool info along with the full dataset will be available then.http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 51.
    Q+ADownload at:http://bit.ly/rankfactorssydney You can now try SEOmoz PRO Free!http://www.seomoz.org/freetrialRand Fishkin, CEO & Co-Founder, SEOmoz Twitter: @randfish
  • 52.
  • 53.