iStrategy AMS 2011 - Gillian Muessig, SEO Moz


Published on

iStrategy Amsterdam 2011 : Gillian Muessig, Co-Founder, SEO Moz - From SEO to Cloud Marketing (Day Two - Keynote Session Panel Discussion

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Some queries are very simple - a search for "wikipedia" is non-ambiguous. It’s straightforward and can be effectively returned by even a very basic web search engine. Other searches aren't nearly as simple. Let's look at how engines might order two results - a simple problem most of the time, it can be somewhat complex depending on the situation.Since Content A contains the word “Batman” and Content B does not, the engine an easily choose which one to rank.
  • The search engine can use TF*IDF to determine that “Wiggum” is a much less common word than “chief” and thus, Content A is more relevant to the query than Content B. NOTE: This example also does a good job of showing the inherent weakness of a metric like keyword density.
  • Using co-occurrence, the engine can determine that phrases like “Daily Planet” and “Clark Kent” appear with “Superman” and thus, Content B is more relevant than Content A.
  • As humans reading both sentences, we can infer that Content B is obviously about the musical instrument – a piano – and the woman playing it. But a search engine armed with only the methods we described above will struggle since both sentences use the words “keys” and “notes”, some of the few clues to the puzzle.NOTE: We were pretty excited to see that our LDA modeling tool correctly scored B than higher than A… but then things got REALLY interesting.
  • For complex queries or when relating large quantities of results with lots of content-related signals, search engines need ways to determine the intent of a particular page. Simply because it containsa keyword 4 or 5 times in prominent places or even mentions similar phrases/synonyms doesn’t necessarily mean that it's truly relevant to the searcher's query.
  • In this imaginary example, every word in the English language is related to either "cat" or "dog“. They are the only topics available. To measure whether a word is more related to "dog," we use a vector space model that displaysthose relationships mathematically. The illustration does a reasonable job showing our simplistic world. Words like "bigfoot" are perfectly in the middle with no more closeness to "cat" than "dog." But words like "canine" and "feline" are clearly closer to one that the other and the degree of the angle in the vector model illustrates this-and gives us a number.BTW, in an LDA vector space model, topics wouldn't have exact label associations like "dog" and "cat" but would instead be things like "the vector around the topic of dogs.“Taking the simple model above and scaling it to thousands or millions of topics, each of which would have its own dimension. Using this construct, the model can compute the similarity between any word or groups of words and the topics its created. You can learn more about this from Stanford University's posting of Introduction to Information Retrieval, <> which has a specific section on Vector Space Models <>
  • The correlation with rankings of the LDA scores are uncanny. Certainly, they're not a perfect correlation, but that’s expected, given the complexity of Google's ranking algorithm. Seeing LDA scores show this dramatic result makes us seriously question whether there was causation at work here. We hope to do additional research via our ranking models to attempt to show that impact. Perhaps, good links are more likely to point to pages that are more "relevant" via a topic model or some other aspect of Google's algorithm that we don't yet understand naturally biases towards these.
  • Like anything else in the SEO world, manipulatively applying the process is probably a terrible idea. Even if this tool worked perfectly to measure keyword relevance and topic modeling in Google, it would be unwise to simply stuff 50 keywords on your page to get the highest LDA score you could. Quality content that real people actually want to find should be the goal of SEO and Google is sophisticated enough to determine the difference between junk content that matches topic models and real content that real users will like,even if the tool's scoring can't do that.
  • We've just made the LDA Labs tool available. You can use this to input a word, phrase, chunk of text or an entire page's content (via the URL input box) along with a desired query (the keyword term/phrase you want to rank for) and the tool will give back a score that represents the cosine similarity in a percentage form (100% = perfect, 0% = no relationship).
  • If you're trying to do serious SEO analysis and improvement, Rand suggest you build a chart something like this.This chart shows SERPs analysis of "SEO" in w/Linkscape Metrics + LDA
  • Search engines have, classically, relied on a relatively universal algorithm - one that rates pages based on the metrics available, without massive swings between verticals. In the past few years, however, savvy searchers and many SEOs have noted a distinct shift to a model where certain types of sites have a greater opportunity to perform for certain queries. The odds aren't necessarily stacked against outsiders, but the engines appear to bias to the types of content providers that are likely to fulfill the users' intent.For example, when a user performs a search for "lamb shanks," it could make a lot of sense to give an extra boost to sites whose content is focused on recipes and food.BillSlawsky reported on Entity Association - Rather than just looking for brands, it’s more likely that Google is trying to understand when a query includes an entity – a specific person, place, or thing. And if it can identify an entity, that identification can influence the search results that you see...
  • Click and visit data is being used to rank results for better personalization.
  • iStrategy AMS 2011 - Gillian Muessig, SEO Moz

    1. 1. From SEO to Cloud MarketingWhere We Came FromWhere We’re Headed What To Do About It<br />Gillian Muessig – iStrategy May 2011<br />
    2. 2. The Web’s Most Popular Search Marketing Software<br />
    3. 3. What IS SEO?<br />It is enabling the dissemination of ideas on the web<br />
    4. 4.
    5. 5. Socio-Political Ramifications of Our Work<br />
    6. 6. What IS SEO?<br />
    7. 7. Revolution is fomented on the empty bellies of men<br />Karl Marx, Leon Trotsky, Fidel Castro, CheGuevera, and many others<br />
    8. 8. It’s Like This…<br />
    9. 9. 1999 - 2002<br />
    10. 10. On-Page Optimization<br />http:/<br /><br />
    11. 11. Slaves to PageRank<br />http:/<br /><br />
    12. 12. PageRank Busted!<br />http:/<br /><br />
    13. 13. 2003 - 2005<br />
    14. 14. Anchor Text<br />http:/<br /><br />
    15. 15. Keyword-Match Domain Names<br /><br />
    16. 16. Registration & Historical Information<br /><br />
    17. 17. 1999-2008: What Page Ranked #1 for the Queries “Exit” & “Leave”?<br />http:/<br /><br />
    18. 18. Topic Modeling<br />LDA correlates w/ Google rankings better than any other on-page feature<br /><br />
    19. 19. Why Engines Need Topic Modeling<br />
    20. 20. Term Frequency & Inverse Document Frequency<br />
    21. 21. Co-Occurrence<br />
    22. 22. Topic Modeling<br />
    23. 23. Content-related Signals Require Ability to Determine INTENT<br />Rock, grenade, or baseball?<br />Are you SURE?<br />
    24. 24. Simplistic Term Vector Model<br />
    25. 25. Causation? Not So Fast!<br /><ul><li>Good links may be more likely to point to more "relevant" pages
    26. 26. Other aspect of Google's algorithm may naturally bias towards these pages</li></li></ul><li>Expect the Percentage Output to Fluctuate ~1-5%<br /><ul><li>Think of it like polling, not counting
    27. 27. Checking every possibility would take an exceptionally long time</li></ul>Image credit:<br />
    28. 28. Out of our SERPs!<br /><ul><li>Keyword Spamming might improve your LDA Score, but probably not your rankings</li></li></ul><li>The Labs Tool<br />keywords here<br />Text from url here<br />
    29. 29. Build a Chart: Compare Your Friends<br />
    30. 30. How to Use Topic Tool<br /><ul><li>Think about negative keywords in a similar fashion to negative keywords for ppc.
    31. 31. Think about positive keywords in localized terms</li></li></ul><li>Perspective: It’s All Relative<br /><ul><li>The numbers are RELATIVE
    32. 32. Track numbers over time – shoot for improvement</li></li></ul><li>2006 - 2009<br />
    33. 33. Domain Authority<br />http:/<br /><br />
    34. 34. External Link Source Diversity<br /><br />
    35. 35. Nofollow, Sitemaps & Webmaster Tools<br /><br />
    36. 36. Search Quality Raters<br /><br />
    37. 37. Where We Are2009 - 2011<br />
    38. 38. Twitter Data<br />Danny Sullivan:If an article is retweeted or referenced much in Twitter, do you count that as a signal outside of finding any non-nofollowed links that may naturally result from it?<br />Google: Yes, we do use it as a signal. It is used as a signal in our organic and news rankings. We also use it to enhance our news universal by marking how many people shared an article <br /><br />
    39. 39. Twitter Test<br />Page A<br />646 links from 36 root domains<br />2 tweets<br />Page B<br />1 link from 1 root domain<br />522 tweets<br />http:/<br /><br />
    40. 40. Twitter: Clearly Influencing Google<br />Page B – the tweeted version – ranks #1!<br />Page A<br />646 links from 36 root domains<br />2 tweets<br />Page B<br />1 link from 1 root domain<br />522 tweets<br />http:/<br /><br />
    41. 41. Twitter Data: Very Powerful for QDF<br />http:/<br /><br />
    42. 42. Don’t Bother Abusing Twitter for SEO<br />http:/<br /><br />
    43. 43. Author Authority<br />Danny Sullivan: Do you try to calculate the authority of someone who tweets that might be assigned to their Twitter page. Do you try to “know,” if you will, who they are?<br />Bing: Yes. We do calculate the authority of someone who tweets. For known public figures or publishers, we do associate them with who they are. (For example, query for Danny Sullivan)<br />Google: Yes we do compute and use author quality. We don’t know who anyone is in real life :-)<br />http:/<br /><br />
    44. 44. Facebook Likes & Shares<br /><br />
    45. 45. Brand Signals<br /><br />
    46. 46. Brand Signals<br />Brands<br />Generics<br /><ul><li> Have real people working at a physical address
    47. 47. Have authentic, followed social accounts
    48. 48. Display obvious, robust contact information
    49. 49. Register with government/civic organizations
    50. 50. Receive traffic from diverse sources
    51. 51. Generate branded search query volume
    52. 52. Run offline marketing/advertising campaigns
    53. 53. Often exist only online
    54. 54. Rarely have significant social accounts
    55. 55. Frequently use email forms only
    56. 56. Stay “under the radar”
    57. 57. Search is often 90%+ of traffic
    58. 58. Have little-no branded search demand
    59. 59. Ignore the offline world</li></ul><br />
    60. 60. Entity Association<br /><br />
    61. 61. User & Usage Behavior<br /><br />
    62. 62. Where We Are<br />
    63. 63. The Ranking Factors Are Changing<br />
    64. 64. Search Engine Ranking Factors 2009<br /><br />
    65. 65. Search Engine Ranking Factors 2011<br />Preliminary Data<br /><br />
    66. 66. Big Changes from 2009 to 2011<br /><ul><li> Link-Based Factors are waning
    67. 67. Social Data is increasing
    68. 68. Page-Level Link Metrics Fell the Most (43% - 22%)
    69. 69. Keyword-Level Domain Metrics, Brand Data + Social Rising
    70. 70. The Survey itself asked for more detail/specificity which may be responsible for some of these shifts</li></ul>The new version of the ranking factors will be online in April, 2011<br />
    71. 71. Pandas & Farmers<br />
    72. 72. From the Mouths of Googlers<br /> How do you recognize a shallow-content site? <br />Singhal: (W)e asked… “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”<br />Cutts: (Using) a rigorous set of questions… “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?”<br />Singhal: And based on that, we basically formed some definition of what could be considered low quality. <br /><br />
    73. 73. From the Mouths of Googlers<br /> But how do you implement that algorithmically?<br />Cutts: I think you look for signals that recreate that same intuition, that same experience that you have as an engineer and that users have. <br />Singhal: You can imagine in a hyperspace a bunch of points, some points are red, some points are green, and in others there’s some mixture. Your job is to find a plane which says that most things on this side of the plane are red, and most of the things on that side of the plane are the opposite of red.<br /><br />
    74. 74. From the Mouths of Googlers<br />Googlers want to know:<br /><ul><li>Trustworthy?
    75. 75. Expert/enthusiast author?
    76. 76. Facts checked, well edited, sloppy
    77. 77. Genuinely interesting?
    78. 78. Original, insightful, comprehensive, print worthy?
    79. 79. Worthy of bookmarking? Sharing? Recommending?</li></ul><br />
    80. 80. From the Mouths of Googlers<br />Googlers want to know:<br /><ul><li>Safe place for your credit card info?
    81. 81. Do excessive ads distract from or interfere with content?
    82. 82. Is it better than competitive pages? </li></ul><br />
    83. 83. Where We’re Headed<br />
    84. 84. Google SERPs circa 1999<br />
    85. 85. Google SERPs 2011<br />
    86. 86. Google SERPs Tomorrow<br />
    87. 87. What’s Shrinking<br /><ul><li>Low quality links
    88. 88. Low quality content
    89. 89. Low value/fake social profiles
    90. 90. Over SEO’ing
    91. 91. Simple gaming tactics
    92. 92. Link farms, keyword stuffing, social spamming</li></li></ul><li>Thin Content, Content Aggregating, Plaguerism: OUT!<br />
    93. 93. Over SEO’ing: OUT!<br />seoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseoseo<br />seoseoseoseoseoseoseo<br />seoseoseoseoseoseoseo<br />seoseoseoseoseoseoseo<br />seoseoseoseoseoseoseo<br />seoseoseoseoseoseoseo<br />
    94. 94. Sheer # of Followers: OUT!<br />
    95. 95. Ignoring Reputation: OUT!<br />
    96. 96. Just a Blog: OUT!<br />
    97. 97. Blogging With Social Promotion: IN!<br />
    98. 98. Domain Authority: IN!<br />
    99. 99. Depth of Engagement & Social Reputation: IN!<br />Engagement ring… get it?<br />
    100. 100. Well Written Content & LDA Scoring: IN!<br />
    101. 101. Well Managed & Tagged Content: IN!<br />
    102. 102. Reputation Management& Monitoring: IN!<br />
    103. 103. Q&A Sites: IN!<br />Build Your Personal Brand<br />
    104. 104. Competitive Research: IN!<br /><br />
    105. 105. Where We’re Headed<br />More searches on Social platforms<br />
    106. 106. Where We’re Headed<br />First touch may no longer be on traditional search engines<br />
    107. 107. Where We’re Headed<br />Search engines are mining multiple social platforms to connect keywords to data<br />
    108. 108. Where We’re Headed<br />Mobile Searches Are WAY UpMobile Commerce Has Slower Adoption<br />
    109. 109. Darwinian Tips<br />
    110. 110. Don’t “Look” Like a Content Farm<br /><br />
    111. 111. Avoid “Classic” SEO Tactics<br />Directory Link Building<br />Keyword-Variant Abuse<br />Reciprocal Link Pages<br />Paid Links w/ Manipulative Anchor Text<br />Sitewide, Footer Links<br />Navigation for Engines, Not Humans<br />Low Cost/Quality, Outsourced Content<br />Generic Design and Layout<br />Anchor-Text Rich Internal Links<br />Anonymous Contact Forms<br />Keyword Stuffed Titles + Pages<br />Ad Blocks Dominating the Page<br />It’s great to do good SEO, just don’t look like the only reason the site exists is to draw Google traffic<br />
    112. 112. Take Advantage of New and Evolving Opportunities<br />
    113. 113. Become a “Brand”<br />Brands<br />Generics<br /><ul><li> Have real people working at a physical address
    114. 114. Have authentic, followed social accounts
    115. 115. Display obvious, robust contact information
    116. 116. Register with government/civic organizations
    117. 117. Receive traffic from diverse sources
    118. 118. Generate branded search query volume
    119. 119. Run offline marketing/advertising campaigns
    120. 120. Often exist only online
    121. 121. Rarely have significant social accounts
    122. 122. Frequently use email forms only
    123. 123. Stay “under the radar”
    124. 124. Search is often 90%+ of traffic
    125. 125. Have little-no branded search demand
    126. 126. Ignore the offline world</li></ul><br />
    127. 127. Do Competitive Research<br />Where do these brands earn their links?<br />http:/<br /><br />
    128. 128. Research Brand “Mention” Sources<br />Facebook page<br />Blippr?<br />Crunchbase profile<br />App profile on Blackberry<br />Twitter account<br />BusinessWeek Profile<br />Chrome Extension<br />Mashable Article<br />
    129. 129. Focus on the User| Don’t Forget Engines<br /><br />
    130. 130. Rich Snippets<br /><br />
    131. 131. Re- blog, tweet, target <br /><br />
    132. 132. Be Personal Brand: Q & A Sites<br /><br />
    133. 133. Infographics: Hire Guys Like This!<br /><br />
    134. 134. Get Your Social On<br /><ul><li>Stumble (upon)
    135. 135. Thumb up
    136. 136. Re-tweet
    137. 137. Like it
    138. 138. Share it
    139. 139. Digg it
    140. 140. Redd It</li></ul><br />
    141. 141. When in Rome…<br />Find Your Corporate Voice<br />Phenomenal analysis of statements by Googlers + how they translate to content/marketing actions:<br />I’m excited to be able to share my life’s passion with you.<br /><br />
    142. 142. Embrace All of Inbound Marketing<br />News/Media/PR<br />SEO<br />Email<br />Research/White Papers<br />Blogs + Blogging<br />Infographics<br />Comment Marketing<br />Social Networks<br />Online Video<br />INBOUND MARKETING!(AKA all the “free” traffic sources)<br />Webinars<br />Forums<br />Document Sharing<br />Social Bookmarking<br />Word of Mouth<br />Podcasting<br />Direct/Referring Links<br />Type-In Traffic<br />Q+A Sites<br />
    143. 143. Part II:Stop Competing. Start Owning.<br />
    144. 144. Why are there Billions of Searches?<br />Billions of Queries<br />
    145. 145. via ~diP on Flickr<br />How is Demand Created?<br />
    146. 146. Inherent Need<br />via dimnikolov on Flickr<br />
    147. 147. AD-FUELEDDEMAND<br />Advertising $300 Billion<br />Online Advertising $24 Billion<br />Search Ads $10.7 Billion<br />Via Interactive Advertising Bureau 2010<br />
    148. 148. You Can Cater to Demand<br />via lumierefl on Flickr<br />
    149. 149. Or…<br />You Can CreateDemand<br />
    150. 150.
    151. 151.
    152. 152.
    153. 153. A Case Study in“Making the Market”<br />
    154. 154.
    155. 155.
    156. 156. DON’T CALL IT “SEO.”<br />
    157. 157.
    158. 158.
    159. 159.
    160. 160.
    161. 161.
    162. 162.
    163. 163.
    164. 164.
    165. 165. 3250<br />3000<br />2750<br />2500<br />2250<br />2000<br />Sep-10<br />
    166. 166. That’s how you beat Wikipedia & Amazon!<br />
    167. 167. Gillian Muessig<br />President & Co-Founder, SEOmoz<br /><ul><li> Twitter: @SEOmom
    168. 168. Blog:
    169. 169. Email:</li></ul>Try SEOmoz Free for 60 Days<br />iStrategy2011<br />