Your SlideShare is downloading. ×

iStrategy AMS 2011 - Gillian Muessig, SEO Moz


Published on

iStrategy Amsterdam 2011 : Gillian Muessig, Co-Founder, SEO Moz - From SEO to Cloud Marketing (Day Two - Keynote Session Panel Discussion

iStrategy Amsterdam 2011 : Gillian Muessig, Co-Founder, SEO Moz - From SEO to Cloud Marketing (Day Two - Keynote Session Panel Discussion

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Some queries are very simple - a search for "wikipedia" is non-ambiguous. It’s straightforward and can be effectively returned by even a very basic web search engine. Other searches aren't nearly as simple. Let's look at how engines might order two results - a simple problem most of the time, it can be somewhat complex depending on the situation.Since Content A contains the word “Batman” and Content B does not, the engine an easily choose which one to rank.
  • The search engine can use TF*IDF to determine that “Wiggum” is a much less common word than “chief” and thus, Content A is more relevant to the query than Content B. NOTE: This example also does a good job of showing the inherent weakness of a metric like keyword density.
  • Using co-occurrence, the engine can determine that phrases like “Daily Planet” and “Clark Kent” appear with “Superman” and thus, Content B is more relevant than Content A.
  • As humans reading both sentences, we can infer that Content B is obviously about the musical instrument – a piano – and the woman playing it. But a search engine armed with only the methods we described above will struggle since both sentences use the words “keys” and “notes”, some of the few clues to the puzzle.NOTE: We were pretty excited to see that our LDA modeling tool correctly scored B than higher than A… but then things got REALLY interesting.
  • For complex queries or when relating large quantities of results with lots of content-related signals, search engines need ways to determine the intent of a particular page. Simply because it containsa keyword 4 or 5 times in prominent places or even mentions similar phrases/synonyms doesn’t necessarily mean that it's truly relevant to the searcher's query.
  • In this imaginary example, every word in the English language is related to either "cat" or "dog“. They are the only topics available. To measure whether a word is more related to "dog," we use a vector space model that displaysthose relationships mathematically. The illustration does a reasonable job showing our simplistic world. Words like "bigfoot" are perfectly in the middle with no more closeness to "cat" than "dog." But words like "canine" and "feline" are clearly closer to one that the other and the degree of the angle in the vector model illustrates this-and gives us a number.BTW, in an LDA vector space model, topics wouldn't have exact label associations like "dog" and "cat" but would instead be things like "the vector around the topic of dogs.“Taking the simple model above and scaling it to thousands or millions of topics, each of which would have its own dimension. Using this construct, the model can compute the similarity between any word or groups of words and the topics its created. You can learn more about this from Stanford University's posting of Introduction to Information Retrieval, <> which has a specific section on Vector Space Models <>
  • The correlation with rankings of the LDA scores are uncanny. Certainly, they're not a perfect correlation, but that’s expected, given the complexity of Google's ranking algorithm. Seeing LDA scores show this dramatic result makes us seriously question whether there was causation at work here. We hope to do additional research via our ranking models to attempt to show that impact. Perhaps, good links are more likely to point to pages that are more "relevant" via a topic model or some other aspect of Google's algorithm that we don't yet understand naturally biases towards these.
  • Like anything else in the SEO world, manipulatively applying the process is probably a terrible idea. Even if this tool worked perfectly to measure keyword relevance and topic modeling in Google, it would be unwise to simply stuff 50 keywords on your page to get the highest LDA score you could. Quality content that real people actually want to find should be the goal of SEO and Google is sophisticated enough to determine the difference between junk content that matches topic models and real content that real users will like,even if the tool's scoring can't do that.
  • We've just made the LDA Labs tool available. You can use this to input a word, phrase, chunk of text or an entire page's content (via the URL input box) along with a desired query (the keyword term/phrase you want to rank for) and the tool will give back a score that represents the cosine similarity in a percentage form (100% = perfect, 0% = no relationship).
  • If you're trying to do serious SEO analysis and improvement, Rand suggest you build a chart something like this.This chart shows SERPs analysis of "SEO" in w/Linkscape Metrics + LDA
  • Search engines have, classically, relied on a relatively universal algorithm - one that rates pages based on the metrics available, without massive swings between verticals. In the past few years, however, savvy searchers and many SEOs have noted a distinct shift to a model where certain types of sites have a greater opportunity to perform for certain queries. The odds aren't necessarily stacked against outsiders, but the engines appear to bias to the types of content providers that are likely to fulfill the users' intent.For example, when a user performs a search for "lamb shanks," it could make a lot of sense to give an extra boost to sites whose content is focused on recipes and food.BillSlawsky reported on Entity Association - Rather than just looking for brands, it’s more likely that Google is trying to understand when a query includes an entity – a specific person, place, or thing. And if it can identify an entity, that identification can influence the search results that you see...
  • Click and visit data is being used to rank results for better personalization.
  • Transcript

    • 1. From SEO to Cloud MarketingWhere We Came FromWhere We’re Headed What To Do About It
      Gillian Muessig – iStrategy May 2011
    • 2. The Web’s Most Popular Search Marketing Software
    • 3. What IS SEO?
      It is enabling the dissemination of ideas on the web
    • 4.
    • 5. Socio-Political Ramifications of Our Work
    • 6. What IS SEO?
    • 7. Revolution is fomented on the empty bellies of men
      Karl Marx, Leon Trotsky, Fidel Castro, CheGuevera, and many others
    • 8. It’s Like This…
    • 9. 1999 - 2002
    • 10. On-Page Optimization
    • 11. Slaves to PageRank
    • 12. PageRank Busted!
    • 13. 2003 - 2005
    • 14. Anchor Text
    • 15. Keyword-Match Domain Names
    • 16. Registration & Historical Information
    • 17. 1999-2008: What Page Ranked #1 for the Queries “Exit” & “Leave”?
    • 18. Topic Modeling
      LDA correlates w/ Google rankings better than any other on-page feature
    • 19. Why Engines Need Topic Modeling
    • 20. Term Frequency & Inverse Document Frequency
    • 21. Co-Occurrence
    • 22. Topic Modeling
    • 23. Content-related Signals Require Ability to Determine INTENT
      Rock, grenade, or baseball?
      Are you SURE?
    • 24. Simplistic Term Vector Model
    • 25. Causation? Not So Fast!
      • Good links may be more likely to point to more "relevant" pages
      • 26. Other aspect of Google's algorithm may naturally bias towards these pages
    • Expect the Percentage Output to Fluctuate ~1-5%
      • Think of it like polling, not counting
      • 27. Checking every possibility would take an exceptionally long time
      Image credit:
    • 28. Out of our SERPs!
      • Keyword Spamming might improve your LDA Score, but probably not your rankings
    • The Labs Tool
      keywords here
      Text from url here
    • 29. Build a Chart: Compare Your Friends
    • 30. How to Use Topic Tool
      • Think about negative keywords in a similar fashion to negative keywords for ppc.
      • 31. Think about positive keywords in localized terms
    • Perspective: It’s All Relative
      • The numbers are RELATIVE
      • 32. Track numbers over time – shoot for improvement
    • 2006 - 2009
    • 33. Domain Authority
    • 34. External Link Source Diversity
    • 35. Nofollow, Sitemaps & Webmaster Tools
    • 36. Search Quality Raters
    • 37. Where We Are2009 - 2011
    • 38. Twitter Data
      Danny Sullivan:If an article is retweeted or referenced much in Twitter, do you count that as a signal outside of finding any non-nofollowed links that may naturally result from it?
      Google: Yes, we do use it as a signal. It is used as a signal in our organic and news rankings. We also use it to enhance our news universal by marking how many people shared an article
    • 39. Twitter Test
      Page A
      646 links from 36 root domains
      2 tweets
      Page B
      1 link from 1 root domain
      522 tweets
    • 40. Twitter: Clearly Influencing Google
      Page B – the tweeted version – ranks #1!
      Page A
      646 links from 36 root domains
      2 tweets
      Page B
      1 link from 1 root domain
      522 tweets
    • 41. Twitter Data: Very Powerful for QDF
    • 42. Don’t Bother Abusing Twitter for SEO
    • 43. Author Authority
      Danny Sullivan: Do you try to calculate the authority of someone who tweets that might be assigned to their Twitter page. Do you try to “know,” if you will, who they are?
      Bing: Yes. We do calculate the authority of someone who tweets. For known public figures or publishers, we do associate them with who they are. (For example, query for Danny Sullivan)
      Google: Yes we do compute and use author quality. We don’t know who anyone is in real life :-)
    • 44. Facebook Likes & Shares
    • 45. Brand Signals
    • 46. Brand Signals
      • Have real people working at a physical address
      • 47. Have authentic, followed social accounts
      • 48. Display obvious, robust contact information
      • 49. Register with government/civic organizations
      • 50. Receive traffic from diverse sources
      • 51. Generate branded search query volume
      • 52. Run offline marketing/advertising campaigns
      • 53. Often exist only online
      • 54. Rarely have significant social accounts
      • 55. Frequently use email forms only
      • 56. Stay “under the radar”
      • 57. Search is often 90%+ of traffic
      • 58. Have little-no branded search demand
      • 59. Ignore the offline world
    • 60. Entity Association
    • 61. User & Usage Behavior
    • 62. Where We Are
    • 63. The Ranking Factors Are Changing
    • 64. Search Engine Ranking Factors 2009
    • 65. Search Engine Ranking Factors 2011
      Preliminary Data
    • 66. Big Changes from 2009 to 2011
      • Link-Based Factors are waning
      • 67. Social Data is increasing
      • 68. Page-Level Link Metrics Fell the Most (43% - 22%)
      • 69. Keyword-Level Domain Metrics, Brand Data + Social Rising
      • 70. The Survey itself asked for more detail/specificity which may be responsible for some of these shifts
      The new version of the ranking factors will be online in April, 2011
    • 71. Pandas & Farmers
    • 72. From the Mouths of Googlers How do you recognize a shallow-content site?
      Singhal: (W)e asked… “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”
      Cutts: (Using) a rigorous set of questions… “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?”
      Singhal: And based on that, we basically formed some definition of what could be considered low quality.
    • 73. From the Mouths of Googlers But how do you implement that algorithmically?
      Cutts: I think you look for signals that recreate that same intuition, that same experience that you have as an engineer and that users have.
      Singhal: You can imagine in a hyperspace a bunch of points, some points are red, some points are green, and in others there’s some mixture. Your job is to find a plane which says that most things on this side of the plane are red, and most of the things on that side of the plane are the opposite of red.
    • 74. From the Mouths of Googlers
      Googlers want to know:
      • Trustworthy?
      • 75. Expert/enthusiast author?
      • 76. Facts checked, well edited, sloppy
      • 77. Genuinely interesting?
      • 78. Original, insightful, comprehensive, print worthy?
      • 79. Worthy of bookmarking? Sharing? Recommending?
    • 80. From the Mouths of Googlers
      Googlers want to know:
      • Safe place for your credit card info?
      • 81. Do excessive ads distract from or interfere with content?
      • 82. Is it better than competitive pages?
    • 83. Where We’re Headed
    • 84. Google SERPs circa 1999
    • 85. Google SERPs 2011
    • 86. Google SERPs Tomorrow
    • 87. What’s Shrinking
      • Low quality links
      • 88. Low quality content
      • 89. Low value/fake social profiles
      • 90. Over SEO’ing
      • 91. Simple gaming tactics
      • 92. Link farms, keyword stuffing, social spamming
    • Thin Content, Content Aggregating, Plaguerism: OUT!
    • 93. Over SEO’ing: OUT!
    • 94. Sheer # of Followers: OUT!
    • 95. Ignoring Reputation: OUT!
    • 96. Just a Blog: OUT!
    • 97. Blogging With Social Promotion: IN!
    • 98. Domain Authority: IN!
    • 99. Depth of Engagement & Social Reputation: IN!
      Engagement ring… get it?
    • 100. Well Written Content & LDA Scoring: IN!
    • 101. Well Managed & Tagged Content: IN!
    • 102. Reputation Management& Monitoring: IN!
    • 103. Q&A Sites: IN!
      Build Your Personal Brand
    • 104. Competitive Research: IN!
    • 105. Where We’re Headed
      More searches on Social platforms
    • 106. Where We’re Headed
      First touch may no longer be on traditional search engines
    • 107. Where We’re Headed
      Search engines are mining multiple social platforms to connect keywords to data
    • 108. Where We’re Headed
      Mobile Searches Are WAY UpMobile Commerce Has Slower Adoption
    • 109. Darwinian Tips
    • 110. Don’t “Look” Like a Content Farm
    • 111. Avoid “Classic” SEO Tactics
      Directory Link Building
      Keyword-Variant Abuse
      Reciprocal Link Pages
      Paid Links w/ Manipulative Anchor Text
      Sitewide, Footer Links
      Navigation for Engines, Not Humans
      Low Cost/Quality, Outsourced Content
      Generic Design and Layout
      Anchor-Text Rich Internal Links
      Anonymous Contact Forms
      Keyword Stuffed Titles + Pages
      Ad Blocks Dominating the Page
      It’s great to do good SEO, just don’t look like the only reason the site exists is to draw Google traffic
    • 112. Take Advantage of New and Evolving Opportunities
    • 113. Become a “Brand”
      • Have real people working at a physical address
      • 114. Have authentic, followed social accounts
      • 115. Display obvious, robust contact information
      • 116. Register with government/civic organizations
      • 117. Receive traffic from diverse sources
      • 118. Generate branded search query volume
      • 119. Run offline marketing/advertising campaigns
      • 120. Often exist only online
      • 121. Rarely have significant social accounts
      • 122. Frequently use email forms only
      • 123. Stay “under the radar”
      • 124. Search is often 90%+ of traffic
      • 125. Have little-no branded search demand
      • 126. Ignore the offline world
    • 127. Do Competitive Research
      Where do these brands earn their links?
    • 128. Research Brand “Mention” Sources
      Facebook page
      Crunchbase profile
      App profile on Blackberry
      Twitter account
      BusinessWeek Profile
      Chrome Extension
      Mashable Article
    • 129. Focus on the User| Don’t Forget Engines
    • 130. Rich Snippets
    • 131. Re- blog, tweet, target
    • 132. Be Personal Brand: Q & A Sites
    • 133. Infographics: Hire Guys Like This!
    • 134. Get Your Social On
    • 141. When in Rome…
      Find Your Corporate Voice
      Phenomenal analysis of statements by Googlers + how they translate to content/marketing actions:
      I’m excited to be able to share my life’s passion with you.
    • 142. Embrace All of Inbound Marketing
      Research/White Papers
      Blogs + Blogging
      Comment Marketing
      Social Networks
      Online Video
      INBOUND MARKETING!(AKA all the “free” traffic sources)
      Document Sharing
      Social Bookmarking
      Word of Mouth
      Direct/Referring Links
      Type-In Traffic
      Q+A Sites
    • 143. Part II:Stop Competing. Start Owning.
    • 144. Why are there Billions of Searches?
      Billions of Queries
    • 145. via ~diP on Flickr
      How is Demand Created?
    • 146. Inherent Need
      via dimnikolov on Flickr
      Advertising $300 Billion
      Online Advertising $24 Billion
      Search Ads $10.7 Billion
      Via Interactive Advertising Bureau 2010
    • 148. You Can Cater to Demand
      via lumierefl on Flickr
    • 149. Or…
      You Can CreateDemand
    • 150.
    • 151.
    • 152.
    • 153. A Case Study in“Making the Market”
    • 154.
    • 155.
    • 156. DON’T CALL IT “SEO.”
    • 157.
    • 158.
    • 159.
    • 160.
    • 161.
    • 162.
    • 163.
    • 164.
    • 165. 3250
    • 166. That’s how you beat Wikipedia & Amazon!
    • 167. Gillian Muessig
      President & Co-Founder, SEOmoz
      • Twitter: @SEOmom
      • 168. Blog:
      • 169. Email:
      Try SEOmoz Free for 60 Days