Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
BEYOND TECHNICAL SEO,
HOW TO DEAL WITH RANKBRAIN AND AI IN SEO
@OnCrawl
BrightonSEO 2018
#seocamp
FranCoisGoube
OncrawlCEO&founder
SEOExpert
AdvisorforFrenchVC
firms
Serialentrepreneur
& GOOD IDEAS
Speaker today
Menu
• Introduction
• Crawl, indexation, ranking and AI principles
• The tables of the law of Data Exploits
• Technical SE...
#seocamp
How a search engine works
Crawl
1 32
RankIndex
discover organize respond AI re-ranking
#seocamp
How does Google work?
Google's algorithms are computer programs
designed to navigate through billions of pages, f...
Google consumes annually
as much energy
as the city of San Francisco
journaldugeek 12/12/2016
#seocamp
Google Crawl budget
Are the resources $ that Google setup to crawl your website optimized?
#seocamp
What Google says about crawl budget
If you observe that new pages are usually explored the
same day they are publ...
#seocamp
100% of the sites in GSC have exploration data!
Tracking its "Crawl Behavior" through the analysis
of its logs ca...
#seocamp
How to organize a crawl effectively
Define a Crawl Budget to spend
prioritize the urls to explore according to im...
#seocamp
Monitor your crawl frequency
Analyze your log files
It’s a sanity indicator
regarding your SEO
Create alerts when...
Your log files contain the only data that
accurately reflects how search engines browse
your website
- MOZ Blog
#seocamp
To optimize crawl budget spending it is
necessary to return the most beautiful data
The algorithms will « decide ...
#seocamp
Indexation
How Google choose to index a page or not?
Internal metrics
• Page Quality
Volume of Content / keywords...
#seocamp
Return the most beautiful data!
The algorithms will decide according to your semantic alignment,
semantic quality...
#seocamp
Rankings
Once indexed, Google scores each page of your website
There are many different metrics that are computed...
#seocamp
Return the most beautiful data!
The algorithms will decide according to your semantic alignment,
semantic quality...
#seocamp
Algorithms, Data aggregation, human/supervised
validation, massive usage of data models,
knowledge and interrelat...
#seocamp
#seocamp
Everybody works for Google
This is how works Machine Learning
#seocamp
Interpretation of the
request
Matching with the
knowledge base
Assumption of what
the Internet user is
looking fo...
And so what?
You need to help Google !
#seocamp
You need to help Google
Why? How?
Maximize exploration sessions and point the
bot (and users!) in the right direc...
#seocamp
A query-independent score (also called a
document score) is computed for
each URL by URL page rankers
The page ra...
#seocamp
Page importance Patent US20110179178
Page importance score data is provided to URL managers,
which pass a page im...
#seocamp
Page importance
Can be optimized by playing on the right metrics
§ depth and page localisation in the site
§ Page...
And so what?
#seocamp
What we know
Google does not like digging too deep into a site
#seocamp
Google is sensitive to the volume of content
What we know
#seocamp
Google is sensitive to Internal
popularity - OnCrawl InRank®
What we know
More links = best positions More links ...
#seocamp
Google is sensitive to the CTR and Bounce Rates
What we know
Less BR = more Bot Hits
Best CTR = best positions
#seocamp
To remember
Your priority pages should be
linked from the home, or
ideally 1-2 level from the home
To be crawled ...
#seocamp
Indexation
and Content
Interpretation
#seocamp
What we know
Google classifies types of request:
• Transactional
• Informational
• Navigation
Depending on the ty...
#seocamp
Transactional
Here the user wants to get to a website
where there will be more interaction
Pages on “converse men...
#seocamp
Informational
This is when the user is looking for a
specific bit of information
Pages on “Clinton” or “Trump”
to...
#seocamp
Navigational
The user is looking to reach a particular
website
Pages on “BrightonSEO speakers”
are know as part o...
#seocamp
Word Embeddings
is the collective name for a set of
language modeling and feature
learning techniques in natural ...
#seocamp
NLP as Rankbrain's foundation
Our (Rankbrain) algorithm is able to
represent strings of text in very high-
dimens...
#seocamp
Automated Language & Rankbrain Processing
Google maintains a knowledge base on named entities
and understands the...
#seocamp
Word Embeddings
In a search engine algorithm, a tool is needed to calculate a
"similarity" score between two docu...
#seocamp
Each entity or concept is vectorized
For the machines to understand
#seocamp
Google can then evaluate the
distance between two concepts
Entity
#1
Entity
#1
vectorized
distances
related
entit...
#seocamp
For each entity,
Google knows:
Entity#1
Sentences
that contain
the entity
In which
context / topic
the entity is
...
For my SEO? WTF?
#seocamp
Concretely?
how old is the wife of bill gates ?
Assumption of
a request on
age
Type of
relation
=
wife
Individual...
#seocamp
An example: “Apple”
Phone
Computer
Brand
/Apple
Local store /”apple”
#seocamp
Phone
Computer
Brand
/Apple
Local store / ”apple”
It is entity detection that infers
the context of the search an...
#seocamp
To remember
Google will tend to crawl pages
by “package”
• On the same path (Discover)
• On the same topic (Recra...
#seocamp
How to check my named entities?
You can do it for each URL in the OnCrawl Toolbox
Now integrated in the OnCrawl c...
how to maximize your efforts
#seocamp
Understand your website
Map your website by:
• Type of content
• Pages categories
Understand which entities are p...
#seocamp
Which steps?
Crawl your site
Categorize your pages
Extract named entities by page group
Identify pages with/witho...
#seocamp
Quickwins
ü Use named entities in your link anchors
ü Create packages of linked pages according to entity typolog...
#seocamp
OnCrawl detects entities
Follow rankings and cross data
with OnCrawl Rankings
#seocamp
Conclusions
Crawling, indexing, ranking and re-ranking
are all based on artificial intelligence and machine learn...
#seocamp
Crawling consumes energy, simplify bots life :
pay attention to depth, navigation shortcuts, No Duplicate
and esp...
#seocamp
Indexing is based on internal/external metrics content
Knowledge Graph is the Google learning base on named entit...
#seocamp
Ranking: it is the consistency of all these data with user intentions
it depends on quality scores (technical)
re...
#seocamp
Make users want to come back to manipulate the CTR and BR
Favorise titles, meta desc, content, speed, UX/UI
Concl...
www.oncrawl.com
OnCrawl help e-commerce & online media take
better SEO decisions and grow their revenues
By providing acce...
Try OnCrawl for free
Start your free trial
BrightonSEO - How to deal with Rankbrain and IA in SEO?
BrightonSEO - How to deal with Rankbrain and IA in SEO?
BrightonSEO - How to deal with Rankbrain and IA in SEO?
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

4

Share

Download to read offline

BrightonSEO - How to deal with Rankbrain and IA in SEO?

Download to read offline

How to deal with Rankbrain and IA in SEO?
Conference made by Lionel Kappelhoff, VP Customer Success at OnCrawl during last BrightonSEO (April, 2018).

BrightonSEO - How to deal with Rankbrain and IA in SEO?

  1. 1. BEYOND TECHNICAL SEO, HOW TO DEAL WITH RANKBRAIN AND AI IN SEO @OnCrawl BrightonSEO 2018
  2. 2. #seocamp FranCoisGoube OncrawlCEO&founder SEOExpert AdvisorforFrenchVC firms Serialentrepreneur & GOOD IDEAS Speaker today
  3. 3. Menu • Introduction • Crawl, indexation, ranking and AI principles • The tables of the law of Data Exploits • Technical SEO Methodology to make you a winner
  4. 4. #seocamp How a search engine works Crawl 1 32 RankIndex discover organize respond AI re-ranking
  5. 5. #seocamp How does Google work? Google's algorithms are computer programs designed to navigate through billions of pages, find the right clues and send you exactly the answer to your question https://www.google.com/search/howsearchworks/ The life of a query begins long before your capture, by exploring and indexing the billions of documents that make up the Web
  6. 6. Google consumes annually as much energy as the city of San Francisco journaldugeek 12/12/2016
  7. 7. #seocamp Google Crawl budget Are the resources $ that Google setup to crawl your website optimized?
  8. 8. #seocamp What Google says about crawl budget If you observe that new pages are usually explored the same day they are published, then you don't really have to worry about the exploration budget […] if a site has less than a few thousand URLs, it will be browsed correctly most of the time […] we do not have a single term to describe everything this term seems to mean on the outsidene
  9. 9. #seocamp 100% of the sites in GSC have exploration data! Tracking its "Crawl Behavior" through the analysis of its logs can quickly detect an anomaly in the bot’s behavior Does crawl's budget is related to the ranking, visits? The more the index is updated, the more Google knows that the page fit « the best response to a query »
  10. 10. #seocamp How to organize a crawl effectively Define a Crawl Budget to spend prioritize the urls to explore according to important (and changing) factors Schedule : importants pages first Adapt the budget according to needs to reduce costs Optimize Behind this method there are only algorithms they use the quality data of your site to make choices Mobile first is comming!
  11. 11. #seocamp Monitor your crawl frequency Analyze your log files It’s a sanity indicator regarding your SEO Create alerts when crawl frequencies drop or increase
  12. 12. Your log files contain the only data that accurately reflects how search engines browse your website - MOZ Blog
  13. 13. #seocamp To optimize crawl budget spending it is necessary to return the most beautiful data The algorithms will « decide » according to your technical quality, popularity and semantics scores and user beahviors We know that some metrics are more important than others to trigger the increase in crawl frequency Understand that algorithms use variables
  14. 14. #seocamp Indexation How Google choose to index a page or not? Internal metrics • Page Quality Volume of Content / keywords Topics and Entity detection Title, Hn, Schema.org, … Payload, OnCrawl InRank® External metrics • Page Authority Page Rank Majestic Trust & Citation Flow
  15. 15. #seocamp Return the most beautiful data! The algorithms will decide according to your semantic alignment, semantic quality, content quality Web Scalling principes : • use massive data interpretation Natural Language analysis Word Embedding • use massive data corelation The web is full of entities • The ontology also includes all the semantic that describe the relationships between terms or between named entities Optimize Indexation
  16. 16. #seocamp Rankings Once indexed, Google scores each page of your website There are many different metrics that are computed Pagerank, Quality scores, Website Trust… Google aggregates many attributes to each page Meta Data, Title, Schema.org, N-Grams, Payload, content interpretation
  17. 17. #seocamp Return the most beautiful data! The algorithms will decide according to your semantic alignment, semantic quality, content quality Web Scalling principes : • use massive data agregation : human / computed • use massive data interrelationship / qualification : human / computed Optimize Ranking
  18. 18. #seocamp Algorithms, Data aggregation, human/supervised validation, massive usage of data models, knowledge and interrelationship analysis Machine Learning can be a part of the process by adding multiple interation cycles Test and learn! This is Artificial Intelligence
  19. 19. #seocamp
  20. 20. #seocamp Everybody works for Google
  21. 21. This is how works Machine Learning
  22. 22. #seocamp Interpretation of the request Matching with the knowledge base Assumption of what the Internet user is looking for (Context) Does the user like the result? Yes Perfect, I keep my ranking I'll try a new ranking next time.No What we know About Re-ranking
  23. 23. And so what?
  24. 24. You need to help Google !
  25. 25. #seocamp You need to help Google Why? How? Maximize exploration sessions and point the bot (and users!) in the right direction • Reduce errors • Rectify technical issues • Reinforce your content • Create depth shortcuts • Organize linking by objectives • Speed up the web site More pages/freq. in less of time 🧐
  26. 26. #seocamp A query-independent score (also called a document score) is computed for each URL by URL page rankers The page rankers compute a page importance score for a given URL […] the page importance score is computed by considering not only the number of URLs that reference a given URL but also the page importance score of such referencing URLs Page importance Patent US20110179178
  27. 27. #seocamp Page importance Patent US20110179178 Page importance score data is provided to URL managers, which pass a page importance score for each URL robots, and content processing servers One example of a page importance score is PageRank, which is used the page importance metric used in the Google search engine The Crawl rate is SEO data driven
  28. 28. #seocamp Page importance Can be optimized by playing on the right metrics § depth and page localisation in the site § Page Rank – Majestic Trust Flow, Citation Flow § internal Page Rank – OnCrawl InRank § type of document: PDF, HTML, TXT § sitemap.xml inclusion § quality/spread of anchors § number of words, few near duplicate § parents page importance From the excellent Dawn Anderson article on SearchEngine Land https://patents.google.com/patent/US8042112B1/en?q=(page)&q=(importance)&q=url&q=schedulin g&assignee=Google+Inc.&oq=(page)+(importance)+assignee:(Google+Inc.)+url+scheduling
  29. 29. And so what?
  30. 30. #seocamp What we know Google does not like digging too deep into a site
  31. 31. #seocamp Google is sensitive to the volume of content What we know
  32. 32. #seocamp Google is sensitive to Internal popularity - OnCrawl InRank® What we know More links = best positions More links = best crawl freq.
  33. 33. #seocamp Google is sensitive to the CTR and Bounce Rates What we know Less BR = more Bot Hits Best CTR = best positions
  34. 34. #seocamp To remember Your priority pages should be linked from the home, or ideally 1-2 level from the home To be crawled frequently a page must be fast and have a wonderful content … Return the most beautiful data!
  35. 35. #seocamp Indexation and Content Interpretation
  36. 36. #seocamp What we know Google classifies types of request: • Transactional • Informational • Navigation Depending on the type of request to position, Google will more or less crawl you Google does not understand the content, but it seeks to understand the concepts by using detection of named entities : • it creates term and word weight matrices • It deduces page themes for the ranking
  37. 37. #seocamp Transactional Here the user wants to get to a website where there will be more interaction Pages on “converse men's chuck taylor” cold content, transactional request à average crawl frequency A well thought linking based on brand/product entity relationships
  38. 38. #seocamp Informational This is when the user is looking for a specific bit of information Pages on “Clinton” or “Trump” to a hot “informational” subject à High crawl frequency Need to use powerful linking to pages, use of named entities
  39. 39. #seocamp Navigational The user is looking to reach a particular website Pages on “BrightonSEO speakers” are know as part of Brightonseo website à Law crawl frequency Need to use good semantic SEO optimization, Trust and Citation
  40. 40. #seocamp Word Embeddings is the collective name for a set of language modeling and feature learning techniques in natural language processing – NLP - where words or phrases from the vocabulary are mapped to vectors of real numbers
  41. 41. #seocamp NLP as Rankbrain's foundation Our (Rankbrain) algorithm is able to represent strings of text in very high- dimensional space and “see” how they relate to another
  42. 42. #seocamp Automated Language & Rankbrain Processing Google maintains a knowledge base on named entities and understands the relationships between entities:
  43. 43. #seocamp Word Embeddings In a search engine algorithm, a tool is needed to calculate a "similarity" score between two documents This note is strategic for creating a relevant ranking, but it is used in combination with a very large number of other signals, rather major (like the popularity of the page), or minor (like the presence of a keyword in the page url)
  44. 44. #seocamp Each entity or concept is vectorized For the machines to understand
  45. 45. #seocamp Google can then evaluate the distance between two concepts Entity #1 Entity #1 vectorized distances related entities vectorized distances related entities
  46. 46. #seocamp For each entity, Google knows: Entity#1 Sentences that contain the entity In which context / topic the entity is used Often used with entity #2 in a paragraph Often used with entity #2 in a site on the subject Often used with entity #2 on the same page
  47. 47. For my SEO? WTF?
  48. 48. #seocamp Concretely? how old is the wife of bill gates ? Assumption of a request on age Type of relation = wife Individual / Personality
  49. 49. #seocamp An example: “Apple” Phone Computer Brand /Apple Local store /”apple”
  50. 50. #seocamp Phone Computer Brand /Apple Local store / ”apple” It is entity detection that infers the context of the search and refines the results An example: “Apple”
  51. 51. #seocamp To remember Google will tend to crawl pages by “package” • On the same path (Discover) • On the same topic (Recrawl) The more content it will met with expected entities (related to the theme), the deeper it will crawl The type of entity, its rarity or popularity will directly impact the crawl frequency Internal linking must be thought from the relationship between entities present in your content
  52. 52. #seocamp How to check my named entities? You can do it for each URL in the OnCrawl Toolbox Now integrated in the OnCrawl crawl reports!
  53. 53. how to maximize your efforts
  54. 54. #seocamp Understand your website Map your website by: • Type of content • Pages categories Understand which entities are present in my content • Anchors Texts Analyze how pages with entities are linked in my site
  55. 55. #seocamp Which steps? Crawl your site Categorize your pages Extract named entities by page group Identify pages with/without entities à Adjust your content Monitor the number of words per group of pages / per packet The goal is to define the “ideal content metrics” to maximize your crawlability Data Explorer export Comparison with log data Filter by group Average Crawled pages by Google Ignored pages Number of pages concerned 875 256 1 340 872 Number of words 897 457 Number of entities in the content 18 3 Number of entities in anchors 6 0
  56. 56. #seocamp Quickwins ü Use named entities in your link anchors ü Create packages of linked pages according to entity typology ü Example of a media site:
  57. 57. #seocamp OnCrawl detects entities
  58. 58. Follow rankings and cross data with OnCrawl Rankings
  59. 59. #seocamp Conclusions Crawling, indexing, ranking and re-ranking are all based on artificial intelligence and machine learning principes It is not so intelligent because they need us to validate models Never forget that they are only algorithms: you have to know and manipulate the metrics they take into account to manipulate them!
  60. 60. #seocamp Crawling consumes energy, simplify bots life : pay attention to depth, navigation shortcuts, No Duplicate and especially load time, weight in the current index mobile firtst context Follow crawl budget with your logs! Conclusions
  61. 61. #seocamp Indexing is based on internal/external metrics content Knowledge Graph is the Google learning base on named entities Understand: NLP – Word Embeddings Conclusions
  62. 62. #seocamp Ranking: it is the consistency of all these data with user intentions it depends on quality scores (technical) relevance scores (semantic) + knowledge of the user’s behaviors and intention his personal background of research and visit Conclusions
  63. 63. #seocamp Make users want to come back to manipulate the CTR and BR Favorise titles, meta desc, content, speed, UX/UI Conclusions
  64. 64. www.oncrawl.com OnCrawl help e-commerce & online media take better SEO decisions and grow their revenues By providing access to the Most Advanced SEO Software Semantic SEO Crawler Comprehensive Log Analyser API & Platform to combined all website’s data
  65. 65. Try OnCrawl for free Start your free trial
  • giuliog

    Mar. 7, 2020
  • PaoloAlbera1

    May. 13, 2018
  • JimmyJulian

    May. 3, 2018
  • ChrisDrury6

    Apr. 30, 2018

How to deal with Rankbrain and IA in SEO? Conference made by Lionel Kappelhoff, VP Customer Success at OnCrawl during last BrightonSEO (April, 2018).

Views

Total views

3,383

On Slideshare

0

From embeds

0

Number of embeds

1,263

Actions

Downloads

42

Shares

0

Comments

0

Likes

4

×