SlideShare a Scribd company logo
1 of 26
Download to read offline
Why	
  Googlebot &	
  The	
  URL	
  Scheduler	
  
Should	
   Be	
  Amongst	
   Your	
  Key	
  Personas	
  
And	
  How	
  To	
  Train	
  Them
TALK	
  TO	
  
THE	
  SPIDER Dawn	
  Anderson	
  @	
  dawnieando
9	
  types	
  of	
  
Googlebot
THE KEY PERSONAS
02
SUPPORTING	
  ROLES
Indexer	
  /	
  
Ranking	
  Engine
The	
  URL	
  
Scheduler
History	
  Logs
Link	
  Logs
Anchor	
  Logs
‘Ranks	
  nothing	
  at	
  all’
Takes	
  a	
  list	
  of	
  URLs	
  to	
  crawl	
  from	
  URL	
  Scheduler
Job	
  varies	
  based	
  on	
  ‘bot’	
  type
Runs	
  errands	
  &	
  makes	
  deliveries	
  for	
  the	
  URL	
  server,	
  
indexer	
  /	
  ranking	
  engine	
  and	
  logs
Makes	
  notes	
  of	
  outbound	
   linked	
  pages	
  and	
  additional	
  
links	
  for	
  future	
  crawling
Takes	
  notes	
  of	
  ‘hints’	
  from	
  URL	
  scheduler	
  when	
  crawling
Tells	
  tales	
  of	
  URL	
  accessibility	
  status,	
  server	
  response	
  
codes,	
  notes	
  relationships	
  between	
  links	
  and	
  collects	
  
content	
  checksums	
  (binary	
  data	
  equivalent	
  of	
  web	
  
content)	
  for	
  comparison	
  with	
  past	
  visits	
  by	
  history	
  and	
  
link	
  logs
03
GOOGLEBOT’S JOBS
04
ROLES – MAJOR PLAYERS – A ‘BOSS’- URL
SCHEDULER
Think	
  of	
  it	
  as	
  Google’s	
  
line	
  manager	
  or	
  ‘air	
  
traffic	
  controller’	
  for	
  
Googlebots in	
  the	
  
web	
  crawling	
  system
Schedules	
  Googlebot visits	
  to	
  URLs
Decides	
  which	
  URLs	
  to	
  ‘feed’	
  to	
  Googlebot
Uses	
  data	
  from	
  the	
  history	
  logs	
  about	
  past	
  visits
Assigns	
  visit	
  regularity	
  of	
  Googlebot to	
  URLs
Drops	
  ‘hints’	
  to	
  Googlebot to	
  guide	
  on	
  types	
  of	
  content	
  NOT	
  to	
  
crawl	
  and	
  excludes	
  some	
  URLs	
  from	
  schedules
Analyses	
  past	
  ‘change’	
  periods	
  and	
  predicts	
  future	
  ‘change’	
  
periods	
  for	
  URLs	
  for	
  the	
  purposes	
  of	
  scheduling	
  Googlebot visits
Checks	
  ‘page	
  importance’	
  in	
  scheduling	
  visits
Assigns	
  URLs	
  to	
  ‘layers	
  /	
  tiers’	
  for	
  crawling	
  schedules
Indexed	
  Web	
  contains at	
  least	
  4.73	
  billion	
   pages (13/11/2015)
05
TOO MUCH CONTENT
Total	
  number	
  of	
  websites
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
1,000,000,000
750,000,000
500,000,000
250,000,000
SINCE	
  2013	
  THE	
  WEB	
  IS	
  
THOUGHT	
  TO	
  HAVE	
  
INCREASED	
  IN	
  SIZE	
  BY	
  1/3
Capacity	
  limits	
  
on	
  Google’s	
  
crawling	
  system
By	
  prioritising	
  
URLs	
  for	
  
crawling
By	
  assigning	
  
crawl	
  period	
  
intervals	
  to	
  URLs
How	
  have	
  
search	
  engines	
  
responded?
By	
  creating	
  work	
  
‘schedules’	
  for	
  
Googlebots
06
TOO MUCH CONTENT
‘Managing items in a
crawl schedule’
Include
07
GOOGLE CRAWL SCHEDULER PATENTS
‘Scheduling a recrawl’
‘Web crawler scheduler that
utilizes sitemaps from websites’
‘
‘Document reuse in a
search engine crawler’
‘Minimizing visibility of stale content in
web searching including revising web
crawl intervals of documents’
‘Scheduler for search engine’
Crawled	
  multiple	
  
times	
  daily
Crawled	
  daily	
  
Or	
  bi-­‐daily
Crawled	
  least	
  on	
  a	
  ‘round	
  
robin’	
  basis	
  – only	
  ‘active’	
  
segment	
  is	
  crawledSplit	
  into	
  segments	
  
on	
  random	
  rotation
08
MANAGING ITEMS IN A CRAWL
SCHEDULE (GOOGLE PATENT)
Real	
  Time
Crawl
Daily Crawl
Base	
  Layer	
  	
  Crawl
3	
  layers	
  /	
  tiers URLs	
  are	
  moved	
  
in	
  and	
  out	
  of	
  
layers	
  based	
  on	
  
past	
  visits	
  data
Scheduler	
  checks	
  URLs	
  
for	
  ‘importance’,	
  ‘boost	
  
factor’	
  candidacy,	
  
‘probability	
  of	
  
modification’
GOOGLEBOT’S BEEN PUT ON A
URL CONTROLLED DIET
09
The	
  URL	
  Scheduler	
  
controls	
  the	
  meal	
  
planner
Carefully	
  controls	
  
the	
  list	
  of	
  URLs	
  
Googlebot vits
‘Budgets’	
  are	
  allocated
£
CRAWL BUDGET
10
Roughly	
  proportionate	
  to	
  Page	
  Importance	
  (LinkEquity)	
   &	
  speed
Pages	
  with	
  a	
  lot	
  of	
  healthy	
  links	
  get	
  crawled	
  more	
  (Can	
  include	
  internal	
  links??)
Apportioned	
  by	
  the	
  URL	
  scheduler	
  to	
  Googlebots
WHAT	
  IS	
  A	
  CRAWL	
  BUDGET?	
  -­‐ An	
  allocation	
  of	
  ‘crawl	
  visit	
  frequency’	
  apportioned	
  to	
  URLs	
  on	
  a	
  site
But	
  there	
  are	
  other	
  factors	
  affecting	
  frequency	
  of	
  Googlebot visits	
  aside	
  from	
  importance	
  /	
  speed
The	
  vast	
  majority	
  of	
  URLs	
  on	
  the	
  web	
  don’t	
  get	
  a	
  lot	
  of	
  budget	
  allocated	
  to	
  them
CRITICAL MATERIAL CONTENT
CHANGE
11
HINTS	
  &
C	
  =	
  ∑	
  i =	
  0	
  n	
  -­‐ 1	
   	
  weight	
  i *	
  feature
Current	
  capacity	
  of	
  the	
  web	
  crawling	
  system	
  is	
  high
Your	
  URL	
  is	
  ‘important’
Your	
  URL	
  is	
  in	
  the	
  real	
  time,	
  daily	
  crawl	
  or	
  ‘active’	
  base	
  
layer	
  segment
Your	
  URL	
  changes	
  a	
  lot	
  with	
  critical	
  material	
  content	
  
change
Probability	
  and	
  predictability	
  of	
  critical	
  material	
  content	
  
change	
  is	
  high	
  for	
  your	
  URL
Your	
  website	
  speed	
  is	
  fast	
  and	
  Googlebot gets	
  the	
  time	
  to	
  
visit	
  your	
  URL
Your	
  URL	
  has	
  been	
  ‘upgraded’	
  to	
  a	
  daily	
  or	
  real	
  time	
  crawl	
  
layer
12
POSITIVE FACTORS AFFECTING
GOOGLEBOT VISIT FREQUENCY
Current	
  capacity	
  of	
  web	
  crawling	
  system	
  is	
  low
Your	
  URL	
  has	
  been	
  detected	
  as	
  a	
  ‘spam’	
  URL
Your	
  URL	
  is	
  in	
  an	
  ‘inactive’	
  base	
  layer	
  segment
Your	
  URLs	
  are	
  ‘tripping	
  hints’	
  built	
  into	
  the	
  system	
  to	
  
detect	
  non-­‐critical	
  change	
  dynamic	
  content
Probability	
  and	
  predictability	
  of	
  critical	
  material	
  content	
  
change	
  is	
  low	
  for	
  your	
  URL
Your	
  website	
  speed	
  is	
  slow	
  and	
  Googlebot doesn’t	
  get	
  the	
  
time	
  to	
  visit	
  your	
  URL
Your	
  URL	
  has	
  been	
  ‘downgraded’	
  to	
  an	
  ‘inactive’	
  base	
  
layer	
  segment
Your	
  URL	
  has	
  returned	
  an	
  ‘unreachable’	
  server	
  response	
  
code	
  recently
13
NEGATIVE FACTORS AFFECTING
GOOGLEBOT VISIT FREQUENCY
IT’S NOT JUST ABOUT ‘FRESHNESS’
14
It’s	
  about	
  the	
  
probability	
  &	
  
predictability	
  of	
  future	
  
‘freshness’
BASED ON DATA FROM THE HISTORY LOGS - HOW CAN WE
INFLUENCE THEM TO ESCAPE THE BASE LAYER?
Going	
  ‘where	
  the	
  action	
  is’	
  in	
  sites
The	
  ‘need	
  for	
  speed’
Logical	
  structure
Correct	
  ‘response’	
  codes
XML	
  sitemaps
‘Successful	
  crawl	
  visits
‘Seeing	
  everything’	
  on	
  a	
  page
Taking	
  ‘hints’
Clear	
  unique	
  single	
  ‘URL	
  
fingerprints’	
  (no	
  duplicates)
Predicting	
  likelihood	
  of	
  ‘future	
  
change’
Slow	
  sites
Too	
  many	
  redirects
Being	
  bored	
  (Meh)	
  (‘Hints’	
  are	
  built	
  in	
  by	
  the	
  
search	
  engine	
  systems	
  – Takes	
  ‘hints’)
Being	
  lied	
  to	
  (e.g.	
  On	
  XML	
  sitemap	
  priorities)
Crawl	
  traps	
  and	
  dead	
  ends
Going	
  round	
  in	
  circles	
  (Infinite	
  loops)
Spam	
  URLs
Crawl	
  wasting	
  minor	
  content	
  change	
  URLs
‘Hidden’	
  and	
  blocked	
  content
Uncrawlable URLs
Not	
  just	
  any	
  change
Critical	
  material	
  change
Predicting	
  future	
  change
Dropping	
  ‘hints’	
  to	
  Googlebot
Sending	
  Googlebot
Where	
  ‘the	
  action	
  is’
CRAWL OPTIMISATION – STAGE 1 -
UNDERSTAND GOOGLEBOT & URL
SCHEDULER - LIKES & DISLIKES
15
LIKES DISLIKES CHANGE	
  IS	
  KEY
FIND GOOGLEBOT
16
AUTOMATE	
  SERVER	
  LOG	
  
RETRIEVAL	
  VIA	
  CRON	
  JOB
grep Googlebot access_log
>googlebot_access.txt
LOOK THROUGH ‘SPIDER EYES’ VIA
LOG ANALYSIS – ANALYSE GOOGLEBOT
17
PREPARE TO BE HORRIFIED
Incorrect	
  URL	
  header	
  response	
  codes	
  (e.g.	
  302s)
301	
  redirect	
  chains
Old	
  files	
  or	
  XML	
  sitemaps	
  left	
  on	
  server	
  from	
  years	
  ago
Infinite/	
  endless	
  loops	
  (circular	
  dependency)
On	
  parameter	
  driven	
  sites	
  URLs	
  crawled	
  which	
  produce	
  same	
  output
URLs	
  generated	
  by	
  spammers
Dead	
  image	
  files	
  being	
  visited
Old	
  css files	
  still	
  being	
  crawled
Identify	
  your	
  ‘real	
  time’,	
  ‘daily’	
  and	
  ‘base	
  layer’	
  URLs
ARE	
  THEY	
  THE	
  ONES	
  YOU	
  WANT	
  THERE?
18
FIX GOOGLEBOT’S JOURNEY
SPEED UP YOUR
SITE TO ‘FEED’
GOOGLEGOT
MORE
TECHNICAL	
  ‘FIXES’	
  	
  	
  
Speed	
  up	
  your	
  site
Implement	
  compression,	
  minification,	
  caching
‘
Fix	
  incorrect	
  header	
  response	
  codes
Fix	
  nonsensical	
  ‘infinite	
  loops’	
  generated	
  by	
  
database	
  driven	
  parameters	
  or	
  ‘looping’	
  relative	
  
URLs
Use	
  absolute	
  versus	
  relative	
  internal	
  links
Ensure	
  no	
  parts	
  of	
  content	
  is	
  blocked	
  from	
  
crawlers	
  (e.g.	
  in	
  carousels,	
  concertinas	
  and	
  
tabbed	
  content
Ensure	
  no	
  css or	
  javascript files	
  are	
  blocked	
  from	
  
crawlers
Unpick	
  301	
  redirect	
  chains
Minimise	
  301	
  redirects
Minimise	
  canonicalisation
Use	
  ‘if	
  modified’	
  headers	
  on	
  low	
  importance	
  
‘hygiene’	
  pages
Use	
  ‘expires	
  after’	
  headers	
  on	
  content	
  with	
  short	
  
shelf	
  live	
  (e.g.	
  auctions,	
  job	
  sites,	
  event	
  sites)
Noindex low	
  search	
  volume	
  or	
  near	
  duplicate	
  URLs	
  
(use	
  noindex directive	
  on	
  robots.txt)
Use	
  410	
  ‘gone’	
  headers	
  on	
  dead	
  URLs	
  liberally
Revisit	
  .htaccess file	
  and	
  review	
  legacy	
  pattern	
  
matched	
  301	
  redirects
Combine	
  CSS	
  and	
  javascript files
FIX GOOGLEBOT’S JOURNEY
19
SAVE	
  BUDGET
£
Revisit	
  ‘Votes	
  for	
  self’	
  via	
  internal	
  links	
  in	
  GSC
Clear	
  ‘unique’	
  URL	
  fingerprints
Use	
  XML	
  sitemaps	
  for	
  your	
  important	
  URLs	
  (don’t	
  put	
  
everything	
  on	
  it)
Use	
  ‘mega	
  menus’	
  (very	
  selectively)	
  to	
  key	
  pages
Use	
  ‘breadcrumbs’	
  (for	
  hierarchical	
  structure)
Build	
  ‘bridges’	
  and	
  ‘shortcuts’	
  via	
  html	
  sitemaps	
  and	
  
supplementary	
  content	
  for	
  ‘cross	
  modular’	
  ‘related’	
  
internal	
  linking	
  to	
  key	
  pages
Consolidate	
  (merge)	
  important	
  but	
  similar	
  content	
  (e.g.	
  
merge	
  FAQs)
Consider	
  flattening	
  your	
  site	
  structure	
  so	
  ‘importance’	
  
flows	
  further
Reduce	
  internal	
  linking	
  to	
  low	
  priority	
  URLs
BE	
  CLEAR	
  TO	
  GOOGLEBOT	
  WHICH	
  ARE	
  
YOUR	
  MOST	
  IMPORTANT	
  PAGES
Not	
  just	
  any	
  change	
  – Critical	
  material	
  change
Keep	
  the	
  ‘action’	
  in	
  the	
  key	
  areas -­‐ NOT	
  JUST	
  THE	
  BLOG
Use	
  ‘relevant	
  ‘supplementary	
  content	
  to	
  keep	
  key	
  pages	
  ‘fresh’
Remember	
  the	
  negative	
  impact	
  of	
  	
  ‘crawl	
  hints’
Regularly	
  update	
  key	
  content
Consider	
  ‘updating’	
  rather	
  than	
  replacing	
  seasonal	
  content	
  
URLs
Build	
  ‘dynamism’	
  into	
  your	
  web	
  development	
  (sites	
  that	
  ‘move’	
  
win)
GOOGLEBOT	
  GOES	
  WHERE	
  THE	
  ACTION	
  IS	
  AND	
  
IS	
  LIKELY	
  TO	
  BE	
  IN	
  THE	
  FUTURE
TRAIN GOOGLEBOT – ‘TALK TO THE
SPIDER’ (PROMOTE URLS TO HIGHER CRAWL LAYERS)
20
EMPHASISE	
  PAGE	
  IMPORTANCE	
  	
  	
   TRAIN	
  ON	
  CHANGE
YSlow
Pingdom
Google	
  Page	
  Speed	
  Tests
Minificiation – JS	
  Compress	
  and	
  CSS	
  
Minifier
Image	
  Compression	
   – Compressjpeg.com,	
  
tinypng.com
21
TOOLS YOU CAN USE
GSC	
  Crawl	
  Stats
Deepcrawl
Screaming	
  Frog
Server	
  Logs
SEMRush (auditing	
  tools)
Webconfs (header	
  responses	
   /	
  similarity	
  
checker)
Powermapper (birds	
  eye	
  view	
  of	
  site)
GSC	
  Internal	
  links	
  Report	
  (URL	
  importance)
Link	
  Research	
  Tools	
  (Strongest	
  sub	
  pages	
  
reports)
GSC	
  Internal	
  links	
  (add	
  site	
  categories	
  and	
  
sections	
  as	
  additional	
  profiles)
Powermapper
GSC	
  Index	
  levels	
  (over	
  indexation	
  checks)
GSC	
  Crawl	
  stats
Last	
  Accessed	
  Tools	
  (versus	
  competitors)
Server	
  logs
SPEED
SPIDER	
  EYES
URL	
  IMPORTANCE
SAVINGS	
  &	
  CHANGE
Webmaster Hangout Office Hours
IS THIS
YOUR BLOG??
HOPE NOT
22
WARNING SIGNS – TOO MANY
VOTES BY SELF FOR WRONG PAGES
Most Important Page 1
Most	
  Important	
  Page	
  2
Most	
  Important	
  Page	
  3
23
WARNING SIGNS – OVER INDEXATION
FIX IT FOR
A BETTER
CRAWL
Tags:	
  I,	
  must,	
  tag,	
  	
  this,	
  blog,	
  post,	
  with,	
  
every,	
  possible,	
   word,	
  that,	
  pops,	
   into,	
  my,	
  
head,	
  when,	
  I,	
  look,	
  at,	
  it,	
  and,	
  dilute,	
  all,	
  
relevance,	
  from,	
  it,	
  to,	
  a,	
  pile,	
  of,	
  mush,	
  
cow,	
  shoes,	
  sheep,	
  the,	
  and,	
  me,	
  of,	
  it
Image	
  Credit:	
  Buzzfeed
Creating	
  ‘thin’	
  content	
  and	
  
Even	
  more	
  URLs	
  to	
  crawl
24
WARNING SIGNS – TAG MAN
25
GOOGLE THINKS SO
”Googlebot’s On	
  A	
  Strict	
  Diet”
“Make	
  sure	
  the	
  right	
  URLs	
  get	
  on	
  the	
  menu”
Dawn	
  Anderson	
  @	
  dawnieando
REMEMBER

More Related Content

What's hot

Digital Olympus Technical SEO Findings Whilst Taming An SEO Beast
Digital Olympus Technical SEO Findings Whilst Taming An SEO BeastDigital Olympus Technical SEO Findings Whilst Taming An SEO Beast
Digital Olympus Technical SEO Findings Whilst Taming An SEO BeastDawn Anderson MSc DigM
 
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your SitesSEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your SitesDawn Anderson MSc DigM
 
Pubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonPubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonDawn Anderson MSc DigM
 
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Dawn Anderson MSc DigM
 
SEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftSEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftDawn Anderson MSc DigM
 
Duplicate Content Myths Types and Ways To Make It Work For You
Duplicate Content Myths Types and Ways To Make It Work For YouDuplicate Content Myths Types and Ways To Make It Work For You
Duplicate Content Myths Types and Ways To Make It Work For YouDawn Anderson MSc DigM
 
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson MSc DigM
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUJason Mun
 
FoundConf 2018 Signals Speak - Alexis Sanders
FoundConf 2018 Signals Speak - Alexis SandersFoundConf 2018 Signals Speak - Alexis Sanders
FoundConf 2018 Signals Speak - Alexis SandersAlexis Sanders
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Dawn Anderson MSc DigM
 
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual SearchSEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual SearchDawn Anderson MSc DigM
 
Mobile-First Indexing and AMP - SMX Advanced 2018
Mobile-First Indexing and AMP - SMX Advanced 2018Mobile-First Indexing and AMP - SMX Advanced 2018
Mobile-First Indexing and AMP - SMX Advanced 2018Alexis Sanders
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stoxpatrickstox
 
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick StoxPubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stoxpatrickstox
 
Creating Commerce Reviews and Considering The Case For User Generated Reviews
Creating Commerce Reviews and Considering The Case For User Generated ReviewsCreating Commerce Reviews and Considering The Case For User Generated Reviews
Creating Commerce Reviews and Considering The Case For User Generated ReviewsDawn Anderson MSc DigM
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsDistilled
 
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration Branded3
 
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...Jamie Indigo
 
Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019Bastian Grimm
 
How SEO Ruined the Internet, and How We Can Save It
How SEO Ruined the Internet, and How We Can Save ItHow SEO Ruined the Internet, and How We Can Save It
How SEO Ruined the Internet, and How We Can Save ItKeith Goode
 

What's hot (20)

Digital Olympus Technical SEO Findings Whilst Taming An SEO Beast
Digital Olympus Technical SEO Findings Whilst Taming An SEO BeastDigital Olympus Technical SEO Findings Whilst Taming An SEO Beast
Digital Olympus Technical SEO Findings Whilst Taming An SEO Beast
 
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your SitesSEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
 
Pubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonPubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn anderson
 
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
 
SEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftSEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm Shift
 
Duplicate Content Myths Types and Ways To Make It Work For You
Duplicate Content Myths Types and Ways To Make It Work For YouDuplicate Content Myths Types and Ways To Make It Work For You
Duplicate Content Myths Types and Ways To Make It Work For You
 
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
 
FoundConf 2018 Signals Speak - Alexis Sanders
FoundConf 2018 Signals Speak - Alexis SandersFoundConf 2018 Signals Speak - Alexis Sanders
FoundConf 2018 Signals Speak - Alexis Sanders
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
 
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual SearchSEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
 
Mobile-First Indexing and AMP - SMX Advanced 2018
Mobile-First Indexing and AMP - SMX Advanced 2018Mobile-First Indexing and AMP - SMX Advanced 2018
Mobile-First Indexing and AMP - SMX Advanced 2018
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
 
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick StoxPubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
 
Creating Commerce Reviews and Considering The Case For User Generated Reviews
Creating Commerce Reviews and Considering The Case For User Generated ReviewsCreating Commerce Reviews and Considering The Case For User Generated Reviews
Creating Commerce Reviews and Considering The Case For User Generated Reviews
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
 
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
 
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...
How Googlebot Renders (Roleplaying as Google's Web Rendering Service-- D&D st...
 
Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019
 
How SEO Ruined the Internet, and How We Can Save It
How SEO Ruined the Internet, and How We Can Save ItHow SEO Ruined the Internet, and How We Can Save It
How SEO Ruined the Internet, and How We Can Save It
 

Viewers also liked

Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016
Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016
Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016Lino Uruñuela
 
Instagram Ads - Advertising on Instagram
Instagram Ads - Advertising on InstagramInstagram Ads - Advertising on Instagram
Instagram Ads - Advertising on InstagramDhiaksa Adiwyakto
 
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...Dawn Anderson MSc DigM
 
Open data e app: questo matrimonio s'ha da fare
Open data e app: questo matrimonio s'ha da fareOpen data e app: questo matrimonio s'ha da fare
Open data e app: questo matrimonio s'ha da fareLibreItalia
 
Enseñar la identidad terrenal
Enseñar la identidad terrenalEnseñar la identidad terrenal
Enseñar la identidad terrenalClaudia BC
 
Online Marketing Rockstars Daily Präsentation 2016
Online Marketing Rockstars Daily Präsentation 2016Online Marketing Rockstars Daily Präsentation 2016
Online Marketing Rockstars Daily Präsentation 2016Martin Gardt
 
hreflang SMX München 2016 Eoghan Henn
hreflang SMX München 2016 Eoghan Hennhreflang SMX München 2016 Eoghan Henn
hreflang SMX München 2016 Eoghan HennEoghan Henn
 
Friends of Search - VR Marketing
Friends of Search  - VR MarketingFriends of Search  - VR Marketing
Friends of Search - VR MarketingJes Scholz
 
Instagram Advertising Performance Snapshot: February 2017
Instagram Advertising Performance Snapshot: February 2017Instagram Advertising Performance Snapshot: February 2017
Instagram Advertising Performance Snapshot: February 2017Nanigans
 
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...Arianne Donoghue
 
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016Mark Thomas
 
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play Accelerator
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play AcceleratorOnline Marketing for Startups | 2015 Philipp Klöckner @Plug and Play Accelerator
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play AcceleratorPhilipp Klöckner
 
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online Marketing
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online MarketingCompetitive Intelligence: Wettbewerbsbeobachtung im SEO und Online Marketing
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online MarketingPhilipp Klöckner
 
Strategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldStrategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldRand Fishkin
 
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...Online Marketing Rockstars
 
The Great Content Skills Debate
The Great Content Skills DebateThe Great Content Skills Debate
The Great Content Skills DebateLizzie Everard
 
Becoming an SEO Superhero at #SMXLmilan
Becoming an SEO Superhero at #SMXLmilan Becoming an SEO Superhero at #SMXLmilan
Becoming an SEO Superhero at #SMXLmilan Aleyda Solís
 
SEO for Education Organizations #EMCDigital
SEO for Education Organizations #EMCDigitalSEO for Education Organizations #EMCDigital
SEO for Education Organizations #EMCDigitalAleyda Solís
 

Viewers also liked (20)

Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016
Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016
Seo Logs y Big Data, Lino Uruñuela en Seonthebeach 2016
 
Instagram Ads - Advertising on Instagram
Instagram Ads - Advertising on InstagramInstagram Ads - Advertising on Instagram
Instagram Ads - Advertising on Instagram
 
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...
SEO Make Micro-Moments and Wordpress Work For User Journey Mapping With Conte...
 
Open data e app: questo matrimonio s'ha da fare
Open data e app: questo matrimonio s'ha da fareOpen data e app: questo matrimonio s'ha da fare
Open data e app: questo matrimonio s'ha da fare
 
Apostila nr 10
Apostila nr 10Apostila nr 10
Apostila nr 10
 
Enseñar la identidad terrenal
Enseñar la identidad terrenalEnseñar la identidad terrenal
Enseñar la identidad terrenal
 
Online Marketing Rockstars Daily Präsentation 2016
Online Marketing Rockstars Daily Präsentation 2016Online Marketing Rockstars Daily Präsentation 2016
Online Marketing Rockstars Daily Präsentation 2016
 
Content Brand Pyramid
Content Brand Pyramid Content Brand Pyramid
Content Brand Pyramid
 
hreflang SMX München 2016 Eoghan Henn
hreflang SMX München 2016 Eoghan Hennhreflang SMX München 2016 Eoghan Henn
hreflang SMX München 2016 Eoghan Henn
 
Friends of Search - VR Marketing
Friends of Search  - VR MarketingFriends of Search  - VR Marketing
Friends of Search - VR Marketing
 
Instagram Advertising Performance Snapshot: February 2017
Instagram Advertising Performance Snapshot: February 2017Instagram Advertising Performance Snapshot: February 2017
Instagram Advertising Performance Snapshot: February 2017
 
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...
Moving Towards Audiences in a Keyword-Based World - Friends of Search 2017 - ...
 
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
 
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play Accelerator
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play AcceleratorOnline Marketing for Startups | 2015 Philipp Klöckner @Plug and Play Accelerator
Online Marketing for Startups | 2015 Philipp Klöckner @Plug and Play Accelerator
 
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online Marketing
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online MarketingCompetitive Intelligence: Wettbewerbsbeobachtung im SEO und Online Marketing
Competitive Intelligence: Wettbewerbsbeobachtung im SEO und Online Marketing
 
Strategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldStrategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate World
 
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...
Online Marketing Rockstars - Medien- und Marketingprodukte der nächsten Gener...
 
The Great Content Skills Debate
The Great Content Skills DebateThe Great Content Skills Debate
The Great Content Skills Debate
 
Becoming an SEO Superhero at #SMXLmilan
Becoming an SEO Superhero at #SMXLmilan Becoming an SEO Superhero at #SMXLmilan
Becoming an SEO Superhero at #SMXLmilan
 
SEO for Education Organizations #EMCDigital
SEO for Education Organizations #EMCDigitalSEO for Education Organizations #EMCDigital
SEO for Education Organizations #EMCDigital
 

Similar to Sasconbeta 2015 Dawn Anderson - Talk To The Spider

How to Optimize Your Website for Crawl Efficiency
How to Optimize Your Website for Crawl EfficiencyHow to Optimize Your Website for Crawl Efficiency
How to Optimize Your Website for Crawl EfficiencySemrush
 
Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)SyedFaraz41
 
Modern SEO Players Guide
Modern SEO Players GuideModern SEO Players Guide
Modern SEO Players GuideMichael King
 
How to perform a technical SEO audit and ramp up your content strategy in 10 ...
How to perform a technical SEO audit and ramp up your content strategy in 10 ...How to perform a technical SEO audit and ramp up your content strategy in 10 ...
How to perform a technical SEO audit and ramp up your content strategy in 10 ...Waqar Ahmad
 
Google Webmaster Tool Guide
Google Webmaster Tool GuideGoogle Webmaster Tool Guide
Google Webmaster Tool Guideitsyousuf
 
Screaming Frog: Little-Known Features In The SEO Spider
Screaming Frog: Little-Known Features In The SEO SpiderScreaming Frog: Little-Known Features In The SEO Spider
Screaming Frog: Little-Known Features In The SEO Spiderseo, ppc, website development
 
10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdfRaulrox1
 
10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdfRaulrox1
 
SEO 101: How to Get Started Winning Google Search Traffic
SEO 101: How to Get Started Winning Google Search TrafficSEO 101: How to Get Started Winning Google Search Traffic
SEO 101: How to Get Started Winning Google Search TrafficBernard Huang
 
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...Morning Dough
 
Search Engine Optimization Primer
Search Engine Optimization PrimerSearch Engine Optimization Primer
Search Engine Optimization PrimerSimobo
 
The step by step guide to SEO Website Audit
The step by step guide to SEO Website Audit The step by step guide to SEO Website Audit
The step by step guide to SEO Website Audit amandacerry
 
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGTechnical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGBastian Grimm
 
How Google WOrks?
How Google WOrks?How Google WOrks?
How Google WOrks?07Deeps
 
10 snappy SEO tricks for 2017
10 snappy SEO tricks for 201710 snappy SEO tricks for 2017
10 snappy SEO tricks for 2017Prakriti Sinha
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)ssunnysengar
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)jhon smith
 

Similar to Sasconbeta 2015 Dawn Anderson - Talk To The Spider (20)

How to Optimize Your Website for Crawl Efficiency
How to Optimize Your Website for Crawl EfficiencyHow to Optimize Your Website for Crawl Efficiency
How to Optimize Your Website for Crawl Efficiency
 
Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)
 
Modern SEO Players Guide
Modern SEO Players GuideModern SEO Players Guide
Modern SEO Players Guide
 
How to perform a technical SEO audit and ramp up your content strategy in 10 ...
How to perform a technical SEO audit and ramp up your content strategy in 10 ...How to perform a technical SEO audit and ramp up your content strategy in 10 ...
How to perform a technical SEO audit and ramp up your content strategy in 10 ...
 
Google Webmaster Tool Guide
Google Webmaster Tool GuideGoogle Webmaster Tool Guide
Google Webmaster Tool Guide
 
Screaming Frog: Little-Known Features In The SEO Spider
Screaming Frog: Little-Known Features In The SEO SpiderScreaming Frog: Little-Known Features In The SEO Spider
Screaming Frog: Little-Known Features In The SEO Spider
 
10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf
 
10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf10 Important On-Site Technical SEO Factors.pdf
10 Important On-Site Technical SEO Factors.pdf
 
SEO 101: How to Get Started Winning Google Search Traffic
SEO 101: How to Get Started Winning Google Search TrafficSEO 101: How to Get Started Winning Google Search Traffic
SEO 101: How to Get Started Winning Google Search Traffic
 
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
 
Search Engine Optimization Primer
Search Engine Optimization PrimerSearch Engine Optimization Primer
Search Engine Optimization Primer
 
The step by step guide to SEO Website Audit
The step by step guide to SEO Website Audit The step by step guide to SEO Website Audit
The step by step guide to SEO Website Audit
 
Seo tutorial
Seo tutorialSeo tutorial
Seo tutorial
 
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGTechnical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
 
How Google WOrks?
How Google WOrks?How Google WOrks?
How Google WOrks?
 
google search console.pptx
google search console.pptxgoogle search console.pptx
google search console.pptx
 
Foxtail Website Audit
Foxtail Website AuditFoxtail Website Audit
Foxtail Website Audit
 
10 snappy SEO tricks for 2017
10 snappy SEO tricks for 201710 snappy SEO tricks for 2017
10 snappy SEO tricks for 2017
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)
 

More from Dawn Anderson MSc DigM

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfDawn Anderson MSc DigM
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesDawn Anderson MSc DigM
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Dawn Anderson MSc DigM
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you thinkDawn Anderson MSc DigM
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Dawn Anderson MSc DigM
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceDawn Anderson MSc DigM
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowDawn Anderson MSc DigM
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Dawn Anderson MSc DigM
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender SearchDawn Anderson MSc DigM
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Dawn Anderson MSc DigM
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Dawn Anderson MSc DigM
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceDawn Anderson MSc DigM
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchDawn Anderson MSc DigM
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Dawn Anderson MSc DigM
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...Dawn Anderson MSc DigM
 
Voice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEOVoice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEODawn Anderson MSc DigM
 

More from Dawn Anderson MSc DigM (20)

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdf
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you think
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard Race
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive Search
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
SEO in a Mobile First World
SEO in a Mobile First WorldSEO in a Mobile First World
SEO in a Mobile First World
 
Modern Ecommerce SEO
Modern Ecommerce SEOModern Ecommerce SEO
Modern Ecommerce SEO
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
 
Voice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEOVoice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEO
 

Recently uploaded

Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdf
Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdfGlobal Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdf
Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdfMROC Japan
 
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting Apps
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting AppsFantasy Cricket Apps: A New Viewpoint for Online Cricket Betting Apps
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting AppsCricket Betting Online
 
Digital marketing guide complete guide for beginners
Digital marketing guide complete guide for beginnersDigital marketing guide complete guide for beginners
Digital marketing guide complete guide for beginnerstejaswinisahyadreeso
 
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di PasuruanWA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruaninfoobataborsi24
 
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...Md najmul Islam
 
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...ahmedjiabur940
 
Beyond Silos: How Holistic B2B Digital Strategy Drives Pipeline
Beyond Silos: How Holistic B2B Digital Strategy Drives PipelineBeyond Silos: How Holistic B2B Digital Strategy Drives Pipeline
Beyond Silos: How Holistic B2B Digital Strategy Drives PipelineSearch Engine Journal
 
Niche Analysis for Client Outreach Outside Marketplace.pptx
Niche Analysis for Client Outreach Outside Marketplace.pptxNiche Analysis for Client Outreach Outside Marketplace.pptx
Niche Analysis for Client Outreach Outside Marketplace.pptxAhnaf Tahmid Haque
 
Taprank - Boost your Google reviews with personalized NFC cards
Taprank - Boost your Google reviews with personalized NFC cardsTaprank - Boost your Google reviews with personalized NFC cards
Taprank - Boost your Google reviews with personalized NFC cardsJuan Pablo Ponce
 
Unit 3 - Liberalization, Privatization & Globalization
Unit 3 - Liberalization, Privatization & GlobalizationUnit 3 - Liberalization, Privatization & Globalization
Unit 3 - Liberalization, Privatization & GlobalizationKaushik Jaiswal
 
Influencer Marekting Trends- Where the creator economy is going in in 2024
Influencer Marekting Trends- Where the creator economy is going in in 2024Influencer Marekting Trends- Where the creator economy is going in in 2024
Influencer Marekting Trends- Where the creator economy is going in in 2024Inflyx
 
youtube_marketing_partner_vling_service_introduction
youtube_marketing_partner_vling_service_introductionyoutube_marketing_partner_vling_service_introduction
youtube_marketing_partner_vling_service_introductionbzznbyd
 
Impacts Of Smart Watch & Wearable Technology On Daily Life
Impacts Of Smart Watch & Wearable Technology On Daily LifeImpacts Of Smart Watch & Wearable Technology On Daily Life
Impacts Of Smart Watch & Wearable Technology On Daily LifeFonacc Gadgets
 
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdf
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdfTea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdf
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdfDIGGIT
 
The BoF Brand Magic Index Volume Two — Preview.pdf
The BoF Brand Magic Index Volume Two — Preview.pdfThe BoF Brand Magic Index Volume Two — Preview.pdf
The BoF Brand Magic Index Volume Two — Preview.pdfhannahcrump4
 
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptx
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptxUsing GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptx
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptxGreg Jarboe
 
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...DIGGIT
 
Key Social Media Marketing Trends for 2024
Key Social Media Marketing Trends for 2024Key Social Media Marketing Trends for 2024
Key Social Media Marketing Trends for 2024Jomer Gregorio
 
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di MagetanWA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetaninfoobataborsi24
 
Passive Income System 2.0 Digital: Effortless Earnings
Passive Income System 2.0 Digital: Effortless EarningsPassive Income System 2.0 Digital: Effortless Earnings
Passive Income System 2.0 Digital: Effortless Earningsabdullahspz0428
 

Recently uploaded (20)

Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdf
Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdfGlobal Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdf
Global Trends in Market Reserch & Insights - Ray Poynter - May 2023.pdf
 
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting Apps
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting AppsFantasy Cricket Apps: A New Viewpoint for Online Cricket Betting Apps
Fantasy Cricket Apps: A New Viewpoint for Online Cricket Betting Apps
 
Digital marketing guide complete guide for beginners
Digital marketing guide complete guide for beginnersDigital marketing guide complete guide for beginners
Digital marketing guide complete guide for beginners
 
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di PasuruanWA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Pasuruan
 
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...
Valor Review – AI Transforms Any ClickBank Account Into a Money-Making Machin...
 
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...
Top Abortion Clinic in Muscat +918761049707!!!!!!!!!!! Get Cytotec kit availa...
 
Beyond Silos: How Holistic B2B Digital Strategy Drives Pipeline
Beyond Silos: How Holistic B2B Digital Strategy Drives PipelineBeyond Silos: How Holistic B2B Digital Strategy Drives Pipeline
Beyond Silos: How Holistic B2B Digital Strategy Drives Pipeline
 
Niche Analysis for Client Outreach Outside Marketplace.pptx
Niche Analysis for Client Outreach Outside Marketplace.pptxNiche Analysis for Client Outreach Outside Marketplace.pptx
Niche Analysis for Client Outreach Outside Marketplace.pptx
 
Taprank - Boost your Google reviews with personalized NFC cards
Taprank - Boost your Google reviews with personalized NFC cardsTaprank - Boost your Google reviews with personalized NFC cards
Taprank - Boost your Google reviews with personalized NFC cards
 
Unit 3 - Liberalization, Privatization & Globalization
Unit 3 - Liberalization, Privatization & GlobalizationUnit 3 - Liberalization, Privatization & Globalization
Unit 3 - Liberalization, Privatization & Globalization
 
Influencer Marekting Trends- Where the creator economy is going in in 2024
Influencer Marekting Trends- Where the creator economy is going in in 2024Influencer Marekting Trends- Where the creator economy is going in in 2024
Influencer Marekting Trends- Where the creator economy is going in in 2024
 
youtube_marketing_partner_vling_service_introduction
youtube_marketing_partner_vling_service_introductionyoutube_marketing_partner_vling_service_introduction
youtube_marketing_partner_vling_service_introduction
 
Impacts Of Smart Watch & Wearable Technology On Daily Life
Impacts Of Smart Watch & Wearable Technology On Daily LifeImpacts Of Smart Watch & Wearable Technology On Daily Life
Impacts Of Smart Watch & Wearable Technology On Daily Life
 
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdf
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdfTea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdf
Tea Gobec, Kako pluti po morju tehnoloških sprememb, Innovatif.pdf
 
The BoF Brand Magic Index Volume Two — Preview.pdf
The BoF Brand Magic Index Volume Two — Preview.pdfThe BoF Brand Magic Index Volume Two — Preview.pdf
The BoF Brand Magic Index Volume Two — Preview.pdf
 
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptx
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptxUsing GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptx
Using GA 4 to to Prove Value - Greg Jarboe - Aug 8, 2023.pptx
 
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...
Klaus Schweighofer, Zakaj je digitalizacija odlična priložnost za medije, Sty...
 
Key Social Media Marketing Trends for 2024
Key Social Media Marketing Trends for 2024Key Social Media Marketing Trends for 2024
Key Social Media Marketing Trends for 2024
 
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di MagetanWA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetan
WA | 0821-8888-6412 | Apotik Jual Obat Aborsi Cytotec Asli Di Magetan
 
Passive Income System 2.0 Digital: Effortless Earnings
Passive Income System 2.0 Digital: Effortless EarningsPassive Income System 2.0 Digital: Effortless Earnings
Passive Income System 2.0 Digital: Effortless Earnings
 

Sasconbeta 2015 Dawn Anderson - Talk To The Spider

  • 1. Why  Googlebot &  The  URL  Scheduler   Should   Be  Amongst   Your  Key  Personas   And  How  To  Train  Them TALK  TO   THE  SPIDER Dawn  Anderson  @  dawnieando
  • 2. 9  types  of   Googlebot THE KEY PERSONAS 02 SUPPORTING  ROLES Indexer  /   Ranking  Engine The  URL   Scheduler History  Logs Link  Logs Anchor  Logs
  • 3. ‘Ranks  nothing  at  all’ Takes  a  list  of  URLs  to  crawl  from  URL  Scheduler Job  varies  based  on  ‘bot’  type Runs  errands  &  makes  deliveries  for  the  URL  server,   indexer  /  ranking  engine  and  logs Makes  notes  of  outbound   linked  pages  and  additional   links  for  future  crawling Takes  notes  of  ‘hints’  from  URL  scheduler  when  crawling Tells  tales  of  URL  accessibility  status,  server  response   codes,  notes  relationships  between  links  and  collects   content  checksums  (binary  data  equivalent  of  web   content)  for  comparison  with  past  visits  by  history  and   link  logs 03 GOOGLEBOT’S JOBS
  • 4. 04 ROLES – MAJOR PLAYERS – A ‘BOSS’- URL SCHEDULER Think  of  it  as  Google’s   line  manager  or  ‘air   traffic  controller’  for   Googlebots in  the   web  crawling  system Schedules  Googlebot visits  to  URLs Decides  which  URLs  to  ‘feed’  to  Googlebot Uses  data  from  the  history  logs  about  past  visits Assigns  visit  regularity  of  Googlebot to  URLs Drops  ‘hints’  to  Googlebot to  guide  on  types  of  content  NOT  to   crawl  and  excludes  some  URLs  from  schedules Analyses  past  ‘change’  periods  and  predicts  future  ‘change’   periods  for  URLs  for  the  purposes  of  scheduling  Googlebot visits Checks  ‘page  importance’  in  scheduling  visits Assigns  URLs  to  ‘layers  /  tiers’  for  crawling  schedules
  • 5. Indexed  Web  contains at  least  4.73  billion   pages (13/11/2015) 05 TOO MUCH CONTENT Total  number  of  websites 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 1,000,000,000 750,000,000 500,000,000 250,000,000 SINCE  2013  THE  WEB  IS   THOUGHT  TO  HAVE   INCREASED  IN  SIZE  BY  1/3
  • 6. Capacity  limits   on  Google’s   crawling  system By  prioritising   URLs  for   crawling By  assigning   crawl  period   intervals  to  URLs How  have   search  engines   responded? By  creating  work   ‘schedules’  for   Googlebots 06 TOO MUCH CONTENT
  • 7. ‘Managing items in a crawl schedule’ Include 07 GOOGLE CRAWL SCHEDULER PATENTS ‘Scheduling a recrawl’ ‘Web crawler scheduler that utilizes sitemaps from websites’ ‘ ‘Document reuse in a search engine crawler’ ‘Minimizing visibility of stale content in web searching including revising web crawl intervals of documents’ ‘Scheduler for search engine’
  • 8. Crawled  multiple   times  daily Crawled  daily   Or  bi-­‐daily Crawled  least  on  a  ‘round   robin’  basis  – only  ‘active’   segment  is  crawledSplit  into  segments   on  random  rotation 08 MANAGING ITEMS IN A CRAWL SCHEDULE (GOOGLE PATENT) Real  Time Crawl Daily Crawl Base  Layer    Crawl 3  layers  /  tiers URLs  are  moved   in  and  out  of   layers  based  on   past  visits  data
  • 9. Scheduler  checks  URLs   for  ‘importance’,  ‘boost   factor’  candidacy,   ‘probability  of   modification’ GOOGLEBOT’S BEEN PUT ON A URL CONTROLLED DIET 09 The  URL  Scheduler   controls  the  meal   planner Carefully  controls   the  list  of  URLs   Googlebot vits ‘Budgets’  are  allocated £
  • 10. CRAWL BUDGET 10 Roughly  proportionate  to  Page  Importance  (LinkEquity)   &  speed Pages  with  a  lot  of  healthy  links  get  crawled  more  (Can  include  internal  links??) Apportioned  by  the  URL  scheduler  to  Googlebots WHAT  IS  A  CRAWL  BUDGET?  -­‐ An  allocation  of  ‘crawl  visit  frequency’  apportioned  to  URLs  on  a  site But  there  are  other  factors  affecting  frequency  of  Googlebot visits  aside  from  importance  /  speed The  vast  majority  of  URLs  on  the  web  don’t  get  a  lot  of  budget  allocated  to  them
  • 11. CRITICAL MATERIAL CONTENT CHANGE 11 HINTS  & C  =  ∑  i =  0  n  -­‐ 1    weight  i *  feature
  • 12. Current  capacity  of  the  web  crawling  system  is  high Your  URL  is  ‘important’ Your  URL  is  in  the  real  time,  daily  crawl  or  ‘active’  base   layer  segment Your  URL  changes  a  lot  with  critical  material  content   change Probability  and  predictability  of  critical  material  content   change  is  high  for  your  URL Your  website  speed  is  fast  and  Googlebot gets  the  time  to   visit  your  URL Your  URL  has  been  ‘upgraded’  to  a  daily  or  real  time  crawl   layer 12 POSITIVE FACTORS AFFECTING GOOGLEBOT VISIT FREQUENCY
  • 13. Current  capacity  of  web  crawling  system  is  low Your  URL  has  been  detected  as  a  ‘spam’  URL Your  URL  is  in  an  ‘inactive’  base  layer  segment Your  URLs  are  ‘tripping  hints’  built  into  the  system  to   detect  non-­‐critical  change  dynamic  content Probability  and  predictability  of  critical  material  content   change  is  low  for  your  URL Your  website  speed  is  slow  and  Googlebot doesn’t  get  the   time  to  visit  your  URL Your  URL  has  been  ‘downgraded’  to  an  ‘inactive’  base   layer  segment Your  URL  has  returned  an  ‘unreachable’  server  response   code  recently 13 NEGATIVE FACTORS AFFECTING GOOGLEBOT VISIT FREQUENCY
  • 14. IT’S NOT JUST ABOUT ‘FRESHNESS’ 14 It’s  about  the   probability  &   predictability  of  future   ‘freshness’ BASED ON DATA FROM THE HISTORY LOGS - HOW CAN WE INFLUENCE THEM TO ESCAPE THE BASE LAYER?
  • 15. Going  ‘where  the  action  is’  in  sites The  ‘need  for  speed’ Logical  structure Correct  ‘response’  codes XML  sitemaps ‘Successful  crawl  visits ‘Seeing  everything’  on  a  page Taking  ‘hints’ Clear  unique  single  ‘URL   fingerprints’  (no  duplicates) Predicting  likelihood  of  ‘future   change’ Slow  sites Too  many  redirects Being  bored  (Meh)  (‘Hints’  are  built  in  by  the   search  engine  systems  – Takes  ‘hints’) Being  lied  to  (e.g.  On  XML  sitemap  priorities) Crawl  traps  and  dead  ends Going  round  in  circles  (Infinite  loops) Spam  URLs Crawl  wasting  minor  content  change  URLs ‘Hidden’  and  blocked  content Uncrawlable URLs Not  just  any  change Critical  material  change Predicting  future  change Dropping  ‘hints’  to  Googlebot Sending  Googlebot Where  ‘the  action  is’ CRAWL OPTIMISATION – STAGE 1 - UNDERSTAND GOOGLEBOT & URL SCHEDULER - LIKES & DISLIKES 15 LIKES DISLIKES CHANGE  IS  KEY
  • 16. FIND GOOGLEBOT 16 AUTOMATE  SERVER  LOG   RETRIEVAL  VIA  CRON  JOB grep Googlebot access_log >googlebot_access.txt
  • 17. LOOK THROUGH ‘SPIDER EYES’ VIA LOG ANALYSIS – ANALYSE GOOGLEBOT 17 PREPARE TO BE HORRIFIED Incorrect  URL  header  response  codes  (e.g.  302s) 301  redirect  chains Old  files  or  XML  sitemaps  left  on  server  from  years  ago Infinite/  endless  loops  (circular  dependency) On  parameter  driven  sites  URLs  crawled  which  produce  same  output URLs  generated  by  spammers Dead  image  files  being  visited Old  css files  still  being  crawled Identify  your  ‘real  time’,  ‘daily’  and  ‘base  layer’  URLs ARE  THEY  THE  ONES  YOU  WANT  THERE?
  • 18. 18 FIX GOOGLEBOT’S JOURNEY SPEED UP YOUR SITE TO ‘FEED’ GOOGLEGOT MORE TECHNICAL  ‘FIXES’       Speed  up  your  site Implement  compression,  minification,  caching ‘ Fix  incorrect  header  response  codes Fix  nonsensical  ‘infinite  loops’  generated  by   database  driven  parameters  or  ‘looping’  relative   URLs Use  absolute  versus  relative  internal  links Ensure  no  parts  of  content  is  blocked  from   crawlers  (e.g.  in  carousels,  concertinas  and   tabbed  content Ensure  no  css or  javascript files  are  blocked  from   crawlers Unpick  301  redirect  chains
  • 19. Minimise  301  redirects Minimise  canonicalisation Use  ‘if  modified’  headers  on  low  importance   ‘hygiene’  pages Use  ‘expires  after’  headers  on  content  with  short   shelf  live  (e.g.  auctions,  job  sites,  event  sites) Noindex low  search  volume  or  near  duplicate  URLs   (use  noindex directive  on  robots.txt) Use  410  ‘gone’  headers  on  dead  URLs  liberally Revisit  .htaccess file  and  review  legacy  pattern   matched  301  redirects Combine  CSS  and  javascript files FIX GOOGLEBOT’S JOURNEY 19 SAVE  BUDGET £
  • 20. Revisit  ‘Votes  for  self’  via  internal  links  in  GSC Clear  ‘unique’  URL  fingerprints Use  XML  sitemaps  for  your  important  URLs  (don’t  put   everything  on  it) Use  ‘mega  menus’  (very  selectively)  to  key  pages Use  ‘breadcrumbs’  (for  hierarchical  structure) Build  ‘bridges’  and  ‘shortcuts’  via  html  sitemaps  and   supplementary  content  for  ‘cross  modular’  ‘related’   internal  linking  to  key  pages Consolidate  (merge)  important  but  similar  content  (e.g.   merge  FAQs) Consider  flattening  your  site  structure  so  ‘importance’   flows  further Reduce  internal  linking  to  low  priority  URLs BE  CLEAR  TO  GOOGLEBOT  WHICH  ARE   YOUR  MOST  IMPORTANT  PAGES Not  just  any  change  – Critical  material  change Keep  the  ‘action’  in  the  key  areas -­‐ NOT  JUST  THE  BLOG Use  ‘relevant  ‘supplementary  content  to  keep  key  pages  ‘fresh’ Remember  the  negative  impact  of    ‘crawl  hints’ Regularly  update  key  content Consider  ‘updating’  rather  than  replacing  seasonal  content   URLs Build  ‘dynamism’  into  your  web  development  (sites  that  ‘move’   win) GOOGLEBOT  GOES  WHERE  THE  ACTION  IS  AND   IS  LIKELY  TO  BE  IN  THE  FUTURE TRAIN GOOGLEBOT – ‘TALK TO THE SPIDER’ (PROMOTE URLS TO HIGHER CRAWL LAYERS) 20 EMPHASISE  PAGE  IMPORTANCE       TRAIN  ON  CHANGE
  • 21. YSlow Pingdom Google  Page  Speed  Tests Minificiation – JS  Compress  and  CSS   Minifier Image  Compression   – Compressjpeg.com,   tinypng.com 21 TOOLS YOU CAN USE GSC  Crawl  Stats Deepcrawl Screaming  Frog Server  Logs SEMRush (auditing  tools) Webconfs (header  responses   /  similarity   checker) Powermapper (birds  eye  view  of  site) GSC  Internal  links  Report  (URL  importance) Link  Research  Tools  (Strongest  sub  pages   reports) GSC  Internal  links  (add  site  categories  and   sections  as  additional  profiles) Powermapper GSC  Index  levels  (over  indexation  checks) GSC  Crawl  stats Last  Accessed  Tools  (versus  competitors) Server  logs SPEED SPIDER  EYES URL  IMPORTANCE SAVINGS  &  CHANGE Webmaster Hangout Office Hours
  • 22. IS THIS YOUR BLOG?? HOPE NOT 22 WARNING SIGNS – TOO MANY VOTES BY SELF FOR WRONG PAGES Most Important Page 1 Most  Important  Page  2 Most  Important  Page  3
  • 23. 23 WARNING SIGNS – OVER INDEXATION FIX IT FOR A BETTER CRAWL
  • 24. Tags:  I,  must,  tag,    this,  blog,  post,  with,   every,  possible,   word,  that,  pops,   into,  my,   head,  when,  I,  look,  at,  it,  and,  dilute,  all,   relevance,  from,  it,  to,  a,  pile,  of,  mush,   cow,  shoes,  sheep,  the,  and,  me,  of,  it Image  Credit:  Buzzfeed Creating  ‘thin’  content  and   Even  more  URLs  to  crawl 24 WARNING SIGNS – TAG MAN
  • 26. ”Googlebot’s On  A  Strict  Diet” “Make  sure  the  right  URLs  get  on  the  menu” Dawn  Anderson  @  dawnieando REMEMBER