SlideShare a Scribd company logo
#pubcon
Logs Don’t Lie – SEO Wins &
Server Log File Analysis
Presented by:
Dawn Anderson @dawnieando
#pubcon
Dawn Anderson
• Move It Marketing
• University Lecturer – Digital Marketing
• From Manchester, UK (rains a lot)
• International SEO Consultant – 11+ yrs in SEO
• Pomeranian pooch lover – Bert & Tedward
• Fascinated by crawling (practice & academia)
• Doesn’t fare well in YouTube screen grabs ;P
• Party trick: Remembering UK postcode areas
(US Zip code equivalent)
• Search Awards Judge
• Twitter chatterer @dawnieando
#pubcon
What Server Log File Analysis is NOT
#pubcon
It’s not just about
‘crawl budget’
#pubcon
Crawl Budget
1
It’s not a
real term,
but 2 notions
together
2
Host Load +
URL
Scheduling
combined
3
Crawling
’Politeness’
IS A BIG
THING
4
How much
can we crawl
& how often?
5
What is
important to
crawl & how
often?
6
We need to
not ‘obsess’
over this if
we have
small sites
7
WE NEED TO
TAKE A LOOK
THROUGH
‘SPIDER
EYES’
#pubcon
When
could it
be
worth
it?
You have a huge
site with many
parameters
You’ve spotted
anomalies in
Google Analytics
traffic URLs
You’re trying to
consolidate site
URLs
You’ve got big
legacy
#pubcon
Why Server Log File Analysis?
#pubcon
Many reasons… But… Here’s one… An
infinite loop can kill a site over time
#pubcon
Exponentially Multiplicative URLs From Faceted Navigation…
100 DRESSES
5 COLOURS
10 SIZES
2 LENGTHS
4 SUPPLIERS
100 x 5 x 10
x 2 x 4 =
40,000
URLs
#pubcon
And that’s without HTTPS, WWW/non or internationalization
100 DRESSES
5 COLOURS
10 SIZES
2 LENGTHS
4 SUPPLIERS
100 x 5 x 10
x 2 x 4 =
40,000
URLs
X 2 BECAUSE…
HTTPS VERSION
80,000
URLs
X 2…
BECAUSE…
WWW / NON
WWW VERSION
160,000
URLs
X 5…
BECAUSE…
EN / FR / ES /
DE / IT (e.g.)
800,000
URLs
#pubcon
Why is Server Log File Analysis Important?
Detecting orphaned URLs
Understanding URL crawl frequency
Detecting server errors
Understanding the % of ‘healthy’ crawling
IDENTIFYING WEAK AREAS IN A SITE
#pubcon
A consolidation of signals
to preferred URLs can win
with SEO
CONSISTENCY
IS KING
#pubcon
Detective
Meets
Detective
#pubcon
We are stalking Googlebots (as
detectives) and trying to walk
their paths to understand their
experiences as they traverse a
site
WHAT ‘CLUES’ ARE WE
PROVIDING?
#pubcon
Is Google (the detective) picking
up on our clues?
Canonical tags
XML sitemaps
Href Lang
Internal links
Pagination
URL parameters
#pubcon
Is Google getting your ’hints’?
ONLY
HINTS
‘NOT’
DIRECTIVES??
#pubcon
Directive or Hint?… Either way, we need
to ensure our clues are working
Likely pretty strong
hints… or maybe
’nearly’ directives ;P
;P ;P…
It depends ;P
#pubcon
Every site will have its own crawling rules
DUSTBUSTER
CRAWLING
RULES
BUILDS ‘HINTS’
ON WHAT NOT
TO CRAWL
DO NOT CRAWL
IN THE DUST
#pubcon
Popular CMS’ might help with ‘rule-building’
ALL WILL HAVE SOME COMMON
CANONICALIZATION PATTERNS
WHICH CAN BE LEARNED FOR
EFFICIENCY
#pubcon
Why ‘Sampling’ in crawling for efficiency?
Is it worth it?
Should we crawl more?
Is there lots of important URLs here?
Do the URL’s ‘genuinely’ change frequently?
Are the changes ‘important’ (weighted) or is it just
‘DUST’?
#pubcon
There is likely also ‘quilting’ detection
Detecting Quilted
Web Pages at Scale
(Najork. M, 2012)
Finds pages
‘stitched’ together
to make other
pages
Image credit: Najork, Mark
#pubcon
Breadth First Crawling
or Other Crawling
Strategies (OR
SOMETHING MUCH
BETTER THAN THIS
SINCE CAFFEINE??)
#pubcon
Do NOT get me started on Javascript &
dependent files
THEY ARE ALL
NEEDED IN
INDEXING &
GATHERED IN
CRAWLING
#pubcon
If you use a ’Builder’ theme in
Wordpress this will be very evident
THE CACHES
CREATED GET
CRAWLED… a
LOT
#pubcon
Yoast and Googlebot
access…
• Yoast has unblocked access to Googlebot
in its plugin pretty much everywhere
• You might find Googlebot is trying to
access wp-admin even
• Googlebot needs all dependent files to
render the page (including javascript and
css files)
#pubcon
Hunting for
Googlebots? Where
can we find them?
#pubcon
In your
quest…You
may face some
challenges
along the way
#pubcon
Google Search Console is Where It’s At… Right?
#pubcon
We Can See Some Symptoms Here
Though
•SPEED ISSUES
•‘BOREDOM’
ISSUES
•ROGUE CODE
#pubcon
We Can See
Some
Symptoms
Here Too
AFTER REMOVAL OF
CANONICAL AT SCALE
#pubcon
Signs in Google Search Console Crawl Stats
• Possible signs of ‘quality-impacted’ content
• Near duplicates & duplicates
• Not just speed… ‘boredom’ too
• Major site changes or switches to https protocol
• Signs of ‘Sampling’ visits for quality
• Getting the best yield for crawling
• Transitive nature
• Slow sites
#pubcon
GOOGLE SEARCH CONSOLE IS NOT JUST
REALLY ‘WEB PAGES’
• Includes ALL CSS, JS, Zip,
XML, PDF, AMP, HTML files
crawled
• Pages are NOT just single
webpages
https://support.google.com/webmasters/answer/3
5253
Not just ‘web
pages
#pubcon
VISITS BY ALL THE TYPES Of GOOGLEBOTS
ARE RECORDED TOGETHER IN GSC
Web Image News
Video Feature Phone Smartphone
Mobile
Adsens
e
Adsense Adsbot
App
Crawle
ALL The Googlebot
Family
#pubcon
ONLY CONTAINS STATUS CODES BETWEEN
200s & 30X (ALL PROTOCOLS THOUGH)
#pubcon
Don’t Be Fooled By Those ‘Big Success’
Screen Grabs on Twitter
CAN BE BOTH
HTTP AND HTTPS
OR SIMPLY A
MAJOR IN-SITE
REDIRECTION
EXERCISE
#pubcon
Google Search Console is Like a Visit To
The GP Before Referral
VAGUE AT BEST
#pubcon
So…We Need To Dig Deeper… Be More Curious
#pubcon
Finding
Awstats on
cPanel
#pubcon
What URLs Though?
#pubcon
REALITY – Server Logs & Log Analysis Is
Where It’s At
AUTOMATE SERVER LOG
RETRIEVAL VIA CRON JOB
grep Googlebot access_log
>googlebot_access.txt
#pubcon
But… what are logs? Log files? Log file
analysis… really?
#pubcon
Not just this… J But kind of this J
#pubcon
Not just this… J But kind of this JLogs are everywhere
#pubcon
So… What’s a log file?
A document
containing one or
more logs
Usually exported as a
.txt file in ‘Common
Log Format’ (W3C)
from the server
Common log format
contains specific
fields
May be bundled in a
tar or .gz file
Can be exported
from ‘raw access’ in
the server
#pubcon
Elements of Server Log File (Common
Log Format)
• IP Address of bot
• Date Accessed (and time)
• Request type (GET, HEAD, POST)
• URL Requested
• Server Response Code Returned
• HTTP Code
• Bytes Served (file size)
• User Agent
#pubcon
Purposes & Types of logs on Servers
Error logs (database
errors + server
response codes, bad
code warnings
01
Suspicious activity
(Security
implications /
monitoring) – DDOS,
spammers, hackers
02
Visits to a file /
page / Page URL
requests (both
human & bots)
03
#pubcon
Monitor
Other
Types of
Logs Too
Error logs provide great
insight into where there
may be issues with
overloading the server etc
If errors (e.g. 500 codes
are sent frequently
Google will pull back and
crawl less
#pubcon
A Better
Explanation of
Log File Analysis
• Interrogation of data
• Looking for patterns
• Looking for anomalies
• Looking for split messaging
about URL importance to
Google and finding ways to
consolidate consistent
signaling to single content
fingerprints - STRENGTH
#pubcon
Recap - Simple
Explanation
• Logs – simply a notation or record of
something
• Log files – simply the document where the
log is stored
• Log file analysis – simply analysing and
exploring the log files to identify areas
where optimisation is possible or wastage
occurs
#pubcon
Hunting for
Googlebots?
How will we
know we’ve
found them?
#pubcon
The
Good,
The Bad
& The
Ugly
Good bots (polite, un-
malicious, usually from
respectable organisations
(e.g. Search Engines &
good tool providers))
Bad bots (unpolite, may
be malicious, scraper
bots, spammers)
#pubcon
‘Politeness’ Crawling Rules
Do NOT
damage the
server
01
Do NOT
damage the
server
02
Do NOT
damage the
server
03
#pubcon
There are ALSO many,
many ‘POLITE’ bots
• Yahoo Slurp
• Bing Bog
• Other Search Engine Bots
• 10 Types of Googlebot
• SEO Tool Bots (On Page, Sistrix,
Deepcrawl, etc)
#pubcon
Verifying it’s really Googlebot
• Spoofing
• Google don’t publish a list
of IPs any longer (they
change too frequently)
• Need to verify by reverse
DNS lookup on server using
HOST to ensure visits are
really from Googlebot and
not spoof bots (Screaming
frog)
#pubcon
Examples of Googlebot
Organic Search Calling
Cards (user agents)
• Desktop - Mozilla/5.0 (compatible;
Googlebot/2.1;
+http://www.google.com/bot.html)
• Smartphone - Mozilla/5.0 (Linux; Android
6.0.1; Nexus 5X Build/MMB29P)
AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/41.0.2272.96 Mobile
Safari/537.36 (compatible;
Googlebot/2.1;
+http://www.google.com/bot.html)
• Googlebot-News
• Googlebot-Image/1.0
#pubcon
So what’s the server log file analysis
process?
Ship AnalyseGather
#pubcon
Gathering server logs
#pubcon
Gathering logs straight from the server
FIND ‘RAW
ACCESS’
#pubcon
Gathering logs straight from the server
These only represent a few
hours worth of data /
activity
GOOD FOR A QUICK
LOOK IF YOU SEE
SOME PROBLEMS
OCCURRING
#pubcon
Archived raw logs… is what you want
• A log of everything on
the server
• All of the separate
subdomains
• All of the separate
protocols
• Zipped Up
#pubcon
Shipping server logs
#pubcon
Shipping logs to analytical tools
• Docker
• Naked Eye
• Excel
• Text File
• Cloud based log analysis software
• GREP
• Command Line
• Downloaded log analysis software
Carrying ‘ALL’ the
logs
#pubcon
Opening log files manually
YOU’LL NEED A TEXT
EDITOR OR EVEN BETTER
AN IDE (INTEGRATED
DEVELOPMENT
ENVIRONMENT)
e.g.
NOTEPAD++
BRACKETS (on MAC)
• Notepad++
• Notepad
• Komodo
• Aptana
• Netbeans
• Eclipse
• Brackets (Mac)
#pubcon
They look something like
this… SERVER LOGS
EXAMPLE IN
‘BRACKETS’ IDE
TEXT EDITOR
#pubcon
Move them all to excel
CTRL A and
Paste all into an
Excel
spreadsheet
#pubcon
Convert text to data in Excel
Choose ‘DATA’
and convert text
to columns.
Delimit with
‘space’ (usually)
#pubcon
Filter by verified user agent
FILTER USER
AGENT on verified
DNS lookup HOST
http://google.com/bot.h
tml
#pubcon
Many ‘scale’ tools for log file analysis
#pubcon
SCREAMING FROG LOG ANALYZER
#pubcon
You cannot ‘emulate’ a Googlebot
crawl
It is not a ‘from start to finish’ crawl
through a site
#pubcon
You may see strange URLs
• Old .htaccess rules run in order of their placement in
the .htaccess file
• Old .htaccess rules on ‘migrated-from’ sites
• Old XML sitemaps left on server
• Old plugins removed but folders left
• Un-optimized MySQL or other database (cluttered
with legacy)
• Spammers hitting your search queries & randomly
spinning new links
#pubcon
Understanding the problems
#pubcon
Crawl prioritization & queuing is evident
MANY OF THESE FILES
WERE PREVIOUSLY
CALLING THESE NOW
REMOVED PLUGINS AND
WERE QUEUED TO CRAWL
TOGETHER
#pubcon
If you migrate or switch protocol… import &
monitor all logs… they will chain
#pubcon
What followed the 301? FILTER ON BOT &
RESPONSE CODE
INCLUDING ONLY 301
AND 200 RESPONSE
CODES TO FIND THE
NEXT PART OF THE BOT
JOURNEY
(CONSIDER MULTIPLE
CONNECTIONS)
#pubcon
Filter on bot and 301 to identify bad crawl chains &
problematic parameters
SPOT THE
ISSUES
#pubcon
Filter on bot and 304 code served
304 – ‘IF MODIFIED’
(NOTICE NOTHING HAS
BEEN DOWNLOADED)
HEAD CHECKED
ONLY. NO ‘GET’
REQUEST
#pubcon
When consolidating… check 410 progress
Filter on 410 and User
Agent
410 ‘GONE’
#pubcon
Build a monitoring dashboard
#pubcon
Check The Split of Smartphone v Desktop Googlebot
#pubcon
Check The
Split of
Response
Codes
#pubcon
Splunk &
Deepcrawl log
file filter query
#pubcon
Explore & drill down into issues
#pubcon
Discover anomalies / gaps between
analytics & crawls from user-agent logs
#pubcon
Find & reconnect orphaned pages
#pubcon
Identify the discrepancies & weak
areas to prioritise
#pubcon
Do Regular Log Analysis & Crawling
• Weekly or monthly crawls
• Download logs or run them into the cloud
automatically (RECOMMENDED)
• Compare log file activity against crawls of the
site
• Compare crawls and log file activity against
Google Analytics & GSC ‘active’ URLs
#pubcon
Closing words –
Pressing
‘Recrawl now’
(April Fools)
will not fix your
content
#pubcon
But… Fixing your
content might
positively impact
crawling
#pubcon
Happy
Sleuthing
Thank you
#pubcon
Appendix, References & Further
Resources
#pubcon
Pros and cons of Excel
CONS
• Fiddly
• Mostly Manual process
• Limited capacity
• Need to keep updating with
more data
PROS
• Easy to set up
• Suitable for small
analysis
• Simple to understand
• Easy to filter & sort
#pubcon
Loggly
CONS
• Free version very limited
• Not initially intuitive
• Integrates with server
• Integrates with log-shipper
intermediaries like
‘Docker’
PROS
• Option to upload files
• Good graphical analysis
• Can build nice reports
• Cloud based software
• Great dashboard
#pubcon
Splunk Light
CONS
• Free version limited
• Based on usage
• Can soon add up
• Not easily configured in
the cloud
• UI not intuitive
PROS
• Downloadable version
• Good for medium projects
• You have control (on own
machine)
• Easy to pick out essentials
• Can integrate with Deepcrawl
#pubcon
Screaming Frog Log Analyzer
CONS
• Need to increase RAM
allowance almost always
• Very limited free version
• Again, has limits as it’s not
cloud based
PROS
• Very easy to configure
• Can compare log URIs
with crawl data
• Very similar to excel
• Some graphics to build
reports
• Overlay GA & GSC
#pubcon
Places to find bot
footprints
• On server analytics & visitor analytics
screens – e.g. Awstats
• Google search console
• Server logs (raw access logs)

More Related Content

What's hot

Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Dawn Anderson MSc DigM
 
SEO Cannibalisation of Your Own SEO Success
SEO Cannibalisation of Your Own SEO SuccessSEO Cannibalisation of Your Own SEO Success
SEO Cannibalisation of Your Own SEO Success
Dawn Anderson MSc DigM
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebots
Dawn Anderson MSc DigM
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawling
Dawn Anderson MSc DigM
 
Sasconbeta 2015 Dawn Anderson - Talk To The Spider
Sasconbeta 2015 Dawn Anderson - Talk To The SpiderSasconbeta 2015 Dawn Anderson - Talk To The Spider
Sasconbeta 2015 Dawn Anderson - Talk To The Spider
Dawn Anderson MSc DigM
 
How to Perform SEO Audits for Maximized Efficiency & Value
How to Perform SEO Audits for Maximized Efficiency & ValueHow to Perform SEO Audits for Maximized Efficiency & Value
How to Perform SEO Audits for Maximized Efficiency & Value
alanbleiweiss
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Dawn Anderson MSc DigM
 
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your SitesSEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
Dawn Anderson MSc DigM
 
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick StoxPubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
patrickstox
 
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
Using Competitive Gap Analyses to Discover Low-Hanging FruitUsing Competitive Gap Analyses to Discover Low-Hanging Fruit
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
Keith Goode
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
patrickstox
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Jason Mun
 
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
Faye Watt
 
Lots of ways to speed up your site
Lots of ways to speed up your siteLots of ways to speed up your site
Lots of ways to speed up your site
Ian Lurie
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
Distilled
 
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
Branded3
 
BrightonSEO 2017 - SEO quick wins from a technical check
BrightonSEO 2017  - SEO quick wins from a technical checkBrightonSEO 2017  - SEO quick wins from a technical check
BrightonSEO 2017 - SEO quick wins from a technical check
Chloe Bodard
 
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick StoxGoogle's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
Ahrefs
 
Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015
Jan Hendrik Merlin Jacob
 

What's hot (19)

Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLs
 
SEO Cannibalisation of Your Own SEO Success
SEO Cannibalisation of Your Own SEO SuccessSEO Cannibalisation of Your Own SEO Success
SEO Cannibalisation of Your Own SEO Success
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebots
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawling
 
Sasconbeta 2015 Dawn Anderson - Talk To The Spider
Sasconbeta 2015 Dawn Anderson - Talk To The SpiderSasconbeta 2015 Dawn Anderson - Talk To The Spider
Sasconbeta 2015 Dawn Anderson - Talk To The Spider
 
How to Perform SEO Audits for Maximized Efficiency & Value
How to Perform SEO Audits for Maximized Efficiency & ValueHow to Perform SEO Audits for Maximized Efficiency & Value
How to Perform SEO Audits for Maximized Efficiency & Value
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
 
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your SitesSEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
SEO - Stop Eating Your Words - Avoid Cannibalisation Of Your Sites
 
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick StoxPubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
 
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
Using Competitive Gap Analyses to Discover Low-Hanging FruitUsing Competitive Gap Analyses to Discover Low-Hanging Fruit
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
 
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
An SEO's Guide to Website Migrations | Faye Watt | BrightonSEO's Advanced Tec...
 
Lots of ways to speed up your site
Lots of ways to speed up your siteLots of ways to speed up your site
Lots of ways to speed up your site
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
 
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
 
BrightonSEO 2017 - SEO quick wins from a technical check
BrightonSEO 2017  - SEO quick wins from a technical checkBrightonSEO 2017  - SEO quick wins from a technical check
BrightonSEO 2017 - SEO quick wins from a technical check
 
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick StoxGoogle's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
Google's Search Signals For Page Experience - SMX Advanced 2021 Patrick Stox
 
Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015
 

Similar to Pubcon florida 2018 logs dont lie dawn anderson

How Search Works
How Search WorksHow Search Works
How Search Works
Ahrefs
 
Google Analytics Referral Spam - Pubcon Las Vegas 2015
Google Analytics Referral Spam - Pubcon Las Vegas 2015Google Analytics Referral Spam - Pubcon Las Vegas 2015
Google Analytics Referral Spam - Pubcon Las Vegas 2015
Search Commander, Inc.
 
Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2
Nate Plaunt
 
Search Engine Optimize for WordPress in 3 Easy Steps
Search Engine Optimize for WordPress in 3 Easy StepsSearch Engine Optimize for WordPress in 3 Easy Steps
Search Engine Optimize for WordPress in 3 Easy Steps
Anna Belle Leiserson
 
The CBC on a diet - Slimming down for a whole nation
The CBC on a diet - Slimming down for a whole nationThe CBC on a diet - Slimming down for a whole nation
The CBC on a diet - Slimming down for a whole nation
Barbara Bermes
 
A Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big QueryA Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big Query
Dominic Woodman
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
Distilled
 
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your LogsSearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
Distilled
 
Intro to Google, SEO, and You in 2017
Intro to Google, SEO, and You in 2017Intro to Google, SEO, and You in 2017
Intro to Google, SEO, and You in 2017
Kristine Schachinger SEO and Online Marketing
 
Demand Quest SEO training session 2
Demand Quest SEO training session 2Demand Quest SEO training session 2
Demand Quest SEO training session 2
Nate Plaunt
 
How to do a SEO Site Audit
How to do a SEO Site AuditHow to do a SEO Site Audit
How to do a SEO Site Audit
Kathy Alice Brown
 
SEO Training Slides October 2016
SEO Training Slides October 2016SEO Training Slides October 2016
SEO Training Slides October 2016
Noisy Little Monkey
 
Seo Made Easy
Seo Made EasySeo Made Easy
Seo Made Easy
InfluenceOlogy
 
The Future of SEO #LearnInbound
The Future of SEO #LearnInboundThe Future of SEO #LearnInbound
The Future of SEO #LearnInbound
Britney Muller
 
Sunday Business Post SEO Masterclass - John RIng
Sunday Business Post SEO Masterclass �- John RIngSunday Business Post SEO Masterclass �- John RIng
Sunday Business Post SEO Masterclass - John RIng
TinderPoint
 
How to Master SEO in 2017
How to Master SEO in 2017How to Master SEO in 2017
How to Master SEO in 2017
Digital Vidya
 
Google Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & ImplementationGoogle Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & Implementation
Search Commander, Inc.
 
Google and Beyond: Advanced Search Engine Hacking
Google and Beyond: Advanced Search Engine HackingGoogle and Beyond: Advanced Search Engine Hacking
Google and Beyond: Advanced Search Engine Hacking
amirrullohacmad
 
Technical SEO Training Day | Igoo
Technical SEO Training Day | Igoo Technical SEO Training Day | Igoo
Technical SEO Training Day | Igoo
Charlie Whitworth
 
SEO Checklists
SEO ChecklistsSEO Checklists
SEO Checklists
Jon Payne
 

Similar to Pubcon florida 2018 logs dont lie dawn anderson (20)

How Search Works
How Search WorksHow Search Works
How Search Works
 
Google Analytics Referral Spam - Pubcon Las Vegas 2015
Google Analytics Referral Spam - Pubcon Las Vegas 2015Google Analytics Referral Spam - Pubcon Las Vegas 2015
Google Analytics Referral Spam - Pubcon Las Vegas 2015
 
Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2
 
Search Engine Optimize for WordPress in 3 Easy Steps
Search Engine Optimize for WordPress in 3 Easy StepsSearch Engine Optimize for WordPress in 3 Easy Steps
Search Engine Optimize for WordPress in 3 Easy Steps
 
The CBC on a diet - Slimming down for a whole nation
The CBC on a diet - Slimming down for a whole nationThe CBC on a diet - Slimming down for a whole nation
The CBC on a diet - Slimming down for a whole nation
 
A Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big QueryA Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big Query
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
 
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your LogsSearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
 
Intro to Google, SEO, and You in 2017
Intro to Google, SEO, and You in 2017Intro to Google, SEO, and You in 2017
Intro to Google, SEO, and You in 2017
 
Demand Quest SEO training session 2
Demand Quest SEO training session 2Demand Quest SEO training session 2
Demand Quest SEO training session 2
 
How to do a SEO Site Audit
How to do a SEO Site AuditHow to do a SEO Site Audit
How to do a SEO Site Audit
 
SEO Training Slides October 2016
SEO Training Slides October 2016SEO Training Slides October 2016
SEO Training Slides October 2016
 
Seo Made Easy
Seo Made EasySeo Made Easy
Seo Made Easy
 
The Future of SEO #LearnInbound
The Future of SEO #LearnInboundThe Future of SEO #LearnInbound
The Future of SEO #LearnInbound
 
Sunday Business Post SEO Masterclass - John RIng
Sunday Business Post SEO Masterclass �- John RIngSunday Business Post SEO Masterclass �- John RIng
Sunday Business Post SEO Masterclass - John RIng
 
How to Master SEO in 2017
How to Master SEO in 2017How to Master SEO in 2017
How to Master SEO in 2017
 
Google Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & ImplementationGoogle Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & Implementation
 
Google and Beyond: Advanced Search Engine Hacking
Google and Beyond: Advanced Search Engine HackingGoogle and Beyond: Advanced Search Engine Hacking
Google and Beyond: Advanced Search Engine Hacking
 
Technical SEO Training Day | Igoo
Technical SEO Training Day | Igoo Technical SEO Training Day | Igoo
Technical SEO Training Day | Igoo
 
SEO Checklists
SEO ChecklistsSEO Checklists
SEO Checklists
 

More from Dawn Anderson MSc DigM

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdf
Dawn Anderson MSc DigM
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Dawn Anderson MSc DigM
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Dawn Anderson MSc DigM
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you think
Dawn Anderson MSc DigM
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Dawn Anderson MSc DigM
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
Dawn Anderson MSc DigM
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
Dawn Anderson MSc DigM
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Dawn Anderson MSc DigM
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
Dawn Anderson MSc DigM
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Dawn Anderson MSc DigM
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Dawn Anderson MSc DigM
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Dawn Anderson MSc DigM
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive Search
Dawn Anderson MSc DigM
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Dawn Anderson MSc DigM
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
Dawn Anderson MSc DigM
 
SEO in a Mobile First World
SEO in a Mobile First WorldSEO in a Mobile First World
SEO in a Mobile First World
Dawn Anderson MSc DigM
 
Modern Ecommerce SEO
Modern Ecommerce SEOModern Ecommerce SEO
Modern Ecommerce SEO
Dawn Anderson MSc DigM
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Dawn Anderson MSc DigM
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
Dawn Anderson MSc DigM
 
Voice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEOVoice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEO
Dawn Anderson MSc DigM
 

More from Dawn Anderson MSc DigM (20)

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdf
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you think
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard Race
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive Search
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
SEO in a Mobile First World
SEO in a Mobile First WorldSEO in a Mobile First World
SEO in a Mobile First World
 
Modern Ecommerce SEO
Modern Ecommerce SEOModern Ecommerce SEO
Modern Ecommerce SEO
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
 
Voice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEOVoice Search Challenges For Search and Information Retrieval and SEO
Voice Search Challenges For Search and Information Retrieval and SEO
 

Recently uploaded

How to Make Your Trade Show Booth Stand Out
How to Make Your Trade Show Booth Stand OutHow to Make Your Trade Show Booth Stand Out
How to Make Your Trade Show Booth Stand Out
Blue Atlas Marketing
 
Influencer Marketing Master Class - Alexis Andreasik
Influencer Marketing Master Class - Alexis AndreasikInfluencer Marketing Master Class - Alexis Andreasik
Influencer Marketing Master Class - Alexis Andreasik
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Pillar-Based Marketing - Ryan Brock, DemandJump
Pillar-Based Marketing - Ryan Brock, DemandJumpPillar-Based Marketing - Ryan Brock, DemandJump
Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.
InstBlast Marketing
 
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
Demandbase
 
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
janudm24
 
QuickBooks Sync Manager Repair Tool- What You Need to Know
QuickBooks Sync Manager Repair Tool- What You Need to KnowQuickBooks Sync Manager Repair Tool- What You Need to Know
QuickBooks Sync Manager Repair Tool- What You Need to Know
markmargaret23
 
Efficient Website Management for Digital Marketing Pros
Efficient Website Management for Digital Marketing ProsEfficient Website Management for Digital Marketing Pros
Efficient Website Management for Digital Marketing Pros
Lauren Polinsky
 
How American Bath Group Leveraged Kontent
How American Bath Group Leveraged KontentHow American Bath Group Leveraged Kontent
Yes, It's Your Fault Book Launch Webinar
Yes, It's Your Fault Book Launch WebinarYes, It's Your Fault Book Launch Webinar
Yes, It's Your Fault Book Launch Webinar
Demandbase
 
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Breaking Silos To Break Bank: Shattering The Divide Between Search And Social
Breaking Silos To Break Bank: Shattering The Divide Between Search And SocialBreaking Silos To Break Bank: Shattering The Divide Between Search And Social
Breaking Silos To Break Bank: Shattering The Divide Between Search And Social
Navah Hopkins
 
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptxFrom Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
Boston SEO Services
 
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Assignment 2 Task 1: Digital Marketing Course
Assignment 2 Task 1: Digital Marketing CourseAssignment 2 Task 1: Digital Marketing Course
Assignment 2 Task 1: Digital Marketing Course
klaudiadgmkt
 

Recently uploaded (20)

How to Make Your Trade Show Booth Stand Out
How to Make Your Trade Show Booth Stand OutHow to Make Your Trade Show Booth Stand Out
How to Make Your Trade Show Booth Stand Out
 
Influencer Marketing Master Class - Alexis Andreasik
Influencer Marketing Master Class - Alexis AndreasikInfluencer Marketing Master Class - Alexis Andreasik
Influencer Marketing Master Class - Alexis Andreasik
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis Yu
 
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
 
Pillar-Based Marketing - Ryan Brock, DemandJump
Pillar-Based Marketing - Ryan Brock, DemandJumpPillar-Based Marketing - Ryan Brock, DemandJump
Pillar-Based Marketing - Ryan Brock, DemandJump
 
Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.
 
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
What’s “In” and “Out” for ABM in 2024: Plays That Help You Grow and Ones to L...
 
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
 
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
janani Digital Marketer|Digital Marketing consultant|Marketing Promotion|Coim...
 
QuickBooks Sync Manager Repair Tool- What You Need to Know
QuickBooks Sync Manager Repair Tool- What You Need to KnowQuickBooks Sync Manager Repair Tool- What You Need to Know
QuickBooks Sync Manager Repair Tool- What You Need to Know
 
Efficient Website Management for Digital Marketing Pros
Efficient Website Management for Digital Marketing ProsEfficient Website Management for Digital Marketing Pros
Efficient Website Management for Digital Marketing Pros
 
How American Bath Group Leveraged Kontent
How American Bath Group Leveraged KontentHow American Bath Group Leveraged Kontent
How American Bath Group Leveraged Kontent
 
Yes, It's Your Fault Book Launch Webinar
Yes, It's Your Fault Book Launch WebinarYes, It's Your Fault Book Launch Webinar
Yes, It's Your Fault Book Launch Webinar
 
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
 
Breaking Silos To Break Bank: Shattering The Divide Between Search And Social
Breaking Silos To Break Bank: Shattering The Divide Between Search And SocialBreaking Silos To Break Bank: Shattering The Divide Between Search And Social
Breaking Silos To Break Bank: Shattering The Divide Between Search And Social
 
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptxFrom Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
 
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
 
Assignment 2 Task 1: Digital Marketing Course
Assignment 2 Task 1: Digital Marketing CourseAssignment 2 Task 1: Digital Marketing Course
Assignment 2 Task 1: Digital Marketing Course
 

Pubcon florida 2018 logs dont lie dawn anderson

  • 1. #pubcon Logs Don’t Lie – SEO Wins & Server Log File Analysis Presented by: Dawn Anderson @dawnieando
  • 2. #pubcon Dawn Anderson • Move It Marketing • University Lecturer – Digital Marketing • From Manchester, UK (rains a lot) • International SEO Consultant – 11+ yrs in SEO • Pomeranian pooch lover – Bert & Tedward • Fascinated by crawling (practice & academia) • Doesn’t fare well in YouTube screen grabs ;P • Party trick: Remembering UK postcode areas (US Zip code equivalent) • Search Awards Judge • Twitter chatterer @dawnieando
  • 3. #pubcon What Server Log File Analysis is NOT
  • 4. #pubcon It’s not just about ‘crawl budget’
  • 5. #pubcon Crawl Budget 1 It’s not a real term, but 2 notions together 2 Host Load + URL Scheduling combined 3 Crawling ’Politeness’ IS A BIG THING 4 How much can we crawl & how often? 5 What is important to crawl & how often? 6 We need to not ‘obsess’ over this if we have small sites 7 WE NEED TO TAKE A LOOK THROUGH ‘SPIDER EYES’
  • 6. #pubcon When could it be worth it? You have a huge site with many parameters You’ve spotted anomalies in Google Analytics traffic URLs You’re trying to consolidate site URLs You’ve got big legacy
  • 7. #pubcon Why Server Log File Analysis?
  • 8. #pubcon Many reasons… But… Here’s one… An infinite loop can kill a site over time
  • 9. #pubcon Exponentially Multiplicative URLs From Faceted Navigation… 100 DRESSES 5 COLOURS 10 SIZES 2 LENGTHS 4 SUPPLIERS 100 x 5 x 10 x 2 x 4 = 40,000 URLs
  • 10. #pubcon And that’s without HTTPS, WWW/non or internationalization 100 DRESSES 5 COLOURS 10 SIZES 2 LENGTHS 4 SUPPLIERS 100 x 5 x 10 x 2 x 4 = 40,000 URLs X 2 BECAUSE… HTTPS VERSION 80,000 URLs X 2… BECAUSE… WWW / NON WWW VERSION 160,000 URLs X 5… BECAUSE… EN / FR / ES / DE / IT (e.g.) 800,000 URLs
  • 11. #pubcon Why is Server Log File Analysis Important? Detecting orphaned URLs Understanding URL crawl frequency Detecting server errors Understanding the % of ‘healthy’ crawling IDENTIFYING WEAK AREAS IN A SITE
  • 12. #pubcon A consolidation of signals to preferred URLs can win with SEO CONSISTENCY IS KING
  • 14. #pubcon We are stalking Googlebots (as detectives) and trying to walk their paths to understand their experiences as they traverse a site WHAT ‘CLUES’ ARE WE PROVIDING?
  • 15. #pubcon Is Google (the detective) picking up on our clues? Canonical tags XML sitemaps Href Lang Internal links Pagination URL parameters
  • 16. #pubcon Is Google getting your ’hints’? ONLY HINTS ‘NOT’ DIRECTIVES??
  • 17. #pubcon Directive or Hint?… Either way, we need to ensure our clues are working Likely pretty strong hints… or maybe ’nearly’ directives ;P ;P ;P… It depends ;P
  • 18. #pubcon Every site will have its own crawling rules DUSTBUSTER CRAWLING RULES BUILDS ‘HINTS’ ON WHAT NOT TO CRAWL DO NOT CRAWL IN THE DUST
  • 19. #pubcon Popular CMS’ might help with ‘rule-building’ ALL WILL HAVE SOME COMMON CANONICALIZATION PATTERNS WHICH CAN BE LEARNED FOR EFFICIENCY
  • 20. #pubcon Why ‘Sampling’ in crawling for efficiency? Is it worth it? Should we crawl more? Is there lots of important URLs here? Do the URL’s ‘genuinely’ change frequently? Are the changes ‘important’ (weighted) or is it just ‘DUST’?
  • 21. #pubcon There is likely also ‘quilting’ detection Detecting Quilted Web Pages at Scale (Najork. M, 2012) Finds pages ‘stitched’ together to make other pages Image credit: Najork, Mark
  • 22. #pubcon Breadth First Crawling or Other Crawling Strategies (OR SOMETHING MUCH BETTER THAN THIS SINCE CAFFEINE??)
  • 23. #pubcon Do NOT get me started on Javascript & dependent files THEY ARE ALL NEEDED IN INDEXING & GATHERED IN CRAWLING
  • 24. #pubcon If you use a ’Builder’ theme in Wordpress this will be very evident THE CACHES CREATED GET CRAWLED… a LOT
  • 25. #pubcon Yoast and Googlebot access… • Yoast has unblocked access to Googlebot in its plugin pretty much everywhere • You might find Googlebot is trying to access wp-admin even • Googlebot needs all dependent files to render the page (including javascript and css files)
  • 27. #pubcon In your quest…You may face some challenges along the way
  • 28. #pubcon Google Search Console is Where It’s At… Right?
  • 29. #pubcon We Can See Some Symptoms Here Though •SPEED ISSUES •‘BOREDOM’ ISSUES •ROGUE CODE
  • 30. #pubcon We Can See Some Symptoms Here Too AFTER REMOVAL OF CANONICAL AT SCALE
  • 31. #pubcon Signs in Google Search Console Crawl Stats • Possible signs of ‘quality-impacted’ content • Near duplicates & duplicates • Not just speed… ‘boredom’ too • Major site changes or switches to https protocol • Signs of ‘Sampling’ visits for quality • Getting the best yield for crawling • Transitive nature • Slow sites
  • 32. #pubcon GOOGLE SEARCH CONSOLE IS NOT JUST REALLY ‘WEB PAGES’ • Includes ALL CSS, JS, Zip, XML, PDF, AMP, HTML files crawled • Pages are NOT just single webpages https://support.google.com/webmasters/answer/3 5253 Not just ‘web pages
  • 33. #pubcon VISITS BY ALL THE TYPES Of GOOGLEBOTS ARE RECORDED TOGETHER IN GSC Web Image News Video Feature Phone Smartphone Mobile Adsens e Adsense Adsbot App Crawle ALL The Googlebot Family
  • 34. #pubcon ONLY CONTAINS STATUS CODES BETWEEN 200s & 30X (ALL PROTOCOLS THOUGH)
  • 35. #pubcon Don’t Be Fooled By Those ‘Big Success’ Screen Grabs on Twitter CAN BE BOTH HTTP AND HTTPS OR SIMPLY A MAJOR IN-SITE REDIRECTION EXERCISE
  • 36. #pubcon Google Search Console is Like a Visit To The GP Before Referral VAGUE AT BEST
  • 37. #pubcon So…We Need To Dig Deeper… Be More Curious
  • 40. #pubcon REALITY – Server Logs & Log Analysis Is Where It’s At AUTOMATE SERVER LOG RETRIEVAL VIA CRON JOB grep Googlebot access_log >googlebot_access.txt
  • 41. #pubcon But… what are logs? Log files? Log file analysis… really?
  • 42. #pubcon Not just this… J But kind of this J
  • 43. #pubcon Not just this… J But kind of this JLogs are everywhere
  • 44. #pubcon So… What’s a log file? A document containing one or more logs Usually exported as a .txt file in ‘Common Log Format’ (W3C) from the server Common log format contains specific fields May be bundled in a tar or .gz file Can be exported from ‘raw access’ in the server
  • 45. #pubcon Elements of Server Log File (Common Log Format) • IP Address of bot • Date Accessed (and time) • Request type (GET, HEAD, POST) • URL Requested • Server Response Code Returned • HTTP Code • Bytes Served (file size) • User Agent
  • 46. #pubcon Purposes & Types of logs on Servers Error logs (database errors + server response codes, bad code warnings 01 Suspicious activity (Security implications / monitoring) – DDOS, spammers, hackers 02 Visits to a file / page / Page URL requests (both human & bots) 03
  • 47. #pubcon Monitor Other Types of Logs Too Error logs provide great insight into where there may be issues with overloading the server etc If errors (e.g. 500 codes are sent frequently Google will pull back and crawl less
  • 48. #pubcon A Better Explanation of Log File Analysis • Interrogation of data • Looking for patterns • Looking for anomalies • Looking for split messaging about URL importance to Google and finding ways to consolidate consistent signaling to single content fingerprints - STRENGTH
  • 49. #pubcon Recap - Simple Explanation • Logs – simply a notation or record of something • Log files – simply the document where the log is stored • Log file analysis – simply analysing and exploring the log files to identify areas where optimisation is possible or wastage occurs
  • 50. #pubcon Hunting for Googlebots? How will we know we’ve found them?
  • 51. #pubcon The Good, The Bad & The Ugly Good bots (polite, un- malicious, usually from respectable organisations (e.g. Search Engines & good tool providers)) Bad bots (unpolite, may be malicious, scraper bots, spammers)
  • 52. #pubcon ‘Politeness’ Crawling Rules Do NOT damage the server 01 Do NOT damage the server 02 Do NOT damage the server 03
  • 53. #pubcon There are ALSO many, many ‘POLITE’ bots • Yahoo Slurp • Bing Bog • Other Search Engine Bots • 10 Types of Googlebot • SEO Tool Bots (On Page, Sistrix, Deepcrawl, etc)
  • 54. #pubcon Verifying it’s really Googlebot • Spoofing • Google don’t publish a list of IPs any longer (they change too frequently) • Need to verify by reverse DNS lookup on server using HOST to ensure visits are really from Googlebot and not spoof bots (Screaming frog)
  • 55. #pubcon Examples of Googlebot Organic Search Calling Cards (user agents) • Desktop - Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) • Smartphone - Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) • Googlebot-News • Googlebot-Image/1.0
  • 56. #pubcon So what’s the server log file analysis process? Ship AnalyseGather
  • 58. #pubcon Gathering logs straight from the server FIND ‘RAW ACCESS’
  • 59. #pubcon Gathering logs straight from the server These only represent a few hours worth of data / activity GOOD FOR A QUICK LOOK IF YOU SEE SOME PROBLEMS OCCURRING
  • 60. #pubcon Archived raw logs… is what you want • A log of everything on the server • All of the separate subdomains • All of the separate protocols • Zipped Up
  • 62. #pubcon Shipping logs to analytical tools • Docker • Naked Eye • Excel • Text File • Cloud based log analysis software • GREP • Command Line • Downloaded log analysis software Carrying ‘ALL’ the logs
  • 63. #pubcon Opening log files manually YOU’LL NEED A TEXT EDITOR OR EVEN BETTER AN IDE (INTEGRATED DEVELOPMENT ENVIRONMENT) e.g. NOTEPAD++ BRACKETS (on MAC) • Notepad++ • Notepad • Komodo • Aptana • Netbeans • Eclipse • Brackets (Mac)
  • 64. #pubcon They look something like this… SERVER LOGS EXAMPLE IN ‘BRACKETS’ IDE TEXT EDITOR
  • 65. #pubcon Move them all to excel CTRL A and Paste all into an Excel spreadsheet
  • 66. #pubcon Convert text to data in Excel Choose ‘DATA’ and convert text to columns. Delimit with ‘space’ (usually)
  • 67. #pubcon Filter by verified user agent FILTER USER AGENT on verified DNS lookup HOST http://google.com/bot.h tml
  • 68. #pubcon Many ‘scale’ tools for log file analysis
  • 70. #pubcon You cannot ‘emulate’ a Googlebot crawl It is not a ‘from start to finish’ crawl through a site
  • 71. #pubcon You may see strange URLs • Old .htaccess rules run in order of their placement in the .htaccess file • Old .htaccess rules on ‘migrated-from’ sites • Old XML sitemaps left on server • Old plugins removed but folders left • Un-optimized MySQL or other database (cluttered with legacy) • Spammers hitting your search queries & randomly spinning new links
  • 73. #pubcon Crawl prioritization & queuing is evident MANY OF THESE FILES WERE PREVIOUSLY CALLING THESE NOW REMOVED PLUGINS AND WERE QUEUED TO CRAWL TOGETHER
  • 74. #pubcon If you migrate or switch protocol… import & monitor all logs… they will chain
  • 75. #pubcon What followed the 301? FILTER ON BOT & RESPONSE CODE INCLUDING ONLY 301 AND 200 RESPONSE CODES TO FIND THE NEXT PART OF THE BOT JOURNEY (CONSIDER MULTIPLE CONNECTIONS)
  • 76. #pubcon Filter on bot and 301 to identify bad crawl chains & problematic parameters SPOT THE ISSUES
  • 77. #pubcon Filter on bot and 304 code served 304 – ‘IF MODIFIED’ (NOTICE NOTHING HAS BEEN DOWNLOADED) HEAD CHECKED ONLY. NO ‘GET’ REQUEST
  • 78. #pubcon When consolidating… check 410 progress Filter on 410 and User Agent 410 ‘GONE’
  • 80. #pubcon Check The Split of Smartphone v Desktop Googlebot
  • 83. #pubcon Explore & drill down into issues
  • 84. #pubcon Discover anomalies / gaps between analytics & crawls from user-agent logs
  • 85. #pubcon Find & reconnect orphaned pages
  • 86. #pubcon Identify the discrepancies & weak areas to prioritise
  • 87. #pubcon Do Regular Log Analysis & Crawling • Weekly or monthly crawls • Download logs or run them into the cloud automatically (RECOMMENDED) • Compare log file activity against crawls of the site • Compare crawls and log file activity against Google Analytics & GSC ‘active’ URLs
  • 88. #pubcon Closing words – Pressing ‘Recrawl now’ (April Fools) will not fix your content
  • 89. #pubcon But… Fixing your content might positively impact crawling
  • 91. #pubcon Appendix, References & Further Resources
  • 92. #pubcon Pros and cons of Excel CONS • Fiddly • Mostly Manual process • Limited capacity • Need to keep updating with more data PROS • Easy to set up • Suitable for small analysis • Simple to understand • Easy to filter & sort
  • 93. #pubcon Loggly CONS • Free version very limited • Not initially intuitive • Integrates with server • Integrates with log-shipper intermediaries like ‘Docker’ PROS • Option to upload files • Good graphical analysis • Can build nice reports • Cloud based software • Great dashboard
  • 94. #pubcon Splunk Light CONS • Free version limited • Based on usage • Can soon add up • Not easily configured in the cloud • UI not intuitive PROS • Downloadable version • Good for medium projects • You have control (on own machine) • Easy to pick out essentials • Can integrate with Deepcrawl
  • 95. #pubcon Screaming Frog Log Analyzer CONS • Need to increase RAM allowance almost always • Very limited free version • Again, has limits as it’s not cloud based PROS • Very easy to configure • Can compare log URIs with crawl data • Very similar to excel • Some graphics to build reports • Overlay GA & GSC
  • 96. #pubcon Places to find bot footprints • On server analytics & visitor analytics screens – e.g. Awstats • Google search console • Server logs (raw access logs)