SlideShare a Scribd company logo
Log Files
The Overlooked Source of SEO Opportunities
Robin Rozhon
Why should care about how search
engines crawl my site?
@RobinRozhon
@RobinRozhon
Rossignol Soul 7 HD Skis - Men's - MEC
https://www.mec.ca › All › Snowsports › Backcountry skiing › Gear › Backcountry skis
★★★★★ Rating: 4.1 - 3 reviews - $799.00 - In stock
Soul 7 HD Skis: A fat freeride ski that can serve as your entire quiver. Floaty in powder with
crud-busting power, it's lightweight and easy to turn in all conditions. Powder-turn rocker.
55%
@RobinRozhon
increase in the
revenue YOY
SEO crawlers don’t reveal the real
search engines behavior.
@RobinRozhon
What’s crawl budget?
@RobinRozhon
What’s crawl budget?
Crawl rate limit & Crawl demand
@RobinRozhon
Optimizing crawl budget for a 100
pages website doesn’t make sense.
@RobinRozhon
The hardest thing?
Getting access to log files.
@RobinRozhon
@RobinRozhon
Breakdown of a log file
A log file is a recording of everything
that goes in and out of a server.
@RobinRozhon
181.224.137.56 - - [28/Feb/2018:20:11:07 -0600] "POST /wp-cron.php?doing_wp_cron=1519870267.7953920364379882812500 HTTP/1.0"
200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519870267.7953920364379882812500" "WordPress/4.9.4; https://rozhon.com"
2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:11:07 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh;
Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
66.249.65.104 - - [28/Feb/2018:20:11:29 -0600] "GET / HTTP/1.0" 200 49415 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
181.224.137.56 - - [28/Feb/2018:20:21:11 -0600] "POST /wp-cron.php?doing_wp_cron=1519870871.4666459560394287109375 HTTP/1.0"
200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519870871.4666459560394287109375" "WordPress/4.9.4; https://rozhon.com"
2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:21:11 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh;
Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
148.64.56.79 - - [28/Feb/2018:20:30:18 -0600] "GET /robots.txt HTTP/1.0" 200 67 "-" "Mozilla/5.0 (compatible;
GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)"
148.64.56.79 - - [28/Feb/2018:20:30:21 -0600] "GET /blog/crawling-indexing-technical-seo-basics-that-drive-revenue/ HTTP/1.0"
200 45892 "-" "Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)"
181.224.137.56 - - [28/Feb/2018:20:31:14 -0600] "POST /wp-cron.php?doing_wp_cron=1519871474.0248880386352539062500 HTTP/1.0"
200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519871474.0248880386352539062500" "WordPress/4.9.4; https://rozhon.com"
2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:31:13 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh;
Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
181.224.137.56 - - [28/Feb/2018:20:32:44 -0600] "POST /wp-cron.php?doing_wp_cron=1519871564.8240199089050292968750 HTTP/1.0"
200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519871564.8240199089050292968750" "WordPress/4.9.4; https://rozhon.com"
8.29.198.26 - - [28/Feb/2018:20:32:44 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Feedly/1.0
(+http://www.feedly.com/fetcher.html; like FeedFetcher-Google)"
2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET
/blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_campaign=moztop10&_h
senc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_hsmi=60846479
HTTP/1.0" 200 46022 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167
Safari/537.36"
2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET
/wp-content/plugins/yet-another-related-posts-plugin/style/widget.css?ver=ebb0b10aabc902f625c6c73e10147729 HTTP/1.0" 200 384
"https://rozhon.com/blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_c
ampaign=moztop10&_hsenc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_
hsmi=60846479" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167
Safari/537.36"
2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET
/wp-content/plugins/jquery-smooth-scroll/css/jss-style.css?ver=ebb0b10aabc902f625c6c73e10147729 HTTP/1.0" 200 372
"https://rozhon.com/blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_c
ampaign=moztop10&_hsenc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_
hsmi=60846479" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167
Safari/537.36"
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
IP Address (WHO)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
Timestamp (WHEN)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
Access Request (WHAT)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
Status Code (RESULT)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
Bytes Transferred (SIZE)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
Referrer URL (SOURCE)
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/
HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
@RobinRozhon
User Agent (SIGNATURE)
@RobinRozhon
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
66.249.65.107 >> crawl-66-249-65-107.googlebot.com
Rev D Lo k
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
66.249.65.107 >> crawl-66-249-65-107.googlebot.com
Rev D Lo k
For d D Lo k
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
66.249.65.107 >> crawl-66-249-65-107.googlebot.com
Rev D Lo k
For d D Lo k
crawl-66-249-65-107.googlebot.com >> 66.249.65.107
@RobinRozhon
66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
66.249.65.107 >> crawl-66-249-65-107.googlebot.com
crawl-66-249-65-107.googlebot.com >> 66.249.65.107
Rev D Lo k
For d D Lo k
@RobinRozhon
Which search engines crawl my website?
Google
Bing
Yandex
Baidu
Seznam
@RobinRozhon
Which URLs are crawled most often?
/
/products/cycling
/products/snowsports
/product/intense-bike
/contact
@RobinRozhon
Which content types are crawled most often?
homepage
pdp
plp
content
@RobinRozhon
Which status codes are returned?
200
301
404
500
410
@RobinRozhon
Which status codes are returned?
200
301
404
500
410
@RobinRozhon
Which URLs return 404?
/en/brand/castelli+continental+crank-brothers+fizik+smith+oakley+spy_plus_/helmets/c/1221?sort=newest 188
/en/avalanche-beacons-and-transceivers/c/1112 183
/en/water-and-board-shorts/c/1666 104
/en/brand/castelli+crank-brothers+oakley+spy_plus_/helmets/helmet-covers-and-accessories/c/1223?sort=newest 102
desktop crawler
smartphone crawler
@RobinRozhon
80% 20%
Desktop-first
desktop crawler
smartphone crawler
@RobinRozhon
80%20%
Mobile-first
Tips
Segment data
@RobinRozhon
1)
Merge log files with other
data sources
@RobinRozhon
2)
Use log file to debug
Google Analytics
@RobinRozhon
3)
@RobinRozhon
Mountain Equipment Co-op – MEC – Shop climbing, cycling, running ...
https://www.mec.ca/en/
Canada's go-to place for outdoor gear, know-how and inspiration, MEC is a co-op owned
by the people who shop here. A lifetime membership is $5.
@RobinRozhon
Mountain Equipment Co-op – MEC – Shop climbing, cycling, running ...
https://www.mec.ca/
Canada's go-to place for outdoor gear, know-how and inspiration, MEC is a co-op owned
by the people who shop here. A lifetime membership is $5.
@RobinRozhon
@RobinRozhon
@RobinRozhon
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
@RobinRozhon
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
@RobinRozhon
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
@RobinRozhon
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.mec.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
@RobinRozhon
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.mec.ca/
Direct Traffic
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
Web Application Firewall
(WAF) disabled.
@RobinRozhon
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
@RobinRozhon
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.mec.ca/
Direct Traffic
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
@RobinRozhon
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.mec.ca/
Direct Traffic
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
@RobinRozhon
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
Request: https://www.mec.ca/
Response: 302 (redirect)
Referrer: https://www.google.ca/
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.mec.ca/
Direct Traffic
Request: https://www.mec.ca/en/
Response: 200 (success)
Referrer: https://www.google.ca/
Organic Traffic
Thank you!
@RobinRozhon
rozhon.com/statcitycrawl2018

More Related Content

What's hot

Google Hacking 101
Google Hacking 101Google Hacking 101
Google Hacking 101
Sais Abdelkrim
 
Finding things on the web with BOSS
Finding things on the web with BOSSFinding things on the web with BOSS
Finding things on the web with BOSS
Christian Heilmann
 
Bea con anatomy-of-web-attack
Bea con anatomy-of-web-attackBea con anatomy-of-web-attack
Bea con anatomy-of-web-attack
Patrick Laverty
 
How a Hacker Sees Your Site
How a Hacker Sees Your SiteHow a Hacker Sees Your Site
How a Hacker Sees Your Site
Patrick Laverty
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Dawn Anderson MSc DigM
 
google dork.pdf
google dork.pdfgoogle dork.pdf
google dork.pdf
Mahesh Pradhan
 
How to optimise TTFB - BrightonSEO 2020
How to optimise TTFB - BrightonSEO 2020How to optimise TTFB - BrightonSEO 2020
How to optimise TTFB - BrightonSEO 2020
Roxana Stingu
 
Pets and Pandas.
Pets and Pandas.Pets and Pandas.
Pets and Pandas.
Chris Sherry
 
What Is Seo How Search Engine Works
What Is Seo How Search Engine WorksWhat Is Seo How Search Engine Works
What Is Seo How Search Engine WorksAddithink
 
Poisoning Google images
Poisoning Google imagesPoisoning Google images
Poisoning Google images
lukash4
 
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual SearchSEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
Dawn Anderson MSc DigM
 
Footprints
FootprintsFootprints
Footprints
Aceline Adam
 
Httprewardgalaxy.comref=544212
Httprewardgalaxy.comref=544212Httprewardgalaxy.comref=544212
Httprewardgalaxy.comref=544212Sabancı Sabancı
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebots
Dawn Anderson MSc DigM
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawling
Dawn Anderson MSc DigM
 
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Dawn Anderson MSc DigM
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020
Tom Anthony
 
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
nous sommes vivants
 
Introduction to python scrapping
Introduction to python scrappingIntroduction to python scrapping
Introduction to python scrapping
n|u - The Open Security Community
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTO
La Cuisine du Web
 

What's hot (20)

Google Hacking 101
Google Hacking 101Google Hacking 101
Google Hacking 101
 
Finding things on the web with BOSS
Finding things on the web with BOSSFinding things on the web with BOSS
Finding things on the web with BOSS
 
Bea con anatomy-of-web-attack
Bea con anatomy-of-web-attackBea con anatomy-of-web-attack
Bea con anatomy-of-web-attack
 
How a Hacker Sees Your Site
How a Hacker Sees Your SiteHow a Hacker Sees Your Site
How a Hacker Sees Your Site
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
 
google dork.pdf
google dork.pdfgoogle dork.pdf
google dork.pdf
 
How to optimise TTFB - BrightonSEO 2020
How to optimise TTFB - BrightonSEO 2020How to optimise TTFB - BrightonSEO 2020
How to optimise TTFB - BrightonSEO 2020
 
Pets and Pandas.
Pets and Pandas.Pets and Pandas.
Pets and Pandas.
 
What Is Seo How Search Engine Works
What Is Seo How Search Engine WorksWhat Is Seo How Search Engine Works
What Is Seo How Search Engine Works
 
Poisoning Google images
Poisoning Google imagesPoisoning Google images
Poisoning Google images
 
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual SearchSEO - The Rise of Persona Modelled Intent Driven Contextual Search
SEO - The Rise of Persona Modelled Intent Driven Contextual Search
 
Footprints
FootprintsFootprints
Footprints
 
Httprewardgalaxy.comref=544212
Httprewardgalaxy.comref=544212Httprewardgalaxy.comref=544212
Httprewardgalaxy.comref=544212
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebots
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawling
 
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLs
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020
 
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
courts circuits : l'innovation dans le luxe 'mon idendité de luxe" (partie 3)
 
Introduction to python scrapping
Introduction to python scrappingIntroduction to python scrapping
Introduction to python scrapping
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTO
 

Similar to Log files: The Overlooked Source of SEO Opportunities

Crawling & Indexing: Technical SEO that drives revenue
Crawling & Indexing: Technical SEO that drives revenueCrawling & Indexing: Technical SEO that drives revenue
Crawling & Indexing: Technical SEO that drives revenue
Robin Rozhon
 
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
11 Advanced Uses of Screaming Frog Nov 2019 DMSS11 Advanced Uses of Screaming Frog Nov 2019 DMSS
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
Oliver Brett
 
Developing web applications in 2010
Developing web applications in 2010Developing web applications in 2010
Developing web applications in 2010
Ignacio Coloma
 
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Anna Klepacka
 
Ruby Robots
Ruby RobotsRuby Robots
Ruby Robots
Daniel Cukier
 
Consuming REST Services in BizTalk 2010
Consuming REST Services in BizTalk 2010Consuming REST Services in BizTalk 2010
Consuming REST Services in BizTalk 2010
Daniel Toomey
 
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
Nicholas Zakas
 
Progressive Enhancement 2.0 (Conference Agnostic)
Progressive Enhancement 2.0 (Conference Agnostic)Progressive Enhancement 2.0 (Conference Agnostic)
Progressive Enhancement 2.0 (Conference Agnostic)
Nicholas Zakas
 
It's Mechanize for it. Ruby as a Finder.
It's Mechanize for it. Ruby as a Finder.It's Mechanize for it. Ruby as a Finder.
It's Mechanize for it. Ruby as a Finder.Tomohiro Nishimura
 
RoR Workshop - Web applications hacking - Ruby on Rails example
RoR Workshop - Web applications hacking - Ruby on Rails exampleRoR Workshop - Web applications hacking - Ruby on Rails example
RoR Workshop - Web applications hacking - Ruby on Rails example
Railwaymen
 
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Vlad Savitsky
 
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
patrickstox
 
Http/2 - What's it all about?
Http/2  - What's it all about?Http/2  - What's it all about?
Http/2 - What's it all about?Andy Davies
 
Enabling Microservice @ Orbitz - GOTO Chicago 2016
Enabling Microservice @ Orbitz - GOTO Chicago 2016Enabling Microservice @ Orbitz - GOTO Chicago 2016
Enabling Microservice @ Orbitz - GOTO Chicago 2016
Steve Hoffman
 
Design+Performance Velocity 2015
Design+Performance Velocity 2015Design+Performance Velocity 2015
Design+Performance Velocity 2015
Steve Souders
 
A faster journey with HTTP
A faster journey with HTTPA faster journey with HTTP
A faster journey with HTTP
Michael Ewins
 
Movable Type Seminar 2011
Movable Type Seminar 2011Movable Type Seminar 2011
Movable Type Seminar 2011
Six Apart KK
 
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital MarketersSearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
Distilled
 
The Case for HTTP/2 - Internetdagarna 2015 - Stockholm
The Case for HTTP/2  - Internetdagarna 2015 - StockholmThe Case for HTTP/2  - Internetdagarna 2015 - Stockholm
The Case for HTTP/2 - Internetdagarna 2015 - Stockholm
Andy Davies
 

Similar to Log files: The Overlooked Source of SEO Opportunities (20)

Crawling & Indexing: Technical SEO that drives revenue
Crawling & Indexing: Technical SEO that drives revenueCrawling & Indexing: Technical SEO that drives revenue
Crawling & Indexing: Technical SEO that drives revenue
 
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
11 Advanced Uses of Screaming Frog Nov 2019 DMSS11 Advanced Uses of Screaming Frog Nov 2019 DMSS
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
 
Developing web applications in 2010
Developing web applications in 2010Developing web applications in 2010
Developing web applications in 2010
 
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
 
Ruby Robots
Ruby RobotsRuby Robots
Ruby Robots
 
Consuming REST Services in BizTalk 2010
Consuming REST Services in BizTalk 2010Consuming REST Services in BizTalk 2010
Consuming REST Services in BizTalk 2010
 
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
Progressive Enhancement 2.0 (jQuery Conference SF Bay Area 2011)
 
Progressive Enhancement 2.0 (Conference Agnostic)
Progressive Enhancement 2.0 (Conference Agnostic)Progressive Enhancement 2.0 (Conference Agnostic)
Progressive Enhancement 2.0 (Conference Agnostic)
 
It's Mechanize for it. Ruby as a Finder.
It's Mechanize for it. Ruby as a Finder.It's Mechanize for it. Ruby as a Finder.
It's Mechanize for it. Ruby as a Finder.
 
RoR Workshop - Web applications hacking - Ruby on Rails example
RoR Workshop - Web applications hacking - Ruby on Rails exampleRoR Workshop - Web applications hacking - Ruby on Rails example
RoR Workshop - Web applications hacking - Ruby on Rails example
 
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
 
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
 
Http/2 - What's it all about?
Http/2  - What's it all about?Http/2  - What's it all about?
Http/2 - What's it all about?
 
Enabling Microservice @ Orbitz - GOTO Chicago 2016
Enabling Microservice @ Orbitz - GOTO Chicago 2016Enabling Microservice @ Orbitz - GOTO Chicago 2016
Enabling Microservice @ Orbitz - GOTO Chicago 2016
 
Design+Performance Velocity 2015
Design+Performance Velocity 2015Design+Performance Velocity 2015
Design+Performance Velocity 2015
 
A faster journey with HTTP
A faster journey with HTTPA faster journey with HTTP
A faster journey with HTTP
 
URL Design
URL DesignURL Design
URL Design
 
Movable Type Seminar 2011
Movable Type Seminar 2011Movable Type Seminar 2011
Movable Type Seminar 2011
 
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital MarketersSearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
SearchLove San Diego 2018 | Mat Clayton | Site Speed for Digital Marketers
 
The Case for HTTP/2 - Internetdagarna 2015 - Stockholm
The Case for HTTP/2  - Internetdagarna 2015 - StockholmThe Case for HTTP/2  - Internetdagarna 2015 - Stockholm
The Case for HTTP/2 - Internetdagarna 2015 - Stockholm
 

Recently uploaded

Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 

Recently uploaded (20)

Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 

Log files: The Overlooked Source of SEO Opportunities

  • 1. Log Files The Overlooked Source of SEO Opportunities Robin Rozhon
  • 2. Why should care about how search engines crawl my site? @RobinRozhon
  • 3. @RobinRozhon Rossignol Soul 7 HD Skis - Men's - MEC https://www.mec.ca › All › Snowsports › Backcountry skiing › Gear › Backcountry skis ★★★★★ Rating: 4.1 - 3 reviews - $799.00 - In stock Soul 7 HD Skis: A fat freeride ski that can serve as your entire quiver. Floaty in powder with crud-busting power, it's lightweight and easy to turn in all conditions. Powder-turn rocker.
  • 5. SEO crawlers don’t reveal the real search engines behavior. @RobinRozhon
  • 7. What’s crawl budget? Crawl rate limit & Crawl demand @RobinRozhon
  • 8. Optimizing crawl budget for a 100 pages website doesn’t make sense. @RobinRozhon
  • 9.
  • 10. The hardest thing? Getting access to log files. @RobinRozhon
  • 12. Breakdown of a log file
  • 13. A log file is a recording of everything that goes in and out of a server. @RobinRozhon
  • 14. 181.224.137.56 - - [28/Feb/2018:20:11:07 -0600] "POST /wp-cron.php?doing_wp_cron=1519870267.7953920364379882812500 HTTP/1.0" 200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519870267.7953920364379882812500" "WordPress/4.9.4; https://rozhon.com" 2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:11:07 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36" 66.249.65.104 - - [28/Feb/2018:20:11:29 -0600] "GET / HTTP/1.0" 200 49415 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 181.224.137.56 - - [28/Feb/2018:20:21:11 -0600] "POST /wp-cron.php?doing_wp_cron=1519870871.4666459560394287109375 HTTP/1.0" 200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519870871.4666459560394287109375" "WordPress/4.9.4; https://rozhon.com" 2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:21:11 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36" 148.64.56.79 - - [28/Feb/2018:20:30:18 -0600] "GET /robots.txt HTTP/1.0" 200 67 "-" "Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)" 148.64.56.79 - - [28/Feb/2018:20:30:21 -0600] "GET /blog/crawling-indexing-technical-seo-basics-that-drive-revenue/ HTTP/1.0" 200 45892 "-" "Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)" 181.224.137.56 - - [28/Feb/2018:20:31:14 -0600] "POST /wp-cron.php?doing_wp_cron=1519871474.0248880386352539062500 HTTP/1.0" 200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519871474.0248880386352539062500" "WordPress/4.9.4; https://rozhon.com" 2600:3c03::f03c:91ff:fe08:7a61 - - [28/Feb/2018:20:31:13 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2; +feeder.co/crawler) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36" 181.224.137.56 - - [28/Feb/2018:20:32:44 -0600] "POST /wp-cron.php?doing_wp_cron=1519871564.8240199089050292968750 HTTP/1.0" 200 - "https://rozhon.com/wp-cron.php?doing_wp_cron=1519871564.8240199089050292968750" "WordPress/4.9.4; https://rozhon.com" 8.29.198.26 - - [28/Feb/2018:20:32:44 -0600] "GET /feed/ HTTP/1.0" 200 6212 "-" "Feedly/1.0 (+http://www.feedly.com/fetcher.html; like FeedFetcher-Google)" 2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET /blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_campaign=moztop10&_h senc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_hsmi=60846479 HTTP/1.0" 200 46022 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36" 2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET /wp-content/plugins/yet-another-related-posts-plugin/style/widget.css?ver=ebb0b10aabc902f625c6c73e10147729 HTTP/1.0" 200 384 "https://rozhon.com/blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_c ampaign=moztop10&_hsenc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_ hsmi=60846479" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36" 2601:600:997f:fcfe:1827:adb1:b306:c39a - - [28/Feb/2018:20:35:36 -0600] "GET /wp-content/plugins/jquery-smooth-scroll/css/jss-style.css?ver=ebb0b10aabc902f625c6c73e10147729 HTTP/1.0" 200 372 "https://rozhon.com/blog/crawling-indexing-technical-seo-basics-that-drive-revenue/?utm_source=moztop10&utm_medium=email&utm_c ampaign=moztop10&_hsenc=p2ANqtz-94WR530ovn2zygDfHyy9vLK8rL7eVPTdTPdFlXy9C4zZ9tTHGMZeHFzP-vHe-6mUdyxC-z1xYa2kZ1ZdWDkwE6sxt0MA&_ hsmi=60846479" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36" 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
  • 15. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon
  • 16. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon IP Address (WHO)
  • 17. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon Timestamp (WHEN)
  • 18. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon Access Request (WHAT)
  • 19. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon Status Code (RESULT)
  • 20. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon Bytes Transferred (SIZE)
  • 21. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon Referrer URL (SOURCE)
  • 22. 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" @RobinRozhon User Agent (SIGNATURE)
  • 24. @RobinRozhon 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ...
  • 25. @RobinRozhon 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ... 66.249.65.107 >> crawl-66-249-65-107.googlebot.com Rev D Lo k
  • 26. @RobinRozhon 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ... 66.249.65.107 >> crawl-66-249-65-107.googlebot.com Rev D Lo k For d D Lo k
  • 27. @RobinRozhon 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ... 66.249.65.107 >> crawl-66-249-65-107.googlebot.com Rev D Lo k For d D Lo k crawl-66-249-65-107.googlebot.com >> 66.249.65.107
  • 28. @RobinRozhon 66.249.65.107 - - [08/Dec/2017:04:54:20 -0400] "GET /contact/ ... 66.249.65.107 >> crawl-66-249-65-107.googlebot.com crawl-66-249-65-107.googlebot.com >> 66.249.65.107 Rev D Lo k For d D Lo k
  • 29. @RobinRozhon Which search engines crawl my website? Google Bing Yandex Baidu Seznam
  • 30. @RobinRozhon Which URLs are crawled most often? / /products/cycling /products/snowsports /product/intense-bike /contact
  • 31. @RobinRozhon Which content types are crawled most often? homepage pdp plp content
  • 32. @RobinRozhon Which status codes are returned? 200 301 404 500 410
  • 33. @RobinRozhon Which status codes are returned? 200 301 404 500 410
  • 34. @RobinRozhon Which URLs return 404? /en/brand/castelli+continental+crank-brothers+fizik+smith+oakley+spy_plus_/helmets/c/1221?sort=newest 188 /en/avalanche-beacons-and-transceivers/c/1112 183 /en/water-and-board-shorts/c/1666 104 /en/brand/castelli+crank-brothers+oakley+spy_plus_/helmets/helmet-covers-and-accessories/c/1223?sort=newest 102
  • 37. Tips
  • 39. Merge log files with other data sources @RobinRozhon 2)
  • 40. Use log file to debug Google Analytics @RobinRozhon 3)
  • 41. @RobinRozhon Mountain Equipment Co-op – MEC – Shop climbing, cycling, running ... https://www.mec.ca/en/ Canada's go-to place for outdoor gear, know-how and inspiration, MEC is a co-op owned by the people who shop here. A lifetime membership is $5.
  • 42. @RobinRozhon Mountain Equipment Co-op – MEC – Shop climbing, cycling, running ... https://www.mec.ca/ Canada's go-to place for outdoor gear, know-how and inspiration, MEC is a co-op owned by the people who shop here. A lifetime membership is $5.
  • 45. @RobinRozhon Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/
  • 46. @RobinRozhon Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 47. @RobinRozhon Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 48. @RobinRozhon Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.mec.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 49. @RobinRozhon Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.mec.ca/ Direct Traffic Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 50. Web Application Firewall (WAF) disabled. @RobinRozhon
  • 51. Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ @RobinRozhon Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.mec.ca/ Direct Traffic Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 52. Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ @RobinRozhon Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.mec.ca/ Direct Traffic Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic
  • 53. Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ @RobinRozhon Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic Request: https://www.mec.ca/ Response: 302 (redirect) Referrer: https://www.google.ca/ Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.mec.ca/ Direct Traffic Request: https://www.mec.ca/en/ Response: 200 (success) Referrer: https://www.google.ca/ Organic Traffic