SlideShare a Scribd company logo
1 of 25
Download to read offline
LOG FILE ANALYSIS 
The most powerful tool in your SEO toolkit 
Tom Bennet 
Consultant, Builtvisible 
@tomcbennet
Getting Started
What is a log file? 
A record of all hits that a server has received – humans and robots. 
http://www.brightonseo.com/about/ 
1. Protocol 
2. Host name 
3. File name 
Host name -> IP Address via DNS -> Connection to Server -> 
HTTP Get Request via Protocol for File -> HTML to Browser
They’re not pretty…
…but they’re very powerful. 
188.65.114.122 - - [30/Sep/2013:08:07:05 -0400] "GET /resources/whitepapers/retail-whitepaper/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; + http://www.google.com/bot.html)" 
Server IP 
Timestamp (date & time) 
Method (GET / POST) 
Request URI 
HTTP status code 
User-agent
Log Files & SEO
What is Crawl Budget? 
Crawl Budget = The number of URLs crawled on each visit to your site. 
Higher Authority = Higher Crawl Budget
Crawl Budget Utilisation 
http://example.com/thin-product-page-1 
http://example.com/category/thin-product-page-1 
http://example.com/category/subcategory/thin-product-page-1 
http://example.com/category/subcategory/thin-product-page-1?colour=blue 
Etc… 
Conservation of crawl budget is key.
Working With Logs
Preparing Your Data 
Extraction: Varies by server. See accompanying guide. 
Filter: By Googlebot user-agent, validate the IP range. https://support.google.com/webmasters/answer/80553?hl=en 
Tools: Gamut and Splunk are great, but you can’t beat Excel.
Working in Excel 
1. Convert .log to .csv 
(cool tip: just change the file extension)
Working in Excel 
2. Sample size 
(60-120k Googlebot requests / rows is a good size)
Working in Excel 
3. Text-to-columns 
(a space will usually be a suitable delimiter)
Working in Excel 
4. Create a table 
(Label your columns, sort by timestamp)
Investigate
Most vs Least Crawled 
Formula: Use COUNTIF on Request URL. 
Tip: Extract top-level category for crawl distribution by site-section. 
http://www.brightonseo.com/speakers/person-name/
Crawl Frequency Over Time 
Formula: Pivot date against count of requests. 
Tip: Segment by site section or by user-agent (G-bot Mobile, Images, Video, etc).
HTTP Response Codes 
Formula: Total up HTTP Response Codes. 
Tip: Find most common 302s or 404s, filter by code and sort by URL occurrence.
Level Up 
Robots.txt – Crawl all URLs with Screaming Frog to determine if they are blocked in robots.txt. Investigate most frequently crawled. 
Faceted Nav Issues – Dedupe a list of unique resources, sort by times requested. 
Sitemap – Add your sitemap URLs into an Excel table, VLOOKUP against your logs. Which mapped URLs are crawl deficient? 
CSS / JS – These resources should be crawlable, but are files unnecessary for render absorbing an inordinate amount of crawl budget?
Top Level Crawl Waste 
Formula: Use IF statements to check for every cause of waste.
Crime = Solved
All Brighton SEO attendees will receive the guide via email.
THANKS FOR LISTENING 
Get in touch 
e: tom@builtvisible.com 
t: @tomcbennet 
Tom Bennet 
Consultant, Builtvisible 
@tomcbennet

More Related Content

What's hot

How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021
How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021
How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021Lily Ray
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEOKevin Jonas
 
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegon
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegonGoogle Sheets + SEO = 15 tips en 15 minutos #VamosTalegon
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegonAleyda Solís
 
Seo campaign strategy
Seo campaign strategySeo campaign strategy
Seo campaign strategyCanbayInc
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBethBarnham1
 
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...Tevfik Mert Azizoglu
 
Improving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File InsightsImproving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File InsightsSteven van Vessum
 
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing PagesAreej AbuAli
 
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance FrameworkGoodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance FrameworkAleyda Solís
 
Seo proposal for tensator group
Seo proposal for tensator groupSeo proposal for tensator group
Seo proposal for tensator groupParixit Dwivedi
 
Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...LazarinaStoyanova
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBethBarnham1
 
[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversionFelipe Bazon
 
Kleecks - AI-Martech as a game changer-DEF.pdf
Kleecks - AI-Martech as a game changer-DEF.pdfKleecks - AI-Martech as a game changer-DEF.pdf
Kleecks - AI-Martech as a game changer-DEF.pdfKleecks
 
Discovering SEO Opportunities through Log Analysis #DTDConf
 Discovering SEO Opportunities through Log Analysis #DTDConf Discovering SEO Opportunities through Log Analysis #DTDConf
Discovering SEO Opportunities through Log Analysis #DTDConfAleyda Solís
 

What's hot (20)

How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021
How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021
How the E-A-T Ecosystem has Transformed Organic Search - Lily Ray - MozCon 2021
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEO
 
On Page SEO
On Page SEOOn Page SEO
On Page SEO
 
Log File Analysis
Log File AnalysisLog File Analysis
Log File Analysis
 
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegon
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegonGoogle Sheets + SEO = 15 tips en 15 minutos #VamosTalegon
Google Sheets + SEO = 15 tips en 15 minutos #VamosTalegon
 
Seo campaign strategy
Seo campaign strategySeo campaign strategy
Seo campaign strategy
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
 
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...
SEO Automation Without Using Hard Code by Tevfik Mert Azizoglu - BrightonSEO ...
 
Improving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File InsightsImproving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File Insights
 
Don't be a cannibal
Don't be a cannibalDon't be a cannibal
Don't be a cannibal
 
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
 
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance FrameworkGoodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
 
Seo proposal for tensator group
Seo proposal for tensator groupSeo proposal for tensator group
Seo proposal for tensator group
 
Seo for-content
Seo for-contentSeo for-content
Seo for-content
 
Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...
 
SEO - a brief introduction
SEO - a brief introductionSEO - a brief introduction
SEO - a brief introduction
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
 
[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion
 
Kleecks - AI-Martech as a game changer-DEF.pdf
Kleecks - AI-Martech as a game changer-DEF.pdfKleecks - AI-Martech as a game changer-DEF.pdf
Kleecks - AI-Martech as a game changer-DEF.pdf
 
Discovering SEO Opportunities through Log Analysis #DTDConf
 Discovering SEO Opportunities through Log Analysis #DTDConf Discovering SEO Opportunities through Log Analysis #DTDConf
Discovering SEO Opportunities through Log Analysis #DTDConf
 

Similar to Analyze Log Files and Improve Your SEO with Excel

Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)Jeremy Cabral
 
12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocratlinoj
 
The Technical SEO Full Course how to do
The Technical SEO  Full Course  how to doThe Technical SEO  Full Course  how to do
The Technical SEO Full Course how to doasadkhan888889990
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessAnetwork
 
Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Beat Signer
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first courseVlad Posea
 
RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座Li Yi
 
Software performance testing_overview
Software performance testing_overviewSoftware performance testing_overview
Software performance testing_overviewRohan Bhattarai
 
How To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsHow To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsWembrio
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-servicesrporwal
 
Improving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesImproving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesNikos Katirtzis
 
Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Nikos Katirtzis
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacksFrank Victory
 

Similar to Analyze Log Files and Improve Your SEO with Excel (20)

White Hat Cloaking
White Hat CloakingWhite Hat Cloaking
White Hat Cloaking
 
OTG-Recon
OTG-ReconOTG-Recon
OTG-Recon
 
Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)
 
12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat
 
The Technical SEO Full Course how to do
The Technical SEO  Full Course  how to doThe Technical SEO  Full Course  how to do
The Technical SEO Full Course how to do
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to Success
 
Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)
 
Web hacking
Web hackingWeb hacking
Web hacking
 
ProjectHub
ProjectHubProjectHub
ProjectHub
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first course
 
Fundamentals Of Search
Fundamentals Of SearchFundamentals Of Search
Fundamentals Of Search
 
RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座
 
Software performance testing_overview
Software performance testing_overviewSoftware performance testing_overview
Software performance testing_overview
 
Apex REST
Apex RESTApex REST
Apex REST
 
internet workshop
internet workshopinternet workshop
internet workshop
 
How To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsHow To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web Applications
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-services
 
Improving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesImproving your team’s source code searching capabilities
Improving your team’s source code searching capabilities
 
Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacks
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Analyze Log Files and Improve Your SEO with Excel

  • 1. LOG FILE ANALYSIS The most powerful tool in your SEO toolkit Tom Bennet Consultant, Builtvisible @tomcbennet
  • 2.
  • 4. What is a log file? A record of all hits that a server has received – humans and robots. http://www.brightonseo.com/about/ 1. Protocol 2. Host name 3. File name Host name -> IP Address via DNS -> Connection to Server -> HTTP Get Request via Protocol for File -> HTML to Browser
  • 6. …but they’re very powerful. 188.65.114.122 - - [30/Sep/2013:08:07:05 -0400] "GET /resources/whitepapers/retail-whitepaper/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; + http://www.google.com/bot.html)" Server IP Timestamp (date & time) Method (GET / POST) Request URI HTTP status code User-agent
  • 8. What is Crawl Budget? Crawl Budget = The number of URLs crawled on each visit to your site. Higher Authority = Higher Crawl Budget
  • 9. Crawl Budget Utilisation http://example.com/thin-product-page-1 http://example.com/category/thin-product-page-1 http://example.com/category/subcategory/thin-product-page-1 http://example.com/category/subcategory/thin-product-page-1?colour=blue Etc… Conservation of crawl budget is key.
  • 11. Preparing Your Data Extraction: Varies by server. See accompanying guide. Filter: By Googlebot user-agent, validate the IP range. https://support.google.com/webmasters/answer/80553?hl=en Tools: Gamut and Splunk are great, but you can’t beat Excel.
  • 12. Working in Excel 1. Convert .log to .csv (cool tip: just change the file extension)
  • 13. Working in Excel 2. Sample size (60-120k Googlebot requests / rows is a good size)
  • 14. Working in Excel 3. Text-to-columns (a space will usually be a suitable delimiter)
  • 15. Working in Excel 4. Create a table (Label your columns, sort by timestamp)
  • 17. Most vs Least Crawled Formula: Use COUNTIF on Request URL. Tip: Extract top-level category for crawl distribution by site-section. http://www.brightonseo.com/speakers/person-name/
  • 18. Crawl Frequency Over Time Formula: Pivot date against count of requests. Tip: Segment by site section or by user-agent (G-bot Mobile, Images, Video, etc).
  • 19. HTTP Response Codes Formula: Total up HTTP Response Codes. Tip: Find most common 302s or 404s, filter by code and sort by URL occurrence.
  • 20.
  • 21. Level Up Robots.txt – Crawl all URLs with Screaming Frog to determine if they are blocked in robots.txt. Investigate most frequently crawled. Faceted Nav Issues – Dedupe a list of unique resources, sort by times requested. Sitemap – Add your sitemap URLs into an Excel table, VLOOKUP against your logs. Which mapped URLs are crawl deficient? CSS / JS – These resources should be crawlable, but are files unnecessary for render absorbing an inordinate amount of crawl budget?
  • 22. Top Level Crawl Waste Formula: Use IF statements to check for every cause of waste.
  • 24. All Brighton SEO attendees will receive the guide via email.
  • 25. THANKS FOR LISTENING Get in touch e: tom@builtvisible.com t: @tomcbennet Tom Bennet Consultant, Builtvisible @tomcbennet