Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Advanced Technical SEO: A Sneak Peek at the Bing Crawler By Frederic Dubut

804 views

Published on

From the #SMX Advanced Conference in Seattle, Washington June 11-13, 2018. SESSION: Advanced Technical SEO: Page Speed, Site Migrations, Crawling. PRESENTATION: Advanced Technical SEO: A Sneak Peek at the Bing Crawler - Given by Frederic Dubut, @CoperniX - Bing, Senior Program Manager

Published in: Marketing

Advanced Technical SEO: A Sneak Peek at the Bing Crawler By Frederic Dubut

  1. 1. #SMX #13A @CoperniX The truth behind crawl budget and site migrations Advanced Technical SEO: A Sneak Peek at the Bing Crawler Frédéric Dubut
  2. 2. #SMX #13A @CoperniX The Search Engine Crawler How Google says I look like How SEOs see me How I really look like
  3. 3. #SMX #13A @CoperniX Service Unavailable HTTP Error 503. The service is unavailable. Building a crawler is easy…
  4. 4. #SMX #13A @CoperniX Crawl Manager …making it polite is harder!
  5. 5. #SMX #13A @CoperniX Crawl budget is how much the crawler thinks it can crawl without hurting your site performance. Crawl budget is how much the crawler thinks it can crawl without hurting your site performance. Crawl budget is how much the crawler thinks it can crawl without hurting your site performance.
  6. 6. #SMX #13A @CoperniX Each bottleneck has its own crawl budget Server contoso.com www.contoso.com blog.contoso.com www.fabrikam.com www.proseware.com Assigned IPs 20.190.133.0/28 40.78.208.32/30 To be crawled, a URL must fit in all the applicable budgets.
  7. 7. #SMX #13A @CoperniX Determining crawl budget is an iterative process HTTP Status 2xx 3xx 4xx 5xx Demand vs. Budget Demand Budget Increase Budget Decrease Budget + Connection Errors + DownloadTime + Content Size + Other Signals Crawl Queue
  8. 8. #SMX #13A @CoperniX When crawl budget meets crawl demand Small websites Large websites Great SEO Poor SEO EASIER HARDER Demand vs. Budget Demand Budget Demand vs. Budget Demand Budget Demand vs. Budget Demand Budget Demand vs. Budget Demand Budget
  9. 9. #SMX #13A @CoperniX Rule of thumb: your crawl budget should allow the crawler to recrawl your entire site in about two weeks.YMMV: publishing schedule, update frequency, exceptional events (e.g. site migration), etc…
  10. 10. #SMX #13A @CoperniX Share these #SMXInsights on Technical SEO: Crawling #SMXInsights §Crawl budget is how much the crawler thinks it can crawl without hurting your site performance. §Each bottleneck has its own crawl budget – monitor budget for each property, domain, IP (or IP range).
  11. 11. #SMX #13A @CoperniX
  12. 12. #SMX #13A @CoperniX Scale your infrastructure in number of users and content. Scale your infrastructure in number of users and content. Scale your infrastructure in number of users and content.
  13. 13. #SMX #13A @CoperniX Freeing up server resources to increase crawl budget Reduce resource consumption Eliminate wastePerformance Security
  14. 14. #SMX #13A @CoperniX How to waste your crawl budget? Duplicate content No sitemap Heavy dynamic rendering No redirects No RSS feed Many secondary resources (JS, CSS…) Long redirect chains No “lastmod” in sitemap Mobile “m.” URLs No canonical tags Many useless or junk URLs Many useless URL parameters
  15. 15. #SMX #13A @CoperniX URL Status Content Target Signals http://www.contoso.com/ Indexed <html>… N/A Score=1000 URL Status Content Target Signals http://www.contoso.com/ Indexed <html>… https://www.contoso.com/ Score=1000 https://www.contoso.com/ Discovered N/A N/A N/A URL Status Content Target Signals http://www.contoso.com/ Indexed <html>… https://www.contoso.com/ Score=1000 https://www.contoso.com/ Indexed <html>… N/A N/A URL Status Content Target Signals http://www.contoso.com/ Redirect N/A https://www.contoso.com/ Score=1000 https://www.contoso.com/ Indexed <html>… N/A N/A URL Status Content Target Signals http://www.contoso.com/ Redirect N/A https://www.contoso.com/ N/A https://www.contoso.com/ Indexed <html>… N/A Score=1000 Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ http://www.contoso.com/about.php … Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ http://www.contoso.com/about.php … HTTP 301 redirect step by step 301 Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ https://www.contoso.com/ http://www.contoso.com/about.php Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ https://www.contoso.com/ http://www.contoso.com/about.php 200 Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ https://www.contoso.com/ http://www.contoso.com/about.php
  16. 16. #SMX #13A @CoperniX URL Status Content Target Signals http://www.contoso.com/ Indexed <html>… N/A Score=1000 https://www.contoso.com/ Indexed <html>… N/A Score=100 URL Status Content Target Signals http://www.contoso.com/ Indexed <html>… N/A Score=1000 https://www.contoso.com/ Indexed <html>… N/A Score=100 Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ https://www.contoso.com/ http://www.contoso.com/about.php If both pages return HTTP 200…
  17. 17. #SMX #13A @CoperniX URL Status Content Target Signals http://www.contoso.com/ Indexed N/A N/A Score=100 https://www.contoso.com/ Indexed <html>… N/A Score=100 Crawl Queue http://www.contoso.com/ http://blog.contoso.com/ https://www.contoso.com/ http://www.contoso.com/about.php If the old page is blocked in robots.txt…
  18. 18. #SMX #13A @CoperniX Share these #SMXInsights on Technical SEO: Site Migrations #SMXInsights §Scale your infrastructure in both number of users and content. §Performance and security work also impact positively crawl budget. §HTTP 301 redirects are the gold standard of site migrations.
  19. 19. #SMX #13A @CoperniX LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX

×