Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

11 Advanced Uses of Screaming Frog Nov 2019 DMSS

421 views

Published on

I lied, there are actually 12 advanced uses of Screaming Frog in this presentation. 

Published in: Marketing
  • Be the first to comment

11 Advanced Uses of Screaming Frog Nov 2019 DMSS

  1. 1. 11 Advanced Uses For Screaming Frog OLIVER BRETT, SEO MANAGER, SCREAMING FROG
  2. 2. Can you make it pop?
  3. 3. Can you make it pop?
  4. 4. http://taylortomita.com/seo-graph-portfolio/ @TVYLORTOMITV
  5. 5. http://taylortomita.com/seo-graph-portfolio/ @TVYLORTOMITV
  6. 6. @LordOfTheSERPs
  7. 7. @LordOfTheSERPs
  8. 8. @LordOfTheSERPs
  9. 9. @LordOfTheSERPs
  10. 10. I Accept Fan Submissions
  11. 11. John Muller Follows Me ;-)
  12. 12. 3 Rules For Today:
  13. 13. 1. This isn’t a sales pitch.
  14. 14. 2. This isn’t a walkthrough.
  15. 15. Links will be provided for in-depth guides. Don’t panic.
  16. 16. 3. Just ask if you need help!
  17. 17. Part One: What is Screaming Frog?
  18. 18. Why is it called ‘Screaming Frog’?
  19. 19. Screaming Frog is built by SEOs, for SEOs.
  20. 20. Screaming Frog (usually) runs locally. But that’s a good thing.
  21. 21. “The Swiss Army Knife of SEO” (IT’S CUTTING EDGE…)
  22. 22. Screaming Frog Can…
  23. 23. https://www.seerinteractive.com/blog/screaming-frog-guide/
  24. 24. https://www.seerinteractive.com/blog/screaming-frog-guide/
  25. 25. https://www.seerinteractive.com/blog/screaming-frog-guide/
  26. 26. https://www.seerinteractive.com/blog/screaming-frog-guide/
  27. 27. We don’t have the answers. We have the data.
  28. 28. Screaming Frog keeps getting bigger
  29. 29. Timeline: V.1(2011)-V.11(2019)
  30. 30. V1.0 (2011) LAUNCH
  31. 31. V2.0 (2012) Word Counts URL Rewrites Crawl Control
  32. 32. No big updates (2013/2014)
  33. 33. V3.0 (2015) Tree View Insecure Content Audits Image Sitemaps + Enhanced XML Sitemaps
  34. 34. V4.0 (2015) Google Analytics Integration Custom Extraction
  35. 35. V5.0 (2015) Webmaster Tools (Google Search Console) integration Better Robots.txt Auditing/Control
  36. 36. http://taylortomita.com/seo-graph-portfolio/
  37. 37. http://taylortomita.com/seo-graph-portfolio/
  38. 38. V6.0 (2016) JavaScript Rendering Rendered Crawling Crawl Indexed Sitemaps SERP Emulator
  39. 39. V7.0 (2016) Fetch and Render Custom Robots.txt Rules Hreflang Web Form Authentication
  40. 40. V8.0 (2017) Made it look nice(r) More APIs Custom Profiles
  41. 41. V9.0 (2018) Database Storage Store HTML and Rendered HTML Better Search Functions
  42. 42. V10.0 (2018) Scheduling Visualisations CLI (Headless Mode)
  43. 43. V11.0 (2019) Structured Data- Audit, Validate and Expor t
  44. 44. Lots of Updates… V1.0 -2011- Launch V2.0-2012- Word Counts, URL Rewrites, Crawl Control V3.0- 2015- Tree View, Insecure Content, Image Sitemaps V4.0- 2015- GA Integration, Custom Extraction V5.0- 2015- GSC Integration, Better Robots.txt V6.0- 2016- Rendered Crawling, Crawl Indexed Sitemaps, SERP Emulator V7.0- 2016- Fetch and Render, Custom Robots.txt, hreflang, Web Form Authentication V8.0- 2017- Made it look nice(r), More APIs, Custom Profiles V9.0- 2018- Database Storage, Store and Render HTML, Better Search V10.0- 2018- Scheduling, Visualisations, Headless Mode V11.0- 2019- Structured Data- Audit, Validate and Export
  45. 45. Lots of Updates… V1.0 -2011- Launch V2.0-2012- Word Counts, URL Rewrites, Crawl Control V3.0- 2015- Tree View, Insecure Content, Image Sitemaps V4.0- 2015- GA Integration, Custom Extraction V5.0- 2015- GSC Integration, Better Robots.txt V6.0- 2016- Rendered Crawling, Crawl Indexed Sitemaps, SERP Emulator V7.0- 2016- Fetch and Render, Custom Robots.txt, hreflang, Web Form Authentication V8.0- 2017- Made it look nice(r), More APIs, Custom Profiles V9.0- 2018- Database Storage, Store and Render HTML, Better Search V10.0- 2018- Scheduling, Visualisations, Headless Mode V11.0- 2019- Structured Data- Audit, Validate and Export
  46. 46. V12.0 (2019)… literally last week! PageSpeed Insights Integration Autosave Faster Crawling/Loading More Extractors
  47. 47. 11 Advanced Uses For Screaming Frog OLIVER BRETT, SEO MANAGER, SCREAMING FROG
  48. 48. 11 Advanced Uses For Screaming Frog OLIVER BRETT, SEO MANAGER, SCREAMING FROG
  49. 49. 11 Advanced Uses For Screaming Frog OLIVER BRETT, SEO MANAGER, SCREAMING FROG
  50. 50. Part Two: Setup and Data
  51. 51. 1. Crawling tricky sites
  52. 52. The number one question our support guys get asked:
  53. 53. Why won’t this Logfile Analyser crawl my site?
  54. 54. Why won’t this Logfile Analyser crawl my site?
  55. 55. The number one question our support team get asked:
  56. 56. Q: Why isn’t the spider crawling more than one page?
  57. 57. A: JAVASCRIPT!!
  58. 58. Tricky site 1: JavaScript websites: HTTPS://WWW.SCREAMINGFROG.CO.UK/CRAWL- JAVASCRIPT-SEO/
  59. 59. Tricky site 2: big websites: HTTPS://WWW.SCREAMINGFROG.CO.UK/HOW-TO-CRAWL- LARGE-WEBSITES/
  60. 60. What do we mean by big?
  61. 61. Do you really need to crawl the whole damn site?
  62. 62. That’s gonna be a big Excel doc…
  63. 63. Crawl in sections, subdomains, or subfolders.
  64. 64. Use the exclude function to avoid faceted navigation.
  65. 65. https://www.johnlew is.com/browse/men/ mens- trousers/adidas/alls aints/gant- rugger/hymn/kin/sel ected- femme/homme/size =36r/_/N- ebiZ1z13yvxZ1z0g0g6 Z1z04nruZ1z0s0laZ1z 0vl67Z1z01kl3Z1z0sw k1
  66. 66. https://www.johnlew is.com/browse/men/ mens- trousers/adidas/alls aints/gant- rugger/hymn/kin/sel ected- femme/homme/size =36r/_/N- ebiZ1z13yvxZ1z0g0g6 Z1z04nruZ1z0s0laZ1z 0vl67Z1z01kl3Z1z0sw k1
  67. 67. Database Storage Mode CONFIGURATION > STORAGE > DATABASE STORAGE MODE
  68. 68. Spider saves crawl data to an SSD rather than to the RAM.
  69. 69. Tricky site 3: password protected websites: HTTPS://WWW.SCREAMINGFROG.CO.UK/CRAWLING- PASSWORD-PROTECTED-WEBSITES/
  70. 70. Disclaimer:
  71. 71. Disclaimer: You need to know what the password is.
  72. 72. Disclaimer: You need to know what the password is. We can’t help you hack it…
  73. 73. (Maybe like a dev site for example…)
  74. 74. Forms Based Authentication CONFIGURE > AUTHENTICATION
  75. 75. Disclaimer 2:
  76. 76. Disclaimer 2: This will click every link on the page.
  77. 77. Disclaimer 2: This will click every link on the page. Yes, really…
  78. 78. 2. Scheduling
  79. 79. How to schedule crawls HTTPS://WWW.SCREAMINGFROG.CO.UK/SEO- SPIDER/USER-GUIDE/GENERAL/#SCHEDULING
  80. 80. CLI (Headless Mode)
  81. 81. CLI (Headless Mode)
  82. 82. You can use a cloud service for scheduled crawls. (You don’t have to though.)
  83. 83. https://online.marketing/guide/screaming-frog-in-google-cloud/
  84. 84. https://ipullrank.com/how-to-run-screaming-frog-and-url-profiler-on-amazon-web-services/
  85. 85. https://linki.cz/postavte-si-seo-stroj-amazon-cloudu-aws/
  86. 86. 3. Integrating APIs
  87. 87. Google Analytics HTTPS://WWW.SCREAMINGFROG.CO. UK/SEO-SPIDER/USER- GUIDE/CONFIGURATION/#GOOGLE- ANALYTICS-INTEGRATION
  88. 88. Google Analytics Sessions % New Sessions New Users Bounce Rate Page Views Per Session Avg Session Duration Page Value Goal Conversion Rate Goal Completions All Goal Value All Sessions Above 0 Bounce Rate Above 70% No GA Data Non-Indexable with GA Data Orphan URLs
  89. 89. Web Master Tools H T T P S :/ / W W W.S C R E A M I N G F RO G .C O.U K / S E O - S P I DE R / U SE R - G U I DE / C ON F I GU R AT I ON / # G OO G L E - S E A RCH - C ON SOL E - I N T E G R AT I ON
  90. 90. Google Search Console Web Master Tools HTTPS://WWW.SCREAMINGFROG.CO. UK/SEO-SPIDER/USER- GUIDE/CONFIGURATION/#GOOGLE- SEARCH-CONSOLE-INTEGRATION
  91. 91. Google Search Console Clicks Impressions CTR Position Clicks above 0 No GSC Data Non-Indexible with GSC Data Orphan URLs
  92. 92. ahrefs HTTPS://WWW.SCREAMINGFR OG.CO.UK/SEO - SPIDER/USER - GUIDE/CONFIGURATION/#AH REFS
  93. 93. ahrefs Backlinks Referring Domains URL Rating Social Shares etc. Referring Pages No/Do Follow Gov/Edu
  94. 94. Majestic HTTPS://WWW.SCREAMINGFR OG.CO.UK/SEO - SPIDER/USER - GUIDE/CONFIGURATION/#MA JESTIC
  95. 95. Majestic External Backlinks Referring Domains Trust Flow Citation Flow Edu/Gov Links Anchor Text Historic/Fresh Datasets
  96. 96. Moz HTTPS://WWW.SCREAMINGFRO G.CO.UK/SEO-SPIDER/USER- GUIDE/CONFIGURATION/#MOZ
  97. 97. Moz Page Authority MozRank Time Last Crawled Total Links Domain Authority
  98. 98. Get Ya DAs Done 1. Put the spider into list mode 2. Connect Moz API, enable DA 3. Paste in list of coverage URLs 4. Crawl 5. Export list of URLs + DAs 6. Make link builders happy
  99. 99. Get Ya DAs Done 1. Put the spider into list mode 2. Connect Moz API, enable DA 3. Paste in list of coverage URLs 4. Crawl 5. Export list of URLs + DAs 6. Make link builders happy
  100. 100. PageSpeed Insights HTTPS://WWW.SCREAMINGFROG.CO. UK/SEO-SPIDER/USER- GUIDE/CONFIGURATION/#PAGESPE ED-INSIGHTS-INTEGRATION
  101. 101. More on PageSpeed Insights Soon….
  102. 102. 4. Post Crawl Analysis
  103. 103. Link Score
  104. 104. A good way to estimate internal PageRank.
  105. 105. Link score calculates the relative value of a page based off its internal links.
  106. 106. Link score uses a relative point scale of 0-100, and it takes into account redirects, canonicals, nofollow links, and more.
  107. 107. The tool needs to find all your pages before it can assign link score values to them.
  108. 108. Pagination UNLINKED PAGINATION URLS, PAGINATION LOOPS
  109. 109. Pagination UNLINKED PAGINATION URLS, PAGINATION LOOPS
  110. 110. Hreflang UNLINKED/MISSING HREFLANG URLS
  111. 111. AMP MISSING <HTML AMP> TAG, NON-200 RESPONSE
  112. 112. Sitemaps URLS IN/OUT OF SITEMAPS, ORPHAN URLS, NON-INDEXABLE URLS IN SITEMAP, URLS IN MULTIPLE SITEMAPS
  113. 113. Analytics + Search Console ORPHAN PAGES
  114. 114. Part Three: Reports and Jobs
  115. 115. 5. Visualisations
  116. 116. HTTPS://WWW.SCREAMINGFROG.CO.UK/SEO-SPIDER-10/ HTTPS://WWW.SCREAMINGFROG.CO.UK/SITE-ARCHITECTURE-CRAWL- VISUALISATIONS/
  117. 117. Force-Directed Crawl Diagram SHORTEST PATH TO THE PAGE THAT THE SPIDER TOOK
  118. 118. Crawl Tree Graph SHORTEST PATH TO THE PAGE THAT THE SPIDER TOOK
  119. 119. Force-Directed Directory Tree Diagram
  120. 120. Directory Tree Graph FOLLOWS URL STRUCTURE
  121. 121. Inlink Anchor Text Word Cloud WHAT TEXT ARE YOU LINKING WITH?
  122. 122. Body Text Word Cloud WHAT IS YOUR CONTENT ABOUT?
  123. 123. 6. Creating/auditing Sitemaps
  124. 124. How to create XML Sitemaps: HTTPS://WWW.SCREAMINGFROG.CO.UK/SEO-SPIDER/USER- GUIDE/GENERAL/#XML-SITEMAP-CREATION
  125. 125. How to Audit XML Sitemaps: HTTPS://WWW.SCREAMINGFROG.CO.UK/HOW-TO-AUDIT- XML-SITEMAPS/
  126. 126. Or, paste your sitemap URLs into list mode.
  127. 127. Sitemap Reports URLs In Sitemap URLs Not In Sitemap Orphan URLs Non-Indexable URLs in Sitemap URLs In Multiple Sitemaps XML Sitemap With Over 50k URLs XML Sitemap With Over 50mb
  128. 128. 7. Structured Data Audits
  129. 129. How to audit structured data: HTTPS://WWW.SCREAMINGFROG.CO.UK/STRUCTURED- DATA-TESTING-VALIDATION/
  130. 130. Structured Data Issues Contains Structured Data Missing Structured Data Validation Errors Validation Warnings Parse Errors Microdata/JSON-LD/RDFa URLs
  131. 131. 8. Page Speed Insights with SF
  132. 132. PSI (Lighthouse) ft. CrUX HTTPS://WWW.SCREAMINGFROG.CO.UK/SEO-SPIDER-12/
  133. 133. 9. Audit Redirects in List Mode
  134. 134. How to audit redirects using list mode: HTTPS://WWW.SCREAMINGFROG.CO.UK/AUDIT- REDIRECTS/
  135. 135. Part Four: Fun Extra Stuff
  136. 136. 10. Reviving old search console reports
  137. 137. @RichLawther SCREAMING FROG TECHNICAL SEO MANAGER
  138. 138. Reviving old Search Console reports using SF HTTPS://WWW.SCREAMINGFROG.CO.UK/REVIVING- SEARCH-CONSOLE/
  139. 139. I miss the old structured data report…
  140. 140. I miss the old HTML Improvements tab…
  141. 141. I miss the old International Targeting…
  142. 142. I miss Blocked Resources…
  143. 143. I miss robots.txt Tester…
  144. 144. 11. SF for Content Marketing
  145. 145. Con tent Marketing
  146. 146. So we scraped all the headlines we could…
  147. 147. @ShannonMcGuirk_ HEAD OF PR AND CONTENT AIRA.NET
  148. 148. Scraping news sites to work out who gives out the most links: HTTPS://WWW.SLIDESHARE.NET/SHANNONMCGUIRK/SE ARCH-LEEDS-MAKING-HEADLINES
  149. 149. 12. Scraping SERPs
  150. 150. @patlangridge HEAD OF SEO SCREAMING FROG
  151. 151. How to scrape SERP features HTTPS://WWW.SCREAMINGFROG.CO.UK/HOW-TO-SCRAPE- GOOGLE-SEARCH-FEATURES-USING-XPATH/
  152. 152. Pat wants to know what questions people also ask…
  153. 153. Google of your choice! https://www.google.co.uk Search query parameter /search?q= Keywords with + symbols hmrc+complaints =CONCATENATE in Excel
  154. 154. Google query string Find and replace spaces with + symbol =CONCATENATE(A2,B2) Google Doc template: https://bit.ly/2lAN5sE
  155. 155. Configuration > Spider > Rendering > JavaScript Configuration > robots.txt > Settings > Ignore robots.txt Configuration > User-Agent > Present User Agents > Chrome Configuration > Speed > Max Threads = 1 > Max URI/s = 0.5
  156. 156. Configuration > Custom > Extraction
  157. 157. Configuration > Custom > Extraction
  158. 158. Configuration > Custom > Extraction Snippet page title: (//div[@class='ellip'])[1]/text() Snippet text paragraph: (//span[@class="e24Kjd"])[1] Snippet URL: (//div[@class="xpdopen"]//a/@href)[2]
  159. 159. Thanks! Questions? @OliverBrett @LordofTheSERPs @ScreamingFrog support@screamingfrog.co.uk

×