Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Screaming Frog + Xpath: BrightonSEO April 2019

4,001 views

Published on

Here is my talk about "Screaming Frog & Xpath: How to Analyse The Pants Off Your Competition"

Published in: Marketing

Screaming Frog + Xpath: BrightonSEO April 2019

  1. 1. Screaming Frog + Xpath: A Guide to Analyse the Pants Off Your Competition Sabine Langmann // sabine-langmann.com // @SabTheLa https://slideshare.net/sabinelangmann
  2. 2. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  3. 3. Level 1 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  4. 4. What is this about? 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  5. 5. We‘d like: to crawl specific elements on our own web pages or the ones of our competition We use: Screaming Frog‘s Custom Extraction + XPath 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  6. 6. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  7. 7. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  8. 8. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  9. 9. Level 2 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  10. 10. Who am I? https://www.sabine-langmann.com https://www.linkedin.com/in/sabine-langmann/ @SabTheLa 12.04.2019 Sabine Langmann Slides: bit.ly/sfx-2019 bit.ly/sfx-2019
  11. 11. Level 3Level 3 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  12. 12. Xpath 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  13. 13. XPath (XML Path Language) is a query language for selecting nodes from an XML document. Wikipedia 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  14. 14. Simple Syntax node every page element (e.g. H2, a, p, div) // adresses a certain node attribute attribute of a node (e.g. class, id) @ adresses a certain attribute count() counts addressed nodes 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  15. 15. Simple Syntax 12.04.2019 Sabine Langmann //node[@attribute="attribute_name"] bit.ly/sfx-2019
  16. 16. Simple Syntax 12.04.2019 Sabine Langmann //node[@attribute1="attribute_name1" and @attribute2="attribute_name2"] bit.ly/sfx-2019
  17. 17. Simple Syntax 12.04.2019 Sabine Langmann count(//node[@attribute="attribute_name"]) bit.ly/sfx-2019
  18. 18. Level 4 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  19. 19. Examples 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  20. 20. BBC.com vs. TheGuardian.com 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  21. 21. BBC.com 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  22. 22. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  23. 23. 12.04.2019 Sabine Langmann How many images? How many H2, H3, etc? How many words? How many links to which pages? bit.ly/sfx-2019
  24. 24. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  25. 25. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  26. 26. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  27. 27. What am I searching for? 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  28. 28. In I‘m searching for text (name of the topic tag) 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  29. 29. Suitable Xpath selector: //li[ @class="tags-list__tags" and @data-entityid="topic_link_bottom" ] 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  30. 30. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  31. 31. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  32. 32. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  33. 33. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  34. 34. 12.04.2019 Sabine Langmann https://www.bbc.com/news/uk-politics-.* bit.ly/sfx-2019
  35. 35. 12.04.2019 Sabine Langmann Don‘t forget ;) bit.ly/sfx-2019
  36. 36. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  37. 37. 12.04.2019 Sabine Langmann Result bit.ly/sfx-2019
  38. 38. 12.04.2019 Sabine Langmann Bit.ly/abjsd Ain‘t nobody got time for Excel. Better listen to Ben! 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  39. 39. TheGuardian.com 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  40. 40. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  41. 41. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  42. 42. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  43. 43. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  44. 44. What am I searching for? 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  45. 45. In <div class="submeta"> I‘m searching for the topic tag names 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  46. 46. Suitable Xpath selectors: //div[@class="submeta__section-labels"]) and //div[@class="submeta__keywords"] 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  47. 47. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  48. 48. 12.04.2019 Sabine Langmann https://www.theguardian.com/sitemaps/news.xml bit.ly/sfx-2019
  49. 49. 12.04.2019 Sabine Langmann Result bit.ly/sfx-2019
  50. 50. Yourcat.co.uk vs. TheFoodaholic.co.uk 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  51. 51. Yourcat.co.uk 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  52. 52. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  53. 53. 12.04.2019 Sabine Langmann How many links in editorial content? How many internal/external? bit.ly/sfx-2019
  54. 54. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  55. 55. 12.04.2019 Sabine Langmann <div class=“post-body-container”> bit.ly/sfx-2019
  56. 56. “How many links in editorial content?” Suitable Xpath selector: count(//div[@class="post-body-container"]//p//a) 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  57. 57. “How many internal editorial links?” Suitable Xpath selector: count(//div[@class="post-body-container"]//p//a[ starts-with(@href, "https://www.yourcat.co.uk/")]) 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  58. 58. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  59. 59. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  60. 60. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  61. 61. Result 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  62. 62. TheFoodaholic.co.uk 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  63. 63. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  64. 64. 12.04.2019 Sabine Langmann How many links in editorial content? How many internal/external? bit.ly/sfx-2019
  65. 65. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  66. 66. 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  67. 67. “How many links in editorial content?” Suitable Xpath selector: count(//div[@itemprop="articleBody"]//p //a[not(contains(@href, "wp-content"))]) 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  68. 68. “How many internal editorial links?” Xpath selector: count(//div[@itemprop="articleBody"]//p //a[starts-with(@href, "http://www.thefoodaholic.co.uk") and not(contains(@href, "wp-content"))]) 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  69. 69. Result 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  70. 70. Level 5 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  71. 71. Recap: Which data do I need? Can I crawl the respective elements? What is the right Xpath selector? That‘s it! 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  72. 72. More Xpath Cases, SERP-Crawling, Regex, … 12.04.2019 Sabine Langmann bit.ly/sfx-2019
  73. 73. 12.04.2019 Sabine Langmann Wait no more. Max shows how! bit.ly/sfx-2019
  74. 74. 12.04.2019 Sabine Langmann bit.ly/sfx-2019 Thanks!!

×