Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical SEO, DeepCrawl

486 views

Published on

Sites with any level of content production quickly build up pages that are outdated, no longer relevant and poor performers. Left unmanaged crawl budget may be wasted on low quality pages, penalties may be lost.
In this presentation, Sam wants to show you how to do it in a way that saves you time.

Published in: Marketing
  • Be the first to comment

  • Be the first to like this

Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical SEO, DeepCrawl

  1. 1. Cut the Crap: Running Content Audits With Crawlers Sam Marsden, Technical SEO Executive SEOCAMPIXX 2018 @sam_marsden SEOCAMPIXX
  2. 2. A bit about me... @sam_marsden SEOCAMPIXX 2 About me... Technical SEO Executive at DeepCrawl
  3. 3. Last year I started at DeepCrawl... @sam_marsden SEOCAMPIXX 2
  4. 4. Soon after we received Series A funding... @sam_marsden SEOCAMPIXX 2
  5. 5. This meant we could scale up... @sam_marsden SEOCAMPIXX 2 Very happy CEO
  6. 6. ...and money was made available for a redesign of the website @sam_marsden SEOCAMPIXX 2
  7. 7. A website redesign is a long process... @sam_marsden SEOCAMPIXX 2 Source: http://ezsitecms.com/services/website-redesign/
  8. 8. ...and we wanted to migrate to a new CMS @sam_marsden SEOCAMPIXX 2 Source: https://juliandontcheff.wordpress.com/2014/05/25/cross-platform-transportable-database-and-oracle-engineered-systems/
  9. 9. ...because we were suffering from plugin bloat @sam_marsden SEOCAMPIXX 2 https://www.greenlanemarketing.com/wp-content/uploads/2015/03/index-bloat.jpg
  10. 10. And needed to manually re-enter the data into the new CMS @sam_marsden SEOCAMPIXX 2 http://blog.transactionpro.com//wp-content/uploads/2015/07/shutterstock_139392815.jpg
  11. 11. ...so we only wanted to migrate the content that we needed @sam_marsden SEOCAMPIXX 2
  12. 12. A content audit was in order! @sam_marsden SEOCAMPIXX 2 We needed to: Discover the full extent of the site’s content inventory Attach relevant performance data to each of the site’s pages Create a set of criteria to decide what content keep and to get rid of. Apply that criteria to the site’s pages Decide if content to keep should remain in its current form
  13. 13. How can we do this in a thorough but time-efficient way? @sam_marsden SEOCAMPIXX 2
  14. 14. @sam_marsden SEOCAMPIXX What is a Content Audit?
  15. 15. What do the search results say? @sam_marsden SEOCAMPIXX 2
  16. 16. We want a more data-driven approach... @sam_marsden SEOCAMPIXX 2 https://www.pexels.com/photo/coding-computer-data-depth-of-field-577585/
  17. 17. What guides are out there? @sam_marsden SEOCAMPIXX 2
  18. 18. We want a fresh approach... @sam_marsden SEOCAMPIXX 2 Thorough Time-saving Replicable
  19. 19. Content auditing is like a spring clean Tiefenreinigung Think home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 1. First you need find all the crap you have hidden away in your home. Discovering all of your URLs 2. Then decide what is off-limits and definitely going to be kept Taking your core pages out of the equation 3. What’s your reasoning behind what will go? Creating a set of criteria to make decisions on pages 4. Making the call on what gets binned? What stays? What gets a new lease of life? Deciding what to do with your pages https://tookapic.com/photos/36415
  20. 20. @sam_marsden SEOCAMPIXX The Discovery Phase A Crawl Centred Approach:
  21. 21. The Discovery Phase Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 Aim: To discover all of the existing URLs on your site. Other guides suggest either: ● Exporting a list of pages from your CMS ○ BUT Pages may be missed - not thorough ● Running a crawl ○ BUT Only running a crawl will give you a limited view of the data ● Exporting data from third party tools and joining to your crawl data ○ BUT Joining the data is laborious and time consuming - not easily replicable
  22. 22. Here’s where DeepCrawl comes in... @sam_marsden SEOCAMPIXX 2
  23. 23. Putting Crawl Data at the Centre of Your Audit Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 By using a cloud crawling solution like DeepCrawl: ● Not limited by scale - sites with hundreds, thousands and millions of URLs can be crawled ● Can easily bring in multiple data sources without the need to export tool data and import into Excel table Instead of seeing crawler as a bringing single data source, put it at the centre of your content audit.
  24. 24. Running a crawl Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2
  25. 25. Using Custom Extractions Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 You can also use custom extractions to pull out information on your site which can help inform your content audit. ● Authors - Content performance by writer ● Published date, Last modified date - to examine data in specific date ranges ● Structured and meta data - presence of certain markup correlating with better organic performance. ● Tagging - Extract on page article tags and meta keywords
  26. 26. @sam_marsden SEOCAMPIXX The Refining Phase A Crawl Centred Approach:
  27. 27. Now you’re going to have a large dataset Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2
  28. 28. The Refining Phase Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 Aim: To take this raw data and cut it down to what is necessary and useful in order for you to make decisions on the content of your site. https://www.sharpen-up.com/whittle-beginners-guide-wonderful-craft-whittling/
  29. 29. Chopping the data down to size Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 Go through the spreadsheet and start chopping it down to size. Two parts: 1. Getting rid of unnecessary metrics (columns) 2. Removing pages that sit outside of the audit (rows)
  30. 30. The Whittling Phase Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 Removing pages that sit outside of the audit Once you’ve decided on the metrics, you will want to remove pages that site outside of the audit. These may include: Category pages Paginated pages Core pages Faceted URLs
  31. 31. With you’re reduced dataset you can avoid this... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2
  32. 32. And let the streamlined and efficient content auditing commence... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2
  33. 33. What are you left with? Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 After you’ve cut down your dataset you want to be left with: Page descriptors - URLs, Page titles, Meta descriptions Page attributes - word count, published & last modified date, links in/out, duplicate, categories, author Performance metrics - backlink data, social shares, traffic, SERPs, impressions, time on page
  34. 34. @sam_marsden SEOCAMPIXX Four Questions You Need to Answer A Crawl Centred Approach:
  35. 35. Question No. 1: What is and isn’t performing well? Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 vs.
  36. 36. Defining a set of criteria for content performance Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 You need to define a set of criteria by which you judge content performance. Will vary dependent on the nature of the site. For example: A news site that generates revenue through ad impressions will define successful content differently from a B2B site that provides a niche service. You may also have different expectations of content performance dependent on the content type. Mass appeal vs. targeted content.
  37. 37. Defining a set of criteria for content performance Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 In the case of the DeepCrawl content audit we assessed content performance based on: Unique pageviews Share count Backlink count Page Value* (Analytics) Inclusion relies on correct goal implementation*
  38. 38. Number 2: How can you deal with content that isn’t performing well? Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 https://mylearningsolutions.org/2014/08/13/five-decision-making-pitfalls/
  39. 39. Dealing with poor performing content Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 In your spreadsheet you’ll want to create an ‘Action’ column: This column will feature a set of options which you will use to categorise each page. This will include the four C’s (or K’s) of content audit decision making: 1. Keep - Pages that are performing well and will not be changed significantly 2. Cut - Low value pages that don’t deserve a place on your site e.g. outdated content 3. Combine - Pages that include content that doesn’t warrant its own dedicated page but can be used to bolster another existing page 4. Convert - Pages with potential that you want to invest time improving e.g. partially duplicate content
  40. 40. Dealing with poor performing content Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 In DeepCrawl’s case we knew there was a lot of outdated content no longer providing value. We could afford to be cut-throat and only keep content that: ● Had a publish date within the last year. ● Or had a specified number of traffic from Analytics or impressions from GSC Search Analytics. Medium sized site so could review each poor performing page and decide if could be combined with relevant pages or marked for rewriting.
  41. 41. Criteria creation Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 For pages where you aren’t sure about what action to take. Ask yourself: ● Is the page being seen in search and receiving traffic? ● Is the page actually bringing value to the site? ● How would pages fair if they were put in front of Google’s search quality raters? ○ Do they exude Expertise, Authoritativeness and Trustworthiness? ○ If not, can the content be merged with a stronger page on a related topic or is there the resource available to elevate that content?
  42. 42. No. 3: What can you do to get the most out of content that is performing well? Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th @sam_marsden SEOCAMPIXX 2 http://theleagueam.com/2017/06/24/coaching/
  43. 43. Filter your spreadsheet by what you want to keep... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/
  44. 44. This will effectively be a exercise in content optimisation Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 This is where you can start using a fuller breadth of the data you’ve pulled in. Key areas to focus on for content optimisation: ● Optimising titles & meta descriptions – are titles and descriptions appealing propositions? Match the user intent in search? ● Keyword cannibalisation – Multiple pages ranking for topically similar queries? ● Duplication issues – Unique content? Near or true duplicates diluting the authority? ● Linking – Internal/external linking opportunities? Relevant CTAs? Place in user journey ● Page speed – Ways to reduce load time e.g. image optimisation or clunky code? ● Structured data – Existing implementation correct? Additional markup? ● Tag pages - Can drive value if done well, but sites often have too many. No. tags compared to articles (ratio)? Can this be reduced to consolidate authority?
  45. 45. 4. How can you use this data to inform your content strategy? Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 https://www.eventbrite.co.uk/blog/video-event-industry-trends-and-the-future-of-events-ds00/
  46. 46. Need data driven insights because content marketing resources are finite... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 https://www.freepik.com/premium-photo/empty-piggy-bank_1568162.htm
  47. 47. And need to ensure resources are invested into more of what works... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 http://blog.peerform.com/will-banks-survive-competition-from-alternative-financial-markets/
  48. 48. Achieving this is all about finding relationships... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 http://hopesrising.com/?p=5677
  49. 49. Finding patterns and relationships... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Aim: To establish patterns from data taking into account objective of site to make decisions of content strategy. ● Particularly useful for large sites where page-by-page analysis isn’t an option. ● Doesn’t have to be one-off exercise, can form basis of ongoing reporting for clients or internal teams. Let’s take a look at some relationships which may be of interest.
  50. 50. Tool of choice: The Pivot Table - Pivoting variables around metrics Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2
  51. 51. 1. Performance by channel/category/content type Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Do some types of content perform better than others? Group content into categories and look in terms of performance (views, shares, backlinks) and volume of production (no. articles published). Are you allocating content efforts efficiently? Is time, money and effort being spent on the right types of content?
  52. 52. 1. Performance by channel/category/content type Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 ● Majority of content resources are going into Sport and News. ● TV/Showbiz articles receive much higher average no. pageviews but much fewer no. articles. ● You’d want to investigate possibility of upping production of TV/Showbiz articles to see if can maintain higher average volume of traffic.
  53. 53. 2. Content length and engagement Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Is content length positively correlated with engagement? ● Can look at word count and time on page to determine this? ● Is there a point of diminishing returns e.g. beyond 1,000 words?
  54. 54. 2. Content length and engagement Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 If engagement doesn’t increase linearly with content length then can resources for content production be used more efficiently. ● Create guidelines for content length based on insights. ● Topics selected based on impact rather than content length. ● Greater awareness of time taken to create content and the likely impact that can be expected.
  55. 55. 3. Relationship between page speed and engagement Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Is page speed harming bounce rate and conversion rate? ● Do some pages load more slowly than others? ● Why? Are some resource heavy? Images need to be optimised? ● Important, especially for eCommerce as load time and bounce rate closely tied to conversion rate. https://www.branded3.com/blog/mobile-speed-experience-googles-2-4-second-sweet-spot/
  56. 56. 4. Performance and engagement by author Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 How does content performance vary by author? ● Useful for sites with high turnover of content, like news sites. ● Define ranges by which to rate content performance ○ E.g. Poor, average, good, excellent based on pageviews ● Can be replicated on a weekly, monthly, quarterly basis for ongoing monitoring. Author name Poor Average Good Excellent Barton Haberkorn 26 64 11 60 Jacquelynn Kline 19 79 4 49 Claudette Etheredge 87 79 77 11 Sharell Phinney 73 31 8 20 Dane Shiner 51 54 7 90 Francesco Kirwin 84 90 21 57 Issac Asberry 54 78 29 47
  57. 57. 5. Performance fluctuations by publish date and time Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Is content better received on specific days of the week, time of the day or months of the year? Can you tailor content publication to times that are likely to get more exposure? May involve working non-standard hours or days to meet demand of your audience.
  58. 58. But this is just the beginning... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2
  59. 59. From here you want to automate the auditing process... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2
  60. 60. ...and pull this data into dashboards for continuous monitoring Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2
  61. 61. And so to wrap up... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 Within each audit you want to be able answer: 1. What is and isn’t working well? 2. What should you do with poor performing content? 3. How can you get even more out of your better performing content? 4. What patterns can you find to ensure content resources are better allocated?
  62. 62. And so to wrap up... Think of running a content audit like you would a spring clean of your home Think of running a contenFirst you need find all th https://balancedcarend.com/2013/11/21/healthy-holiday/squirrel-nut/ @sam_marsden SEOCAMPIXX 2 The content auditing process should be centred around a cloud based web crawling solution and be: ● Data driven - So that it is thorough and backed by insights rather than intuition ● Automated - To save you time and ensure it’s a quick and painless process ● Frequent - Regularly replicated to assess the impact of changes and change course accordingly.
  63. 63. THANK YOU ANY QUESTIONS? Sam Marsden Technical SEO @sam_marsden SEOCAMPIXX

×