Working in digital industry? Learn how to perform a website Technical Audit for SEO. Hannah Thorpe explains
What are the core elements to check for an SEO audit?
What are the warning sites your site might be struggling?
How to fix common problems flagged during a technical audit
3. Hannah Thorpe
Head of SEO
@hannahjthorpe
hthorpe@white.net
Technical SEO
Content Marketing
Social Strategy
I improve websites
for customers,
which helps brands
sell more
@hannahjthorpe @outreachdigit
4. Our expertise
Technical
SEO Content
Marketing Digital
PR/
Social
Media
Paid
Media
@hannahjthorpe @outreachdigit
5. There are approximately
54, 261searches on
Google every second
http://www.internetlivestats.com/one-‐second/#google-‐band
@hannahjthorpe @outreachdigit
6. In 2016, global B2C
ecommerce sales are
predicted to reach 1.92
trillion US dollars
http://www.statista.com/statistics/261245/b2c-‐e-‐commerce-‐sales-‐worldwide/
@hannahjthorpe @outreachdigit
7. TL;DR
The internet is important.
Take your website
performance seriously.
@hannahjthorpe @outreachdigit
8. Sometimes blue sky thinking
just isn’t enough.
Let’s get practical.
@hannahjthorpe @outreachdigit
9. One technical audit, fully implemented =
105.2% increase in number of organic sessions last
month
@hannahjthorpe @outreachdigit
10. There are approximately
54, 261different ways to
do a technical audit*
*This is probably not true
@hannahjthorpe @outreachdigit
11.
12. C R A W L Y O U R W E B S I T E
@hannahjthorpe @outreachdigit
13. WHEN BOTS CRAWL A PAGE, THEY "READ" THE CONTENT ON THE
PAGE AND ASSOCIATE IT WITH A URL. THIS INFORMATION IS
THEN PUT IN GOOGLE'S INDEX (DATABASE OF CONTENT AND
URLS) AND IT BECOMES ACCESSIBLE VIA SEARCH. THIS
PROCESS IS CALLED INDEXING.
I N D E X I N G
@hannahjthorpe @outreachdigit
14. HOW
TO
SEE
LEVELS OF INDEXATION
Check for disparity between
all measures of indexed
pages
Numberof html pages foundby ScreamingFrog
15. Does this number make
sense for your
business?
@hannahjthorpe @ODGlobal
16. Robots.txt Meta Robots X Robots
Prevent crawling Yes No No
Prevent Indexing Yes Yes Yes
Prevent URL from showing up in index No Yes Yes
Remove content from index No Yes Yes
Easily implement on specific pages No Yes Yes
CONTROLLING
YOUR
CRAWLING & INDEXING
17. WHEN GOOGLEBOT, BINGBOT, OR THE OTHER SEARCH
CRAWLERS ACCESS A PAGE ON YOUR WEBSITE, THE SERVER
HEADER INFORMATION IS SERVED TO THEM. THE WRONG
STATUS CODE CAN SEND THE WRONG SIGNAL TO A SEARCH
CRAWLER AND NEGATIVELY AFFECT CRAWLABILITY AND
RANKABILITY OF A WEBSITE PAGE.
S T A T U S C O D E S
@hannahjthorpe @outreachdigit
18. 200. Everythingis fine with the page and the crawler should crawl, cache, and index the page.
301. The page has been moved permanently to another page, so the crawler should instead crawl,
cache, and index the new final page.
302. The page has been moved temporarily. The page should not be removed from the index and
link equity is not passed, but both crawlers and visitorsare redirected.
404. The page is no longer in existence or accessible to the search crawlers. Crawlers will
eventually drop the page from the index and users will usually receive a 404 page from the website.
410. Page is gone, and is not expected to be back. Remove page from the index.
500. A server error exists and no content is accessible to either crawlers or search engines.
503. A “temporarily unavailable”status code that informs the crawlers and users to return later. The
503 is the best choice for site maintenance, as it is crawler friendly.
MOST COMMON STATUS CODES
20. The cause? A trailing slash
http://www.outreachdigital.org/our-team-2016/
@hannahjthorpe @outreachdigit
21. 500 SERVER PROBLEMS
This needs to go to your
developers. 500 status
codes are necessary at
certain times so don’t
eliminate them!
HOW TO FIX UNWANTED STATUS CODES
403 & 404 PAGES
Remove all internal
references to these.
Getting these response
codes to display when
needed can be done with a
plugin/developer
301 & 302 REDIRECTS
Update the internal links;
usually these are
templated ones and are
due to a change in site
architecture
22. YOUR URL STRUCTURE & SITE ARCHITECTURE IS KEY TO
HELPING GOOGLEBOT UNDERSTANDYOUR WEBSITE. FOLDERS
ARE YOUR FRIEND FOR ORGANISING PAGES AND BEING ABLE TO
MAKE BIG EDITS TO THE SITE.
U R L S T R U C T U R E
@hannahjthorpe @outreachdigit
24. 1. Protocol – HTTP or HTTPs
2. Subdomain – www or custom
3. Domain
4. Top Level Domain (TLD) - .com, .org
5. Category
6. Slug
http://www.domain.com/category/slug
URL COMPONENTS
25. DUPLICATE CONTENT IS SOMETHING THAT HAPPENS TO MANY
SITES. MANY TIMES THE DUPLICATE CONTENT IS NOT
INTENTIONAL AND CREATED BY THE WEBSITES CMS OR
HOSTING SERVER. THERE ARE OTHER TYPES OF DUPLICATE
CONTENT THAT ARE INTENTIONAL AND HAVE THE POSSIBILITY
OF HURTING THE PERFORMANCE OF AN ENTIRE SITE.
D U P L I C A T E C O N T E N T
@hannahjthorpe @outreachdigit
26. PANDA
First hit: Feb 2011
The Panda algorithm update
affected up to 12% of search
results. Panda cracked down on
websites with thin content,
content farms and high ad-to-
content ratio.
@hannahjthorpe @outreachdigit
27. CHECK FOR COMMONLY FOUND PROBLEMS
• http://www.outreachdigital.org
• http://outreachdigital.org
• https://www.outreachdigital.org
• https://outreachdigital.org
• www.outreachdigital.org
• outreachdigital.org
• http://www.outreachdigital.org/
• http://outreachdigital.org/
• https://www.outreachdigital.org/
• https://outreachdigital.org/
• outreachdigital.org/
• www.outreachdigital.org/
To a robot crawling a site, every different way of typing a URL can be seen as a unique
page.
E.g.
Use 301 redirects to point to one single version of each page, consistently across the site.
Implement self-referencing canonical tags on the core URLs to safeguard the site
28. TOOLS TO FIND INTERNAL DUPLICATE CONTENT
http://www.siteliner.com/
Other types of duplicate content may be found from a user perspective, or by utilising a
tool such as Siteliner:
29. EXTERNAL DUPLICATE CONTENT
http://www.copyscape.com/
External duplicate content is a higher risk for
websites, as it can trigger an algorithmic filter
more severe than internal duplication.
Common Causes:
• Stockists using similar product descriptions
• Affiliate sites using the same content
• Legacy SEO tactics
• Low quality sites stealing your content
30. IF YOU FIND YOUR PAGES ARE OFTEN TOO SIMILAR OR FLAGGED
FOR DUPLICATE CONTENT THEN YOU CAN USE CANONICALS OR
PAGINATION TO HANDLE THE ISSUE. THESE ARE WAYS OF
TALKING DIRECTLY TO THE SEARCH ENGINE BOTS TO EXPLAIN
WHY YOUR SITE IS SET UP IN THAT WAY
C A N O N I C A L S & P A G I N A T I O N
@hannahjthorpe @outreachdigit
31. CANONICALS HELP TO ADD CONTEXT
A canonicalindicates
which version of the
contentyou’d prefer to be
indexed, however it does
not stop the user
accessing the other page,
or it being indexed on
specific queries
https://econsultancy.com/blog/67811-how-canonical-tags-helped-
waterstones-solve-a-product-ranking-nightmare/
32. PAGINATION IS GOOD FOR LISTS OF CONTENT
Use paginationto mark-up that a list of blog
posts, page 1, 2, 3 etc will all be similar and
follow on from one another
33. THE SPEED AT WHICH YOUR WEBSITE LOADS AND CAN BE USED
BY ITS VISITORS IS IMPORTANT FROM A USER PERSPECTIVE. IT
IS ALSO A KNOWN PART OF GOOGLE’S RANKING ALGORITHM SO
KEEP AN EYE ON THE SIZE OF YOUR PAGES & OPTIMISETHEIR
LOADING.
S I T E S P E E D
@hannahjthorpe @outreachdigit
35. IT’S THOUGHT THAT CLICK-THROUGH RATE IS A RANKING
FACTOR; WITH THIS IN MIND MAKING YOUR ORGANIC SEARCH
RESULTS STAND OUT AND ADD EXTRA VALUE TO THE USER IS
CRUCIAL TO YOUR SEO AUDIT.
S T R U C T U R E D M A R K - U P
@hannahjthorpe @outreachdigit
36. Structured mark-up is the way you
make your results have those
yellow stars in them…
… you can also mark-up your
products, events, brand, phone
number, etc.
@hannahjthorpe @outreachdigit
37. • Nested Values – allows
lots of information in one
place
• Quicker & simpler – both
learning JSON-LD and
editing it
WHAT IS JSON-LD?
40. YOU’VE PROBABLY ALL HEARD OF ‘MOBILEGEDON’ AS GOOGLE
PUTS MORE OF AN EMPHASIS ON MOBILE-FRIENDLY WEBSITES
IN THE SEARCH RESULTS ITS KEY THAT WE THINK ABOUT
MOBILE AS PART OF OUR STRATEGIES.
M O B I L E
@hannahjthorpe @outreachdigit
41. EARLY MORNING
Mobilesbrighten the commute
DAYTIME EVENING
PCsdominate working hours Tablets are popular at night
DEVICE PREFERENCES THROUGHOUT THE DAY
42. •Is the text size legible?
•How close together are the links?
•Is the content wider than the screen?
Tool:
https://www.google.com/webmasters/tools/
mobile-friendly/
@hannahjthorpe @outreachdigit
44. OUR ADVICE FOR WEBMASTERS IS TO
FOCUS ON CREATING HIGH QUALITY SITES
THAT CREATE A GOOD USER EXPERIENCE
AND EMPLOY WHITE HAT SEO METHODS
INSTEAD OF ENGAGING IN AGGRESSIVE
WEBSPAM TACTICS…
@hannahjthorpe @outreachdigit
45. The Core Takeaways
Remember it’s humans who use your sites not robots, so make sure
all changes make sense to users too1
2
3
There’s no such thing as a quick win; but prioritise getting full tasks
completed quickly
You will never be done optimising your site; keep on top of it and
constantly re-audit
@hannahjthorpe @outreachdigit