THE COMPLETE SEO
TECHNICAL AUDIT
@hannahjthorpe @outreachdigit
Hannah Thorpe
Head of SEO
@hannahjthorpe
hthorpe@white.net
Technical SEO
Content Marketing
Social Strategy
I improve websites
for customers,
which helps brands
sell more
@hannahjthorpe @outreachdigit
Our expertise
Technical	
  SEO Content	
  Marketing Digital	
  PR/	
  
Social	
  Media
Paid	
  Media
@hannahjthorpe @outreachdigit
There are approximately
54, 261searches on
Google every second
http://www.internetlivestats.com/one-­‐second/#google-­‐band
@hannahjthorpe @outreachdigit
In 2016, global B2C
ecommerce sales are
predicted to reach 1.92
trillion US dollars
http://www.statista.com/statistics/261245/b2c-­‐e-­‐commerce-­‐sales-­‐worldwide/
@hannahjthorpe @outreachdigit
TL;DR
The internet is important.
Take your website
performance seriously.
@hannahjthorpe @outreachdigit
Sometimes blue sky thinking
just isn’t enough.
Let’s get practical.
@hannahjthorpe @outreachdigit
One technical audit, fully implemented =
105.2% increase in number of organic sessions last
month
@hannahjthorpe @outreachdigit
There are approximately
54, 261different ways to
do a technical audit*
*This is probably not true
@hannahjthorpe @outreachdigit
C R A W L Y O U R W E B S I T E
@hannahjthorpe @outreachdigit
WHEN BOTS CRAWL A PAGE, THEY "READ" THE CONTENT ON THE
PAGE AND ASSOCIATE IT WITH A URL. THIS INFORMATION IS
THEN PUT IN GOOGLE'S INDEX (DATABASE OF CONTENT AND
URLS) AND IT BECOMES ACCESSIBLE VIA SEARCH. THIS
PROCESS IS CALLED INDEXING.
I N D E X I N G
@hannahjthorpe @outreachdigit
HOW	
  TO	
  SEE	
  LEVELS OF INDEXATION
Check for disparity between
all measures of indexed
pages
Numberof html pages foundby ScreamingFrog
Does this number make
sense for your
business?
@hannahjthorpe @ODGlobal
Robots.txt Meta Robots X Robots
Prevent crawling Yes No No
Prevent Indexing Yes Yes Yes
Prevent URL from showing up in index No Yes Yes
Remove content from index No Yes Yes
Easily implement on specific pages No Yes Yes
CONTROLLING	
  YOUR	
  CRAWLING & INDEXING
WHEN GOOGLEBOT, BINGBOT, OR THE OTHER SEARCH
CRAWLERS ACCESS A PAGE ON YOUR WEBSITE, THE SERVER
HEADER INFORMATION IS SERVED TO THEM. THE WRONG
STATUS CODE CAN SEND THE WRONG SIGNAL TO A SEARCH
CRAWLER AND NEGATIVELY AFFECT CRAWLABILITY AND
RANKABILITY OF A WEBSITE PAGE.
S T A T U S C O D E S
@hannahjthorpe @outreachdigit
200. Everythingis fine with the page and the crawler should crawl, cache, and index the page.
301. The page has been moved permanently to another page, so the crawler should instead crawl,
cache, and index the new final page.
302. The page has been moved temporarily. The page should not be removed from the index and
link equity is not passed, but both crawlers and visitorsare redirected.
404. The page is no longer in existence or accessible to the search crawlers. Crawlers will
eventually drop the page from the index and users will usually receive a 404 page from the website.
410. Page is gone, and is not expected to be back. Remove page from the index.
500. A server error exists and no content is accessible to either crawlers or search engines.
503. A “temporarily unavailable”status code that informs the crawlers and users to return later. The
503 is the best choice for site maintenance, as it is crawler friendly.
MOST COMMON STATUS CODES
Ayima Redirect
Path
Chrome Extension
@hannahjthorpe @outreachdigit
The cause? A trailing slash
http://www.outreachdigital.org/our-team-2016/
@hannahjthorpe @outreachdigit
500 SERVER PROBLEMS
This needs to go to your
developers. 500 status
codes are necessary at
certain times so don’t
eliminate them!
HOW TO FIX UNWANTED STATUS CODES
403 & 404 PAGES
Remove all internal
references to these.
Getting these response
codes to display when
needed can be done with a
plugin/developer
301 & 302 REDIRECTS
Update the internal links;
usually these are
templated ones and are
due to a change in site
architecture
YOUR URL STRUCTURE & SITE ARCHITECTURE IS KEY TO
HELPING GOOGLEBOT UNDERSTANDYOUR WEBSITE. FOLDERS
ARE YOUR FRIEND FOR ORGANISING PAGES AND BEING ABLE TO
MAKE BIG EDITS TO THE SITE.
U R L S T R U C T U R E
@hannahjthorpe @outreachdigit
SITE ARCHITECTURE
1. Protocol – HTTP or HTTPs
2. Subdomain – www or custom
3. Domain
4. Top Level Domain (TLD) - .com, .org
5. Category
6. Slug
http://www.domain.com/category/slug
URL COMPONENTS
DUPLICATE CONTENT IS SOMETHING THAT HAPPENS TO MANY
SITES. MANY TIMES THE DUPLICATE CONTENT IS NOT
INTENTIONAL AND CREATED BY THE WEBSITES CMS OR
HOSTING SERVER. THERE ARE OTHER TYPES OF DUPLICATE
CONTENT THAT ARE INTENTIONAL AND HAVE THE POSSIBILITY
OF HURTING THE PERFORMANCE OF AN ENTIRE SITE.
D U P L I C A T E C O N T E N T
@hannahjthorpe @outreachdigit
PANDA
First hit: Feb 2011
The Panda algorithm update
affected up to 12% of search
results. Panda cracked down on
websites with thin content,
content farms and high ad-to-
content ratio.
@hannahjthorpe @outreachdigit
CHECK FOR COMMONLY FOUND PROBLEMS
• http://www.outreachdigital.org
• http://outreachdigital.org
• https://www.outreachdigital.org
• https://outreachdigital.org
• www.outreachdigital.org
• outreachdigital.org
• http://www.outreachdigital.org/
• http://outreachdigital.org/
• https://www.outreachdigital.org/
• https://outreachdigital.org/
• outreachdigital.org/
• www.outreachdigital.org/
To a robot crawling a site, every different way of typing a URL can be seen as a unique
page.
E.g.
Use 301 redirects to point to one single version of each page, consistently across the site.
Implement self-referencing canonical tags on the core URLs to safeguard the site
TOOLS TO FIND INTERNAL DUPLICATE CONTENT
http://www.siteliner.com/
Other types of duplicate content may be found from a user perspective, or by utilising a
tool such as Siteliner:
EXTERNAL DUPLICATE CONTENT
http://www.copyscape.com/
External duplicate content is a higher risk for
websites, as it can trigger an algorithmic filter
more severe than internal duplication.
Common Causes:
• Stockists using similar product descriptions
• Affiliate sites using the same content
• Legacy SEO tactics
• Low quality sites stealing your content
IF YOU FIND YOUR PAGES ARE OFTEN TOO SIMILAR OR FLAGGED
FOR DUPLICATE CONTENT THEN YOU CAN USE CANONICALS OR
PAGINATION TO HANDLE THE ISSUE. THESE ARE WAYS OF
TALKING DIRECTLY TO THE SEARCH ENGINE BOTS TO EXPLAIN
WHY YOUR SITE IS SET UP IN THAT WAY
C A N O N I C A L S & P A G I N A T I O N
@hannahjthorpe @outreachdigit
CANONICALS HELP TO ADD CONTEXT
A canonicalindicates
which version of the
contentyou’d prefer to be
indexed, however it does
not stop the user
accessing the other page,
or it being indexed on
specific queries
https://econsultancy.com/blog/67811-how-canonical-tags-helped-
waterstones-solve-a-product-ranking-nightmare/
PAGINATION IS GOOD FOR LISTS OF CONTENT
Use paginationto mark-up that a list of blog
posts, page 1, 2, 3 etc will all be similar and
follow on from one another
THE SPEED AT WHICH YOUR WEBSITE LOADS AND CAN BE USED
BY ITS VISITORS IS IMPORTANT FROM A USER PERSPECTIVE. IT
IS ALSO A KNOWN PART OF GOOGLE’S RANKING ALGORITHM SO
KEEP AN EYE ON THE SIZE OF YOUR PAGES & OPTIMISETHEIR
LOADING.
S I T E S P E E D
@hannahjthorpe @outreachdigit
TEST, TEST & TEST AGAIN
IT’S THOUGHT THAT CLICK-THROUGH RATE IS A RANKING
FACTOR; WITH THIS IN MIND MAKING YOUR ORGANIC SEARCH
RESULTS STAND OUT AND ADD EXTRA VALUE TO THE USER IS
CRUCIAL TO YOUR SEO AUDIT.
S T R U C T U R E D M A R K - U P
@hannahjthorpe @outreachdigit
Structured mark-up is the way you
make your results have those
yellow stars in them…
… you can also mark-up your
products, events, brand, phone
number, etc.
@hannahjthorpe @outreachdigit
• Nested Values – allows
lots of information in one
place
• Quicker & simpler – both
learning JSON-LD and
editing it
WHAT IS JSON-LD?
JSON- LD IN REAL LIFE
https://developers.goo
gle.com/structured-
data/testing-tool/
TEST YOUR MARK-UP
YOU’VE PROBABLY ALL HEARD OF ‘MOBILEGEDON’ AS GOOGLE
PUTS MORE OF AN EMPHASIS ON MOBILE-FRIENDLY WEBSITES
IN THE SEARCH RESULTS ITS KEY THAT WE THINK ABOUT
MOBILE AS PART OF OUR STRATEGIES.
M O B I L E
@hannahjthorpe @outreachdigit
EARLY MORNING
Mobilesbrighten the commute
DAYTIME EVENING
PCsdominate working hours Tablets are popular at night
DEVICE PREFERENCES THROUGHOUT THE DAY
•Is the text size legible?
•How close together are the links?
•Is the content wider than the screen?
Tool:
https://www.google.com/webmasters/tools/
mobile-friendly/
@hannahjthorpe @outreachdigit
FINAL THOUGHTS
@hannahjthorpe @ODGlobal
OUR ADVICE FOR WEBMASTERS IS TO
FOCUS ON CREATING HIGH QUALITY SITES
THAT CREATE A GOOD USER EXPERIENCE
AND EMPLOY WHITE HAT SEO METHODS
INSTEAD OF ENGAGING IN AGGRESSIVE
WEBSPAM TACTICS…
@hannahjthorpe @outreachdigit
The Core Takeaways
Remember it’s humans who use your sites not robots, so make sure
all changes make sense to users too1
2
3
There’s no such thing as a quick win; but prioritise getting full tasks
completed quickly
You will never be done optimising your site; keep on top of it and
constantly re-audit
@hannahjthorpe @outreachdigit
Extra Resources…
http://www.themediaflow.com/technical-seo-audit-
checklist
https://moz.com/blog/technical-site-audit-for-2015
https://www.distilled.net/resources/technical-audit-
checklist-for-human-beings/
https://www.semrush.com/blog/my-top-8-tools-for-a-
technical-seo-audit/
https://www.koozai.com/resources/whitepapers/how-to-
perform-a-technical-seo-audit/
white.net
@hannahjthorpe
hthorpe@white.net
@hannahjthorpe @outreachdigit
Tools…
Crawling Screaming Frog, Deep Crawl, Xenu
Redirect Monitoring Ayima Redirect Path, Check My Links
Duplicate Content Detection Siteliner, Copyscape
Site Speed tools.pingdom.com, WebPageTest
@hannahjthorpe @outreachdigit
Thanks for listening!
@hannahjthorpe @outreachdigit

Technical SEO Audit

  • 2.
    THE COMPLETE SEO TECHNICALAUDIT @hannahjthorpe @outreachdigit
  • 3.
    Hannah Thorpe Head ofSEO @hannahjthorpe hthorpe@white.net Technical SEO Content Marketing Social Strategy I improve websites for customers, which helps brands sell more @hannahjthorpe @outreachdigit
  • 4.
    Our expertise Technical  SEOContent  Marketing Digital  PR/   Social  Media Paid  Media @hannahjthorpe @outreachdigit
  • 5.
    There are approximately 54,261searches on Google every second http://www.internetlivestats.com/one-­‐second/#google-­‐band @hannahjthorpe @outreachdigit
  • 6.
    In 2016, globalB2C ecommerce sales are predicted to reach 1.92 trillion US dollars http://www.statista.com/statistics/261245/b2c-­‐e-­‐commerce-­‐sales-­‐worldwide/ @hannahjthorpe @outreachdigit
  • 7.
    TL;DR The internet isimportant. Take your website performance seriously. @hannahjthorpe @outreachdigit
  • 8.
    Sometimes blue skythinking just isn’t enough. Let’s get practical. @hannahjthorpe @outreachdigit
  • 9.
    One technical audit,fully implemented = 105.2% increase in number of organic sessions last month @hannahjthorpe @outreachdigit
  • 10.
    There are approximately 54,261different ways to do a technical audit* *This is probably not true @hannahjthorpe @outreachdigit
  • 12.
    C R AW L Y O U R W E B S I T E @hannahjthorpe @outreachdigit
  • 13.
    WHEN BOTS CRAWLA PAGE, THEY "READ" THE CONTENT ON THE PAGE AND ASSOCIATE IT WITH A URL. THIS INFORMATION IS THEN PUT IN GOOGLE'S INDEX (DATABASE OF CONTENT AND URLS) AND IT BECOMES ACCESSIBLE VIA SEARCH. THIS PROCESS IS CALLED INDEXING. I N D E X I N G @hannahjthorpe @outreachdigit
  • 14.
    HOW  TO  SEE  LEVELS OF INDEXATION Check for disparity between all measures of indexed pages Numberof html pages foundby ScreamingFrog
  • 15.
    Does this numbermake sense for your business? @hannahjthorpe @ODGlobal
  • 16.
    Robots.txt Meta RobotsX Robots Prevent crawling Yes No No Prevent Indexing Yes Yes Yes Prevent URL from showing up in index No Yes Yes Remove content from index No Yes Yes Easily implement on specific pages No Yes Yes CONTROLLING  YOUR  CRAWLING & INDEXING
  • 17.
    WHEN GOOGLEBOT, BINGBOT,OR THE OTHER SEARCH CRAWLERS ACCESS A PAGE ON YOUR WEBSITE, THE SERVER HEADER INFORMATION IS SERVED TO THEM. THE WRONG STATUS CODE CAN SEND THE WRONG SIGNAL TO A SEARCH CRAWLER AND NEGATIVELY AFFECT CRAWLABILITY AND RANKABILITY OF A WEBSITE PAGE. S T A T U S C O D E S @hannahjthorpe @outreachdigit
  • 18.
    200. Everythingis finewith the page and the crawler should crawl, cache, and index the page. 301. The page has been moved permanently to another page, so the crawler should instead crawl, cache, and index the new final page. 302. The page has been moved temporarily. The page should not be removed from the index and link equity is not passed, but both crawlers and visitorsare redirected. 404. The page is no longer in existence or accessible to the search crawlers. Crawlers will eventually drop the page from the index and users will usually receive a 404 page from the website. 410. Page is gone, and is not expected to be back. Remove page from the index. 500. A server error exists and no content is accessible to either crawlers or search engines. 503. A “temporarily unavailable”status code that informs the crawlers and users to return later. The 503 is the best choice for site maintenance, as it is crawler friendly. MOST COMMON STATUS CODES
  • 19.
  • 20.
    The cause? Atrailing slash http://www.outreachdigital.org/our-team-2016/ @hannahjthorpe @outreachdigit
  • 21.
    500 SERVER PROBLEMS Thisneeds to go to your developers. 500 status codes are necessary at certain times so don’t eliminate them! HOW TO FIX UNWANTED STATUS CODES 403 & 404 PAGES Remove all internal references to these. Getting these response codes to display when needed can be done with a plugin/developer 301 & 302 REDIRECTS Update the internal links; usually these are templated ones and are due to a change in site architecture
  • 22.
    YOUR URL STRUCTURE& SITE ARCHITECTURE IS KEY TO HELPING GOOGLEBOT UNDERSTANDYOUR WEBSITE. FOLDERS ARE YOUR FRIEND FOR ORGANISING PAGES AND BEING ABLE TO MAKE BIG EDITS TO THE SITE. U R L S T R U C T U R E @hannahjthorpe @outreachdigit
  • 23.
  • 24.
    1. Protocol –HTTP or HTTPs 2. Subdomain – www or custom 3. Domain 4. Top Level Domain (TLD) - .com, .org 5. Category 6. Slug http://www.domain.com/category/slug URL COMPONENTS
  • 25.
    DUPLICATE CONTENT ISSOMETHING THAT HAPPENS TO MANY SITES. MANY TIMES THE DUPLICATE CONTENT IS NOT INTENTIONAL AND CREATED BY THE WEBSITES CMS OR HOSTING SERVER. THERE ARE OTHER TYPES OF DUPLICATE CONTENT THAT ARE INTENTIONAL AND HAVE THE POSSIBILITY OF HURTING THE PERFORMANCE OF AN ENTIRE SITE. D U P L I C A T E C O N T E N T @hannahjthorpe @outreachdigit
  • 26.
    PANDA First hit: Feb2011 The Panda algorithm update affected up to 12% of search results. Panda cracked down on websites with thin content, content farms and high ad-to- content ratio. @hannahjthorpe @outreachdigit
  • 27.
    CHECK FOR COMMONLYFOUND PROBLEMS • http://www.outreachdigital.org • http://outreachdigital.org • https://www.outreachdigital.org • https://outreachdigital.org • www.outreachdigital.org • outreachdigital.org • http://www.outreachdigital.org/ • http://outreachdigital.org/ • https://www.outreachdigital.org/ • https://outreachdigital.org/ • outreachdigital.org/ • www.outreachdigital.org/ To a robot crawling a site, every different way of typing a URL can be seen as a unique page. E.g. Use 301 redirects to point to one single version of each page, consistently across the site. Implement self-referencing canonical tags on the core URLs to safeguard the site
  • 28.
    TOOLS TO FINDINTERNAL DUPLICATE CONTENT http://www.siteliner.com/ Other types of duplicate content may be found from a user perspective, or by utilising a tool such as Siteliner:
  • 29.
    EXTERNAL DUPLICATE CONTENT http://www.copyscape.com/ Externalduplicate content is a higher risk for websites, as it can trigger an algorithmic filter more severe than internal duplication. Common Causes: • Stockists using similar product descriptions • Affiliate sites using the same content • Legacy SEO tactics • Low quality sites stealing your content
  • 30.
    IF YOU FINDYOUR PAGES ARE OFTEN TOO SIMILAR OR FLAGGED FOR DUPLICATE CONTENT THEN YOU CAN USE CANONICALS OR PAGINATION TO HANDLE THE ISSUE. THESE ARE WAYS OF TALKING DIRECTLY TO THE SEARCH ENGINE BOTS TO EXPLAIN WHY YOUR SITE IS SET UP IN THAT WAY C A N O N I C A L S & P A G I N A T I O N @hannahjthorpe @outreachdigit
  • 31.
    CANONICALS HELP TOADD CONTEXT A canonicalindicates which version of the contentyou’d prefer to be indexed, however it does not stop the user accessing the other page, or it being indexed on specific queries https://econsultancy.com/blog/67811-how-canonical-tags-helped- waterstones-solve-a-product-ranking-nightmare/
  • 32.
    PAGINATION IS GOODFOR LISTS OF CONTENT Use paginationto mark-up that a list of blog posts, page 1, 2, 3 etc will all be similar and follow on from one another
  • 33.
    THE SPEED ATWHICH YOUR WEBSITE LOADS AND CAN BE USED BY ITS VISITORS IS IMPORTANT FROM A USER PERSPECTIVE. IT IS ALSO A KNOWN PART OF GOOGLE’S RANKING ALGORITHM SO KEEP AN EYE ON THE SIZE OF YOUR PAGES & OPTIMISETHEIR LOADING. S I T E S P E E D @hannahjthorpe @outreachdigit
  • 34.
    TEST, TEST &TEST AGAIN
  • 35.
    IT’S THOUGHT THATCLICK-THROUGH RATE IS A RANKING FACTOR; WITH THIS IN MIND MAKING YOUR ORGANIC SEARCH RESULTS STAND OUT AND ADD EXTRA VALUE TO THE USER IS CRUCIAL TO YOUR SEO AUDIT. S T R U C T U R E D M A R K - U P @hannahjthorpe @outreachdigit
  • 36.
    Structured mark-up isthe way you make your results have those yellow stars in them… … you can also mark-up your products, events, brand, phone number, etc. @hannahjthorpe @outreachdigit
  • 37.
    • Nested Values– allows lots of information in one place • Quicker & simpler – both learning JSON-LD and editing it WHAT IS JSON-LD?
  • 38.
    JSON- LD INREAL LIFE
  • 39.
  • 40.
    YOU’VE PROBABLY ALLHEARD OF ‘MOBILEGEDON’ AS GOOGLE PUTS MORE OF AN EMPHASIS ON MOBILE-FRIENDLY WEBSITES IN THE SEARCH RESULTS ITS KEY THAT WE THINK ABOUT MOBILE AS PART OF OUR STRATEGIES. M O B I L E @hannahjthorpe @outreachdigit
  • 41.
    EARLY MORNING Mobilesbrighten thecommute DAYTIME EVENING PCsdominate working hours Tablets are popular at night DEVICE PREFERENCES THROUGHOUT THE DAY
  • 42.
    •Is the textsize legible? •How close together are the links? •Is the content wider than the screen? Tool: https://www.google.com/webmasters/tools/ mobile-friendly/ @hannahjthorpe @outreachdigit
  • 43.
  • 44.
    OUR ADVICE FORWEBMASTERS IS TO FOCUS ON CREATING HIGH QUALITY SITES THAT CREATE A GOOD USER EXPERIENCE AND EMPLOY WHITE HAT SEO METHODS INSTEAD OF ENGAGING IN AGGRESSIVE WEBSPAM TACTICS… @hannahjthorpe @outreachdigit
  • 45.
    The Core Takeaways Rememberit’s humans who use your sites not robots, so make sure all changes make sense to users too1 2 3 There’s no such thing as a quick win; but prioritise getting full tasks completed quickly You will never be done optimising your site; keep on top of it and constantly re-audit @hannahjthorpe @outreachdigit
  • 46.
  • 47.
    Tools… Crawling Screaming Frog,Deep Crawl, Xenu Redirect Monitoring Ayima Redirect Path, Check My Links Duplicate Content Detection Siteliner, Copyscape Site Speed tools.pingdom.com, WebPageTest @hannahjthorpe @outreachdigit
  • 48.