Getting to Grips with
Python & Machine Learning
for SEO
Ruth Everett // DeepCrawl
https://www.slideshare.net/RuthEverett1
@rvtheverett
Ruth Everett
Technical SEO Analyst
@rvtheverett
Getting to Grips with
Python & Machine
Learning for SEO
@rvtheverett@DeepCrawl
@rvtheverett @deepcrawl#BrightonSEO
Allow: /dogs
Allow: /SEO
Allow: /python
My coding partner
in crime
PROBLEM
SEOs are busy
@rvtheverett#BrightonSEO
SOLUTION
Automation
#BrightonSEO @rvtheverett
@rvtheverett#BrightonSEO
Enter Data Analysis & Automation
with Python
Getting Started with Python
What We’ll Cover
How Python can help with
Technical SEO
An Introduction to Machine
Learning for SEO
@rvtheverett#BrightonSEO
#BrightonSEO
GETTING
STARTED WITH
PYTHON
@rvtheveret
Before
@rvtheverett#BrightonSEO
Now
@rvtheverett#BrightonSEO
WHAT IS PYTHON?
Code written in the
terminal
@rvtheverett#BrightonSEO
Results generated
Open-source interactive
programming language
Interpreted line by line
COMPANIES USING PYTHON
@rvtheverett#BrightonSEO
COMPANIES USING PYTHON
"Python has been an important part of Google
since the beginning, and remains so as the
system grows and evolves. Today dozens of
Google engineers use Python, and we're
looking for more people with skills in this
language."
@rvtheverett#BrightonSEO
COMPANIES USING PYTHON
"Python is fast enough for our site and
allows us to produce maintainable
features in record times, with a minimum
of developers"
@rvtheverett@BrightonSEO
CODECADEMY
@rvtheverett#BrightonSEO
20 week online course
Mixture of theory and practical
A range of projects to undertake
Code console & terminal to play and test
DATACAMP
@rvtheverett#BrightonSEO
Wide range of skill
tracks
Interactive exercises
Instant explanations
Challenges and
projects
https://www.datacamp.com/learn/python/
SOLOLEARN
@rvtheverett#BrightonSEO
Free mobile app
Learn Python on the
go
Over 200 practice
questions
Code Playground
https://www.sololearn.com/Course/Python/
CODECOMBAT
@rvtheverett#BrightonSEOhttps://codecombat.com/
USING PYTHON
Mac - Terminal Windows - Command Line
@rvtheverett#BrightonSEO
USING PYTHON
@rvtheverett#BrightonSEO
Google Colab
USING PYTHON
@rvtheverett#BrightonSEO
Jupyter Notebook
PYTHON LIBRARIES
@rvtheverett#BrightonSEO
Data extraction & analysis
Scientific Computing
Natural Language Processing
Machine Learning
@rvtheverett#BrightonSEO
HOW PYTHON CAN HELP WITH
TECHNICAL SEO
WHY SHOULD WE CARE?
@rvtheverett#BrightonSEO
Data extraction and
analysis to solve
complex problems
Future-proofing your job
Efficiency and time-saving
Automating repetitive
tasks
https://www.ranksense.com/empowering-a-new-generation-of-seos-with-python/
WHY SHOULD WE CARE?
@rvtheverett#BrightonSEO
Spend 5 hours a week using excel
WHY SHOULD WE CARE?
@rvtheverett#BrightonSEO
Spend 5 hours a week using excel
Thats 20 hours a month
WHY SHOULD WE CARE?
@rvtheverett#BrightonSEO
Spend 5 hours a week using excel
Thats 20 hours a month
Over 200 hours a year
WHY SHOULD WE CARE?
@rvtheverett#BrightonSEO
Imagine what we could
achieve if we spent this
time on other important
tasks
(that can’t be automated)
WHY SHOULD WE CARE?
@rvtheverett@DeepCrawl
Redirect Relevancy
WHY SHOULD WE CARE?
@rvtheverett
Pivot Tables
@DeepCrawl
@rvtheverett#BrightonSEO
WHY IS PYTHON GROWING IN POPULARITY IN THE SEO SPACE?
Make data driven decisions
Allowing us to focus on other
important optimisation
efforts
Confidence in recommendations
Provide concrete insights
Better understand data
AUTOMATING WITH PYTHON
@rvtheverett#BrightonSEO
Automating with
Python
Parameter
Finder
404
Checker
Internal Linking
Analysis
Image
Optimisation
Website
Scraping
Keyword
Research
@rvtheverett#BrightonSEO
CHALLENGE - MISSING ALT TEXT
SOLUTION - IMAGE CAPTIONING
WITH PYTHIA
IMAGE CAPTIONING WITH PYTHIA
@rvtheverett#BrightonSEO
Pythia Modular
Framework
https://paperswithcode.com/paper/bottom-up-and-top-down-attention-for-image
https://learnpythia.readthedocs.io/en/latest/
@rvtheverett#BrightonSEO
IMAGE CAPTIONING WITH PYTHIA
Google Colab Link
@rvtheverett#BrightonSEO
IMAGE CAPTIONING WITH PYTHIA
Google Colab Link
@rvtheverett#BrightonSEO
IMAGE CAPTIONING WITH PYTHIA
@rvtheverett#BrightonSEO
IMAGE CAPTIONING WITH PYTHIA
It’s not perfect though!
@rvtheverett#BrightonSEO
IMAGE CAPTIONING WITH PYTHIA
@rvtheverett#BrightonSEO
CHALLENGE - LARGE IMAGE FILE
SIZES
SOLUTION - OPTIMISE IMAGES
OPTIMISE IMAGES WITH PILLOW
@rvtheverett#BrightonSEO
Pure Python using the Pillow library
This script does optimise
images destructively
optimize-images filename.jpg
Optimise a single image
optimize-images ./
Optimise a folder with multiple images
Github Link
OPTIMISE IMAGES WITH PILLOW
@rvtheverett#BrightonSEO
OPTIMISE IMAGES WITH PILLOW
@rvtheverett#BrightonSEO
OPTIMISE IMAGES WITH PILLOW
@rvtheverett#BrightonSEO
OPTIMISE IMAGES WITH PILLOW
@rvtheverett#BrightonSEO
Original Optimised
@rvtheverett#BrightonSEO
UNDERSTANDING PAGERANK
UNDERSTANDING PAGERANK
@rvtheverett@DeepCrawlhttps://colab.research.google.com/drive/1zQ8VFcNmwVLKEMwJ3lhTginPoSC5TdpB
@rvtheverett@DeepCrawlhttps://colab.research.google.com/drive/1zQ8VFcNmwVLKEMwJ3lhTginPoSC5TdpB
UNDERSTANDING PAGERANK
@rvtheverett#BrightonSEO
No coding knowledge
required!
OTHER POSSIBILITIES
@rvtheverett#BrightonSEO
Log File analysis
Validate hreflang
Identify duplicate URLs
Perform competitor
analysis
Automate page speed
audits
@rvtheverett#BrightonSEO
Think about
what you can
automate!
@rvtheverett#BrightonSEO
PAGESPEED API WITH PYTHON
@rvtheverett#BrightonSEO
PAGESPEED API WITH PYTHON
https://colab.research.google.com/drive/1Oe1VTocg21KIVDqROXSt15H6CoO905D0
PYTRENDS
@rvtheverett#BrightonSEO
PYTRENDS
@rvtheverett#BrightonSEO
OTHER FUN PYTHON PROJECTS
@rvtheverett#BrightonSEO
Create a bot using Python,
Telegram and RandomDog API
https://www.practicepython.org/
https://realpython.com/pygame-a-primer/
https://inventwithpython.com/pygame/
@rvtheverett#BrightonSEO
AN INTRODUCTION
TO MACHINE
LEARNING FOR SEO
WHAT IS MACHINE LEARNING?
@rvtheverett#BrightonSEO
“Machine learning is an application of artificial
intelligence (AI) that provides systems the ability
to automatically learn and improve from
experience without being explicitly programmed.”
https://www.expertsystem.com/machine-learning-definition/
POWERING MACHINE LEARNING
@rvtheverett#BrightonSEOhttps://www.expertsystem.com/machine-learning-definition/
Run a script to
train the
computer,
using a dataset
POWERING MACHINE LEARNING
@rvtheverett#BrightonSEOhttps://www.expertsystem.com/machine-learning-definition/
Run a script to
train the
computer,
using a dataset
Summarise &
Visualise the
dataset
POWERING MACHINE LEARNING
@rvtheverett#BrightonSEOhttps://www.expertsystem.com/machine-learning-definition/
Run a script to
train the
computer,
using a dataset
Summarise &
Visualise the
dataset
Evaluate the
algorithms
POWERING MACHINE LEARNING
@rvtheverett#BrightonSEOhttps://www.expertsystem.com/machine-learning-definition/
Run a script to
train the
computer,
using a dataset
Summarise &
Visualise the
dataset
Evaluate the
algorithms
Make
Predictions
REAL WORLD MACHINE LEARNING EXAMPLES
@rvtheverett#BrightonSEO
RankBrain NLP
Computer
Vision
BERT
REAL WORLD MACHINE LEARNING EXAMPLES
@rvtheverett#BrightonSEO
Twitter Curated Timelines
REAL WORLD MACHINE LEARNING EXAMPLES
@rvtheverett#BrightonSEO
Facebook Chatbots
https://ipullrank.com/machine-learning-guide/how-to-set-up-a-chatbot/
REAL WORLD MACHINE LEARNING EXAMPLES
@rvtheverett#BrightonSEO
Personalised Recommendations
https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76
REAL WORLD MACHINE LEARNING EXAMPLES
@rvtheverett#BrightonSEO
Personalised Recommendations
https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76
@rvtheverett#BrightonSEO
DATA IS THE FUEL
FOR MACHINE
LEARNING
SUPERVISED LEARNING
@rvtheverett#BrightonSEO
SUPERVISED LEARNING
@rvtheverett#BrightonSEO
SUPERVISED LEARNING
@rvtheverett#BrightonSEO
SUPERVISED LEARNING
@rvtheverett#BrightonSEO
UNSUPERVISED LEARNING
@rvtheverett#BrightonSEO
UNSUPERVISED LEARNING
@rvtheverett#BrightonSEO
UNSUPERVISED LEARNING
@rvtheverett#BrightonSEO
MACHINE LEARNING SIMPLIFIED
@rvtheverett#BrightonSEO
- Ethem Alpaydin
Machine learning will help us make sense
of an increasingly complex world. Already
we are exposed to more data than what our
sensors can cope with or our brains can
process.
SEO POSSIBILITIES WITH MACHINE LEARNING
@rvtheverett#BrightonSEO
SEO Possibilities with
Machine Learning
Evaluating
Content Quality
Log File
Analysis
Predictive
analysis
Title Tag
Optimisation
User Engagement
Insights
Audio
Transcribing
@rvtheverett#BrightonSEO
PREDICTIVE PREFETCHING
PREDICTIVE PREFETCHING
@rvtheverett#BrightonSEOhttps://guess-js.github.io/docs
Automate the
process of predictive
prefetching
PREDICTIVE PREFETCHING
@rvtheverett#BrightonSEOhttps://guess-js.github.io/docs
Predict the next page a user is likely to visit and prefetch
these pages.
PREDICTIVE PREFETCHING
@rvtheverett#BrightonSEOhttps://guess-js.github.io/docs
Predict the next page a user is likely to visit and prefetch
these pages.
Predict the next piece of content (article, product, video) a
user is likely to want to view and adjust or filter the user
experience to account for this.
PREDICTIVE PREFETCHING
@rvtheverett#BrightonSEOhttps://guess-js.github.io/docs
Predict the next page a user is likely to visit and prefetch
these pages.
Predict the next piece of content (article, product, video) a
user is likely to want to view and adjust or filter the user
experience to account for this.
Predict the types of widgets an individual user is likely to
interact with more (e.g games) and use this data to tailor a
more custom experience.
@rvtheverett#BrightonSEO
INTERNAL LINKING
INTERNAL LINKING
@rvtheverett#BrightonSEO
Crawl to identify broken internal links
Algorithm to suggest the
most accurate
replacement page
Replace broken
internal links
INTERNAL LINKING
@rvtheverett#BrightonSEO
@rvtheverett#BrightonSEO
CONTENT QUALITY
CONTENT QUALITY
@rvtheverett#BrightonSEO
Search Volume
Uniqueness Freshness
Internal Links Word Count
Search Traffic Heading Tags
Time on page
Bounce Rate
Conversion Rate
Model generates insights on the factors that are most
important.
CONTENT QUALITY
@rvtheverett#BrightonSEO
Important content factors
Machine Learning Model
Content Quality
Score
@rvtheverett#BrightonSEO
USER EXPERIENCE
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Sentiment analysis - Instagram bullying language
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Image cropping - Twitter
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Image cropping - Twitter
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Image cropping - Twitter
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Image cropping - Twitter
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Computer Vision
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Computer Vision - Making images accessible
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Chatbots - Helping users find the most useful content
USER EXPERIENCE
@rvtheverett#BrightonSEOhttps://github.com/mgechev/guess-next
Chatbots - Helping users find the most useful content
Remember trust is
important - let users
know if they talking to
a bot rather than a
human
@rvtheverett#BrightonSEO
NATURAL LANGUAGE PROCESSING
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEO
Google’s NLP Model
Natural Language uses machine learning to reveal the structure and meaning of text.
Analyses text to understand the sentiment, as well as extract key information.
https://cloud.google.com/natural-language/
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://cloud.google.com/natural-language/
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://cloud.google.com/natural-language/
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://cloud.google.com/natural-language/
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://github.com/BritneyMuller/colab-notebooks
@BritneyMuller
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://github.com/BritneyMuller/colab-notebooks
Entity Salience
MACHINE LEARNING TOOLS
@rvtheverett#BrightonSEOhttps://github.com/BritneyMuller/colab-notebooks
Entity Categorisation
@rvtheverett#BrightonSEOhttps://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
IMAGE CATEGORISATION
TENSOR FLOW FOR POETS
@rvtheverett#BrightonSEOhttps://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
Retrain an already trained model using transfer
learning for a similar problem.
Train a simple classifier to classify images of flowers.
TENSOR FLOW FOR POETS
@rvtheverett#BrightonSEOhttps://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
TENSOR FLOW FOR POETS
@rvtheverett#BrightonSEOhttps://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
@rvtheverett#BrightonSEO
THE FUTURE OF SEO
Understand and solve problems
faster
@rvtheverett#BrightonSEO
THE FUTURE OF SEO
Make data driven decisions
@rvtheverett#BrightonSEO
THE FUTURE OF SEO
Focus on other important
optimisation activities
@rvtheverett#BrightonSEO
THE FUTURE OF SEO
Improve user experience
TALK TO YOUR
DEVELOPERS
JOIN COMMUNITIES
https://pyslackers.com/web
https://www.100daysofcode.com/
KEEP PRACTICING AND
HAVE FUN
PEOPLE TO FOLLOW
@britneymuller
@hamletbatista
@TylerReardon
@DataChaz
@dawnieando
@jroakes
@jessthebp
@aysunakarsu
@math_rachel
DEEPCRAWL
PROFESSIONAL
SERVICES
@BermanHale
@allophonousrex
@rachelleighrva
@NeilDesai
@theJimmyB0b
@Rick_BarK
KEY TAKEAWAYS
@rvtheverett#BrightonSEO
Python can help technical SEOs
increase their efficiency.
Being able to better understand
data will lead to better decisions
being made.
Anyone can learn Python, with a
little commitment. Have fun with it
and see what you can create.
@rvtheverett#BrightonSEO
USEFUL RESOURCES
@rvtheverett#BrightonSEO
https://www.python.org/
https://www.searchenginejournal.com/python-seo-data-reference-guide/287927/
https://www.searchenginewatch.com/2019/02/06/using-python-to-recover-seo-site-traffic-part-one/
https://cs109.github.io/2015/
https://www.deepcrawl.com/blog/webinars/scaling-automated-quality-text-generation-for-enterprise-sites/
https://automatetheboringstuff.com/
https://towardsdatascience.com/beginners-guide-to-machine-learning-with-python-b9ff35bc9c51
https://www.searchenginejournal.com/python-technical-seo/330515
https://www.searchenginejournal.com/introduction-to-python-seo-spreadsheets/342779/
https://www.fullstackpython.com/
https://www.tensorflow.org/learn
THANK YOU
#BrightonSEO
Ruth Everett
Technical SEO Analyst
@rvtheverett // @deepcrawl

Getting Started with Python and Machine Learning for SEO | BrightonSEO October 2020 | Ruth Everett