‘Natural Semantic SEO’ - Surfacing
walnuts in densely represented,
ever increasingly small worlds –
‘Umbrella’ and ‘Sidecar’ Approaches
- Dawn Anderson
Bertey
Semantic
SEO – So
many
definitions
float about
IT’S ALL ABOUT
SYNONYMS
IT’S ALL ABOUT TOPICS IT’S ALL ABOUT
INTENT
IT’S ALL ABOUT
‘THINGS’ NOT
‘STRINGS’
IT'S ABOUT ENTITIES
Simplified… it’s the connection &
understanding of 3 types of data
STRUCTURED – EASY TO
UNDERSTAND
SEMI-STRUCTURED – SOME
FORM / CATEGORISATION
UNSTRUCTURED – LOOSE /
DIFFICULT TO UNDERSTAND
Assisting a reflection
in a website as a
mirror of ‘real world’
contexts
True Structured Data Is Easy to Disambiguate
Resides in a
database
Is quantifiable
Can be tabular
data
Is well
organised
Is likely stored
in rows &
columns
Has relational
keys mapped
to fields
A simple table is ‘true structured data’ (a relational database)
Column
ID
Column
ID
Column
ID
Column
ID
Column
ID
Column
ID
Row ID Blah Blah Blah Blah Blah Blah
Row ID Blah Blah Blah Blah Blah Blah
Row ID Blah Blah Blah Blah Blah Blah
Row ID Blah Blah Blah Blah Blah Blah
Row ID Blah Blah Blah Blah Blah Blah
Wikipedia is the best example of semi-structured data & structured
data combined
But it’s estimated 80-90% of the world’s data is unstructured (2021)
Source: https://venturebeat.com/2021/07/22/why-unstructured-data-is-the-future-of-data-management/
Unstructured
(disorganised)
data is chaos
The bigger the multi-
dimensional data grows the
harder the problem
Because…
NUANCE AMBIGUITY LACK OF
CONTEXT
MANY TYPES
OF MEDIA
SHEER
VOLUME
SPARCITY OF
MATCHES
Bellman’s ‘Curse of Dimensionality’
But… there
is good news
Semantic search seeks to help with tying
structured data & unstructured data
together – Disambiguates context
Unstructured data
(text in web
content)
Structured data
(Knowledge graphs
/ knowledge
repositories)
‘Natural language’ research is
helping search engines fill
’contextual gaps’ between
identifiable entities
Better understanding
is becoming easier to
achieve even without
schema markup
through natural
semantic SEO
Research
progress is on
fire
Let’s look at
the bigger
picture
4 key areas of
search will
help us
understand
these
developments
Precision & Recall Ranking & Re-
ranking
Comprehensiveness
v Specificity
Lexical & Vector
Search
Precision &
Recall
Precision & recall
Precision (accuracy /
preciseness)
How many precise documents
were retrieved
Recall
How many relevant, (but not
precise) documents were
retrieved
Ranking & Re-
ranking
Stages of search engine ranking (at least two)
Initial ranking – A shortlist Re-ranking – Grand finale
Query Comprehensiveness v Query
Specificity
The Under-specified
Query == Generic
head terms
Mostly search engines have no clue
what the user wants with broad,
generic terms
They’ve even started asking the user to extend the query
e.g. Dresses –
What does the
user even
want?
e.g. ‘Wedding
Dresses’ – Still
under-specified
Most SEO tools
categorise
search terms
with a single
intent
But there will be
a descending
order of intent
probability
Head Terms Will Have Multiple Intents
Most tools only categorise
head terms with a single intent
But this is not the case
Intent will have a descending
order of predictable
probability
Intent shifts often and with
temporal patterns /
predictability
Comprehensive search results
(Search Result Diversification)
seeks to meet all of these
possible intents probably
predicting intents in
descending order of likelihood)
Winner?
Head – Ranking via
comprehensiveness (meet
ALL the needs of the
under-specified query)
(within the collection)
Tail – Ranking by rareness
/ specificity / precision
(meet that one SPECIFIC
need with rarity)
Lexical & Semantic/ Similarity Search
Lexical Search – ‘Bag of
Words’ (eg: TF:IDF & BM25
algorithms) (Word matches)
Semantic Search (Vector
similarity search)
Vectors – Mathematical shapes where
words with similar meaning ‘live’ near
each other
Euclidean
distance
Cosine
Similarity
Lexical Search
is Sparse
“By definition, a sparse matrix is called
“sparse” if most of its elements are zero. In
the bag of words model, each document is
represented as a word-count vector.”
Source:
https://sebastianraschka.com/faq/docs/bag-
of-words-sparsity.html
Vector / similarity search word
embeddings are somewhat denser
The biggest leap forward
in natural language
understanding is
undoubtedly BERT
(Devlin, 2018)
BERT ==
Separate
Contextual
Vectors to
disambiguate
Vector 1 - Bank
– Money, cash
Vector 2 - Bank
– River, water
But
computational
scale is a
challenge
‘Curse of
Dimensionality’
strikes again
How has this
challenge
been
addressed?
Use BERT sparingly
in the re-ranking
stages of ranking
pipelines
Build teacher & student BERTs
Build ‘lighter’ natural language models
Increase performance dramatically
Whilst trading off minimal loss
Break the
problem
down further
use BERT on ‘passages’ of documents
Put Put the document back together
Break down Break down a document and turn it into passages
Passage Passage indexing / ranking & re-ranking
Fight back against ‘The Curse
of Dimensionality’
Dense Retrieval
Go smaller still. Break
the problem down into
sentences & compare
cosine similarity of
sentence pairs
Sentence-bert
65 hours with BERT to find
two most similar
sentences from 10,000
sentences using normal
BERT / RoBERTa model
Same task around 5
seconds with Sentence-
bert
Siamese BERT Networks
– Two BERT encoders
working together
Siamese BERT - Utilising
cosine similarity of two
output sentences
Turn the multi-stage retrieval
stages on their head
Feed the re-ranked
output as training data
to the ranker to learn
better ‘precision’ from
the outset
Machine Learned Dense
Passage Retrieval (with
hard negatives)
Hard negatives ==
‘Rule out the
obvious incorrects
from the start’
How about clusters
of similarity packed
together for
density?
How about Sentence-BERT
with FAISS (Facebook AI
Similarity Search)
A library for efficient similarity search and clustering of dense vectors –
Facebook Research
What about
Adapt TF:IDF to
include topic
classes (C-TF-IDF)
BERTopic & C-TF-IDF (Class based Frequency)
Utilises topical classes /
concepts packed with
term frequency /
inverse document
frequency
Builds topic clusters as
an alternative to simple
word count in
proportion to
document length
Take similarity
seeking beyond
the document
Adapt Navigable Small World (NSW) Graph
Algorithms (Small World Networks) with
Hierarchical machine learned ‘Similarity
Distances’
‘Hierarchical Navigable Small World Graphs’
utilizes ML driven tree-graph traversal to identify
‘nearness’ of semantically similar neighbours
(Approximate Nearest Neighbours)
Search for the
‘Approximate
Nearest Neighbour’
(ANN) across
multiple tree levels
Semantic ‘Small Worlds’
across multiple layers of
a connected link graph
To find the most semantically
similar ‘Cluster centroids’ to
a query
Huge progress overall in
semantic similarity
search
So why does
this matter to
SEOs?
When Tree Graph
‘Small world’
semantic distance
matters
increasingly…
And… Zipfian distribution in
search demand – Pareto / Power
Laws also matters IMPORTANCE
Walnuts, umbrellas
& sidecars are your
friends
Some peanuts too… but
with precision
In a bowl of
mixed nuts…
Walnuts rise to
the top
Search engines are seeking
consistent, strong learned
confidence in ANN cluster
centroids
But… ‘Approximate Nearest
Neighbour’ is still just that…
Approximate
Natural Semantic SEO helps to
pinpoint perfect centres whilst
also considering Zipfian
distribution (importance)
It’s a reflection of:
How things fit
together
Their importance as a
reflection of real
world search demand
Walnuts of importance
must be the heart of the
target clusters
Peanuts (lower
value) may steal
from the target due
to semantic
closeness to the
centroid
Skewing the centre of
semantic clusters
(centroids)
But peanuts do
matter too
Walnuts ==
categorical or
subcategorical
comprehensivenss
Peanuts == Specific
precision (rareness)
Internal links
can help or
endanger here
(HNSW / ANN) utilizes a
‘Friends List’ (links)
Create ‘small worlds’
In Hierarchical Navigable
Small Worlds in SEO
Friends list should be closely related internal links
But…
If the centre of the
cluster is unclear
Google will just
keep guessing
Wrong pages
ranking?
Perhaps…Too MANY internal links / Too MANY internal links
with similar long tail anchors to similar semantic distance
pages
Competing with
Self on ‘too
semantically
similar’ linked
pages
Peanuts
semantically fighting
with other peanuts
or walnuts
TO DO: Internally link in
sectional ‘at scale’
template positons ONLY
where there is search
volume
Make sure the page is
important enough (to the real
world) for its internal link love
TO DO: Understand when
‘things’ have the same
meaning
Eg.
Enterprise architect & business architect
Intruder alarm & security alarm
Information security & cyber security
Check
Wikipedia
Redirects &
merge
On redirects:
Keep the
semantic
distance short
Also… It’s a ‘fine line’ between
long tail pages and dynamic
search index pruning / excluded
Too many internal links
from low value (sparse
value) pages to each other
At scale ('Linked to’) pages
not meeting queries (e.g.
location + service)
Using multipliers? ‘Inadvertent’ web content SPAM?
Peanut
dilution
You just dropped
your site off the
long tail of the web
You turned your site into an
‘unimportance’ colander
Overflow SEO
Divide & Rule
(Split categories
ONLY when
ready to
overflow)
E.g. - When pagination
gets indexed consider an
overflow split
As Content / Offerings Grow… ‘Overflow’
Part of
page
Overflows to
Overflows to
Subcategory
Category
Leaf page
Overflows to
Build increasingly ‘Small Worlds’ of semantic
similarity
Wrong pages
ranking? –
Surface a clear
winner
Where’s the strong
centroid of a cluster in
‘no clear winners?’
Siblings without a
parent category
URLs, connecting
competing siblings?
Your current site structure won’t allow
it you say?
Use ‘Umbrella & Sidecar’
approaches – Build & connect
intents upwards & sideways
Umbrella
Approaches add
intent hubs
‘above’ & are
navigational /
triage systems by
nature
Connect to Primary suspected intent
Connect to
Secondary
intent
Connect to
Secondary
intent
Connect to
Secondary
intent
Connct to
Secondary
intent
Connect to
Secondary
intent
Connect to
Secondary
intent
‘Sidecar pages’ –
Divert low value
+ multiplier
pages to ‘sidecar
pages’
Colours Sizes
Teams Services Locations
Link from
sidecars back
into precise
peanuts
Use internal links
Use
Hack the breadcrumb
Hack
Diverting links via sidecars
Connect
Show a semantic hierarchical relationship
Show
Umbrella approaches
Use
Surfacing Walnuts
Walnuts.
Dampening
Peanuts
Multi-search is Multi-dimensional Similarity
Search at Scale & across different media types &
geo-data
Thank you

Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increasingly Small Worlds