More Related Content
Similar to Scott Stouffer - Advanced Search Summit Napa 2021 (20)
More from Digital Marketers Organization (20)
Scott Stouffer - Advanced Search Summit Napa 2021
- 1. © Market Brew 2021
Ask The Search Engineer
Brought to you by:
SEO Software, Made By Search Engineers
- 2. © Market Brew 2021
marketbrew.ai
2
• Why Did Google Switch to ML-based Search?
• How Does ML-based Search Work?
• Using AI to solve ML-based Search
• Live: Ask The Search Engineer Q&A
Let’s Talk About…
- 3. © Market Brew 2021
marketbrew.ai
3
•
•
•
Everything Google will tell you about the
technical underpinnings of their search engine:
- 4. © Market Brew 2021
marketbrew.ai
4
• Carnegie Institute of Technology at Carnegie Mellon
University.
• B.S. and M.S. in Computer and Electrical Engineering.
• CTO + Co-Founder of Market Brew in 2006.
• Google for Entrepreneurs Advisor
• Inventor and author of multiple utility patents in both
the software and search space.
scott_stouffer
in/scottstouffer/
About Me…
- 5. © Market Brew 2021
marketbrew.ai
5
• Learned BASIC at age SIX.
• Had to write my own games.
• No internal memory.
• Learned to rapidly code!
Learning to Code
- 6. © Market Brew 2021
marketbrew.ai
6
• Built automated trading systems
for NYSE / NASDAQ.
• Lacked broker / floor connections.
• Looked for where my programming
skills were the missing piece.
Learning to Scale
- 7. © Market Brew 2021
marketbrew.ai
7
• Met business owners in 2006 who had
“decoded” Google.
• Was outranking all the national brands in
their local market.
• Asked me if I could figure out a way to
scale nationwide.
• They called their tricks “search engine
optimization”.
Learning About SEO
- 8. © Market Brew 2021
marketbrew.ai
8
• Unleashed the first major link
network.
• Probably caused Google to create
the “supplemental” index.
• Learned that this was not a long-term
success strategy.
Learning My Lesson
- 9. © Market Brew 2021
marketbrew.ai
9
• Decided to “switch teams”.
• Wanted to show others why “hacking” Google
was a pointless, short-term exercise.
• How can we enable non-programmers to
understand how Google works?
Learning The Way
- 10. © Market Brew 2021
marketbrew.ai
10
• At 1am the next night, I woke up in a sweat.
• Started furiously writing idea for a new type of “generic
search engine”.
• Started building basic pieces of a search engine.
Learning Search Engineering
- 11. © Market Brew 2021
marketbrew.ai
11
• Started building crawler first.
• In 2006, no open source libraries to
do this.
• Before Ahrefs.
• Before Open Site Explorer.
• Before Yahoo Site Explorer!
Learning To Crawl
- 12. © Market Brew 2021
marketbrew.ai
12
• Had to model all major families of
SPAM algorithms.
• How important was each family of
algorithms?
• Different from standalone tools that
evaluate “in a bubble”.
750,000 lines of code later…
Learning The Algorithms
- 13. © Market Brew 2021
marketbrew.ai
13
• First version was a RULES-BASED search
engine.
• Around 2013, switched to a MACHINE
LEARNING-BASED search engine.
• We anticipated loss of transparency
into Google.
• Built an artificial intelligence component on top
of our “generic search engine”.
My (small) mark on the SEO industry…
Learning Artificial Intelligence
- 14. © Market Brew 2021
marketbrew.ai
14
Algorithm
Inputs/Outputs
Query Parser
Scoring
Methodologies
clone using
No More Black Box!
machine learning
Key Milestone: Search Engine Modeling
- 15. © Market Brew 2021
marketbrew.ai
15
marketbrew.ai
© Market Brew 2021
Why Did Google switch from
RULES-BASED
to
ML-BASED
Search Engine?
- 16. © Market Brew 2021
marketbrew.ai
16
• Machine Learning encompasses all kinds of
models.
• Simple models are great at classifying
distinct groups.
• In the real world, simple models have a
hard time solving complex problems.
Algorithm Equations Used To Be Simple
- 17. © Market Brew 2021
marketbrew.ai
17
• What happens when those groups of data
are not so distinct?
• Link / Content Algorithms are often dealing
with random looking data.
• Search is a nonlinear classification
problem, too hard to do with
rules-based approach!
What equation defines the line that separates these?
Now They Are Complex
- 18. © Market Brew 2021
marketbrew.ai
18
• Can we program a machine to “learn”
the equations?
• Machine Learning can learn anything,
so why not equations?
• Where do we start?
There’s got to be a better way…
Need To Solve Complex Equations
- 19. © Market Brew 2021
marketbrew.ai
19
• Neural Networks, after many
iterations, give us the equations.
• You can also use this for learning
internal algorithm settings!
Neural Networks to The Rescue
- 20. © Market Brew 2021
marketbrew.ai
20
• Humans assign “labels” and “features” to
data.
• Labels = “is this site spammy?”
• Features = content, links, rules-based
algorithmic outputs.
• Training = a “model” iteratively learns how
to “infer” relationships based on the labels
and features.
Start With Labels and Work Backwards
- 21. © Market Brew 2021
marketbrew.ai
21
• Google created “Search Quality
Rating Guidelines”
• Humans used for labeling their
supervised machine learning models!
• It’s the best of the “portal” and “search
engine” models, put together.
How Does Google Label Their Model?
- 22. © Market Brew 2021
marketbrew.ai
22
• In 2013, Google started its shift to a machine
learning-based search engine.
• Google starts to reduce transparency (even
their own engineers don’t know how the
neural networks work)
• As of 2021, 95% of SEO software
platforms still use a rules-based
paradigm.
What should be the approach to optimization now?
SEO Platforms Haven’t Caught Up
- 23. © Market Brew 2021
marketbrew.ai
23
marketbrew.ai
© Market Brew 2021
"Machine Learning changes the way you think
about a problem. The focus shifts from a
mathematical science to a natural science,
running experiments and using statistics,
not logic, to analyze its results."
PETER NORVIG - GOOGLE RESEARCH DIRECTOR
- 24. © Market Brew 2021
marketbrew.ai
24
marketbrew.ai
© Market Brew 2021
As SEO Professionals,
we want to know:
“What’s in the BLACK BOX?”
- 25. © Market Brew 2021
marketbrew.ai
25
• Google’s Black Box isn’t even completely
understood by their search engineers.
• Complex nonlinear equations THAT ARE
CONSTANTLY CHANGING.
Most tools source their data directly from the BLACK BOX!
The Black Box is Constantly Changing
- 26. © Market Brew 2021
marketbrew.ai
26
• We can use the same concept of Machine
Learning and Neural Networks to determine
the makeup of what is inside the black box.
• Labels = ranking data (search results).
• Features = algorithm outputs.
Same problem as Google, but one level higher!
Let’s Use AI to Discover The Black Box
- 27. © Market Brew 2021
marketbrew.ai
27
• Need to create a “generic” search engine
first.
• How do we determine which algorithms
to model?
How do we create a new search engine model?
Where Do We Start?
- 28. © Market Brew 2021
marketbrew.ai
28
marketbrew.ai
© Market Brew 2021
Search Engine Models
don’t reverse engineer Google's code…
they approximate it using
FIRST PRINCIPLES
- 29. © Market Brew 2021
marketbrew.ai
29
• Start with core-level:
• PageRank
• Duplicate Content
• Panda (extension of “Supplemental
Index”)
• Penguin (Link Neighborhood)
• Fill in smaller missing gaps until model
converges.
Start With “First Principle” Algorithms
- 30. © Market Brew 2021
marketbrew.ai
30
• TV sales correlated with deaths alone is
mostly meaningless.
• Growing population means more TV sales
and more deaths.
• TVs suddenly flatten out but the deaths
don't == some new factor has disturbed
the correlation that may need
investigating...
But Correlation can establish relationships
“Correlation DOES NOT Imply Causation”
- 31. © Market Brew 2021
marketbrew.ai
31
• Here’s an example of an overly complicated
data with too many features.
• We can create a nonlinear equation, using
neural networks, to solve this! Wow!
• But…what happens if the data slightly
changes?
Don’t want an overly complex model…
The Danger of Overfitting
- 32. © Market Brew 2021
marketbrew.ai
32
• Uh oh. The new data is clearly not fitting the
previously trained model.
• What happened here?
• We’re overfitting data.
• Too many specific / narrow / low-level
algorithms.
Don’t want an overly complex model…
Overfitting Leads To An Unsolvable Approach
- 33. © Market Brew 2021
marketbrew.ai
33
William of Ockham, a 14th century friar
and philosopher, loved simplicity.
The less complex a machine learning
model, the more likely that a good
empirical result is not just due to the
peculiarities of the sample.
Don’t want an overly complex model…
Keep It Simple
- 34. © Market Brew 2021
marketbrew.ai
34
• This doesn’t mean an oversimplified
search engine model!
• Prefer feature set that showcases top
layer of algorithms, some of which can
be very complex.
Prefer feature set of TOP LAYER OF ALGORITHMS
Only Allow Top Layer to Adjust
- 35. © Market Brew 2021
marketbrew.ai
35
• Also need to be very careful
about weighting ranges.
• L2 Regularization is used to
prevent curve-fitting.
• W3 is contributing to almost ALL
of the complexity.
Need to prevent curve-fitting data…
Avoid Features Overwhelming The Model
- 36. © Market Brew 2021
marketbrew.ai
36
marketbrew.ai
© Market Brew 2021
What are the Benefits
of modeling the Black Box?
- 37. © Market Brew 2021
marketbrew.ai
37
Think of it as a search
engine that can calibrate
its own settings using
machine learning to behave
like any search engine you
want.
It’s Very Accurate
- 38. © Market Brew 2021
marketbrew.ai
38
Users can test their website
changes in the model and
can predict how their actual
ranking results will be
affected, months before
those changes show up in
their rank trackers.
True Predictability (and Risk Avoidance)
- 39. © Market Brew 2021
marketbrew.ai
39
MODELING
• Search Rankings
• Search Ranking Distances
• Link Flow Distribution
RISK FACTORS
• Algorithmic Penalties
• Good / Bad Site Associations
• Site & Page Scorecards
PREDICTION
• Reach Simulations
• High ROI Optimizations
A Deeper Understanding
- 40. © Market Brew 2021
marketbrew.ai
40
marketbrew.ai
© Market Brew 2021
RETHINK
SEO