2. ABOUT ME
UPASNA GAUTAM
MANAGER, SEARCH
ZIFF DAVIS
AUSTIN, TX
• Search Lead for PCMag & Mashable
• Former Clinical Research Scientist & Lab Rat
• Loud & Proud Michigander (Go Green!)
• Dance Teacher & Fitness Instructor
@upasnagautam
3. OBJECTIVES
• WHAT IS SEMANTIC SEARCH
• WHAT IS NOT SEMANTIC SEARCH
• HOW DOES GOOGLE MAKE IT WORK
• HOW CAN YOU MAKE IT WORK
@upasnagautam
4. • SEO: Then and Now
• What is Semantic Search?
• Optimizing for Voice Search
• Tactical Takeaways
AGENDA
@upasnagautam
6. @upasnagautam
SEO: THEN AND NOW
Keyword-Focused
• Text retrieval system
• Relied on exact-match
• Weighted documents by keyword frequency
BACK THEN:
Unable to Distinguish Synonyms and Homographs
• Synonym: Words that share the same meaning (“car” and “automobile”)
• Homograph: Words having than one meaning depending on context (“charge”)
7. @upasnagautam
SEO: THEN AND NOW
• Driven by Intent and Context
NOW:
• Relevant Answers to
Specific & Vague Queries
14. @upasnagautam
WHAT IS SEMANTIC SEARCH?
A branch of linguistics that studies the relationship between words and
sentences and their actual meanings.
The improvement of search accuracy by understanding intent and context,
using various on-site elements to crawl, index, and serve relevant results.
SEMANTICS
SEMANTIC SEARCH
15. @upasnagautam
WHAT IS SEMANTIC SEARCH?
• ENTITY OPTIMIZATION
• KNOWLEDGE GRAPH
• STRUCTURED DATA
• INFORMATION ARCHITECTURE
• CO-OCCURRENCE & CLUSTERING
17. @upasnagautam
WHAT IS SEMANTIC SEARCH?
KNOWLEDGE GRAPH
•Understands relationships between
things
•Stores and understands the
intelligence between different entities
•Not just a catalog of objects, but a
data model for inter-relationships
Why don’t you explain this
to me like I’m 5?
18. @upasnagautam
WHAT IS SEMANTIC SEARCH?
STRUCTURED DATA
•Google is a data-driven machine that needs to be fed in order for it to learn
•Pieces of intelligence the crawler uses to build semantic relevance & authority
•This is how entities are indexed
•Speakable Schema is HERE & it’s just the beginning for voice search markup
19. @upasnagautam
WHAT IS SEMANTIC SEARCH?
INFORMATION ARCHITECTURE
•Allows for a crawler to clearly understand content and how it’s connected
•Provide a clear and hierarchical path of information
•Lends to a good UX
•The RIGHT approach is the most LOGICAL approach
•Must read: Information Architecture for the World Wide Web [3rd
Edition, by
Peter Morville]:
https://www.amazon.com/Information-Architecture-World-Wide-Web/dp/059652
7349
20. @upasnagautam
WHAT IS SEMANTIC SEARCH?
CO-OCCURRENCE & CLUSTERING
Word Co-Occurrence Clustering
•Generates topics from words frequently occurring together
Weighted Bigraph Clustering
•Uses URLs from Google search results to induce query similarity & generate topics
The combination of these two methods demonstrated greater usefulness and
accuracy when compared to Latent Semantic Analysis.
Read the patent here: https://pdfs.semanticscholar.org/dcf7/05ba07ee1b73fda0c94e9d01b2474173e470.pdf
21. @upasnagautam
WHAT IS SEMANTIC SEARCH?
CO-OCCURRENCE & CLUSTERING
Word Co-Occurrence
•A set of words anchors serve as initial topics, which are then generalized to other
words co-appearing with the same queries.
•Topics are created using hierarchical clustering on query similarity, which
measures to what extent two queries agree on their intersections with the list of words
in each topic.
Bigraph Clustering
•Uses organic results to create a bigraph with a set of queries and a set of URLs as
nodes. Weights of the graph are computed with the impression and click data.
•Bigraph clustering works very well even if the queries do not share common words.
24. @upasnagautam
WHAT IS SEMANTIC SEARCH?
•Learning the mathematical relevance
helps to understand search on a
functional level
•LSI uses Singular Value Decomposition
which is a linear algebraic factorization
for many of our modern algorithms
•It is not a way to “do SEO”
•LSI KEYWORDS ARE NOT A THING
25. @upasnagautam
WHAT IS SEMANTIC SEARCH?
Latent Semantic Indexing (LSI):
•Mathematical algorithm based on Singular Value
Decomposition (SVD)
•Text indexing and retrieval method
•How terms and concepts are related
•Projects a large multi-dimensional space down into
a smaller number of dimensions
•Semantically similar words get bunched together
•Boundary blurring allows LSI to go beyond exact
keyword matching
26. @upasnagautam
WHAT IS SEMANTIC SEARCH?
•Noise reduction
•Reveal similarities that were latent
•Similar terms become more similar, while dissimilar things remain distinct
This method is a widely used technique to unveil latent themes in text data, as
these models learn the hidden topics by understanding document level word
co-occurrence patterns.
Latent Semantic Indexing (LSI):
27. @upasnagautam
WHAT IS SEMANTIC SEARCH?
Latent Semantic Indexing (LSI):
Short texts, such as search queries, tweets or instant messages suffer from data sparsity,
which causes problems for traditional topic modeling techniques. Unlike proper documents,
short text snippets do not provide enough word counts for models to learn how words
are related and to disambiguate multiple meanings of a single word.
*This is why the binary co-occurrence/clustering model works better*
29. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
AUTOMATIC SPEECH RECOGNITION
Automatic Speech Recognition (ASR),
fueled by deep learning neural networking,
is the system that powers applications like
speech transcription and voice search.
31. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
AUTOMATIC SPEECH RECOGNITION
How do humans do it?
Human articulation produces sound waves which the ear
conveys to the brain for processing.
New phone who dis
33. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
GOOGLE’S VOICE SEARCH QUALITY METRICS
“We strive to find metrics that illuminate the end-user experience, to make sure that we
optimize the most important aspects and make effective tradeoffs. We also design metrics
which can bring to light specific issues with the underlying technology.” -GOOG
• Google has defined and uses a set of metrics
to track the quality of its voice search system.
• They use these metrics to drive their research
directions as well as provide insight and
guidance for solving specific problems and
tuning system performance.
34. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
GOOGLE’S VOICE SEARCH QUALITY METRICS
•Word Error Rate (WER)
•Semantic Quality (Webscore)
•Perplexity (PPL)
•Out-of-Vocabulary Rate (OOV)
•Latency
Google Voice Search Case Study
35. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
GOOGLE’S VOICE SEARCH QUALITY METRICS
The SERP has evolved into a
dynamic, purchase-driven environment,
with the integration of product carousels,
featured snippets with product rankings,
research carousels,
and of course,
the shopping carousel.
36. @upasnagautam
OPTIMIZING FOR VOICE SEARCH
GOOGLE’S VOICE SEARCH QUALITY METRICS
A High-Quality UX is a Fast UX
From the time it takes to detect end-of-speech to the time it takes to render
search results, time is of the essence for speech processing.
“It is generally desirable to reduce any user noticeable latency, and in
certain circumstances, may be desirable to reduce latency even if
improved speed comes at the cost of reduced quality ASR results.” -GOOG
38. @upasnagautam
TACTICAL TAKEAWAYS
•Craft and optimize content for topics and concepts, not just keywords
•Use structured data to feed crawler the semantic intelligence it needs to
understand your site better
--Speakable Schema is HERE and it’s just the beginning
•Align the information architecture of your website to the consumer journey
--Navigation, sitemaps, page structure, content organization
39. @upasnagautam
TACTICAL TAKEAWAYS
• Align the information architecture of your website to the consumer journey
--Navigation, sitemaps, page structure, content organization
• Invest in speed optimization
• Provide answers to SPECIFIC questions about your products and
services (Featured Snippets!)