SlideShare a Scribd company logo
1 of 87
Download to read offline
@KorayGubur
Semantic Search Engine &
Query Parsing
In the Light of Semantic Search Principles
@KorayGubur
A b o u t M e
Koray Tuğberk GÜBÜR
Owner and Founder of Holistic SEO & Digital
• Educates his team
• Publishes SEO Case Studies, Researches & Guides
• Twitter: @KorayGubur
• Email: ktgubur@holisticseo.digital
• Official Site: https://www.holisticseo.digital
@KorayGubur
S E O C a s e S t u d i e s o f K T G
@KorayGubur
S E O G u i d e s o f K T G
@KorayGubur
W e b i n a r s a n d I n t e r v i e w s o f K T G
@KorayGubur
What is Query Parsing?
• Query Parsing it the process of
understanding the different sections of a
query.
• Types: Entity-seeking Query, a Substitue
Term, or Synonym Term.
• Canonical and Represented Versions: A
Canonical Query can represent close
variations.
• Query Character: Affects the SERP Design,
Dominant and Minor Search Intent
Assigments.
• Query Process: Other name of the Query
Parsing.
@KorayGubur
@KorayGubur
Multi-Stage Query Processing
• The first patent that talks about «Context
of Words».
• It tries to delete the stop words.
• Stemming the concrete words.
• Expanding words with Synonyms and Co-
occurence.
• Some Criterias: Absent Queries, Boolean
Logic, Query Term Weights, Document
Popularity, Word Proximity (Distance),
Word Adjacency.
• It uses «VIPS» and Web Page Layout.
@KorayGubur
Inventors: Jeffrey Adgate Dean, Paul G.
Haahr, Olcan Sercinoglu, and Amitabh
K. Singhal
US Patent Application 20060036593
Filed: August 13, 2004
Published February 16, 2006
@KorayGubur
Query Breadth
• This is for «adjecent words» and
«unknown entities».
• It uses related document count to see
the ‘query breadth’.
• Query Breadth can be decreased with
the ‘adjecent word’ count.
• Query Breadth can be used for ‘Named
Entity Recognition’, or Triple Creation
(An Object and two Subject).
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
@KorayGubur
@KorayGubur
Query Analysis
• Selection Over Time: For different timespans,
a document can be chosen more frequently.
• Documents with Hot Topics: Rising Queries
can boost documents that include these
queries.
• Documents with Related Hot Topics: Related
queries for rising queries can boost the
documents with related queries.
• Constant Queries with Consistently Changing
Results: Constant Query is the always popular
query with changing information for a topic.
• Freshness of Documents: Date of the
information on the web page, not the date of
the document’s last version.
@KorayGubur
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
@KorayGubur
Query Analysis
• Staleness of Documents: Historical Data
amount can be a positive ranking signal
for a page for a query.
• Overly Broad Pages: Includes discordant
queries, a signal for spam.
• Continuation Patent filed in 2011 for
«document locator». And, some terms
changed.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
@KorayGubur
Query Analysis
• Trends Related to Topics and Search Terms: Grouping
Topics, and Subtopics announced for Trending Queries.
• Access Times to Determine Freshness and Staleness:
Compares the First Access and Last Access time for
certain documents.
• Frequency of Selection: Compares the selection count
for the first and latter time.
• When Staleness Might be Preferred: Even if there is
fresh news, or documents, the user can choose the stale
document. These documents are not affected by stale
information.
• Spam Determination Based Upon Breadth of Rankings,
and Authority: If the document is popular, or
authoritative (link-based), or the source is relevant
enough, it will be an exception.
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
@KorayGubur
Query Analysis
• Continuation of the Historical Data
Patent.
• Speaks about Topics, and Query
Categorization based on Topics.
• It is important beause, same year,
Google Launched its Knowledge Graph
with 5 million entities, and 500 million
facts.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
@KorayGubur
Midpage Query Refinements
• In 2006, Google published the
«Midpage Query Refinements», a.k.a,
Search Suggestions from today.
• The GUI test was between 2004-2006.
• The patent filed in 2003.
• Includes Semantic Query Clusters for
Different Contexts.
• A Matcher, a Clusterer, A Scorer, and A
Presenter.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
@KorayGubur
Midpage Query Refinements
• Precomputation Engine has four parts.
• Associator: Query and Document
Association.
• Selector: Document and Query Section
Selector.
• Regenerator: Checks the query logs to
refresh the selections.
• Inverter: Checks the Cached Data for
presenting.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
@KorayGubur
Midpage Query Refinements
• Query Ambiguity: If the query is ambigous,
Search Engine can use the query clusters.
• Homonyms, General Terms, Improper
Context, and Narrow Terms can create a
stateless SERP Instance.
• To prevent this, Semantic Grouping,
Centroids and Centroid distance are used.
• A Query Cluster and Document Cluster can
be paired. If Document cluster is larger, or
more relevant, the query cluster will be
used as query suggestion.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
@KorayGubur
Midpage Query Refinements
• Matcher: Stored query variations are put into a
cluster, and document phrase variations are
matched.
• Clusterer: The matched query variations, and
documents are clustered together. Different
than query clusters.
• Scorer: Determines the center of the centroid.
If the term vectors are distant to the centroid,
another cluster will be chosen by the Clusterer
for Scorer.
• Presenter: Created Clusters, and Centroids are
presented to the user. According to the
preferred choices, presenter will use sub-
centroids.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain V
CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
@KorayGubur
Midpage Query Refinements
• During 2017, the patent has been
refreshed.
• The Scorer Method has been changed.
• Representative Queries are chosen based
on centroids.
• For every cluster, a representative query is
chosen.
• According to the cluster size, and relevance
scores, the clusters are aligned.
• And, sub-queries are used as the
refinement queries.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
@KorayGubur
Midpage Query Refinements
• Inventors of the Midpage Query Refinement
Methodology are Paul Haahr and Steven D.
Baker.
• Steven Baker has written the Google
Synonyms Blog Post for Google’s Synonym
Update before the RankBrain Announcement.
• Helping Search Engines to Understand
Language:
https://googleblog.blogspot.com/2010/01/hel
ping-computers-understand-language.html
• Paul Haahr is the owner of the How Google
Works Presentation from SMX West. Includes
lots of useful insights.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
@KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
@KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
@KorayGubur
Context-Vectors
• Context-Vectors are close to the ‘Lexicon’
of the first research paper of Google which
is An Anatomy of Large Hypertextual Web
Search Engine document.
• Context-Vectors are the version of Lexicon
with different Contextual Domains.
• Context-Vectors are located in Domain List
Terms.
• A Domain List Terms can include 800.000
words, and word combinations.
• A Domain List Terms can include a macro-
context, and a sub-context with sub-
portions.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
@KorayGubur
Context-Vectors
• Context-vectors use ‘Topical Entries’.
• A Topical Entry, can be used for macro-
context.
• These topical entries can be used for
question generation.
• Generated questions can be used for
differentiating the different sub-contexts
from each other.
• A Macro-context can have a Dominant
Knowledge Domain. A Context-Vector can
be used for intersectional areas.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
@KorayGubur
Categorical Quality
• This is an ‘Re-ranking’ Algorithm Patent.
• There is a strong difference between the
Re-ranking and Initial Ranking.
• Re-ranking Algorithms are the modifying
algorithms for the Query Results.
• Inventor is Tyrstan Upstill, author of the
Evidence-based Ranking Research.
• Categorical Quality doesn’t focus on
relevance, or authoritativeness, it focuses
on Understanding the Category of the
Query.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
@KorayGubur
Categorical Quality
• This patent mentions the ‘social media shares’
and community size.
• If the query satisfy the ‘categorical query’
conditions, the search results will be evaluated
for related and close queries too.
• If a resource satisfies also the related categorical
queries, a categorical quality score will be
assigned to the source.
• Categorical Quality Methodology collects
Navigational Queries for different sources.
• If the source has more navigational queries, it
means that it has a popularity for the category.
• Categorical Quality mentions «Topicality Score».
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
@KorayGubur
Categorical Quality
• If a source includes all query terms for a
topic, it will have more Categorical Quality
and Topicality Score.
• This method also mentions ‘Click
Selection.’
• To understand the Model’s Success, they
do not take every click or CTR into
account.
• They take CTR and Clicks into account if it
meets with certain criterias such as time,
frequency, or personal interest.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
@KorayGubur
Substitue Query
• Substitue Query is the query that can replace
another query. These queries are used for
bolding the some sections of the content.
• Substitue Queries make ‘context’ more
important. Because, synonyms make change
the context. Such as, car and auto can be
same thing for ‘repair’ but they are not same
for ‘railroad’.
• There is a railroad car, but not auto.
• Thus, Sustitue Queries are not synonyms.
They are the replacble words without
changing the context.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
@KorayGubur
Substitue Query
• Co-occurence Matrix and Phrase-
based Indexing are used to support
the Substitue Queries.
• The method uses the Space Vectors
to compare the word vectors to each
other.
• If the queries are similar to each
other with enough co-occurent
words, it means that they can be
subtitue to each other.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
@KorayGubur
Synthetic Query
• Synthetic Query is the re-written version of
the query of the user by the search engine.
• A search engine can re-write a query by
augmenting the query to diversify the SERP
Features for a better search activity
satisfaction possibility.
• Some score types that Synthetic Queries
include are ‘Edit Distance Score’, ‘Similarity
Score’, ‘Transformation Cost Score’.
• Synthetic Queries can be collected from web
documents, Structured Data, and Similarity
Between Documents.
@KorayGubur
Inventors: Anand Shukla, Mark Pearson, Krishna
Bharat and Stefan Buettcher
Assignee: Google LLC
US Patent: 9,916,366
Granted: March 13, 2018
Filed: July 28, 2015
@KorayGubur
Synthetic Query and
Query Templates
• Query Templates are intermediary forms between the
Seed Queries and Synthetic Queries.
• Synthetic Queries are helpful for a Search Engine to
create pre-defined and pre-ordered SERP Instances.
• Synthetic Queries can be generated from HTML Tags,
IDF Scores, Close Phrases.
• If a Document has «Dorothy Parker Biography» as H1,
and «Sylvia Plath» as H2.
• Search Engine can use the «Sylvia Plath Biography» as
a synthetic query.
• If the results are good enough for relevance and
quality, the Synthetic Query will become a Seed
Query.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
@KorayGubur
Synthetic Query and
Query Templates
• Synthetic Queries can be generated from
the same author, same journal, source, or
time of period.
• Synthetic Queries and Open Information
Extraction are closely related to each
other.
• Before entering the world of entities,
understanding the world of phrases are
important.
• Open Information Extraction, and
Unknown Phrases, Entities are connected
to each other.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
@KorayGubur
Open Information Extraction
• Google bought Wavii for 30.000.000$ in
2013.
• Open Information Extraction is about ‘fact
extraction’ around nouns.
• It is for connecting different nouns to each
other based on relations.
• A classifier assigns a confidence scores to
a relation between two nouns.
• This is a text-to-data example.
• Wavii was originally a news aggregator
based on topics, not phrases.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
@KorayGubur
Open Information Extraction
• The relational tuples include at least two
nouns by connected to each other at least
one verb and adverb, such as ‘created by’,
‘author of’, ‘is from’, ‘located there’.
• ‘... Moreover, the number and complexity
of entity types on the Web means that
existing NER systems are inapplicable...’
• Open IE is for Unknown Entities, and
recognizing Minor Entities without a
registration to the Knowledge Base.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
@KorayGubur
Answer-seeking Query
• Answer-seeking Queries have specific
elements within the questions, and
answers.
• Google’s purpose is that extracting
question and answer formats for answer-
seeking queries.
• Answer-seeking queries requires concise
answers without any skepticism.
• Answer-seeking Query is an important
bridge between the Natural Language
Queries with an Intent.
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
@KorayGubur
Answer-seeking Query
• Question Elements are, Entity Instance,
Entity Class, Part of Speech Class, Root
Word, N-Gram and Question Triggering
Words.
• Answer Elements are Measurement, N-
Gram, Verb, Preposition, Entity_instance,
N-gram near entity, verb near entity,
preposition near_entity, verb class, skip
grams.
• Answer-seeking Queries trigger Answer
Scoring Engine,
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
@KorayGubur
Natural Language Queries
• Natural Language Queries are the queries
with the daily language.
• They do not have a proper grammar rule,
or complete sentence.
• They do not explicitly tell their intent.
• That’s why these queries also called Intent
Queries, or Queries with a specific minor
intent.
• For such a query, a Search Engine should
return an answer without lots of details,
or structure.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
@KorayGubur
Natural Language Queries
• Natural Language Queries are not Factual-queries, this is
the main difference for Answer-seeking queries.
• Natural Language Queries are related to the Intent
Template Generation.
• A Natural Language Query can have multiple intents with a
non-factual information, such as ‘How do I make
hummus?’.
• There might be different methods to make a hummus, and
there are different types of hummus, also, the query
includes ‘I’. So, no one can know how you do hummus.
• The answer-seeking version of this query is that ‘How to do
hummus’.
• One of the important methodology points from here is that
Google creates ‘heading-text’ pairs to understand the
topics of the sub-sections of the article.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
@KorayGubur
Natural Language Queries
• Variable and Non-Variable Portions are
important concepts for the intent templates.
• Non-variable section of the intent for the
previous query is ‘hummus’.
• The variable section or portion can be a
‘place, method, tool, or style’. And, ‘I’ can
change as a child, as a women, men, or adult
and blind person.
• For Natural Language Queries, the Intent
Templates can be implemented to different
Query Patterns such as X Causes, X Reasons.
• If someone searches for only X, the intent
templates will be used to assign the natural
language results to the query.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
@KorayGubur
Query Rewriting for Same
Intnet Across Languages
• Google tried to unite different search
intents, data for these intents, and phrases
that represents these intents to each other
to improve the search results before.
• This is called Query Expansion. Query
Expansion can compare results for a query
from a language, to results for the same
query with a different language.
• If the click satisfaction possibility is higher
for another language, for the same intent,
search engine can re-rank the results for the
first language.
@KorayGubur
Invented by Stefan Riezler, Alexander L. Vasserman
Assigned to Google
US Patent Application 20080319962
Published December 25, 2008
Filed: March 17, 2008
@KorayGubur
Seed-Queries
• Seed Queries can be synthetic queries,
user generated queries. The main
necessity for a seed query is that the
query should be satisfying with a set of
documents.
• If a query is logical, popular and satisfying
for the user, it will be marked as seed
query whether it is synthetic or searcher
generated.
• Seed Queries are used to determine the
representative queries for query
variations, query and intent templates.
@KorayGubur
Inventors Manaal Faruqui and Dipanjan Das
Applicants Google LLC
Publication Number 20200167379
Filed: January 18, 2019
Publication Date May 28, 2020
@KorayGubur
End of Phrase-based Indexing and Query
Processing Chaos
• Query Parsing
• Seed Query
• Substitue Query
• Natural Language Query
• Answer-seeking Query
• Factual Query
• Non-factual Query
• Non-variable Portion in Query
• Variable Portion in Query
• Discordant Query
• Query Re-writing
• Open Information Extraction
• Synthetic Query
• Categorical Query
• Contextual Vectors
• Term Vectors @KorayGubur
• Intent Templates
• Question and Answer Elements
• Co-occurence Matrix
• Query Expansion
• Query Term Weight
• Multi-stage Query Processing
• Query Breadth
• Query Template
• Relation Types and Noun Tuples
• Macro-context
• Topical Entry
• Mid-page Query Refinement
• Query Ambiguity
• Query Cluster – Document Cluster for Logical Pair
• Associator, Matcher, Scorer for Query, Document
Association
• Edit Distance Score’, ‘Similarity Score’, ‘Transformation
Cost Score’.
• Phrase-based Indexing
• Contextual Domains
• Contextual Domain Word List
• Query Analysis
• Representative Query
• Canonical Query
• Minor Intent
• Space Vectors
• Navigational Query as a
Popularity Signal
• Evidence Based Ranking
• Word Proximity
• Word Adjecency
• Query Term Weight
@KorayGubur
First Semantic Web Announcement
• Semantic Web Roadmap has been published
in September 1998 by Tim Barners-Lee.
• Semantic HTML, and Semantic Web,
Semantic User Patterns were the principles
of Semantic Search.
• The main purpose of Semantic Web is
making the web understandable to machines
so that machines can help humen-beings for
better web surfing.
• Tim Barners Lee talked about Agents,
Ontology, Structured Data, RDFa, or Semantic
HTML Tags and Digital Signature.
• ‘Such an agent coming to the clinic's Web
page will know not just that the page has
keywords such as "treatment, medicine,
physical, therapy" (as might be encoded
today) but also that Dr. Hartman works at
this clinic on Mondays, Wednesdays and
Fridays and that the script takes a date
range in yyyy-mm-dd format and returns
appointment times. And it will "know" all
this without needing artificial intelligence ‘ @KorayGubur
‘The Semantic Web is an extension of the current web in
which information is given well-defined meaning, better
enabling computers and people to work in cooperation.’
-Tim Barners-Lee
@KorayGubur
First Semantic Search Patent
• Google’s first Semantic Search Engine patent
is from 1999. One year later from Tim
Barners-lee’s announcement.
• The Inventor is directly Sergey Bring.
• Document doesn’t have a legal language, like
other first patent instances of Google.
• Document tells that every thing from similar
type has same features.
• Things on the web can be collected for
certain type of information and stored with
this information.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
@KorayGubur
First Semantic Search Patent
• Sergey Brin encountered some problems
such as Named Entity Recognition, or Main
Entity, and Entity Relation Detection.
• These problems are not called based on
Entities, but these books were entities with
string representations.
• Even a single letter difference resulted in big
problems for Sergey Brin.
• And, some books didn’t have price, or proper
title, and some of them were not even real
books.
• In the first trying, the cost was high, process
was slow, results were half, but Google kept
going.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
@KorayGubur
Knowledge Graph Launch
• ‘Things, not strings.’ is the motto of
Knowledge Graph. Everything on the web is
divided into different entities, entity types,
entity connections.
• Named Entity Recognition, and Natural
Language Processing increased its value and
prominence within the algorithmic hierarchy
of Google.
• Knowledge Graph supported the Knowledge
Panels.
• Fact Extracting, Question Answering,
Accuracy Audit, and Entity Relations are the
columns of Entity-oriented Search Engine.
• ‘Wouldn’t it be great understanding every
word of user, instead of matching words?’, by
Jack Menzel.
@KorayGubur
Inventors: John R. Provine
Assignee: Google LLC
US Patent: 10,922,326
Granted: February 16,
2021
Filed: March 14, 2013
@KorayGubur
Browsable Fact Repisotory
• Browsable Fact Repisotory is the main and
primitive version of the Google Knowledge
Graph.
• There are three important problems for
Browsable Fact Repisotory.
1. Updating the Knowledge Graph.
2. Extracting the New Entities.
3. Auditing the Fact Accuracy.
@KorayGubur
Invented by Andrew W.
Hogue and Jonathan T.
Betz
Assigned to Google Inc.
US Patent 7,774,328
Granted August 10, 2010
Filed: February 17, 2006
@KorayGubur
Entity-seeking Query
• Today’s last Query type.
• Entity-seeking Queries are one of the
basic columns of Entity-oriented search.
• Identify the Query seeks for a singular
entity, or plural things from same type.
• If it is singular, entity-seeking query will
match the term and the entity based on
an attribute.
• Entity-seeking Queries include a Semantic
Dependency Tree, Relevance Threshold
@KorayGubur
Inventors: Mugurel Ionut Andreica, Tatsiana Sakhar,
Behshad Behzadi, Marcin M. Nowak-Przygodzki, and
Adrian-Marius Dumitran
US Patent Application: 20190370326
Published: December 5, 2019
Filed: May 29, 2018
@KorayGubur
Entity-seeking Query
@KorayGubur
@KorayGubur
Structured Search Engine
@KorayGubur
• Sergey Brin said, ‘Structured Form’ in 1999.
• In 2011, Andrew Hogue said Structured
Search Engine.
• Andrew Hogue introduced the Open-
Domain Fact Extraction methodologies for
extracting, clustering entities from the web.
• Andrew Hogue has showed some concrete
examples to the future Google Engineers for
the direction that they want to head.
Cartoon is created by Gary Larson.
@KorayGubur
Semantic Search Engine
@KorayGubur
• Google can extract all attributes of an entity
to understand its general features.
• According to the Source Attribute, these
features can be changed, detected or
altered.
• Based on the entity types, and candidate
entities, Google can generate more entity
types, and connections between them.
• Structured Search Engine’s other name is
Semantic Search Engine.
• Semi-structured Text Understanding,
Question Generation from Keywords, and
Question-Answer Pairing are the main
objectives of Semantic Search Engine.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer
for Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Named Entity Recognition process for the
query.
• Entity-seeking Queries are the backbone
of the entity oriented search.
• Recognizing an entity from a Query is not
easy, or cheap.
• Neural Matching, RankBrain, Sub-topic
Update, or BERT, MuM, LaMDA... All of
them are used for recognizing the entity,
and its related attributes.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Second step is Entity Resolution.
• Entity Resolution, and Attribute
Extraction are for understanding the
related attribute of the entity.
• Entity-seeking Queries usually try to find
an Entity’s Attribute such as look, height,
taste, inception or history.
• After the entity and its attribute are taken
from the query, at the next step,
Question Format will be taken.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Third step is Synonym Extraction.
• Synonym Extraction is for strenghten the
confidence score.
• Other function of the Synonym Extraction
is that, it helps for using alternate
documents for the same question.
• According to the Synonyms, the question
format can change.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Question format is necessary to understand
the query by increasing the confidence
score, and matching the similar successful
documents.
• Question format is important to
determine the answer format.
• Quetion term order, and answer term
order can increase the success rate.
• The last important thing here is that the
‘answer data type’ which is a date.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Forth step is Entity Reconciliation and data accuracy audit.
At the next step, Google can check the related search
activity, possible search activity, and choose the best
answer.
• The answer formats, and answer phrases will be used
for entity reconcilation.
• Entity reconcilation includes the standartization of the
entity with the correct information.
• 5 Rand Fishkin Entity Recording exist in Knowledge
Graph, for same Rand Fishkin.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Entity Reconcilation
Inventors: Oksana Yakhnenko and Norases
Vesdapunt
Assignee: GOOGLE LLC
US Patent: 10,331,706
Granted: June 25, 2019
Filed: October 4, 2017
Entity Reconcilation is another patent from Google.
• It includes checking multiple sources to complete the missing
information on the Knowledge Graph.
• It also uses similarity threshold between different sources and the
knowledge graph.
• If the source is authoritative, it will be easier to modify the
Knowledge Graph.
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
“For other people it can be a little more complicated. Like me, for
example, John Mueller. If you search for me you’ll find Wikipedia pages,
barbecue restaurants, bands, all kinds of people who are called John
Mueller.
And if, on my site, I don’t specify who I actually am, then it could
happen that our systems look at my page and go: “oh this is that guy
that runs that barbecue restaurant.” And suddenly I’m associated with
a barbecue restaurant, which might be a move up, I don’t know.
But these subtle things make it easier for us to recognize who is
actually behind something. We call that reconciliation when it comes to
structured data, kind of recognizing which of these entities belong
together.”
John Mueller
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-oriented
Search.
Source: The Structured Search Engine by Andrew Hogue
@KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
Named Entity Resolution
Named Entity Extraction
Relation Detection
Lexical Semantics
Taxonomy
Ontology
Onomastics
Important Terms and Concepts for NER and Semantic Search Engine
@KorayGubur
Semantic Search Engine
@KorayGubur
Entity Extraction
• Entity extraction is a complementary step for
Named Entity Recognition.
• Recognized Entity can be extracted from the
text to be stored in a Knowledge Base.
• Entity Extraction uses attributes to connect
the entity and its meaning, prominence and
attributes to each other.
• In the sentence of ’46th President of United
States (US) had decided to go Paris on
Monday, 2th june, 2002.’
• ‘46th President of United States’ is the
named entity.
• The decision of the president is the attribute
with the date contribution which is included
in entity extraction.
@KorayGubur
Semantic Search Engine
@KorayGubur
Entity Resolution
• Entity Resolution has two phases.
• First phase is finding the mention entity’s
correct idendity.
• Second phase is finding the correct profile of
the mentioned entity.
• For instance, Bill Clinton was a U.S President,
but also an Actor in Hollywood. An American
Football Player can be also a cook, or
journalist.
• To find the right entity, from the entity
reference, Search Engine can use related
entities, and their types.
• Entity Resolution helps for feeding the text-
to-data systems of Search Engines.
• If you tell ‘Barry Scwhartz entered to
classroom and asked questions to the
students’, the Entity Resolution will decide
that it is the Professor Barry, not our Barry.
@KorayGubur
Semantic Search Engine
@KorayGubur
Relation Detection
• Relation Detection is the process of
understanding the relation type and labels
between different entities within a text.
• There are different types of relations, such as
‘isSimilarOf’, ‘locatedIn’, ‘superiorOf’,
‘closeTo’, ‘sameAs’.
• Some of these relation types are familiar
from the Structured Data.
• Some of the relation types are unique for
specific entities and specific topics.
• Relation Detection takes power from the
Lexical Semantics.
• Relation detection can be used for Visual-to-
text algorithms too.
@KorayGubur
Semantic Search Engine
@KorayGubur
Lexical Semantics
• Lexical Semantics should be known by every
human-being for thinking and speaking in a
healthy way.
• Lexical Semantics include semantic meaning
connections between different words.
• Lexical Semantics are used to understand the
relational connections between named
entities.
• For instance, ‘Boy’ includes ‘single’, ‘teenage’,
‘male’, ‘young’ meanings as default. But,
some of these meanings have high possibility,
some of them low.
• For instance, someone young, male, teenage
can be also married.
• Lexical Semantics are used to understand the
named entity’s resolution and connection
with other things.
Lexemes: not analyzable unit, by itself.
Lexicon: List of lexemes.
@KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
• Semantic Role Labeling is the process of
understanding the parts of a sentence by
assigning related labels.
• Semantic Role Labeling takes power from
Lexical Semantics, and Part of Speech Tag.
• Semantic Role Labeling helps Relation
Detection.
• There are more than 32 Semantic Roles.
• For Semantic Role Labeling, the most
important part is finding the theme,
predicate, agent, and effect.
• Semantic Role Labeling is beneficial to audit
the content’s accuracy, and fact extraction
from the prepositions.
@KorayGubur
Semantic Search Engine
@KorayGubur
Taxonomy
• Taxos-logos, or Taxonomy means arrangement of
things.
• It is used for animal classification first, in Anceint
Greek.
• In moden era, it is used for all living thing classification
in biology, and then it has been used for classification
of chemical, or other types of existing things.
• In the field of Search Engine Optimization, Semantic
Entity Types, and Semantic Dependency Tree is
important.
• Creationg a hierarchy between entities based on their
type and size, prominence or superiority and
inferiority is important to increase the contextual
relevance, and specifying the relevance of the article.
• Every entity type has a different attribute group, and
hierarchy can be refreshed.
• If the context is size of cities, ‘berlin’, ‘paris’, ‘istanbul’
can have a different taxonomy, in terms of big, small,
medium cities.
• If the context is countries of these cities, taxonomy
can be aligned with country names, and region,
continent names.
@KorayGubur
Semantic Search Engine
@KorayGubur
Ontology
• Ontology completes the taxonomy.
• Ontos-logos, essence of things.
• It is a barnch of philosophy.
• Ontology is a reflex for all human-beings.
• Ontology can be created based on mutual
points of different entities.
• According to the mutual attribute between
entities, the taxonomy can change, and
ontology can follow it also.
• If three named entities are from same region,
region name is the mutual attribute, and it
can have other types of connections based
on this.
@KorayGubur
Semantic Search Engine
@KorayGubur
Onomastics
• Onomastics is the science of naming, and
analyzing the name patterns for different
languages.
• Every enttiy type has a different naming pattern.
• Name patterns are used to recognize entities,
entity types, and attributes of entities.
• It comes from onoma and stikos, means names
of things.
• Different science names, city names, event
names, situation names, or instituion names can
have naming patterns.
• Some onomastics sub-type examples,
1. helonyms: proper names of swamps, marshes and bogs.
2. limnonyms: proper names of lakes and ponds.
3. oceanonyms: proper names of oceans.
4. pelagonyms: proper names of seas and maritime bays.
5. potamonyms: proper names of rivers and streams.
• Onomastics can be used for taxonomy and
ontology creation too. Even a water can have
multiple naming patterns based on sub-types.
@KorayGubur
Semantic Search Engine
@KorayGubur
BERT - SMITH
MuM
LaMDA
Conversational Search
Important Announcements for Structured Search Engine
@KorayGubur
Semantic Search Engine
@KorayGubur
BERT - SMITH
Uses, Masked Language Model.
It masks 15% of every tokens for prediction model.
Used, Bidrectional Language Understanding.
It reads all sentence at once from both direction.
It predicts the next sentence.
Used bigger tokens than 512 with SMITH.
Used fine-tuning based representation model.
@KorayGubur
Semantic Search Engine
@KorayGubur
MuM
The research papers have been taken in 2021 March.
In 2021 May, they announced MuM.
In 2021 June, they announced that they started to use MuM.
All system is related to the understand ‘Related Search Activity’ to predict the future queries.
@KorayGubur
Semantic Search Engine
@KorayGubur
MuM
If you search for trekking to a mountain, there are three possible different contexts:
Trekking
Mountain
And, Specific Mountain Trekking
@KorayGubur
Semantic Search Engine
@KorayGubur
LaMDA
LaMDA is for connecting a question to another with Human Sensible Way.
Specifity
Factuality
Interestingness
Sensibleness
LaMDA is a part of Conversational AI.
@KorayGubur
Semantic Search Engine
@KorayGubur
Conversational Search
Conversational Search is close to Conversational AI.
It connects different entities, concepts, intents to each
other.
Creates new Contextual Domains, and Co-occurence
Matrixes.
Conversational Search Announcement includes only the
past queries.
MuM, and LaMDA includes future queries.
@KorayGubur
Semantic Search Engine
@KorayGubur
Important Language Models for Near Future in the context of Semantic Search Engine
ReALM
KeALM
@KorayGubur
Semantic Search Engine
@KorayGubur
ReALM
Retrieval Augmented Language Model
Based on Entity Dependency Tree, missed attributes and facts can be extracted.
Source: https://ai.googleblog.com/2020/08/realm-integrating-retrieval-into.html
@KorayGubur
Semantic Search Engine
@KorayGubur
ReALM
Inventors: Kenton Chiu Tsun Lee,
Kelvin Gu, Zora Tung, Panupong
Pasupat, and Ming-Wei Chang
Assignee: Google LLC
US Patent: 11,003,865
Granted: May 11, 2021
Filed: May 20, 2020
First a Research Paper,
Then, a Patent.
Lastly, an Update with Official Statement,
Or Non-Official Statement.
@KorayGubur
Semantic Search Engine
@KorayGubur
KeALM
Knowledge Graph Integrated Language Model for Fact and
Accuracy Checking.
Source: https://ai.googleblog.com/2021/05/kelm-integrating-
knowledge-graphs-with.html
Data to text Triple Example
@KorayGubur
Semantic Search Engine
@KorayGubur
Encazip.com.
Holistic SEO Case Study based on Semantic SEO.
Used Entity-oriented Search.
From daily 150 clicks to 6.000 clicks.
@KorayGubur
Semantic Search Engine
@KorayGubur
An Education Brand
11.000 queries and 30.000 monthly clicks within 25 days
@KorayGubur
Semantic Search Engine
@KorayGubur
An unpublished case study.
422.000 queries, 220.000 clicks in 66 days.
It is also a Technical SEO Case Study.
Indexed 73.000 pages in 66 days.
@KorayGubur
Semantic Search Engine
@KorayGubur
15.000 New Queries.
35.000 monthly traffic.
In 3 months.
Used Semantic SEO
@KorayGubur
@KorayGubur
‘Without understanding the Query Processing in the eyes of
Search Engine, you can’t create the relevant, and satisfying
document based on minor and dominant search activity
types.’
Thank You

More Related Content

What's hot

Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Koray Tugberk GUBUR
 
AI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAnton Shulke
 
SEO & Patents Vrtualcon v. 3.0
SEO & Patents Vrtualcon v. 3.0SEO & Patents Vrtualcon v. 3.0
SEO & Patents Vrtualcon v. 3.0Bill Slawski
 
Quality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCQuality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCHamlet Batista
 
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureCoronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureKoray Tugberk GUBUR
 
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateSEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateKoray Tugberk GUBUR
 
How to Automatically Subcategorise Your Website Automatically With Python
How to Automatically Subcategorise Your Website Automatically With PythonHow to Automatically Subcategorise Your Website Automatically With Python
How to Automatically Subcategorise Your Website Automatically With Pythonsearchsolved
 
Semantic seo and the evolution of queries
Semantic seo and the evolution of queriesSemantic seo and the evolution of queries
Semantic seo and the evolution of queriesBill Slawski
 
William slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchWilliam slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchBill Slawski
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Massimiliano Geraci
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search SystemTrey Grainger
 
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...LazarinaStoyanova
 
Slawski New Approaches for Structured Data:Evolution of Question Answering
Slawski   New Approaches for Structured Data:Evolution of Question Answering Slawski   New Approaches for Structured Data:Evolution of Question Answering
Slawski New Approaches for Structured Data:Evolution of Question Answering Bill Slawski
 
The Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerThe Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerHamlet Batista
 
Semantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA ConSemantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA ConBill Slawski
 
Assessing Remote Talent to Scale Up SEO Success
Assessing Remote Talent to Scale Up SEO Success Assessing Remote Talent to Scale Up SEO Success
Assessing Remote Talent to Scale Up SEO Success Begum Kaya
 
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOConAleyda Solís
 
Holistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First StrategyHolistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First StrategyArpunBhuhi
 

What's hot (20)

Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
 
AI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBUR
 
SEO & Patents Vrtualcon v. 3.0
SEO & Patents Vrtualcon v. 3.0SEO & Patents Vrtualcon v. 3.0
SEO & Patents Vrtualcon v. 3.0
 
Quality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCQuality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGC
 
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureCoronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
 
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateSEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
 
How to Automatically Subcategorise Your Website Automatically With Python
How to Automatically Subcategorise Your Website Automatically With PythonHow to Automatically Subcategorise Your Website Automatically With Python
How to Automatically Subcategorise Your Website Automatically With Python
 
Semantic seo and the evolution of queries
Semantic seo and the evolution of queriesSemantic seo and the evolution of queries
Semantic seo and the evolution of queries
 
William slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchWilliam slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-search
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Entity seo
Entity seoEntity seo
Entity seo
 
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
 
Slawski New Approaches for Structured Data:Evolution of Question Answering
Slawski   New Approaches for Structured Data:Evolution of Question Answering Slawski   New Approaches for Structured Data:Evolution of Question Answering
Slawski New Approaches for Structured Data:Evolution of Question Answering
 
The Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerThe Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy Marketer
 
Semantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA ConSemantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA Con
 
Assessing Remote Talent to Scale Up SEO Success
Assessing Remote Talent to Scale Up SEO Success Assessing Remote Talent to Scale Up SEO Success
Assessing Remote Talent to Scale Up SEO Success
 
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
 
Holistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First StrategyHolistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First Strategy
 

Similar to Semantic Search Engine: Semantic Search and Query Parsing with Phrases and Entities

Preparing Data for (Open) Publication
Preparing Data for (Open) PublicationPreparing Data for (Open) Publication
Preparing Data for (Open) PublicationBrian Hole
 
Content Strategy for WordPress: Case Study
Content Strategy for WordPress: Case StudyContent Strategy for WordPress: Case Study
Content Strategy for WordPress: Case StudyStephanie Leary
 
Bill Slawski SEO and the New Search Results
Bill Slawski   SEO and the New Search ResultsBill Slawski   SEO and the New Search Results
Bill Slawski SEO and the New Search ResultsBill Slawski
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyondBill Slawski
 
Open Access and Open Education: Background, lobby tips, and continuing the di...
Open Access and Open Education: Background, lobby tips, and continuing the di...Open Access and Open Education: Background, lobby tips, and continuing the di...
Open Access and Open Education: Background, lobby tips, and continuing the di...Nicole Allen
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDACyndy Parr
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
 
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
 
How to Create Newsworthy Content
How to Create Newsworthy ContentHow to Create Newsworthy Content
How to Create Newsworthy ContentTug Agency
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE
 
There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...Lynn Connaway
 
Evolution of Search
Evolution of SearchEvolution of Search
Evolution of SearchBill Slawski
 
Using Gale In Context: Biography Instructional Presentation
Using Gale In Context: Biography  Instructional PresentationUsing Gale In Context: Biography  Instructional Presentation
Using Gale In Context: Biography Instructional Presentationrikkimoore
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Datakfear
 
Finding Argument Sources
Finding Argument SourcesFinding Argument Sources
Finding Argument SourcesNicoleBranch
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 

Similar to Semantic Search Engine: Semantic Search and Query Parsing with Phrases and Entities (20)

data citation
data citationdata citation
data citation
 
Preparing Data for (Open) Publication
Preparing Data for (Open) PublicationPreparing Data for (Open) Publication
Preparing Data for (Open) Publication
 
Content Strategy for WordPress: Case Study
Content Strategy for WordPress: Case StudyContent Strategy for WordPress: Case Study
Content Strategy for WordPress: Case Study
 
Bill Slawski SEO and the New Search Results
Bill Slawski   SEO and the New Search ResultsBill Slawski   SEO and the New Search Results
Bill Slawski SEO and the New Search Results
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyond
 
Using Technology for Academic Research
Using Technology for Academic ResearchUsing Technology for Academic Research
Using Technology for Academic Research
 
Open Access and Open Education: Background, lobby tips, and continuing the di...
Open Access and Open Education: Background, lobby tips, and continuing the di...Open Access and Open Education: Background, lobby tips, and continuing the di...
Open Access and Open Education: Background, lobby tips, and continuing the di...
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016
 
Searching Online
Searching OnlineSearching Online
Searching Online
 
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
 
How to Create Newsworthy Content
How to Create Newsworthy ContentHow to Create Newsworthy Content
How to Create Newsworthy Content
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...
 
Evolution of Search
Evolution of SearchEvolution of Search
Evolution of Search
 
Using Gale In Context: Biography Instructional Presentation
Using Gale In Context: Biography  Instructional PresentationUsing Gale In Context: Biography  Instructional Presentation
Using Gale In Context: Biography Instructional Presentation
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 
Finding Argument Sources
Finding Argument SourcesFinding Argument Sources
Finding Argument Sources
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 

Recently uploaded

DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategyDIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategySouvikRay24
 
VIP Call Girls In Green Park 9654467111 Escorts Service
VIP Call Girls In Green Park 9654467111 Escorts ServiceVIP Call Girls In Green Park 9654467111 Escorts Service
VIP Call Girls In Green Park 9654467111 Escorts ServiceSapana Sha
 
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service Nagpur
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service NagpurRussian Call Girls Nagpur Swara 8617697112 Independent Escort Service Nagpur
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service NagpurCall girls in Ahmedabad High profile
 
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
How to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessHow to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessAggregage
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesPushON Ltd
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfEastern Online-iSURVEY
 
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessBrighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessVarn
 
Jai Institute for Parenting Program Guide
Jai Institute for Parenting Program GuideJai Institute for Parenting Program Guide
Jai Institute for Parenting Program Guidekiva6
 
Unraveling the Mystery of Roanoke Colony: What Really Happened?
Unraveling the Mystery of Roanoke Colony: What Really Happened?Unraveling the Mystery of Roanoke Colony: What Really Happened?
Unraveling the Mystery of Roanoke Colony: What Really Happened?elizabethella096
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingJuan Pineda
 
9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking MenSapana Sha
 
pptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxpptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxarsathsahil
 
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Onlineanilsa9823
 
Uncover Insightful User Journey Secrets Using GA4 Reports
Uncover Insightful User Journey Secrets Using GA4 ReportsUncover Insightful User Journey Secrets Using GA4 Reports
Uncover Insightful User Journey Secrets Using GA4 ReportsVWO
 

Recently uploaded (20)

DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategyDIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
 
VIP Call Girls In Green Park 9654467111 Escorts Service
VIP Call Girls In Green Park 9654467111 Escorts ServiceVIP Call Girls In Green Park 9654467111 Escorts Service
VIP Call Girls In Green Park 9654467111 Escorts Service
 
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
 
Brand Strategy Master Class - Juntae DeLane
Brand Strategy Master Class - Juntae DeLaneBrand Strategy Master Class - Juntae DeLane
Brand Strategy Master Class - Juntae DeLane
 
The Future of Brands on LinkedIn - Alison Kaltman
The Future of Brands on LinkedIn - Alison KaltmanThe Future of Brands on LinkedIn - Alison Kaltman
The Future of Brands on LinkedIn - Alison Kaltman
 
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service Nagpur
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service NagpurRussian Call Girls Nagpur Swara 8617697112 Independent Escort Service Nagpur
Russian Call Girls Nagpur Swara 8617697112 Independent Escort Service Nagpur
 
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Lajpat Nagar Delhi 💯Call Us 🔝8264348440🔝
 
How to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessHow to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail Success
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surges
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
 
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessBrighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
 
Jai Institute for Parenting Program Guide
Jai Institute for Parenting Program GuideJai Institute for Parenting Program Guide
Jai Institute for Parenting Program Guide
 
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel LeminTurn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
 
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose GuirgisCreator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
 
Unraveling the Mystery of Roanoke Colony: What Really Happened?
Unraveling the Mystery of Roanoke Colony: What Really Happened?Unraveling the Mystery of Roanoke Colony: What Really Happened?
Unraveling the Mystery of Roanoke Colony: What Really Happened?
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO Copywriting
 
9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men
 
pptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxpptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptx
 
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
 
Uncover Insightful User Journey Secrets Using GA4 Reports
Uncover Insightful User Journey Secrets Using GA4 ReportsUncover Insightful User Journey Secrets Using GA4 Reports
Uncover Insightful User Journey Secrets Using GA4 Reports
 

Semantic Search Engine: Semantic Search and Query Parsing with Phrases and Entities

  • 1. @KorayGubur Semantic Search Engine & Query Parsing In the Light of Semantic Search Principles
  • 2. @KorayGubur A b o u t M e Koray Tuğberk GÜBÜR Owner and Founder of Holistic SEO & Digital • Educates his team • Publishes SEO Case Studies, Researches & Guides • Twitter: @KorayGubur • Email: ktgubur@holisticseo.digital • Official Site: https://www.holisticseo.digital
  • 3. @KorayGubur S E O C a s e S t u d i e s o f K T G
  • 4. @KorayGubur S E O G u i d e s o f K T G
  • 5. @KorayGubur W e b i n a r s a n d I n t e r v i e w s o f K T G
  • 6. @KorayGubur What is Query Parsing? • Query Parsing it the process of understanding the different sections of a query. • Types: Entity-seeking Query, a Substitue Term, or Synonym Term. • Canonical and Represented Versions: A Canonical Query can represent close variations. • Query Character: Affects the SERP Design, Dominant and Minor Search Intent Assigments. • Query Process: Other name of the Query Parsing. @KorayGubur
  • 7. @KorayGubur Multi-Stage Query Processing • The first patent that talks about «Context of Words». • It tries to delete the stop words. • Stemming the concrete words. • Expanding words with Synonyms and Co- occurence. • Some Criterias: Absent Queries, Boolean Logic, Query Term Weights, Document Popularity, Word Proximity (Distance), Word Adjacency. • It uses «VIPS» and Web Page Layout. @KorayGubur Inventors: Jeffrey Adgate Dean, Paul G. Haahr, Olcan Sercinoglu, and Amitabh K. Singhal US Patent Application 20060036593 Filed: August 13, 2004 Published February 16, 2006
  • 8. @KorayGubur Query Breadth • This is for «adjecent words» and «unknown entities». • It uses related document count to see the ‘query breadth’. • Query Breadth can be decreased with the ‘adjecent word’ count. • Query Breadth can be used for ‘Named Entity Recognition’, or Triple Creation (An Object and two Subject). Invented by Karl Pfleger and Brian Larson Assigned to Google US Patent 7,925,657 Granted April 12, 2011 Filed: March 17, 2004 @KorayGubur
  • 9. @KorayGubur Query Analysis • Selection Over Time: For different timespans, a document can be chosen more frequently. • Documents with Hot Topics: Rising Queries can boost documents that include these queries. • Documents with Related Hot Topics: Related queries for rising queries can boost the documents with related queries. • Constant Queries with Consistently Changing Results: Constant Query is the always popular query with changing information for a topic. • Freshness of Documents: Date of the information on the web page, not the date of the document’s last version. @KorayGubur Invented by Karl Pfleger and Brian Larson Assigned to Google US Patent 7,925,657 Granted April 12, 2011 Filed: March 17, 2004
  • 10. @KorayGubur Query Analysis • Staleness of Documents: Historical Data amount can be a positive ranking signal for a page for a query. • Overly Broad Pages: Includes discordant queries, a signal for spam. • Continuation Patent filed in 2011 for «document locator». And, some terms changed. @KorayGubur Inventors: DEAN; Jeffrey; (Palo Alto, CA) ; Haahr; Paul; (San Francisco, CA) ; Henzinger; Monika; (Corseaux, CH) ; Lawrence; Steve; (Mountain View, CA) ; Pfleger; Karl; (Mountain View, CA) ; Sercinoglu; Olcan; (Mountain View, CA) ; Tong; Simon; (Mountain View, CA) Assignee: GOOGLE INC. Mountain View CA Family ID: 34381362 Appl. No.: 13/244853 Filed: September 26, 2011
  • 11. @KorayGubur Query Analysis • Trends Related to Topics and Search Terms: Grouping Topics, and Subtopics announced for Trending Queries. • Access Times to Determine Freshness and Staleness: Compares the First Access and Last Access time for certain documents. • Frequency of Selection: Compares the selection count for the first and latter time. • When Staleness Might be Preferred: Even if there is fresh news, or documents, the user can choose the stale document. These documents are not affected by stale information. • Spam Determination Based Upon Breadth of Rankings, and Authority: If the document is popular, or authoritative (link-based), or the source is relevant enough, it will be an exception. Inventors: DEAN; Jeffrey; (Palo Alto, CA) ; Haahr; Paul; (San Francisco, CA) ; Henzinger; Monika; (Corseaux, CH) ; Lawrence; Steve; (Mountain View, CA) ; Pfleger; Karl; (Mountain View, CA) ; Sercinoglu; Olcan; (Mountain View, CA) ; Tong; Simon; (Mountain View, CA) Assignee: GOOGLE INC. Mountain View CA Family ID: 34381362 Appl. No.: 13/244853 Filed: September 26, 2011
  • 12. @KorayGubur Query Analysis • Continuation of the Historical Data Patent. • Speaks about Topics, and Query Categorization based on Topics. • It is important beause, same year, Google Launched its Knowledge Graph with 5 million entities, and 500 million facts. @KorayGubur Inventors: DEAN; Jeffrey; (Palo Alto, CA) ; Haahr; Paul; (San Francisco, CA) ; Henzinger; Monika; (Corseaux, CH) ; Lawrence; Steve; (Mountain View, CA) ; Pfleger; Karl; (Mountain View, CA) ; Sercinoglu; Olcan; (Mountain View, CA) ; Tong; Simon; (Mountain View, CA) Assignee: GOOGLE INC. Mountain View CA Family ID: 34381362 Appl. No.: 13/244853 Filed: September 26, 2011
  • 13. @KorayGubur Midpage Query Refinements • In 2006, Google published the «Midpage Query Refinements», a.k.a, Search Suggestions from today. • The GUI test was between 2004-2006. • The patent filed in 2003. • Includes Semantic Query Clusters for Different Contexts. • A Matcher, a Clusterer, A Scorer, and A Presenter. @KorayGubur Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain View, CA) Correspondence Address: PATRICK J S INOUYE P S 810 3RD AVENUE SUITE 258 SEATTLE WA 98104 US Family ID: 34228721 Appl. No.: 10/668721 Filed: September 22, 2003
  • 14. @KorayGubur Midpage Query Refinements • Precomputation Engine has four parts. • Associator: Query and Document Association. • Selector: Document and Query Section Selector. • Regenerator: Checks the query logs to refresh the selections. • Inverter: Checks the Cached Data for presenting. @KorayGubur Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain View, CA) Correspondence Address: PATRICK J S INOUYE P S 810 3RD AVENUE SUITE 258 SEATTLE WA 98104 US Family ID: 34228721 Appl. No.: 10/668721 Filed: September 22, 2003
  • 15. @KorayGubur Midpage Query Refinements • Query Ambiguity: If the query is ambigous, Search Engine can use the query clusters. • Homonyms, General Terms, Improper Context, and Narrow Terms can create a stateless SERP Instance. • To prevent this, Semantic Grouping, Centroids and Centroid distance are used. • A Query Cluster and Document Cluster can be paired. If Document cluster is larger, or more relevant, the query cluster will be used as query suggestion. @KorayGubur Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain View, CA) Correspondence Address: PATRICK J S INOUYE P S 810 3RD AVENUE SUITE 258 SEATTLE WA 98104 US Family ID: 34228721 Appl. No.: 10/668721 Filed: September 22, 2003
  • 16. @KorayGubur Midpage Query Refinements • Matcher: Stored query variations are put into a cluster, and document phrase variations are matched. • Clusterer: The matched query variations, and documents are clustered together. Different than query clusters. • Scorer: Determines the center of the centroid. If the term vectors are distant to the centroid, another cluster will be chosen by the Clusterer for Scorer. • Presenter: Created Clusters, and Centroids are presented to the user. According to the preferred choices, presenter will use sub- centroids. @KorayGubur Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain V CA) Correspondence Address: PATRICK J S INOUYE P S 810 3RD AVENUE SUITE 258 SEATTLE WA 98104 US Family ID: 34228721 Appl. No.: 10/668721 Filed: September 22, 2003
  • 17. @KorayGubur Midpage Query Refinements • During 2017, the patent has been refreshed. • The Scorer Method has been changed. • Representative Queries are chosen based on centroids. • For every cluster, a representative query is chosen. • According to the cluster size, and relevance scores, the clusters are aligned. • And, sub-queries are used as the refinement queries. @KorayGubur Inventors: Paul Haahr and Steven D. Baker Assignee: Google Inc. The United States Patent 9,552,388 Granted: January 24, 2017 Filed: January 31, 2014
  • 18. @KorayGubur Midpage Query Refinements • Inventors of the Midpage Query Refinement Methodology are Paul Haahr and Steven D. Baker. • Steven Baker has written the Google Synonyms Blog Post for Google’s Synonym Update before the RankBrain Announcement. • Helping Search Engines to Understand Language: https://googleblog.blogspot.com/2010/01/hel ping-computers-understand-language.html • Paul Haahr is the owner of the How Google Works Presentation from SMX West. Includes lots of useful insights. @KorayGubur Inventors: Paul Haahr and Steven D. Baker Assignee: Google Inc. The United States Patent 9,552,388 Granted: January 24, 2017 Filed: January 31, 2014
  • 19. @KorayGubur Context-Vectors • Midpage Query Refinements and Query- Document Logical Pairs with Centroids and Clusters are the beginning of RankBrain. • Context-Vectors were the second step for completing the journey. • Word Vectors and Context Vectors are different from each other. • Word Vectors are the combination of words. • Context Vectors are the list of combination of words for a Contextual Domain. • Term Vector is a word combination from a Contextual Domain. @KorayGubur Inventors: David C. Taylor Application Date: 09/04/2012 Grant Number: 09449105 Grant Date: 09/20/2016
  • 20. @KorayGubur Context-Vectors • Midpage Query Refinements and Query- Document Logical Pairs with Centroids and Clusters are the beginning of RankBrain. • Context-Vectors were the second step for completing the journey. • Word Vectors and Context Vectors are different from each other. • Word Vectors are the combination of words. • Context Vectors are the list of combination of words for a Contextual Domain. • Term Vector is a word combination from a Contextual Domain. @KorayGubur Inventors: David C. Taylor Application Date: 09/04/2012 Grant Number: 09449105 Grant Date: 09/20/2016
  • 21. @KorayGubur Context-Vectors • Context-Vectors are close to the ‘Lexicon’ of the first research paper of Google which is An Anatomy of Large Hypertextual Web Search Engine document. • Context-Vectors are the version of Lexicon with different Contextual Domains. • Context-Vectors are located in Domain List Terms. • A Domain List Terms can include 800.000 words, and word combinations. • A Domain List Terms can include a macro- context, and a sub-context with sub- portions. @KorayGubur Inventors: David C. Taylor Application Date: 09/04/2012 Grant Number: 09449105 Grant Date: 09/20/2016
  • 22. @KorayGubur Context-Vectors • Context-vectors use ‘Topical Entries’. • A Topical Entry, can be used for macro- context. • These topical entries can be used for question generation. • Generated questions can be used for differentiating the different sub-contexts from each other. • A Macro-context can have a Dominant Knowledge Domain. A Context-Vector can be used for intersectional areas. @KorayGubur Inventors: David C. Taylor Application Date: 09/04/2012 Grant Number: 09449105 Grant Date: 09/20/2016
  • 23. @KorayGubur Categorical Quality • This is an ‘Re-ranking’ Algorithm Patent. • There is a strong difference between the Re-ranking and Initial Ranking. • Re-ranking Algorithms are the modifying algorithms for the Query Results. • Inventor is Tyrstan Upstill, author of the Evidence-based Ranking Research. • Categorical Quality doesn’t focus on relevance, or authoritativeness, it focuses on Understanding the Category of the Query. @KorayGubur Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo Ko, Neesha Subramaniam, and Vishnu P. Natchu US Patent Application: 20190155948 Published on: May 23, 2019 Filed: March 31, 2015
  • 24. @KorayGubur Categorical Quality • This patent mentions the ‘social media shares’ and community size. • If the query satisfy the ‘categorical query’ conditions, the search results will be evaluated for related and close queries too. • If a resource satisfies also the related categorical queries, a categorical quality score will be assigned to the source. • Categorical Quality Methodology collects Navigational Queries for different sources. • If the source has more navigational queries, it means that it has a popularity for the category. • Categorical Quality mentions «Topicality Score». @KorayGubur Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo Ko, Neesha Subramaniam, and Vishnu P. Natchu US Patent Application: 20190155948 Published on: May 23, 2019 Filed: March 31, 2015
  • 25. @KorayGubur Categorical Quality • If a source includes all query terms for a topic, it will have more Categorical Quality and Topicality Score. • This method also mentions ‘Click Selection.’ • To understand the Model’s Success, they do not take every click or CTR into account. • They take CTR and Clicks into account if it meets with certain criterias such as time, frequency, or personal interest. @KorayGubur Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo Ko, Neesha Subramaniam, and Vishnu P. Natchu US Patent Application: 20190155948 Published on: May 23, 2019 Filed: March 31, 2015
  • 26. @KorayGubur Substitue Query • Substitue Query is the query that can replace another query. These queries are used for bolding the some sections of the content. • Substitue Queries make ‘context’ more important. Because, synonyms make change the context. Such as, car and auto can be same thing for ‘repair’ but they are not same for ‘railroad’. • There is a railroad car, but not auto. • Thus, Sustitue Queries are not synonyms. They are the replacble words without changing the context. @KorayGubur Invented by Daisuke Ikeda and Ke Yang Assigned to Google US Patent 8,504,562 Granted August 6, 2013 Filed: April 3, 2012
  • 27. @KorayGubur Substitue Query • Co-occurence Matrix and Phrase- based Indexing are used to support the Substitue Queries. • The method uses the Space Vectors to compare the word vectors to each other. • If the queries are similar to each other with enough co-occurent words, it means that they can be subtitue to each other. @KorayGubur Invented by Daisuke Ikeda and Ke Yang Assigned to Google US Patent 8,504,562 Granted August 6, 2013 Filed: April 3, 2012
  • 28. @KorayGubur Synthetic Query • Synthetic Query is the re-written version of the query of the user by the search engine. • A search engine can re-write a query by augmenting the query to diversify the SERP Features for a better search activity satisfaction possibility. • Some score types that Synthetic Queries include are ‘Edit Distance Score’, ‘Similarity Score’, ‘Transformation Cost Score’. • Synthetic Queries can be collected from web documents, Structured Data, and Similarity Between Documents. @KorayGubur Inventors: Anand Shukla, Mark Pearson, Krishna Bharat and Stefan Buettcher Assignee: Google LLC US Patent: 9,916,366 Granted: March 13, 2018 Filed: July 28, 2015
  • 29. @KorayGubur Synthetic Query and Query Templates • Query Templates are intermediary forms between the Seed Queries and Synthetic Queries. • Synthetic Queries are helpful for a Search Engine to create pre-defined and pre-ordered SERP Instances. • Synthetic Queries can be generated from HTML Tags, IDF Scores, Close Phrases. • If a Document has «Dorothy Parker Biography» as H1, and «Sylvia Plath» as H2. • Search Engine can use the «Sylvia Plath Biography» as a synthetic query. • If the results are good enough for relevance and quality, the Synthetic Query will become a Seed Query. @KorayGubur Invented by Steven D. Baker, Michael Flaster, Nitin Gupta, Paul Haahr, Srinivasan Venkatachary, and Yonghui Wu Assigned to Google US Patent 8,346,792 Granted January 1, 2013 Filed: November 9, 2010
  • 30. @KorayGubur Synthetic Query and Query Templates • Synthetic Queries can be generated from the same author, same journal, source, or time of period. • Synthetic Queries and Open Information Extraction are closely related to each other. • Before entering the world of entities, understanding the world of phrases are important. • Open Information Extraction, and Unknown Phrases, Entities are connected to each other. @KorayGubur Invented by Steven D. Baker, Michael Flaster, Nitin Gupta, Paul Haahr, Srinivasan Venkatachary, and Yonghui Wu Assigned to Google US Patent 8,346,792 Granted January 1, 2013 Filed: November 9, 2010
  • 31. @KorayGubur Open Information Extraction • Google bought Wavii for 30.000.000$ in 2013. • Open Information Extraction is about ‘fact extraction’ around nouns. • It is for connecting different nouns to each other based on relations. • A classifier assigns a confidence scores to a relation between two nouns. • This is a text-to-data example. • Wavii was originally a news aggregator based on topics, not phrases. @KorayGubur Invented by Michael J. Cafarella, Michele Banko, and Oren Etzioni Assigned to: University of Washington through its Center for Commercialization United States Patent 7,877,343 Granted January 25, 2011
  • 32. @KorayGubur Open Information Extraction • The relational tuples include at least two nouns by connected to each other at least one verb and adverb, such as ‘created by’, ‘author of’, ‘is from’, ‘located there’. • ‘... Moreover, the number and complexity of entity types on the Web means that existing NER systems are inapplicable...’ • Open IE is for Unknown Entities, and recognizing Minor Entities without a registration to the Knowledge Base. @KorayGubur Invented by Michael J. Cafarella, Michele Banko, and Oren Etzioni Assigned to: University of Washington through its Center for Commercialization United States Patent 7,877,343 Granted January 25, 2011
  • 33. @KorayGubur Answer-seeking Query • Answer-seeking Queries have specific elements within the questions, and answers. • Google’s purpose is that extracting question and answer formats for answer- seeking queries. • Answer-seeking queries requires concise answers without any skepticism. • Answer-seeking Query is an important bridge between the Natural Language Queries with an Intent. @KorayGubur Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz Mohiuddin Assignee: Google LLC US Patent: 10,592,540 Granted: March 17, 2020 Filed: June 28, 2016
  • 34. @KorayGubur Answer-seeking Query • Question Elements are, Entity Instance, Entity Class, Part of Speech Class, Root Word, N-Gram and Question Triggering Words. • Answer Elements are Measurement, N- Gram, Verb, Preposition, Entity_instance, N-gram near entity, verb near entity, preposition near_entity, verb class, skip grams. • Answer-seeking Queries trigger Answer Scoring Engine, @KorayGubur Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz Mohiuddin Assignee: Google LLC US Patent: 10,592,540 Granted: March 17, 2020 Filed: June 28, 2016
  • 35. @KorayGubur Natural Language Queries • Natural Language Queries are the queries with the daily language. • They do not have a proper grammar rule, or complete sentence. • They do not explicitly tell their intent. • That’s why these queries also called Intent Queries, or Queries with a specific minor intent. • For such a query, a Search Engine should return an answer without lots of details, or structure. @KorayGubur International Application No WO/2014/197227 Published:11.12.2014 International Filing Date: 23.05.2014 Applicant: Google Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
  • 36. @KorayGubur Natural Language Queries • Natural Language Queries are not Factual-queries, this is the main difference for Answer-seeking queries. • Natural Language Queries are related to the Intent Template Generation. • A Natural Language Query can have multiple intents with a non-factual information, such as ‘How do I make hummus?’. • There might be different methods to make a hummus, and there are different types of hummus, also, the query includes ‘I’. So, no one can know how you do hummus. • The answer-seeking version of this query is that ‘How to do hummus’. • One of the important methodology points from here is that Google creates ‘heading-text’ pairs to understand the topics of the sub-sections of the article. @KorayGubur International Application No WO/2014/197227 Published:11.12.2014 International Filing Date: 23.05.2014 Applicant: Google Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
  • 37. @KorayGubur Natural Language Queries • Variable and Non-Variable Portions are important concepts for the intent templates. • Non-variable section of the intent for the previous query is ‘hummus’. • The variable section or portion can be a ‘place, method, tool, or style’. And, ‘I’ can change as a child, as a women, men, or adult and blind person. • For Natural Language Queries, the Intent Templates can be implemented to different Query Patterns such as X Causes, X Reasons. • If someone searches for only X, the intent templates will be used to assign the natural language results to the query. @KorayGubur International Application No WO/2014/197227 Published:11.12.2014 International Filing Date: 23.05.2014 Applicant: Google Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
  • 38. @KorayGubur Query Rewriting for Same Intnet Across Languages • Google tried to unite different search intents, data for these intents, and phrases that represents these intents to each other to improve the search results before. • This is called Query Expansion. Query Expansion can compare results for a query from a language, to results for the same query with a different language. • If the click satisfaction possibility is higher for another language, for the same intent, search engine can re-rank the results for the first language. @KorayGubur Invented by Stefan Riezler, Alexander L. Vasserman Assigned to Google US Patent Application 20080319962 Published December 25, 2008 Filed: March 17, 2008
  • 39. @KorayGubur Seed-Queries • Seed Queries can be synthetic queries, user generated queries. The main necessity for a seed query is that the query should be satisfying with a set of documents. • If a query is logical, popular and satisfying for the user, it will be marked as seed query whether it is synthetic or searcher generated. • Seed Queries are used to determine the representative queries for query variations, query and intent templates. @KorayGubur Inventors Manaal Faruqui and Dipanjan Das Applicants Google LLC Publication Number 20200167379 Filed: January 18, 2019 Publication Date May 28, 2020
  • 40. @KorayGubur End of Phrase-based Indexing and Query Processing Chaos • Query Parsing • Seed Query • Substitue Query • Natural Language Query • Answer-seeking Query • Factual Query • Non-factual Query • Non-variable Portion in Query • Variable Portion in Query • Discordant Query • Query Re-writing • Open Information Extraction • Synthetic Query • Categorical Query • Contextual Vectors • Term Vectors @KorayGubur • Intent Templates • Question and Answer Elements • Co-occurence Matrix • Query Expansion • Query Term Weight • Multi-stage Query Processing • Query Breadth • Query Template • Relation Types and Noun Tuples • Macro-context • Topical Entry • Mid-page Query Refinement • Query Ambiguity • Query Cluster – Document Cluster for Logical Pair • Associator, Matcher, Scorer for Query, Document Association • Edit Distance Score’, ‘Similarity Score’, ‘Transformation Cost Score’. • Phrase-based Indexing • Contextual Domains • Contextual Domain Word List • Query Analysis • Representative Query • Canonical Query • Minor Intent • Space Vectors • Navigational Query as a Popularity Signal • Evidence Based Ranking • Word Proximity • Word Adjecency • Query Term Weight
  • 41. @KorayGubur First Semantic Web Announcement • Semantic Web Roadmap has been published in September 1998 by Tim Barners-Lee. • Semantic HTML, and Semantic Web, Semantic User Patterns were the principles of Semantic Search. • The main purpose of Semantic Web is making the web understandable to machines so that machines can help humen-beings for better web surfing. • Tim Barners Lee talked about Agents, Ontology, Structured Data, RDFa, or Semantic HTML Tags and Digital Signature. • ‘Such an agent coming to the clinic's Web page will know not just that the page has keywords such as "treatment, medicine, physical, therapy" (as might be encoded today) but also that Dr. Hartman works at this clinic on Mondays, Wednesdays and Fridays and that the script takes a date range in yyyy-mm-dd format and returns appointment times. And it will "know" all this without needing artificial intelligence ‘ @KorayGubur ‘The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.’ -Tim Barners-Lee
  • 42. @KorayGubur First Semantic Search Patent • Google’s first Semantic Search Engine patent is from 1999. One year later from Tim Barners-lee’s announcement. • The Inventor is directly Sergey Bring. • Document doesn’t have a legal language, like other first patent instances of Google. • Document tells that every thing from similar type has same features. • Things on the web can be collected for certain type of information and stored with this information. @KorayGubur Invented by Sergey Brin Assigned to Google US Patent 6,678,681 Granted January 13, 2004 Filed: March 9, 2000
  • 43. @KorayGubur First Semantic Search Patent • Sergey Brin encountered some problems such as Named Entity Recognition, or Main Entity, and Entity Relation Detection. • These problems are not called based on Entities, but these books were entities with string representations. • Even a single letter difference resulted in big problems for Sergey Brin. • And, some books didn’t have price, or proper title, and some of them were not even real books. • In the first trying, the cost was high, process was slow, results were half, but Google kept going. @KorayGubur Invented by Sergey Brin Assigned to Google US Patent 6,678,681 Granted January 13, 2004 Filed: March 9, 2000
  • 44. @KorayGubur Knowledge Graph Launch • ‘Things, not strings.’ is the motto of Knowledge Graph. Everything on the web is divided into different entities, entity types, entity connections. • Named Entity Recognition, and Natural Language Processing increased its value and prominence within the algorithmic hierarchy of Google. • Knowledge Graph supported the Knowledge Panels. • Fact Extracting, Question Answering, Accuracy Audit, and Entity Relations are the columns of Entity-oriented Search Engine. • ‘Wouldn’t it be great understanding every word of user, instead of matching words?’, by Jack Menzel. @KorayGubur Inventors: John R. Provine Assignee: Google LLC US Patent: 10,922,326 Granted: February 16, 2021 Filed: March 14, 2013
  • 45. @KorayGubur Browsable Fact Repisotory • Browsable Fact Repisotory is the main and primitive version of the Google Knowledge Graph. • There are three important problems for Browsable Fact Repisotory. 1. Updating the Knowledge Graph. 2. Extracting the New Entities. 3. Auditing the Fact Accuracy. @KorayGubur Invented by Andrew W. Hogue and Jonathan T. Betz Assigned to Google Inc. US Patent 7,774,328 Granted August 10, 2010 Filed: February 17, 2006
  • 46. @KorayGubur Entity-seeking Query • Today’s last Query type. • Entity-seeking Queries are one of the basic columns of Entity-oriented search. • Identify the Query seeks for a singular entity, or plural things from same type. • If it is singular, entity-seeking query will match the term and the entity based on an attribute. • Entity-seeking Queries include a Semantic Dependency Tree, Relevance Threshold @KorayGubur Inventors: Mugurel Ionut Andreica, Tatsiana Sakhar, Behshad Behzadi, Marcin M. Nowak-Przygodzki, and Adrian-Marius Dumitran US Patent Application: 20190370326 Published: December 5, 2019 Filed: May 29, 2018
  • 48. @KorayGubur Structured Search Engine @KorayGubur • Sergey Brin said, ‘Structured Form’ in 1999. • In 2011, Andrew Hogue said Structured Search Engine. • Andrew Hogue introduced the Open- Domain Fact Extraction methodologies for extracting, clustering entities from the web. • Andrew Hogue has showed some concrete examples to the future Google Engineers for the direction that they want to head. Cartoon is created by Gary Larson.
  • 49. @KorayGubur Semantic Search Engine @KorayGubur • Google can extract all attributes of an entity to understand its general features. • According to the Source Attribute, these features can be changed, detected or altered. • Based on the entity types, and candidate entities, Google can generate more entity types, and connections between them. • Structured Search Engine’s other name is Semantic Search Engine. • Semi-structured Text Understanding, Question Generation from Keywords, and Question-Answer Pairing are the main objectives of Semantic Search Engine.
  • 50. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity-oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 51. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity-oriented Search. Source: The Structured Search Engine by Andrew Hogue Named Entity Recognition process for the query. • Entity-seeking Queries are the backbone of the entity oriented search. • Recognizing an entity from a Query is not easy, or cheap. • Neural Matching, RankBrain, Sub-topic Update, or BERT, MuM, LaMDA... All of them are used for recognizing the entity, and its related attributes.
  • 52. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity-oriented Search. Source: The Structured Search Engine by Andrew Hogue Second step is Entity Resolution. • Entity Resolution, and Attribute Extraction are for understanding the related attribute of the entity. • Entity-seeking Queries usually try to find an Entity’s Attribute such as look, height, taste, inception or history. • After the entity and its attribute are taken from the query, at the next step, Question Format will be taken.
  • 53. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue Third step is Synonym Extraction. • Synonym Extraction is for strenghten the confidence score. • Other function of the Synonym Extraction is that, it helps for using alternate documents for the same question. • According to the Synonyms, the question format can change.
  • 54. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue Question format is necessary to understand the query by increasing the confidence score, and matching the similar successful documents. • Question format is important to determine the answer format. • Quetion term order, and answer term order can increase the success rate. • The last important thing here is that the ‘answer data type’ which is a date.
  • 55. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue Forth step is Entity Reconciliation and data accuracy audit. At the next step, Google can check the related search activity, possible search activity, and choose the best answer. • The answer formats, and answer phrases will be used for entity reconcilation. • Entity reconcilation includes the standartization of the entity with the correct information. • 5 Rand Fishkin Entity Recording exist in Knowledge Graph, for same Rand Fishkin.
  • 56. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue Entity Reconcilation Inventors: Oksana Yakhnenko and Norases Vesdapunt Assignee: GOOGLE LLC US Patent: 10,331,706 Granted: June 25, 2019 Filed: October 4, 2017 Entity Reconcilation is another patent from Google. • It includes checking multiple sources to complete the missing information on the Knowledge Graph. • It also uses similarity threshold between different sources and the knowledge graph. • If the source is authoritative, it will be easier to modify the Knowledge Graph.
  • 57. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue “For other people it can be a little more complicated. Like me, for example, John Mueller. If you search for me you’ll find Wikipedia pages, barbecue restaurants, bands, all kinds of people who are called John Mueller. And if, on my site, I don’t specify who I actually am, then it could happen that our systems look at my page and go: “oh this is that guy that runs that barbecue restaurant.” And suddenly I’m associated with a barbecue restaurant, which might be a move up, I don’t know. But these subtle things make it easier for us to recognize who is actually behind something. We call that reconciliation when it comes to structured data, kind of recognizing which of these entities belong together.” John Mueller
  • 58. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 59. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 60. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 61. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 62. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity- oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 63. @KorayGubur Semantic Search Engine @KorayGubur This is a Query Parsing Example from a Google Engineer for Entity-oriented Search. Source: The Structured Search Engine by Andrew Hogue
  • 64. @KorayGubur Semantic Search Engine @KorayGubur Semantic Role Labeling Named Entity Resolution Named Entity Extraction Relation Detection Lexical Semantics Taxonomy Ontology Onomastics Important Terms and Concepts for NER and Semantic Search Engine
  • 65. @KorayGubur Semantic Search Engine @KorayGubur Entity Extraction • Entity extraction is a complementary step for Named Entity Recognition. • Recognized Entity can be extracted from the text to be stored in a Knowledge Base. • Entity Extraction uses attributes to connect the entity and its meaning, prominence and attributes to each other. • In the sentence of ’46th President of United States (US) had decided to go Paris on Monday, 2th june, 2002.’ • ‘46th President of United States’ is the named entity. • The decision of the president is the attribute with the date contribution which is included in entity extraction.
  • 66. @KorayGubur Semantic Search Engine @KorayGubur Entity Resolution • Entity Resolution has two phases. • First phase is finding the mention entity’s correct idendity. • Second phase is finding the correct profile of the mentioned entity. • For instance, Bill Clinton was a U.S President, but also an Actor in Hollywood. An American Football Player can be also a cook, or journalist. • To find the right entity, from the entity reference, Search Engine can use related entities, and their types. • Entity Resolution helps for feeding the text- to-data systems of Search Engines. • If you tell ‘Barry Scwhartz entered to classroom and asked questions to the students’, the Entity Resolution will decide that it is the Professor Barry, not our Barry.
  • 67. @KorayGubur Semantic Search Engine @KorayGubur Relation Detection • Relation Detection is the process of understanding the relation type and labels between different entities within a text. • There are different types of relations, such as ‘isSimilarOf’, ‘locatedIn’, ‘superiorOf’, ‘closeTo’, ‘sameAs’. • Some of these relation types are familiar from the Structured Data. • Some of the relation types are unique for specific entities and specific topics. • Relation Detection takes power from the Lexical Semantics. • Relation detection can be used for Visual-to- text algorithms too.
  • 68. @KorayGubur Semantic Search Engine @KorayGubur Lexical Semantics • Lexical Semantics should be known by every human-being for thinking and speaking in a healthy way. • Lexical Semantics include semantic meaning connections between different words. • Lexical Semantics are used to understand the relational connections between named entities. • For instance, ‘Boy’ includes ‘single’, ‘teenage’, ‘male’, ‘young’ meanings as default. But, some of these meanings have high possibility, some of them low. • For instance, someone young, male, teenage can be also married. • Lexical Semantics are used to understand the named entity’s resolution and connection with other things. Lexemes: not analyzable unit, by itself. Lexicon: List of lexemes.
  • 69. @KorayGubur Semantic Search Engine @KorayGubur Semantic Role Labeling • Semantic Role Labeling is the process of understanding the parts of a sentence by assigning related labels. • Semantic Role Labeling takes power from Lexical Semantics, and Part of Speech Tag. • Semantic Role Labeling helps Relation Detection. • There are more than 32 Semantic Roles. • For Semantic Role Labeling, the most important part is finding the theme, predicate, agent, and effect. • Semantic Role Labeling is beneficial to audit the content’s accuracy, and fact extraction from the prepositions.
  • 70. @KorayGubur Semantic Search Engine @KorayGubur Taxonomy • Taxos-logos, or Taxonomy means arrangement of things. • It is used for animal classification first, in Anceint Greek. • In moden era, it is used for all living thing classification in biology, and then it has been used for classification of chemical, or other types of existing things. • In the field of Search Engine Optimization, Semantic Entity Types, and Semantic Dependency Tree is important. • Creationg a hierarchy between entities based on their type and size, prominence or superiority and inferiority is important to increase the contextual relevance, and specifying the relevance of the article. • Every entity type has a different attribute group, and hierarchy can be refreshed. • If the context is size of cities, ‘berlin’, ‘paris’, ‘istanbul’ can have a different taxonomy, in terms of big, small, medium cities. • If the context is countries of these cities, taxonomy can be aligned with country names, and region, continent names.
  • 71. @KorayGubur Semantic Search Engine @KorayGubur Ontology • Ontology completes the taxonomy. • Ontos-logos, essence of things. • It is a barnch of philosophy. • Ontology is a reflex for all human-beings. • Ontology can be created based on mutual points of different entities. • According to the mutual attribute between entities, the taxonomy can change, and ontology can follow it also. • If three named entities are from same region, region name is the mutual attribute, and it can have other types of connections based on this.
  • 72. @KorayGubur Semantic Search Engine @KorayGubur Onomastics • Onomastics is the science of naming, and analyzing the name patterns for different languages. • Every enttiy type has a different naming pattern. • Name patterns are used to recognize entities, entity types, and attributes of entities. • It comes from onoma and stikos, means names of things. • Different science names, city names, event names, situation names, or instituion names can have naming patterns. • Some onomastics sub-type examples, 1. helonyms: proper names of swamps, marshes and bogs. 2. limnonyms: proper names of lakes and ponds. 3. oceanonyms: proper names of oceans. 4. pelagonyms: proper names of seas and maritime bays. 5. potamonyms: proper names of rivers and streams. • Onomastics can be used for taxonomy and ontology creation too. Even a water can have multiple naming patterns based on sub-types.
  • 73. @KorayGubur Semantic Search Engine @KorayGubur BERT - SMITH MuM LaMDA Conversational Search Important Announcements for Structured Search Engine
  • 74. @KorayGubur Semantic Search Engine @KorayGubur BERT - SMITH Uses, Masked Language Model. It masks 15% of every tokens for prediction model. Used, Bidrectional Language Understanding. It reads all sentence at once from both direction. It predicts the next sentence. Used bigger tokens than 512 with SMITH. Used fine-tuning based representation model.
  • 75. @KorayGubur Semantic Search Engine @KorayGubur MuM The research papers have been taken in 2021 March. In 2021 May, they announced MuM. In 2021 June, they announced that they started to use MuM. All system is related to the understand ‘Related Search Activity’ to predict the future queries.
  • 76. @KorayGubur Semantic Search Engine @KorayGubur MuM If you search for trekking to a mountain, there are three possible different contexts: Trekking Mountain And, Specific Mountain Trekking
  • 77. @KorayGubur Semantic Search Engine @KorayGubur LaMDA LaMDA is for connecting a question to another with Human Sensible Way. Specifity Factuality Interestingness Sensibleness LaMDA is a part of Conversational AI.
  • 78. @KorayGubur Semantic Search Engine @KorayGubur Conversational Search Conversational Search is close to Conversational AI. It connects different entities, concepts, intents to each other. Creates new Contextual Domains, and Co-occurence Matrixes. Conversational Search Announcement includes only the past queries. MuM, and LaMDA includes future queries.
  • 79. @KorayGubur Semantic Search Engine @KorayGubur Important Language Models for Near Future in the context of Semantic Search Engine ReALM KeALM
  • 80. @KorayGubur Semantic Search Engine @KorayGubur ReALM Retrieval Augmented Language Model Based on Entity Dependency Tree, missed attributes and facts can be extracted. Source: https://ai.googleblog.com/2020/08/realm-integrating-retrieval-into.html
  • 81. @KorayGubur Semantic Search Engine @KorayGubur ReALM Inventors: Kenton Chiu Tsun Lee, Kelvin Gu, Zora Tung, Panupong Pasupat, and Ming-Wei Chang Assignee: Google LLC US Patent: 11,003,865 Granted: May 11, 2021 Filed: May 20, 2020 First a Research Paper, Then, a Patent. Lastly, an Update with Official Statement, Or Non-Official Statement.
  • 82. @KorayGubur Semantic Search Engine @KorayGubur KeALM Knowledge Graph Integrated Language Model for Fact and Accuracy Checking. Source: https://ai.googleblog.com/2021/05/kelm-integrating- knowledge-graphs-with.html Data to text Triple Example
  • 83. @KorayGubur Semantic Search Engine @KorayGubur Encazip.com. Holistic SEO Case Study based on Semantic SEO. Used Entity-oriented Search. From daily 150 clicks to 6.000 clicks.
  • 84. @KorayGubur Semantic Search Engine @KorayGubur An Education Brand 11.000 queries and 30.000 monthly clicks within 25 days
  • 85. @KorayGubur Semantic Search Engine @KorayGubur An unpublished case study. 422.000 queries, 220.000 clicks in 66 days. It is also a Technical SEO Case Study. Indexed 73.000 pages in 66 days.
  • 86. @KorayGubur Semantic Search Engine @KorayGubur 15.000 New Queries. 35.000 monthly traffic. In 3 months. Used Semantic SEO
  • 87. @KorayGubur @KorayGubur ‘Without understanding the Query Processing in the eyes of Search Engine, you can’t create the relevant, and satisfying document based on minor and dominant search activity types.’ Thank You