This document summarizes several patents related to query parsing and semantic search. It describes patents for multi-stage query processing, query breadth, query analysis, midpage query refinements (search suggestions), context vectors, and categorical quality (re-ranking search results based on the category of the query). Each patent is briefly described, including inventors, filing dates, and some technical details. The document aims to provide an overview of the evolution of semantic search and query understanding technologies at Google.
Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEOKoray Tugberk GUBUR
Query Processing is the process of query term weight calculation, query augmentation, query context defining, and more. Query understanding and Query clustering are related to Information Retrieval tasks for the search engines. To provide a better search engine optimization effort and project result, the organic search performance optimizers need to implement query processing methodologies. Digital marketing and SEO are connected to each other. Understanding a query includes query parsing, query rewriting, question generation, and answer pairing. Multi-stages Query Processing, Candidate Answer Passages, or Candidate Answer Passages and Answer Term Weighting are some of the concepts from the Google Search Engine to parse the queries.
The presentation of The Secret Life of Queries, Parsing, Rewriting & SEO has been presented at the Brighton SEO Event in April 2022. The event speech focused on explaining the theoretical SEO and practical SEO examples together.
Query Processing methodologies are beyond synonym matching or synonym finding. It involves multiple aspects of the words, and meanings of the words. The theme of words, the centrality of words, attention windows, context windows, and word co-occurrence matrices, GloVe, Word2Vec, word embeddings, character embeddings, and more.
Themes of words contain the word probability like in Continues Bag of Window.
The search engine optimization community focuses on keyword research by matching the queries. Query processing involves query word order change, query word type change, query word combination change, query phrase synonym usage, query question generation, query clustering. Query processing and document processing are correlational. Query processing is to understand a query while document processing is to process a web document. Both of the processes are for ranking algorithms. Providing a better ranking algorithm requires a better query understanding. And providing better rankings as SEOs require better search engine understanding. Thus, understanding the methods of query processing is necessary.
Search Query Processing is implementing the query processing for thesearch engines. Search query refers to the phrase that search engine users use for searching. Search intent understanding and search intent grouping are two different things. But, query templates, questions templates, and document templates work together. Search query is for organic search behaviors. A web search engine answers millions of queries every day. Search query processing is a fundamental task for search engine optimization and search engine result page optimization.
The "Semantic Search Engine: Query Processing" slides from Koray Tuğberk GÜBÜR supported the presentation of "Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEO". The presentation has been created by Dear Rebecca Berbel.
Many thanks to the Google engineers that created the Semantic Search Engine patents including Larry Page.
Semantic Content Networks - Ranking Websites on Google with Semantic SEOKoray Tugberk GUBUR
Semantic Content Networks are the semantic networks of things with relations, directed graphs, attributes and facts. Every declaration, and proposition for semantic search represent a factual repository. Open Information Extraction is a methodology for creation of a semantic network. The Knowledge Base and Knowledge Graph are connected things to each other in terms of factual repository usage. The Knowledge Base represents a factual repository with descriptions and triples. Knowledge Graph is the visualized version of the Knowledge Base. A semantic network is knowledge representation. Semantic Network is prominent to understand the value of an individual node, or the similar and distant members of the same semantic network. Semantic networks are implemented for the search engine result pages. Semantic networks are to create a factual and connected question and answer networks. A semantic network can be represented and consist of from textual and visual content. Semantic Network include lexical parts and lexical units.
Links, Nodes, and Labels are parts of the semantic networks. Procedural Parts are constructors, destructors, writers and readers. Procedural parts are to expand the semantic networks and refresh the information on it.
Structural Part has links and nodes. Semantic part has the associated meanings which are represented as the labels.
The semantic content networks have different types of relations and relation types.
Semantic content networks have "and/OR" trees.
Semantic Content Networks have "Relation Type Examples" with "is/A" hierarchies.
Semantic Content Networks have "is/Part" Hierarchy.
Inheritance, reification, multiple inheritance, range queries and values, intersection search, complex semantic networks, inferential distance, partial ordering, semantic distance, and semantic relevance are concepts from semantic networks.
Semantic networks help understanding semantic search engines and the semantic SEO. Because, it contains all of the related lexical relations, semantic role labels, entity-attribute pairs, or triples like entity, predicate and object. Search engines prefer to use semantic networks to understand the factuality of a website. Knowledge-based Trust is related to the semantic networks because it provides a factuality related trust score to balance the PageRank. The knowledge-based Trust is announced by Luna DONG. Ramanathan V. Guha is another inventor from the Google and Schema.org. He focuses on the semantic web and semantic search engine behaviors. He explored and invented the semantic search engine related facts.
Semantic Content Networks are used as a concept by Koray Tuğberk GÜBÜR who is founder of Holistic SEO & Digital. Expressing semantic content networks helps to shape the semantic networks via textual and visual content pieces. The semantic content networks are helpful to shape the truth on the open web, and help a search engine to rank a website even if there is no external PageRank flow.
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?Koray Tugberk GUBUR
This article delves into the concepts of Semantic SEO, Topical Authority, and PageRank, exploring their relationships and how they benefit both website owners and search engines. By leveraging Natural Language Processing (NLP) techniques, Semantic SEO improves search engine comprehension of content and enhances user experience, ultimately leading to better search results.
In the ever-evolving world of Search Engine Optimization (SEO), understanding the intricate connections between Semantic SEO, Topical Authority, and PageRank is crucial for webmasters, content creators, and marketers. These concepts play a vital role in enhancing the visibility and relevance of websites in search results.
Semantic SEO: Going Beyond Keywords
Semantic SEO involves optimizing content by focusing on the meaning and context of words, phrases, and sentences rather than merely targeting specific keywords. This is achieved through NLP techniques such as topic modeling, sentiment analysis, and entity recognition, which allow search engines to comprehend the true essence of content.
Topical Authority: Establishing Expertise and Trustworthiness
Topical Authority refers to the perceived expertise of a website or content creator in a specific subject area. By producing high-quality, relevant, and in-depth content, websites can establish themselves as authorities, earning the trust of both users and search engines. This translates into higher search rankings and increased visibility.
PageRank: Measuring the Importance of Webpages
PageRank is an algorithm used by Google to determine the significance of a webpage by analyzing the quality and quantity of its inbound links. A higher PageRank implies that a website is more authoritative and valuable, thus warranting a better position in search results.
The Interrelation of Semantic SEO, Topical Authority, and PageRank
Semantic SEO, Topical Authority, and PageRank are interconnected concepts that work in tandem to improve a website's search performance. By focusing on Semantic SEO, content creators can enhance their Topical Authority and establish a solid online presence. This, in turn, can lead to higher PageRank and improved search visibility.
The Benefits of Semantic SEO for Search Engines
Semantic SEO not only benefits website owners but also search engines by reducing the cost of understanding documents. With the help of NLP techniques, search engines can efficiently analyze and comprehend content, making it easier to identify and index relevant webpages. This ultimately leads to more accurate search results and a better user experience.
In conclusion, embracing Semantic SEO, Topical Authority, and PageRank is essential for achieving higher search rankings and increased online visibility. By leveraging NLP techniques, Semantic SEO offers a more sophisticated and efficient approach to understanding and optimizing content, ultimately benefiting both website owners and search engines.
Lexical Semantics, Semantic Similarity and Relevance for SEOKoray Tugberk GUBUR
Lexical semantics and relations between words include relations of superiority, inferiority, part, whole, opposition, and sameness between the meanings of words. The same word can be a meronymy, hyponym, or antonym of another word, depending on the word before or after it. The lexical relation value of the first word can affect the structure of the next word, affecting the context of the sentence and the Information Retrieval Score. Information Retrieval Score is the score that determines how much content is related to a query, how close the different variants of the related query are, and the structure processed by the search engine’s query processor to the relevant document. A higher information retrieval score represents better relevance and possible click satisfaction.
The problem with a semi-structured and distracting context for Information Retrieval Score is that, if a document is not configured for a single topic, the IR Score can be diluted by the two different contexts resulting in a relative rank lost to another textual document.
IR Score Dilution involves badly structured lexical relations, along with bad word proximity. The relevant words that complete each other within the meaning map should be used closely, within a paragraph or section of the document, to signal the context in a more clear way to increase the IR Score. A search engine can check whether the document contains the hyponym of the words within the query or not. A possible query prediction can be generated from the hypernyms of the query. A search engine can check only the anchor texts to see whether there is a word within the “hyponym distance” which represents the hyponym depth between two different words.
Lexical Relations can represent the semantic annotations for a document. A semantic annotation is a word that describes the document overall in terms of category and main context that carries the purpose of the document. A semantic annotation can contain the main entity of the document or a general concept for covering a broader meaning area (knowledge domain). Semantic Annotations can be generated with the lexical relations between words. A semantic annotation can be used to match the document to the query. Semantic annotations are factors for a better IR Score.
A search engine can generate phrase patterns from the lexical relationships between words within the queries or the documents. A phrase pattern contains sections that define a concept with qualifiers. Phrase patterns can contain a hyponym just after an adjective, or a hypernym with the antonym of the same adjective. Most of these connections and patterns are used within the Recurrent Neural Network (RNN) for the next word prediction. A phrase pattern helps a search engine to increase its confidence score for relating the document to the specific query, or the meaning of the query.
40 Deep #SEO Insights for 2023:
-In 2022, I told to focus on Natural Language Generation, and it happened.
-In 2023, F-O-C-U-S on "Information Density, Richness, and Unique Added Value" with Microsemantics.
I call the collection of these, "Information Responsiveness".
1/40 🧵.
1. PageRank Increases its Prominence for Weighting Sources
Reason: #AI and automation will bloat the web, and the real authority signals will come from PageRank, and Exogenous Factors.
The expert-like AI content and real expertise are differentiated with historical consistency.
2. Indexing and relevance thresholds will increase.
Reason: A bloated web creates the need for unique value to be added to the web with real-world expertise and organizational signals. The knowledge domain terms, or #PageRank, will be important in the future of a web source.
3. AI and #automation filters will be created.
Reason: Google needs to filter the websites that publish 500 articles a day on multiple topics to find non-expert websites. This is already happening.
4. #Google will start to make mistakes in filtering websites that use spam and AI.
Reason: The need for AI-generated content filtration forced Google to check and audit "momentum", in other words, content publication frequency.
I used the "momentum" first in TA Case Study.
5. Google uses #Author Vectors, and Author Recognition.
Reason: LLMs use certain types of language styles and word sequences by leaving a watermark behind them. It is easy to understand which websites do not use a real expert for their articles, and content to differentiate.
6. #Microsemantics will be the name of the next game.
Reason: The bloating on the web will create bigger web document clusters, and being a representative source will be more important.
Thus, micro-differences inside the content will create higher unique value.
7. Custom #LLMs will be rented.
Reason: Custom and unique LLMs will be trained and rented to the people who try to create 100 websites with 100,000 content items per website.
NLP in SEO will show its true monetary value in mid-2023.
8. Advanced Semantic SEO will be a must for every SEO.
Reason: 20 years of websites will lose their rankings to the new websites that come with 60,000 articles. This creates the need for advanced #Semantics and Lingusitics capabilities for SEOs.
9. Cost-of-retrieval will be a base concept for #SEO, as TA.
Reason: TA explains a big portion of how the web works. Information Responsiveness and Cost-of-retrieval will complete it further.
For two books, I will be publishing only these two concepts.
10. Google Keys
Reason: The biggest Google leak after Quality Rater Guidelines will happen in 2023. And, I will be involved, but no more information, for now, I am not allowed to share more.
Check the slides for the next SEO Insights for 2023.
#searchengineoptimization #future #nlp #semantic #chatgpt #ai #content #quality #publishing #trend #seotrend #seo #searchengineoptimisation
Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEOKoray Tugberk GUBUR
Query Processing is the process of query term weight calculation, query augmentation, query context defining, and more. Query understanding and Query clustering are related to Information Retrieval tasks for the search engines. To provide a better search engine optimization effort and project result, the organic search performance optimizers need to implement query processing methodologies. Digital marketing and SEO are connected to each other. Understanding a query includes query parsing, query rewriting, question generation, and answer pairing. Multi-stages Query Processing, Candidate Answer Passages, or Candidate Answer Passages and Answer Term Weighting are some of the concepts from the Google Search Engine to parse the queries.
The presentation of The Secret Life of Queries, Parsing, Rewriting & SEO has been presented at the Brighton SEO Event in April 2022. The event speech focused on explaining the theoretical SEO and practical SEO examples together.
Query Processing methodologies are beyond synonym matching or synonym finding. It involves multiple aspects of the words, and meanings of the words. The theme of words, the centrality of words, attention windows, context windows, and word co-occurrence matrices, GloVe, Word2Vec, word embeddings, character embeddings, and more.
Themes of words contain the word probability like in Continues Bag of Window.
The search engine optimization community focuses on keyword research by matching the queries. Query processing involves query word order change, query word type change, query word combination change, query phrase synonym usage, query question generation, query clustering. Query processing and document processing are correlational. Query processing is to understand a query while document processing is to process a web document. Both of the processes are for ranking algorithms. Providing a better ranking algorithm requires a better query understanding. And providing better rankings as SEOs require better search engine understanding. Thus, understanding the methods of query processing is necessary.
Search Query Processing is implementing the query processing for thesearch engines. Search query refers to the phrase that search engine users use for searching. Search intent understanding and search intent grouping are two different things. But, query templates, questions templates, and document templates work together. Search query is for organic search behaviors. A web search engine answers millions of queries every day. Search query processing is a fundamental task for search engine optimization and search engine result page optimization.
The "Semantic Search Engine: Query Processing" slides from Koray Tuğberk GÜBÜR supported the presentation of "Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEO". The presentation has been created by Dear Rebecca Berbel.
Many thanks to the Google engineers that created the Semantic Search Engine patents including Larry Page.
Semantic Content Networks - Ranking Websites on Google with Semantic SEOKoray Tugberk GUBUR
Semantic Content Networks are the semantic networks of things with relations, directed graphs, attributes and facts. Every declaration, and proposition for semantic search represent a factual repository. Open Information Extraction is a methodology for creation of a semantic network. The Knowledge Base and Knowledge Graph are connected things to each other in terms of factual repository usage. The Knowledge Base represents a factual repository with descriptions and triples. Knowledge Graph is the visualized version of the Knowledge Base. A semantic network is knowledge representation. Semantic Network is prominent to understand the value of an individual node, or the similar and distant members of the same semantic network. Semantic networks are implemented for the search engine result pages. Semantic networks are to create a factual and connected question and answer networks. A semantic network can be represented and consist of from textual and visual content. Semantic Network include lexical parts and lexical units.
Links, Nodes, and Labels are parts of the semantic networks. Procedural Parts are constructors, destructors, writers and readers. Procedural parts are to expand the semantic networks and refresh the information on it.
Structural Part has links and nodes. Semantic part has the associated meanings which are represented as the labels.
The semantic content networks have different types of relations and relation types.
Semantic content networks have "and/OR" trees.
Semantic Content Networks have "Relation Type Examples" with "is/A" hierarchies.
Semantic Content Networks have "is/Part" Hierarchy.
Inheritance, reification, multiple inheritance, range queries and values, intersection search, complex semantic networks, inferential distance, partial ordering, semantic distance, and semantic relevance are concepts from semantic networks.
Semantic networks help understanding semantic search engines and the semantic SEO. Because, it contains all of the related lexical relations, semantic role labels, entity-attribute pairs, or triples like entity, predicate and object. Search engines prefer to use semantic networks to understand the factuality of a website. Knowledge-based Trust is related to the semantic networks because it provides a factuality related trust score to balance the PageRank. The knowledge-based Trust is announced by Luna DONG. Ramanathan V. Guha is another inventor from the Google and Schema.org. He focuses on the semantic web and semantic search engine behaviors. He explored and invented the semantic search engine related facts.
Semantic Content Networks are used as a concept by Koray Tuğberk GÜBÜR who is founder of Holistic SEO & Digital. Expressing semantic content networks helps to shape the semantic networks via textual and visual content pieces. The semantic content networks are helpful to shape the truth on the open web, and help a search engine to rank a website even if there is no external PageRank flow.
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?Koray Tugberk GUBUR
This article delves into the concepts of Semantic SEO, Topical Authority, and PageRank, exploring their relationships and how they benefit both website owners and search engines. By leveraging Natural Language Processing (NLP) techniques, Semantic SEO improves search engine comprehension of content and enhances user experience, ultimately leading to better search results.
In the ever-evolving world of Search Engine Optimization (SEO), understanding the intricate connections between Semantic SEO, Topical Authority, and PageRank is crucial for webmasters, content creators, and marketers. These concepts play a vital role in enhancing the visibility and relevance of websites in search results.
Semantic SEO: Going Beyond Keywords
Semantic SEO involves optimizing content by focusing on the meaning and context of words, phrases, and sentences rather than merely targeting specific keywords. This is achieved through NLP techniques such as topic modeling, sentiment analysis, and entity recognition, which allow search engines to comprehend the true essence of content.
Topical Authority: Establishing Expertise and Trustworthiness
Topical Authority refers to the perceived expertise of a website or content creator in a specific subject area. By producing high-quality, relevant, and in-depth content, websites can establish themselves as authorities, earning the trust of both users and search engines. This translates into higher search rankings and increased visibility.
PageRank: Measuring the Importance of Webpages
PageRank is an algorithm used by Google to determine the significance of a webpage by analyzing the quality and quantity of its inbound links. A higher PageRank implies that a website is more authoritative and valuable, thus warranting a better position in search results.
The Interrelation of Semantic SEO, Topical Authority, and PageRank
Semantic SEO, Topical Authority, and PageRank are interconnected concepts that work in tandem to improve a website's search performance. By focusing on Semantic SEO, content creators can enhance their Topical Authority and establish a solid online presence. This, in turn, can lead to higher PageRank and improved search visibility.
The Benefits of Semantic SEO for Search Engines
Semantic SEO not only benefits website owners but also search engines by reducing the cost of understanding documents. With the help of NLP techniques, search engines can efficiently analyze and comprehend content, making it easier to identify and index relevant webpages. This ultimately leads to more accurate search results and a better user experience.
In conclusion, embracing Semantic SEO, Topical Authority, and PageRank is essential for achieving higher search rankings and increased online visibility. By leveraging NLP techniques, Semantic SEO offers a more sophisticated and efficient approach to understanding and optimizing content, ultimately benefiting both website owners and search engines.
Lexical Semantics, Semantic Similarity and Relevance for SEOKoray Tugberk GUBUR
Lexical semantics and relations between words include relations of superiority, inferiority, part, whole, opposition, and sameness between the meanings of words. The same word can be a meronymy, hyponym, or antonym of another word, depending on the word before or after it. The lexical relation value of the first word can affect the structure of the next word, affecting the context of the sentence and the Information Retrieval Score. Information Retrieval Score is the score that determines how much content is related to a query, how close the different variants of the related query are, and the structure processed by the search engine’s query processor to the relevant document. A higher information retrieval score represents better relevance and possible click satisfaction.
The problem with a semi-structured and distracting context for Information Retrieval Score is that, if a document is not configured for a single topic, the IR Score can be diluted by the two different contexts resulting in a relative rank lost to another textual document.
IR Score Dilution involves badly structured lexical relations, along with bad word proximity. The relevant words that complete each other within the meaning map should be used closely, within a paragraph or section of the document, to signal the context in a more clear way to increase the IR Score. A search engine can check whether the document contains the hyponym of the words within the query or not. A possible query prediction can be generated from the hypernyms of the query. A search engine can check only the anchor texts to see whether there is a word within the “hyponym distance” which represents the hyponym depth between two different words.
Lexical Relations can represent the semantic annotations for a document. A semantic annotation is a word that describes the document overall in terms of category and main context that carries the purpose of the document. A semantic annotation can contain the main entity of the document or a general concept for covering a broader meaning area (knowledge domain). Semantic Annotations can be generated with the lexical relations between words. A semantic annotation can be used to match the document to the query. Semantic annotations are factors for a better IR Score.
A search engine can generate phrase patterns from the lexical relationships between words within the queries or the documents. A phrase pattern contains sections that define a concept with qualifiers. Phrase patterns can contain a hyponym just after an adjective, or a hypernym with the antonym of the same adjective. Most of these connections and patterns are used within the Recurrent Neural Network (RNN) for the next word prediction. A phrase pattern helps a search engine to increase its confidence score for relating the document to the specific query, or the meaning of the query.
40 Deep #SEO Insights for 2023:
-In 2022, I told to focus on Natural Language Generation, and it happened.
-In 2023, F-O-C-U-S on "Information Density, Richness, and Unique Added Value" with Microsemantics.
I call the collection of these, "Information Responsiveness".
1/40 🧵.
1. PageRank Increases its Prominence for Weighting Sources
Reason: #AI and automation will bloat the web, and the real authority signals will come from PageRank, and Exogenous Factors.
The expert-like AI content and real expertise are differentiated with historical consistency.
2. Indexing and relevance thresholds will increase.
Reason: A bloated web creates the need for unique value to be added to the web with real-world expertise and organizational signals. The knowledge domain terms, or #PageRank, will be important in the future of a web source.
3. AI and #automation filters will be created.
Reason: Google needs to filter the websites that publish 500 articles a day on multiple topics to find non-expert websites. This is already happening.
4. #Google will start to make mistakes in filtering websites that use spam and AI.
Reason: The need for AI-generated content filtration forced Google to check and audit "momentum", in other words, content publication frequency.
I used the "momentum" first in TA Case Study.
5. Google uses #Author Vectors, and Author Recognition.
Reason: LLMs use certain types of language styles and word sequences by leaving a watermark behind them. It is easy to understand which websites do not use a real expert for their articles, and content to differentiate.
6. #Microsemantics will be the name of the next game.
Reason: The bloating on the web will create bigger web document clusters, and being a representative source will be more important.
Thus, micro-differences inside the content will create higher unique value.
7. Custom #LLMs will be rented.
Reason: Custom and unique LLMs will be trained and rented to the people who try to create 100 websites with 100,000 content items per website.
NLP in SEO will show its true monetary value in mid-2023.
8. Advanced Semantic SEO will be a must for every SEO.
Reason: 20 years of websites will lose their rankings to the new websites that come with 60,000 articles. This creates the need for advanced #Semantics and Lingusitics capabilities for SEOs.
9. Cost-of-retrieval will be a base concept for #SEO, as TA.
Reason: TA explains a big portion of how the web works. Information Responsiveness and Cost-of-retrieval will complete it further.
For two books, I will be publishing only these two concepts.
10. Google Keys
Reason: The biggest Google leak after Quality Rater Guidelines will happen in 2023. And, I will be involved, but no more information, for now, I am not allowed to share more.
Check the slides for the next SEO Insights for 2023.
#searchengineoptimization #future #nlp #semantic #chatgpt #ai #content #quality #publishing #trend #seotrend #seo #searchengineoptimisation
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Koray Tugberk GUBUR
How Search Engines Leverage Opinion-based Articles for Ranking?
Search engines use opinions, and factoids to understand the consensus. News search engines use different reports, and opinions in their search results to satisfy the urgent news information needed by the newsreaders. The news search engines differentiate disinformation from information to protect the newsreaders. Google, Microsoft Bing, Yandex, and DuckDuckGo have different algorithms and prioritization for classifications of the news sources, or prioritization of the news, and newsworthy topics.
Corroboration of the Web Answers from the Open Web is a research paper from Amelia Marian and Minji Wu explaining how a search engine can rank information according to its accuracy.
Google started to explain that the Expertise-Authoriteveness-Trustworthiness is the most important group of signals to be sure that a result won't shame the search engine. Embarrassment factors for the search engines involve wrong information on a news title on the news story, or a wrong featured snippet. A search engine might be shame due to the bad result that is ranking on the SERP.
Dense-retrieval, context scoring, named entity recognition, semantic role labeling, truth ranges, fix points, confidence score, query processing, and parsing.
Context understanding requires processing the text, and tokenizing the words by recognizing the word sense. Processing the text of the news articles requires time. And, most of the time, news search engines do not have enough time for processing the text. Thus, PageRank provides a sustainable timeline for the news sources for rankings.
PageRank is a quick signal for search engines to show the authenticity of the news web source. The highly cited sources are ranked higher, and longer on the top stories. Usually, Google protects the high PageRank sources by trusting the judgment of the websites. But, fact-finding algorithms do not use PageRank mostly, unless they couldn't decide by looking at other factors, or they do not have enough resources to process the text among the hundreds of sources.
News ranking algorithms differentiate opinions, reports, and breaking news from each other. News-related entities, their co-existence, and contextual relations change. Google inventors suggest differentiation of these entities from each other for a proper news categorization.
News categorization is important to match the interested topics of the users in queryless news feeds such as Google Discover. Google Discover is a queryless news feed that serves news stories according to the users' interest areas.
An opinion for news might be misleading. Some news titles might be too harsh, or strict. Search engines use these headlines to differentiate the non-trustworthy news sources from the trustworthy ones. And, opinions of journalists or their different interpretations of the events might change the rankings of a document according to the fact-finding algorithms.
Coronavirus and Future of SEO: Digital Marketing and Remote CultureKoray Tugberk GUBUR
I have attended a great SEO and Digital Marketing webinar with Founder of Stradiji and SEMRush Turkey Lead Mr. Mert Erkal and My Dearest Friend and SEO Consultant Atakan Erdoğan.
Small Note: After I uploaded the presentation, Google launched a new Covid-19 news address like Bing/covid-19. You may want to look at it -> https://www.google.com/covid-19
I have prepared a Presentation about Coronavirus's Effects on Search Engine Optimization (SEO).
You will find Coronavirus's changing effects on Digital Marketing and psychology of global society while using Search Engines.
I also have focused on Search Engine's and Social Media Brands, E-commerce Site's reflexes against Coronavirus Pandemic.
You will see the web sites and categories who earn more traffic and lose traffic. You will also see conversion rate differences because of Coronavirus.
Also, I have told about Search Engine's differences and their attitude against the Coronavirus Pandemic, their future, their updates during the pandemic.
In the last part, you will see some new 2020 Web Technology and Design Trends with AI.
There are also Google Researches for better Search Engine technologies.
Questions:
1- What are the differences between Yandex, Google, Bing, and Duckduckgo for Coronavirus Pandemic?
2- Twitter, Instagram, Amazon or Apple, what are they doing?
3- What do people search most for during the Coronavirus Crisis?
4- What changes from country to country?
5- What are the future technologies of Web and App?
6- How and why do Search Engines improve AI, what is the last events?
7- Which sites loose traffic and which earn more?
8- Lots of quotes from International SEOs about the pandemic.
And more...
I am Koray Tuğberk GÜBÜR and a Holistic SEO Expert.
I sincerely thank you for my Dearest Friend Atakan Erdoğan and Mr. Mert Erkal for this awesome webinar opportunity and experience.
To watch the webinar, please visit Stradiji's Official Youtube Channel.
https://www.youtube.com/watch?v=V4sJTNcRqaM&t=100s
Slawski New Approaches for Structured Data:Evolution of Question Answering Bill Slawski
Google has moved from Search to Knowledge, and Focusing on Answering questions with knowledge graph entity information provides has led to answering queries with Knowledge graphs for those questions, with confidence scores between entities and other entities or attributes of entities, based upon freshness, reliabilillity, popularity, and proximity between an entity and another entity or an attribute.
Google Lighthouse is super valuable but it only checks one page at a time.
Hamlet will show you how to get it to check all pages of a site, and how to run automated Lighthouse checks on-demand at scheduled intervals and from automated tests.
He'll also cover how to set performance budgets, how to get alerts when budgets are exceeded, and how to aggregate page reports using BigQuery and Google Data Studio.
The Python Cheat Sheet for the Busy MarketerHamlet Batista
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsDan Taylor
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateKoray Tugberk GUBUR
Start Summary:
"131% Organic Session Increase in 5 Months
62% Impression Increase in 5 Months
144% Clicks Increase in 5 Months"
This SEO Case study is about Google Core Updates and their impacts on biggest financial institution website in Turkey.
I have started to work in Hangikredi.com at 26 March 2019. But, the company's website had been affected by 12 March Google Core Update very negatively.
I had started to work in here while a crisis had been happening.
I had examined the web site and figured it out that the real problems were crawl budget, authority signasl and relevancy-entity connection. I have activated social media, Google My Bussiness accounts, I have entered financial forums, every other alternative channel. I created a news publisher network about us. I cleaned the misleading status codes, HTML and CSS mistakes, optimised meta tags, fixed the redirection chains, I used the image compressions and deleted lots of unnecessary URL and their contents, I created the internal link structure from scratch.
Until 5 June Google Core Update, we were winners again.
We had regained all of our traffic lost. Until 1 August server atack, we were okay, then in one day, everything went wrong.
I had started from 0 again...
I had been optimising web site's offpage signals for regain the trust of Google AI and I had been supporting this strategy with onpage elements.
After 24th September Google Core Update, there was another success. We breaked the crawl load/rate record, avarage site position, CTR and impression, click records for site history.
In this CASE Study, you are gonna find details of a SEO Success Story with graphics and also some funny cencor images from my life.
End Summary:
"12 March, 5 june and 24 September Google Core Updates with 1 August Server Atack are the milestones of this SEO Casse Study. You will find all details from our view of point. I hope you will like it."
My presentation at the Semantic Technology and Business Conference in San Jose on August 19, 2014, with Barbara Starr (Her slides are separate, and cover a vast array of semantic tools and approaches for assessing and understanding your pages).
Building a semantic search system - one that can correctly parse and interpret end-user intent and return the ideal results for users’ queries - is not an easy task. It requires semantically parsing the terms, phrases, and structure within queries, disambiguating polysemous terms, correcting misspellings, expanding to conceptually synonymous or related concepts, and rewriting queries in a way that maps the correct interpretation of each end user’s query into the ideal representation of features and weights that will return the best results for that user. Not only that, but the above must often be done within the confines of a very specific domain - ripe with its own jargon and linguistic and conceptual nuances.
This talk will walk through the anatomy of a semantic search system and how each of the pieces described above fit together to deliver a final solution. We'll leverage several recently-released capabilities in Apache Solr (the Semantic Knowledge Graph, Solr Text Tagger, Statistical Phrase Identifier) and Lucidworks Fusion (query log mining, misspelling job, word2vec job, query pipelines, relevancy experiment backtesting) to show you an end-to-end working Semantic Search system that can automatically learn the nuances of any domain and deliver a substantially more relevant search experience.
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...Koray Tugberk GUBUR
Özeti Başlat:
"5 Ayda% 131 Organik Oturum Artışı
5 Ay İçinde% 62 Gösterim Artışı
5 Aylık% 144 Tıklama Artışı "
Bu SEO Vaka çalışması, Google Çekirdek Güncellemeleri ve Türkiye'deki en büyük finansal kurum web sitesine etkileri ile ilgili.
Hangikredi.com'da 26 Mart 2019'da çalışmaya başladım. Ancak şirketin web sitesi 12 Mart Google Çekirdek Güncellemesi'nden çok olumsuz etkilendi.
Bir kriz yaşanırken burada çalışmaya başladım.
Web sitesini inceledim ve asıl sorunların tarama bütçesi, otorite işareti ve alaka düzeyi-işletme ilişkisi olduğunu anladım. Sosyal medyayı, Google İş Hesaplarım'ı etkinleştirdim, diğer tüm alternatif kanallardan finansal forumlara girdim. Bizimle ilgili bir haber yayıncı ağı yarattım. Yanıltıcı durum kodlarını, HTML ve CSS hatalarını temizledim, meta etiketleri optimize ettim, yönlendirme zincirlerini düzelttim, görüntü sıkıştırma kullandım ve birçok gereksiz URL ve içeriğini sildim, iç bağlantı yapısını sıfırdan yarattım.
5 Haziran’dan Google Çekirdek Güncelleme’ye kadar tekrar kazandık.
Kaybettiğimiz bütün trafiğimizi geri kazandık. 1 Ağustos sunucu atack kadar, biz iyiydik, sonra bir gün, her şey ters gitti.
Yine 0'dan başladım ...
Web sitemizin offpage sinyallerini Google AI’ın güvenini yeniden kazanmak için optimize ediyordum ve bu stratejiyi onpage öğeleriyle destekliyordum.
24 Eylül Google Çekirdek Güncellemesi’nden sonra başka bir başarı daha oldu. Tarama yükü / oranı kaydını, ortalama site konumunu, TO'yu ve gösterimini, site geçmişi için tıklama kayıtlarını kırdık.
Bu CASE Çalışmasında grafikler içeren bir SEO Başarı Hikayesinin ayrıntılarını ve hayatımdaki bazı komik cencor resimlerini bulacaksınız.
Bitiş Özeti:
"12 Mart, 5 Haziran ve 24 Eylül 1 Ağustos Sunucu ile Google Çekirdek Güncellemeleri Atack, bu SEO Casse Çalışmasının kilometre taşlarıdır. Tüm detayları bizim açımızdan göreceksiniz. Umarım beğenirsiniz."
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...LazarinaStoyanova
How are you currently doing SERP analysis? Is your approach efficient or scalable?
How SEOs can incorporate programmatic approaches and advances in machine learning in order to identify winning strategies?
This talk will leave you with a better understanding of what is possible for SERP analysis at scale, what insights you can capture with the help of machine learning quickly, and how to incorporate insights into your strategy and visualize your findings to impress your stakeholders.
Presentation given at the British Library Turing workshop on Software Citation, considering what lessons could be learned from the world of data citation
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Koray Tugberk GUBUR
How Search Engines Leverage Opinion-based Articles for Ranking?
Search engines use opinions, and factoids to understand the consensus. News search engines use different reports, and opinions in their search results to satisfy the urgent news information needed by the newsreaders. The news search engines differentiate disinformation from information to protect the newsreaders. Google, Microsoft Bing, Yandex, and DuckDuckGo have different algorithms and prioritization for classifications of the news sources, or prioritization of the news, and newsworthy topics.
Corroboration of the Web Answers from the Open Web is a research paper from Amelia Marian and Minji Wu explaining how a search engine can rank information according to its accuracy.
Google started to explain that the Expertise-Authoriteveness-Trustworthiness is the most important group of signals to be sure that a result won't shame the search engine. Embarrassment factors for the search engines involve wrong information on a news title on the news story, or a wrong featured snippet. A search engine might be shame due to the bad result that is ranking on the SERP.
Dense-retrieval, context scoring, named entity recognition, semantic role labeling, truth ranges, fix points, confidence score, query processing, and parsing.
Context understanding requires processing the text, and tokenizing the words by recognizing the word sense. Processing the text of the news articles requires time. And, most of the time, news search engines do not have enough time for processing the text. Thus, PageRank provides a sustainable timeline for the news sources for rankings.
PageRank is a quick signal for search engines to show the authenticity of the news web source. The highly cited sources are ranked higher, and longer on the top stories. Usually, Google protects the high PageRank sources by trusting the judgment of the websites. But, fact-finding algorithms do not use PageRank mostly, unless they couldn't decide by looking at other factors, or they do not have enough resources to process the text among the hundreds of sources.
News ranking algorithms differentiate opinions, reports, and breaking news from each other. News-related entities, their co-existence, and contextual relations change. Google inventors suggest differentiation of these entities from each other for a proper news categorization.
News categorization is important to match the interested topics of the users in queryless news feeds such as Google Discover. Google Discover is a queryless news feed that serves news stories according to the users' interest areas.
An opinion for news might be misleading. Some news titles might be too harsh, or strict. Search engines use these headlines to differentiate the non-trustworthy news sources from the trustworthy ones. And, opinions of journalists or their different interpretations of the events might change the rankings of a document according to the fact-finding algorithms.
Coronavirus and Future of SEO: Digital Marketing and Remote CultureKoray Tugberk GUBUR
I have attended a great SEO and Digital Marketing webinar with Founder of Stradiji and SEMRush Turkey Lead Mr. Mert Erkal and My Dearest Friend and SEO Consultant Atakan Erdoğan.
Small Note: After I uploaded the presentation, Google launched a new Covid-19 news address like Bing/covid-19. You may want to look at it -> https://www.google.com/covid-19
I have prepared a Presentation about Coronavirus's Effects on Search Engine Optimization (SEO).
You will find Coronavirus's changing effects on Digital Marketing and psychology of global society while using Search Engines.
I also have focused on Search Engine's and Social Media Brands, E-commerce Site's reflexes against Coronavirus Pandemic.
You will see the web sites and categories who earn more traffic and lose traffic. You will also see conversion rate differences because of Coronavirus.
Also, I have told about Search Engine's differences and their attitude against the Coronavirus Pandemic, their future, their updates during the pandemic.
In the last part, you will see some new 2020 Web Technology and Design Trends with AI.
There are also Google Researches for better Search Engine technologies.
Questions:
1- What are the differences between Yandex, Google, Bing, and Duckduckgo for Coronavirus Pandemic?
2- Twitter, Instagram, Amazon or Apple, what are they doing?
3- What do people search most for during the Coronavirus Crisis?
4- What changes from country to country?
5- What are the future technologies of Web and App?
6- How and why do Search Engines improve AI, what is the last events?
7- Which sites loose traffic and which earn more?
8- Lots of quotes from International SEOs about the pandemic.
And more...
I am Koray Tuğberk GÜBÜR and a Holistic SEO Expert.
I sincerely thank you for my Dearest Friend Atakan Erdoğan and Mr. Mert Erkal for this awesome webinar opportunity and experience.
To watch the webinar, please visit Stradiji's Official Youtube Channel.
https://www.youtube.com/watch?v=V4sJTNcRqaM&t=100s
Slawski New Approaches for Structured Data:Evolution of Question Answering Bill Slawski
Google has moved from Search to Knowledge, and Focusing on Answering questions with knowledge graph entity information provides has led to answering queries with Knowledge graphs for those questions, with confidence scores between entities and other entities or attributes of entities, based upon freshness, reliabilillity, popularity, and proximity between an entity and another entity or an attribute.
Google Lighthouse is super valuable but it only checks one page at a time.
Hamlet will show you how to get it to check all pages of a site, and how to run automated Lighthouse checks on-demand at scheduled intervals and from automated tests.
He'll also cover how to set performance budgets, how to get alerts when budgets are exceeded, and how to aggregate page reports using BigQuery and Google Data Studio.
The Python Cheat Sheet for the Busy MarketerHamlet Batista
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsDan Taylor
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateKoray Tugberk GUBUR
Start Summary:
"131% Organic Session Increase in 5 Months
62% Impression Increase in 5 Months
144% Clicks Increase in 5 Months"
This SEO Case study is about Google Core Updates and their impacts on biggest financial institution website in Turkey.
I have started to work in Hangikredi.com at 26 March 2019. But, the company's website had been affected by 12 March Google Core Update very negatively.
I had started to work in here while a crisis had been happening.
I had examined the web site and figured it out that the real problems were crawl budget, authority signasl and relevancy-entity connection. I have activated social media, Google My Bussiness accounts, I have entered financial forums, every other alternative channel. I created a news publisher network about us. I cleaned the misleading status codes, HTML and CSS mistakes, optimised meta tags, fixed the redirection chains, I used the image compressions and deleted lots of unnecessary URL and their contents, I created the internal link structure from scratch.
Until 5 June Google Core Update, we were winners again.
We had regained all of our traffic lost. Until 1 August server atack, we were okay, then in one day, everything went wrong.
I had started from 0 again...
I had been optimising web site's offpage signals for regain the trust of Google AI and I had been supporting this strategy with onpage elements.
After 24th September Google Core Update, there was another success. We breaked the crawl load/rate record, avarage site position, CTR and impression, click records for site history.
In this CASE Study, you are gonna find details of a SEO Success Story with graphics and also some funny cencor images from my life.
End Summary:
"12 March, 5 june and 24 September Google Core Updates with 1 August Server Atack are the milestones of this SEO Casse Study. You will find all details from our view of point. I hope you will like it."
My presentation at the Semantic Technology and Business Conference in San Jose on August 19, 2014, with Barbara Starr (Her slides are separate, and cover a vast array of semantic tools and approaches for assessing and understanding your pages).
Building a semantic search system - one that can correctly parse and interpret end-user intent and return the ideal results for users’ queries - is not an easy task. It requires semantically parsing the terms, phrases, and structure within queries, disambiguating polysemous terms, correcting misspellings, expanding to conceptually synonymous or related concepts, and rewriting queries in a way that maps the correct interpretation of each end user’s query into the ideal representation of features and weights that will return the best results for that user. Not only that, but the above must often be done within the confines of a very specific domain - ripe with its own jargon and linguistic and conceptual nuances.
This talk will walk through the anatomy of a semantic search system and how each of the pieces described above fit together to deliver a final solution. We'll leverage several recently-released capabilities in Apache Solr (the Semantic Knowledge Graph, Solr Text Tagger, Statistical Phrase Identifier) and Lucidworks Fusion (query log mining, misspelling job, word2vec job, query pipelines, relevancy experiment backtesting) to show you an end-to-end working Semantic Search system that can automatically learn the nuances of any domain and deliver a substantially more relevant search experience.
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...Koray Tugberk GUBUR
Özeti Başlat:
"5 Ayda% 131 Organik Oturum Artışı
5 Ay İçinde% 62 Gösterim Artışı
5 Aylık% 144 Tıklama Artışı "
Bu SEO Vaka çalışması, Google Çekirdek Güncellemeleri ve Türkiye'deki en büyük finansal kurum web sitesine etkileri ile ilgili.
Hangikredi.com'da 26 Mart 2019'da çalışmaya başladım. Ancak şirketin web sitesi 12 Mart Google Çekirdek Güncellemesi'nden çok olumsuz etkilendi.
Bir kriz yaşanırken burada çalışmaya başladım.
Web sitesini inceledim ve asıl sorunların tarama bütçesi, otorite işareti ve alaka düzeyi-işletme ilişkisi olduğunu anladım. Sosyal medyayı, Google İş Hesaplarım'ı etkinleştirdim, diğer tüm alternatif kanallardan finansal forumlara girdim. Bizimle ilgili bir haber yayıncı ağı yarattım. Yanıltıcı durum kodlarını, HTML ve CSS hatalarını temizledim, meta etiketleri optimize ettim, yönlendirme zincirlerini düzelttim, görüntü sıkıştırma kullandım ve birçok gereksiz URL ve içeriğini sildim, iç bağlantı yapısını sıfırdan yarattım.
5 Haziran’dan Google Çekirdek Güncelleme’ye kadar tekrar kazandık.
Kaybettiğimiz bütün trafiğimizi geri kazandık. 1 Ağustos sunucu atack kadar, biz iyiydik, sonra bir gün, her şey ters gitti.
Yine 0'dan başladım ...
Web sitemizin offpage sinyallerini Google AI’ın güvenini yeniden kazanmak için optimize ediyordum ve bu stratejiyi onpage öğeleriyle destekliyordum.
24 Eylül Google Çekirdek Güncellemesi’nden sonra başka bir başarı daha oldu. Tarama yükü / oranı kaydını, ortalama site konumunu, TO'yu ve gösterimini, site geçmişi için tıklama kayıtlarını kırdık.
Bu CASE Çalışmasında grafikler içeren bir SEO Başarı Hikayesinin ayrıntılarını ve hayatımdaki bazı komik cencor resimlerini bulacaksınız.
Bitiş Özeti:
"12 Mart, 5 Haziran ve 24 Eylül 1 Ağustos Sunucu ile Google Çekirdek Güncellemeleri Atack, bu SEO Casse Çalışmasının kilometre taşlarıdır. Tüm detayları bizim açımızdan göreceksiniz. Umarım beğenirsiniz."
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...LazarinaStoyanova
How are you currently doing SERP analysis? Is your approach efficient or scalable?
How SEOs can incorporate programmatic approaches and advances in machine learning in order to identify winning strategies?
This talk will leave you with a better understanding of what is possible for SERP analysis at scale, what insights you can capture with the help of machine learning quickly, and how to incorporate insights into your strategy and visualize your findings to impress your stakeholders.
Presentation given at the British Library Turing workshop on Software Citation, considering what lessons could be learned from the world of data citation
How to manage the complete content strategy in WordPress using plugins. Do your content inventory in WordPress -- no spreadsheets! Do content modeling using custom post types, taxonomies, and fields. Video: http://wordpress.tv/2013/08/02/stephanie-leary-content-strategy-wordpress-case-studies/
Public access to research results at USDACyndy Parr
An update on public access activities at the National Agricultural Library and next steps, presented 11 January 2017 at the Earth Science Information Partners (ESIP) meeting in Bethesda, Maryland.
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
Part of the SciDataCon14 workshop on "Data Papers and their applications" run by myself and Brian Hole to help attendees understand current data-publishing journals and trends and help them understand the editorial processes on NPG's Scientific Data and Ubiquity's Open Health Data.
Finding things to write about can be difficult for bloggers. Here is how to get the most out of your content by using resources already available to you.
Lesson 8 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
There is a method to it: Making meaning in information research through a mix...Lynn Connaway
Connaway, L. S., Faniel, I. M., Narayan, B., & Abdi, E. S. (2019). There is a method to it: Making meaning in information research through a mix of paradigms and methods. Panel presented at ASIS&T Annual Meeting, October 21, 2019, Melbourne, Australia.
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village.
Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions.
Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Learning to Curate Research Data
Jennifer Doty, Research Data Librarian, Emory Center for Digital Scholarship, Emory University, Robert W. Woodruff Library
How to evaluate the whole web (without being Google)Dixon Jones
Could you build your own, private view of the Internet? One that isn't reliant on Google or Bing? Majestic has done this and now has one of the largest web indexes on the planet. Whilst known and a backlink analysis engine, Majestic infact has its own, unique view of the Internet and is able to derive meaning, influence and context out of its dataset. Here's how they did it. (2018)
2014 Family Search Developer Conference Place 2.0 APIdshellman
Presentation introducing the new Place 2.0 API developed and hosted by FamilySearch. Answers questions about why you'd use the API and what you can do with it.
Similar to Semantic Search Engine: Semantic Search and Query Parsing with Phrases and Entities (20)
AI-Powered Personalization: Principles, Use Cases, and Its Impact on CROVWO
In today’s era of AI, personalization is more than just a trend—it’s a fundamental strategy that unlocks numerous opportunities.
When done effectively, personalization builds trust, loyalty, and satisfaction among your users—key factors for business success. However, relying solely on AI capabilities isn’t enough. You need to anchor your approach in solid principles, understand your users’ context, and master the art of persuasion.
Join us as Sarjak Patel and Naitry Saggu from 3rd Eye Consulting unveil a transformative framework. This approach seamlessly integrates your unique context, consumer insights, and conversion goals, paving the way for unparalleled success in personalization.
The Forgotten Secret Weapon of Digital Marketing: Email
Digital marketing is a rapidly changing, ever evolving industry--Influencers, Threads, X, AI, etc. But one of the most effective digital marketing tools is also one of the oldest: Email. Find out from two Houston-based digital experts how to maximize your results from email.
Key Takeaways:
Email has the best ROI of any digital tactic
It can be used at any stage of the customer journey
It is increasingly important as the cookie-less future gets closer and closer
SEO as the Backbone of Digital MarketingFelipe Bazon
In this talk Felipe Bazon will share how him and his team at Hedgehog Digital share our journey of making C-Levels alike, specially CMOS realize that SEO is the backbone of digital marketing by showing how SEO can contribute to brand awareness, reputation and authority and above all how to use SEO to create more robust global marketing strategies.
Come learn how YOU can Animate and Illuminate the World with Generative AI's Explosive Power. Come sit in the driver's seat and learn to harness this great technology.
Short video marketing has sweeped the nation and is the fastest way to build an online brand on social media in 2024. In this session you will learn:- What is short video marketing- Which platforms work best for your business- Content strategies that are on brand for your business- How to sell organically without paying for ads.
It's another new era of digital and marketers are faced with making big bets on their digital strategy. If you are looking at modernizing your tech stack to support your digital evolution, there are a few can't miss (often overlooked) areas that should be part of every conversation. We'll cover setting your vision, avoiding siloes, adding a democratized approach to data strategy, localization, creating critical governance requirements and more. Attendees will walk away with actions they can take into initiatives they are running today and consider for the future.
Monthly Social Media News Update May 2024Andy Lambert
TL;DR. These are the three themes that stood out to us over the course of last month.
1️⃣ Social media is becoming increasingly significant for brand discovery. Marketers are now understanding the impact of social and budgets are shifting accordingly.
2️⃣ Instagram’s new algorithm and latest guidance will help us maintain organic growth. Instagram continues to evolve, but Reels remains the most crucial tool for growth.
3️⃣ Collaboration will help us unlock growth. Who we work with will define how fast we grow. Meta continues to evolve their Creator Marketplace and now TikTok are beginning to push ‘collabs’ more too.
How to Run Landing Page Tests On and Off Paid Social PlatformsVWO
Join us for an exclusive webinar featuring Mariate, Alexandra and Nima where we will unveil a comprehensive blueprint for crafting a successful paid media strategy focused on landing page testing.With escalating costs in paid advertising, understanding how to maximize each visitor’s experience is crucial for retention and conversion.
This session will dive into the methodologies for executing and analyzing landing page tests within paid social channels, offering a blend of theoretical knowledge and practical insights.
The Pearmill team will guide you through the nuances of setting up and managing landing page experiments on paid social platforms. You will learn about the critical rules to follow, the structure of effective tests, optimal conversion duration and budget allocation.
The session will also cover data analysis techniques and criteria for graduating landing pages.
In the second part of the webinar, Pearmill will explore the use of A/B testing platforms. Discover common pitfalls to avoid in A/B testing and gain insights into analyzing A/B tests results effectively.
SMM Cheap - No. 1 SMM panel in the worldsmmpanel567
Boost your social media marketing with our SMM Panel services offering SMM Cheap services! Get cost-effective services for your business and increase followers, likes, and engagement across all social media platforms. Get affordable services perfect for businesses and influencers looking to increase their social proof. See how cheap SMM strategies can help improve your social media presence and be a pro at the social media game.
Videos are more engaging, more memorable, and more popular than any other type of content out there. That’s why it’s estimated that 82% of consumer traffic will come from videos by 2025.
And with videos evolving from landscape to portrait and experts promoting shorter clips, one thing remains constant – our brains LOVE videos.
So is there science behind what makes people absolutely irresistible on camera?
The answer: definitely yes.
In this jam-packed session with Stephanie Garcia, you’ll get your hands on a steal-worthy guide that uncovers the art and science to being irresistible on camera. From body language to words that convert, she’ll show you how to captivate on command so that viewers are excited and ready to take action.
The What, Why & How of 3D and AR in Digital CommercePushON Ltd
Vladimir Mulhem has over 20 years of experience in commercialising cutting edge creative technology across construction, marketing and retail.
Previously the founder and Tech and Innovation Director of Creative Content Works working with the likes of Next, John Lewis and JD Sport, he now helps retailers, brands and agencies solve challenges of applying the emerging technologies 3D, AR, VR and Gen AI to real-world problems.
In this webinar, Vladimir will be covering the following topics:
Applications of 3D and AR in Digital Commerce,
Benefits of 3D and AR,
Tools to create, manage and publish 3D and AR in Digital Commerce.
Mastering Local SEO for Service Businesses in the AI Era is tailored specifically for local service providers like plumbers, dentists, and others seeking to dominate their local search landscape. This session delves into leveraging AI advancements to enhance your online visibility and search rankings through the Content Factory model, designed for creating high-impact, SEO-driven content. Discover the Dollar-a-Day advertising strategy, a cost-effective approach to boost your local SEO efforts and attract more customers with minimal investment. Gain practical insights on optimizing your online presence to meet the specific needs of local service seekers, ensuring your business not only appears but stands out in local searches. This concise, action-oriented workshop is your roadmap to navigating the complexities of digital marketing in the AI age, driving more leads, conversions, and ultimately, success for your local service business.
Key Takeaways:
Embrace AI for Local SEO: Learn to harness the power of AI technologies to optimize your website and content for local search. Understand the pivotal role AI plays in analyzing search trends and consumer behavior, enabling you to tailor your SEO strategies to meet the specific demands of your target local audience. Leverage the Content Factory Model: Discover the step-by-step process of creating SEO-optimized content at scale. This approach ensures a steady stream of high-quality content that engages local customers and boosts your search rankings. Get an action guide on implementing this model, complete with templates and scheduling strategies to maintain a consistent online presence. Maximize ROI with Dollar-a-Day Advertising: Dive into the cost-effective Dollar-a-Day advertising strategy that amplifies your visibility in local searches without breaking the bank. Learn how to strategically allocate your budget across platforms to target potential local customers effectively. The session includes an action guide on setting up, monitoring, and optimizing your ad campaigns to ensure maximum impact with minimal investment.
Core Web Vitals SEO Workshop - improve your performance [pdf]Peter Mead
Core Web Vitals to improve your website performance for better SEO results with CWV.
CWV Topics include:
- Understanding the latest Core Web Vitals including the significance of LCP, INP and CLS + their impact on SEO
- Optimisation techniques from our experts on how to improve your CWV on platforms like WordPress and WP Engine
- The impact of user experience and SEO
10 Video Ideas Any Business Can Make RIGHT NOW!
You'll never draw a blank again on what kind of video to make for your business. Go beyond the basic categories and truly reimagine a brand new advanced way to brainstorm video content creation. During this masterclass you'll be challenged to think creatively and outside of the box and view your videos through lenses you may have never thought of previously. It's guaranteed that you'll leave with more than 10 video ideas, but I like to under-promise and over-deliver. Don't miss this session.
Key Takeaways:
How to use the Video Matrix
How to use additional "Lenses"
Where to source original video ideas
2. @KorayGubur
A b o u t M e
Koray Tuğberk GÜBÜR
Owner and Founder of Holistic SEO & Digital
• Educates his team
• Publishes SEO Case Studies, Researches & Guides
• Twitter: @KorayGubur
• Email: ktgubur@holisticseo.digital
• Official Site: https://www.holisticseo.digital
6. @KorayGubur
What is Query Parsing?
• Query Parsing it the process of
understanding the different sections of a
query.
• Types: Entity-seeking Query, a Substitue
Term, or Synonym Term.
• Canonical and Represented Versions: A
Canonical Query can represent close
variations.
• Query Character: Affects the SERP Design,
Dominant and Minor Search Intent
Assigments.
• Query Process: Other name of the Query
Parsing.
@KorayGubur
7. @KorayGubur
Multi-Stage Query Processing
• The first patent that talks about «Context
of Words».
• It tries to delete the stop words.
• Stemming the concrete words.
• Expanding words with Synonyms and Co-
occurence.
• Some Criterias: Absent Queries, Boolean
Logic, Query Term Weights, Document
Popularity, Word Proximity (Distance),
Word Adjacency.
• It uses «VIPS» and Web Page Layout.
@KorayGubur
Inventors: Jeffrey Adgate Dean, Paul G.
Haahr, Olcan Sercinoglu, and Amitabh
K. Singhal
US Patent Application 20060036593
Filed: August 13, 2004
Published February 16, 2006
8. @KorayGubur
Query Breadth
• This is for «adjecent words» and
«unknown entities».
• It uses related document count to see
the ‘query breadth’.
• Query Breadth can be decreased with
the ‘adjecent word’ count.
• Query Breadth can be used for ‘Named
Entity Recognition’, or Triple Creation
(An Object and two Subject).
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
@KorayGubur
9. @KorayGubur
Query Analysis
• Selection Over Time: For different timespans,
a document can be chosen more frequently.
• Documents with Hot Topics: Rising Queries
can boost documents that include these
queries.
• Documents with Related Hot Topics: Related
queries for rising queries can boost the
documents with related queries.
• Constant Queries with Consistently Changing
Results: Constant Query is the always popular
query with changing information for a topic.
• Freshness of Documents: Date of the
information on the web page, not the date of
the document’s last version.
@KorayGubur
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
10. @KorayGubur
Query Analysis
• Staleness of Documents: Historical Data
amount can be a positive ranking signal
for a page for a query.
• Overly Broad Pages: Includes discordant
queries, a signal for spam.
• Continuation Patent filed in 2011 for
«document locator». And, some terms
changed.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
11. @KorayGubur
Query Analysis
• Trends Related to Topics and Search Terms: Grouping
Topics, and Subtopics announced for Trending Queries.
• Access Times to Determine Freshness and Staleness:
Compares the First Access and Last Access time for
certain documents.
• Frequency of Selection: Compares the selection count
for the first and latter time.
• When Staleness Might be Preferred: Even if there is
fresh news, or documents, the user can choose the stale
document. These documents are not affected by stale
information.
• Spam Determination Based Upon Breadth of Rankings,
and Authority: If the document is popular, or
authoritative (link-based), or the source is relevant
enough, it will be an exception.
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
12. @KorayGubur
Query Analysis
• Continuation of the Historical Data
Patent.
• Speaks about Topics, and Query
Categorization based on Topics.
• It is important beause, same year,
Google Launched its Knowledge Graph
with 5 million entities, and 500 million
facts.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
13. @KorayGubur
Midpage Query Refinements
• In 2006, Google published the
«Midpage Query Refinements», a.k.a,
Search Suggestions from today.
• The GUI test was between 2004-2006.
• The patent filed in 2003.
• Includes Semantic Query Clusters for
Different Contexts.
• A Matcher, a Clusterer, A Scorer, and A
Presenter.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
14. @KorayGubur
Midpage Query Refinements
• Precomputation Engine has four parts.
• Associator: Query and Document
Association.
• Selector: Document and Query Section
Selector.
• Regenerator: Checks the query logs to
refresh the selections.
• Inverter: Checks the Cached Data for
presenting.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
15. @KorayGubur
Midpage Query Refinements
• Query Ambiguity: If the query is ambigous,
Search Engine can use the query clusters.
• Homonyms, General Terms, Improper
Context, and Narrow Terms can create a
stateless SERP Instance.
• To prevent this, Semantic Grouping,
Centroids and Centroid distance are used.
• A Query Cluster and Document Cluster can
be paired. If Document cluster is larger, or
more relevant, the query cluster will be
used as query suggestion.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
16. @KorayGubur
Midpage Query Refinements
• Matcher: Stored query variations are put into a
cluster, and document phrase variations are
matched.
• Clusterer: The matched query variations, and
documents are clustered together. Different
than query clusters.
• Scorer: Determines the center of the centroid.
If the term vectors are distant to the centroid,
another cluster will be chosen by the Clusterer
for Scorer.
• Presenter: Created Clusters, and Centroids are
presented to the user. According to the
preferred choices, presenter will use sub-
centroids.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain V
CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
17. @KorayGubur
Midpage Query Refinements
• During 2017, the patent has been
refreshed.
• The Scorer Method has been changed.
• Representative Queries are chosen based
on centroids.
• For every cluster, a representative query is
chosen.
• According to the cluster size, and relevance
scores, the clusters are aligned.
• And, sub-queries are used as the
refinement queries.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
18. @KorayGubur
Midpage Query Refinements
• Inventors of the Midpage Query Refinement
Methodology are Paul Haahr and Steven D.
Baker.
• Steven Baker has written the Google
Synonyms Blog Post for Google’s Synonym
Update before the RankBrain Announcement.
• Helping Search Engines to Understand
Language:
https://googleblog.blogspot.com/2010/01/hel
ping-computers-understand-language.html
• Paul Haahr is the owner of the How Google
Works Presentation from SMX West. Includes
lots of useful insights.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
19. @KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
20. @KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
21. @KorayGubur
Context-Vectors
• Context-Vectors are close to the ‘Lexicon’
of the first research paper of Google which
is An Anatomy of Large Hypertextual Web
Search Engine document.
• Context-Vectors are the version of Lexicon
with different Contextual Domains.
• Context-Vectors are located in Domain List
Terms.
• A Domain List Terms can include 800.000
words, and word combinations.
• A Domain List Terms can include a macro-
context, and a sub-context with sub-
portions.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
22. @KorayGubur
Context-Vectors
• Context-vectors use ‘Topical Entries’.
• A Topical Entry, can be used for macro-
context.
• These topical entries can be used for
question generation.
• Generated questions can be used for
differentiating the different sub-contexts
from each other.
• A Macro-context can have a Dominant
Knowledge Domain. A Context-Vector can
be used for intersectional areas.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
23. @KorayGubur
Categorical Quality
• This is an ‘Re-ranking’ Algorithm Patent.
• There is a strong difference between the
Re-ranking and Initial Ranking.
• Re-ranking Algorithms are the modifying
algorithms for the Query Results.
• Inventor is Tyrstan Upstill, author of the
Evidence-based Ranking Research.
• Categorical Quality doesn’t focus on
relevance, or authoritativeness, it focuses
on Understanding the Category of the
Query.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
24. @KorayGubur
Categorical Quality
• This patent mentions the ‘social media shares’
and community size.
• If the query satisfy the ‘categorical query’
conditions, the search results will be evaluated
for related and close queries too.
• If a resource satisfies also the related categorical
queries, a categorical quality score will be
assigned to the source.
• Categorical Quality Methodology collects
Navigational Queries for different sources.
• If the source has more navigational queries, it
means that it has a popularity for the category.
• Categorical Quality mentions «Topicality Score».
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
25. @KorayGubur
Categorical Quality
• If a source includes all query terms for a
topic, it will have more Categorical Quality
and Topicality Score.
• This method also mentions ‘Click
Selection.’
• To understand the Model’s Success, they
do not take every click or CTR into
account.
• They take CTR and Clicks into account if it
meets with certain criterias such as time,
frequency, or personal interest.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
26. @KorayGubur
Substitue Query
• Substitue Query is the query that can replace
another query. These queries are used for
bolding the some sections of the content.
• Substitue Queries make ‘context’ more
important. Because, synonyms make change
the context. Such as, car and auto can be
same thing for ‘repair’ but they are not same
for ‘railroad’.
• There is a railroad car, but not auto.
• Thus, Sustitue Queries are not synonyms.
They are the replacble words without
changing the context.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
27. @KorayGubur
Substitue Query
• Co-occurence Matrix and Phrase-
based Indexing are used to support
the Substitue Queries.
• The method uses the Space Vectors
to compare the word vectors to each
other.
• If the queries are similar to each
other with enough co-occurent
words, it means that they can be
subtitue to each other.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
28. @KorayGubur
Synthetic Query
• Synthetic Query is the re-written version of
the query of the user by the search engine.
• A search engine can re-write a query by
augmenting the query to diversify the SERP
Features for a better search activity
satisfaction possibility.
• Some score types that Synthetic Queries
include are ‘Edit Distance Score’, ‘Similarity
Score’, ‘Transformation Cost Score’.
• Synthetic Queries can be collected from web
documents, Structured Data, and Similarity
Between Documents.
@KorayGubur
Inventors: Anand Shukla, Mark Pearson, Krishna
Bharat and Stefan Buettcher
Assignee: Google LLC
US Patent: 9,916,366
Granted: March 13, 2018
Filed: July 28, 2015
29. @KorayGubur
Synthetic Query and
Query Templates
• Query Templates are intermediary forms between the
Seed Queries and Synthetic Queries.
• Synthetic Queries are helpful for a Search Engine to
create pre-defined and pre-ordered SERP Instances.
• Synthetic Queries can be generated from HTML Tags,
IDF Scores, Close Phrases.
• If a Document has «Dorothy Parker Biography» as H1,
and «Sylvia Plath» as H2.
• Search Engine can use the «Sylvia Plath Biography» as
a synthetic query.
• If the results are good enough for relevance and
quality, the Synthetic Query will become a Seed
Query.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
30. @KorayGubur
Synthetic Query and
Query Templates
• Synthetic Queries can be generated from
the same author, same journal, source, or
time of period.
• Synthetic Queries and Open Information
Extraction are closely related to each
other.
• Before entering the world of entities,
understanding the world of phrases are
important.
• Open Information Extraction, and
Unknown Phrases, Entities are connected
to each other.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
31. @KorayGubur
Open Information Extraction
• Google bought Wavii for 30.000.000$ in
2013.
• Open Information Extraction is about ‘fact
extraction’ around nouns.
• It is for connecting different nouns to each
other based on relations.
• A classifier assigns a confidence scores to
a relation between two nouns.
• This is a text-to-data example.
• Wavii was originally a news aggregator
based on topics, not phrases.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
32. @KorayGubur
Open Information Extraction
• The relational tuples include at least two
nouns by connected to each other at least
one verb and adverb, such as ‘created by’,
‘author of’, ‘is from’, ‘located there’.
• ‘... Moreover, the number and complexity
of entity types on the Web means that
existing NER systems are inapplicable...’
• Open IE is for Unknown Entities, and
recognizing Minor Entities without a
registration to the Knowledge Base.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
33. @KorayGubur
Answer-seeking Query
• Answer-seeking Queries have specific
elements within the questions, and
answers.
• Google’s purpose is that extracting
question and answer formats for answer-
seeking queries.
• Answer-seeking queries requires concise
answers without any skepticism.
• Answer-seeking Query is an important
bridge between the Natural Language
Queries with an Intent.
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
34. @KorayGubur
Answer-seeking Query
• Question Elements are, Entity Instance,
Entity Class, Part of Speech Class, Root
Word, N-Gram and Question Triggering
Words.
• Answer Elements are Measurement, N-
Gram, Verb, Preposition, Entity_instance,
N-gram near entity, verb near entity,
preposition near_entity, verb class, skip
grams.
• Answer-seeking Queries trigger Answer
Scoring Engine,
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
35. @KorayGubur
Natural Language Queries
• Natural Language Queries are the queries
with the daily language.
• They do not have a proper grammar rule,
or complete sentence.
• They do not explicitly tell their intent.
• That’s why these queries also called Intent
Queries, or Queries with a specific minor
intent.
• For such a query, a Search Engine should
return an answer without lots of details,
or structure.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
36. @KorayGubur
Natural Language Queries
• Natural Language Queries are not Factual-queries, this is
the main difference for Answer-seeking queries.
• Natural Language Queries are related to the Intent
Template Generation.
• A Natural Language Query can have multiple intents with a
non-factual information, such as ‘How do I make
hummus?’.
• There might be different methods to make a hummus, and
there are different types of hummus, also, the query
includes ‘I’. So, no one can know how you do hummus.
• The answer-seeking version of this query is that ‘How to do
hummus’.
• One of the important methodology points from here is that
Google creates ‘heading-text’ pairs to understand the
topics of the sub-sections of the article.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
37. @KorayGubur
Natural Language Queries
• Variable and Non-Variable Portions are
important concepts for the intent templates.
• Non-variable section of the intent for the
previous query is ‘hummus’.
• The variable section or portion can be a
‘place, method, tool, or style’. And, ‘I’ can
change as a child, as a women, men, or adult
and blind person.
• For Natural Language Queries, the Intent
Templates can be implemented to different
Query Patterns such as X Causes, X Reasons.
• If someone searches for only X, the intent
templates will be used to assign the natural
language results to the query.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
38. @KorayGubur
Query Rewriting for Same
Intnet Across Languages
• Google tried to unite different search
intents, data for these intents, and phrases
that represents these intents to each other
to improve the search results before.
• This is called Query Expansion. Query
Expansion can compare results for a query
from a language, to results for the same
query with a different language.
• If the click satisfaction possibility is higher
for another language, for the same intent,
search engine can re-rank the results for the
first language.
@KorayGubur
Invented by Stefan Riezler, Alexander L. Vasserman
Assigned to Google
US Patent Application 20080319962
Published December 25, 2008
Filed: March 17, 2008
39. @KorayGubur
Seed-Queries
• Seed Queries can be synthetic queries,
user generated queries. The main
necessity for a seed query is that the
query should be satisfying with a set of
documents.
• If a query is logical, popular and satisfying
for the user, it will be marked as seed
query whether it is synthetic or searcher
generated.
• Seed Queries are used to determine the
representative queries for query
variations, query and intent templates.
@KorayGubur
Inventors Manaal Faruqui and Dipanjan Das
Applicants Google LLC
Publication Number 20200167379
Filed: January 18, 2019
Publication Date May 28, 2020
40. @KorayGubur
End of Phrase-based Indexing and Query
Processing Chaos
• Query Parsing
• Seed Query
• Substitue Query
• Natural Language Query
• Answer-seeking Query
• Factual Query
• Non-factual Query
• Non-variable Portion in Query
• Variable Portion in Query
• Discordant Query
• Query Re-writing
• Open Information Extraction
• Synthetic Query
• Categorical Query
• Contextual Vectors
• Term Vectors @KorayGubur
• Intent Templates
• Question and Answer Elements
• Co-occurence Matrix
• Query Expansion
• Query Term Weight
• Multi-stage Query Processing
• Query Breadth
• Query Template
• Relation Types and Noun Tuples
• Macro-context
• Topical Entry
• Mid-page Query Refinement
• Query Ambiguity
• Query Cluster – Document Cluster for Logical Pair
• Associator, Matcher, Scorer for Query, Document
Association
• Edit Distance Score’, ‘Similarity Score’, ‘Transformation
Cost Score’.
• Phrase-based Indexing
• Contextual Domains
• Contextual Domain Word List
• Query Analysis
• Representative Query
• Canonical Query
• Minor Intent
• Space Vectors
• Navigational Query as a
Popularity Signal
• Evidence Based Ranking
• Word Proximity
• Word Adjecency
• Query Term Weight
41. @KorayGubur
First Semantic Web Announcement
• Semantic Web Roadmap has been published
in September 1998 by Tim Barners-Lee.
• Semantic HTML, and Semantic Web,
Semantic User Patterns were the principles
of Semantic Search.
• The main purpose of Semantic Web is
making the web understandable to machines
so that machines can help humen-beings for
better web surfing.
• Tim Barners Lee talked about Agents,
Ontology, Structured Data, RDFa, or Semantic
HTML Tags and Digital Signature.
• ‘Such an agent coming to the clinic's Web
page will know not just that the page has
keywords such as "treatment, medicine,
physical, therapy" (as might be encoded
today) but also that Dr. Hartman works at
this clinic on Mondays, Wednesdays and
Fridays and that the script takes a date
range in yyyy-mm-dd format and returns
appointment times. And it will "know" all
this without needing artificial intelligence ‘ @KorayGubur
‘The Semantic Web is an extension of the current web in
which information is given well-defined meaning, better
enabling computers and people to work in cooperation.’
-Tim Barners-Lee
42. @KorayGubur
First Semantic Search Patent
• Google’s first Semantic Search Engine patent
is from 1999. One year later from Tim
Barners-lee’s announcement.
• The Inventor is directly Sergey Bring.
• Document doesn’t have a legal language, like
other first patent instances of Google.
• Document tells that every thing from similar
type has same features.
• Things on the web can be collected for
certain type of information and stored with
this information.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
43. @KorayGubur
First Semantic Search Patent
• Sergey Brin encountered some problems
such as Named Entity Recognition, or Main
Entity, and Entity Relation Detection.
• These problems are not called based on
Entities, but these books were entities with
string representations.
• Even a single letter difference resulted in big
problems for Sergey Brin.
• And, some books didn’t have price, or proper
title, and some of them were not even real
books.
• In the first trying, the cost was high, process
was slow, results were half, but Google kept
going.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
44. @KorayGubur
Knowledge Graph Launch
• ‘Things, not strings.’ is the motto of
Knowledge Graph. Everything on the web is
divided into different entities, entity types,
entity connections.
• Named Entity Recognition, and Natural
Language Processing increased its value and
prominence within the algorithmic hierarchy
of Google.
• Knowledge Graph supported the Knowledge
Panels.
• Fact Extracting, Question Answering,
Accuracy Audit, and Entity Relations are the
columns of Entity-oriented Search Engine.
• ‘Wouldn’t it be great understanding every
word of user, instead of matching words?’, by
Jack Menzel.
@KorayGubur
Inventors: John R. Provine
Assignee: Google LLC
US Patent: 10,922,326
Granted: February 16,
2021
Filed: March 14, 2013
45. @KorayGubur
Browsable Fact Repisotory
• Browsable Fact Repisotory is the main and
primitive version of the Google Knowledge
Graph.
• There are three important problems for
Browsable Fact Repisotory.
1. Updating the Knowledge Graph.
2. Extracting the New Entities.
3. Auditing the Fact Accuracy.
@KorayGubur
Invented by Andrew W.
Hogue and Jonathan T.
Betz
Assigned to Google Inc.
US Patent 7,774,328
Granted August 10, 2010
Filed: February 17, 2006
46. @KorayGubur
Entity-seeking Query
• Today’s last Query type.
• Entity-seeking Queries are one of the
basic columns of Entity-oriented search.
• Identify the Query seeks for a singular
entity, or plural things from same type.
• If it is singular, entity-seeking query will
match the term and the entity based on
an attribute.
• Entity-seeking Queries include a Semantic
Dependency Tree, Relevance Threshold
@KorayGubur
Inventors: Mugurel Ionut Andreica, Tatsiana Sakhar,
Behshad Behzadi, Marcin M. Nowak-Przygodzki, and
Adrian-Marius Dumitran
US Patent Application: 20190370326
Published: December 5, 2019
Filed: May 29, 2018
48. @KorayGubur
Structured Search Engine
@KorayGubur
• Sergey Brin said, ‘Structured Form’ in 1999.
• In 2011, Andrew Hogue said Structured
Search Engine.
• Andrew Hogue introduced the Open-
Domain Fact Extraction methodologies for
extracting, clustering entities from the web.
• Andrew Hogue has showed some concrete
examples to the future Google Engineers for
the direction that they want to head.
Cartoon is created by Gary Larson.
49. @KorayGubur
Semantic Search Engine
@KorayGubur
• Google can extract all attributes of an entity
to understand its general features.
• According to the Source Attribute, these
features can be changed, detected or
altered.
• Based on the entity types, and candidate
entities, Google can generate more entity
types, and connections between them.
• Structured Search Engine’s other name is
Semantic Search Engine.
• Semi-structured Text Understanding,
Question Generation from Keywords, and
Question-Answer Pairing are the main
objectives of Semantic Search Engine.
51. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Named Entity Recognition process for the
query.
• Entity-seeking Queries are the backbone
of the entity oriented search.
• Recognizing an entity from a Query is not
easy, or cheap.
• Neural Matching, RankBrain, Sub-topic
Update, or BERT, MuM, LaMDA... All of
them are used for recognizing the entity,
and its related attributes.
52. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Second step is Entity Resolution.
• Entity Resolution, and Attribute
Extraction are for understanding the
related attribute of the entity.
• Entity-seeking Queries usually try to find
an Entity’s Attribute such as look, height,
taste, inception or history.
• After the entity and its attribute are taken
from the query, at the next step,
Question Format will be taken.
53. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Third step is Synonym Extraction.
• Synonym Extraction is for strenghten the
confidence score.
• Other function of the Synonym Extraction
is that, it helps for using alternate
documents for the same question.
• According to the Synonyms, the question
format can change.
54. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Question format is necessary to understand
the query by increasing the confidence
score, and matching the similar successful
documents.
• Question format is important to
determine the answer format.
• Quetion term order, and answer term
order can increase the success rate.
• The last important thing here is that the
‘answer data type’ which is a date.
55. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Forth step is Entity Reconciliation and data accuracy audit.
At the next step, Google can check the related search
activity, possible search activity, and choose the best
answer.
• The answer formats, and answer phrases will be used
for entity reconcilation.
• Entity reconcilation includes the standartization of the
entity with the correct information.
• 5 Rand Fishkin Entity Recording exist in Knowledge
Graph, for same Rand Fishkin.
56. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Entity Reconcilation
Inventors: Oksana Yakhnenko and Norases
Vesdapunt
Assignee: GOOGLE LLC
US Patent: 10,331,706
Granted: June 25, 2019
Filed: October 4, 2017
Entity Reconcilation is another patent from Google.
• It includes checking multiple sources to complete the missing
information on the Knowledge Graph.
• It also uses similarity threshold between different sources and the
knowledge graph.
• If the source is authoritative, it will be easier to modify the
Knowledge Graph.
57. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
“For other people it can be a little more complicated. Like me, for
example, John Mueller. If you search for me you’ll find Wikipedia pages,
barbecue restaurants, bands, all kinds of people who are called John
Mueller.
And if, on my site, I don’t specify who I actually am, then it could
happen that our systems look at my page and go: “oh this is that guy
that runs that barbecue restaurant.” And suddenly I’m associated with
a barbecue restaurant, which might be a move up, I don’t know.
But these subtle things make it easier for us to recognize who is
actually behind something. We call that reconciliation when it comes to
structured data, kind of recognizing which of these entities belong
together.”
John Mueller
64. @KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
Named Entity Resolution
Named Entity Extraction
Relation Detection
Lexical Semantics
Taxonomy
Ontology
Onomastics
Important Terms and Concepts for NER and Semantic Search Engine
65. @KorayGubur
Semantic Search Engine
@KorayGubur
Entity Extraction
• Entity extraction is a complementary step for
Named Entity Recognition.
• Recognized Entity can be extracted from the
text to be stored in a Knowledge Base.
• Entity Extraction uses attributes to connect
the entity and its meaning, prominence and
attributes to each other.
• In the sentence of ’46th President of United
States (US) had decided to go Paris on
Monday, 2th june, 2002.’
• ‘46th President of United States’ is the
named entity.
• The decision of the president is the attribute
with the date contribution which is included
in entity extraction.
66. @KorayGubur
Semantic Search Engine
@KorayGubur
Entity Resolution
• Entity Resolution has two phases.
• First phase is finding the mention entity’s
correct idendity.
• Second phase is finding the correct profile of
the mentioned entity.
• For instance, Bill Clinton was a U.S President,
but also an Actor in Hollywood. An American
Football Player can be also a cook, or
journalist.
• To find the right entity, from the entity
reference, Search Engine can use related
entities, and their types.
• Entity Resolution helps for feeding the text-
to-data systems of Search Engines.
• If you tell ‘Barry Scwhartz entered to
classroom and asked questions to the
students’, the Entity Resolution will decide
that it is the Professor Barry, not our Barry.
67. @KorayGubur
Semantic Search Engine
@KorayGubur
Relation Detection
• Relation Detection is the process of
understanding the relation type and labels
between different entities within a text.
• There are different types of relations, such as
‘isSimilarOf’, ‘locatedIn’, ‘superiorOf’,
‘closeTo’, ‘sameAs’.
• Some of these relation types are familiar
from the Structured Data.
• Some of the relation types are unique for
specific entities and specific topics.
• Relation Detection takes power from the
Lexical Semantics.
• Relation detection can be used for Visual-to-
text algorithms too.
68. @KorayGubur
Semantic Search Engine
@KorayGubur
Lexical Semantics
• Lexical Semantics should be known by every
human-being for thinking and speaking in a
healthy way.
• Lexical Semantics include semantic meaning
connections between different words.
• Lexical Semantics are used to understand the
relational connections between named
entities.
• For instance, ‘Boy’ includes ‘single’, ‘teenage’,
‘male’, ‘young’ meanings as default. But,
some of these meanings have high possibility,
some of them low.
• For instance, someone young, male, teenage
can be also married.
• Lexical Semantics are used to understand the
named entity’s resolution and connection
with other things.
Lexemes: not analyzable unit, by itself.
Lexicon: List of lexemes.
69. @KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
• Semantic Role Labeling is the process of
understanding the parts of a sentence by
assigning related labels.
• Semantic Role Labeling takes power from
Lexical Semantics, and Part of Speech Tag.
• Semantic Role Labeling helps Relation
Detection.
• There are more than 32 Semantic Roles.
• For Semantic Role Labeling, the most
important part is finding the theme,
predicate, agent, and effect.
• Semantic Role Labeling is beneficial to audit
the content’s accuracy, and fact extraction
from the prepositions.
70. @KorayGubur
Semantic Search Engine
@KorayGubur
Taxonomy
• Taxos-logos, or Taxonomy means arrangement of
things.
• It is used for animal classification first, in Anceint
Greek.
• In moden era, it is used for all living thing classification
in biology, and then it has been used for classification
of chemical, or other types of existing things.
• In the field of Search Engine Optimization, Semantic
Entity Types, and Semantic Dependency Tree is
important.
• Creationg a hierarchy between entities based on their
type and size, prominence or superiority and
inferiority is important to increase the contextual
relevance, and specifying the relevance of the article.
• Every entity type has a different attribute group, and
hierarchy can be refreshed.
• If the context is size of cities, ‘berlin’, ‘paris’, ‘istanbul’
can have a different taxonomy, in terms of big, small,
medium cities.
• If the context is countries of these cities, taxonomy
can be aligned with country names, and region,
continent names.
71. @KorayGubur
Semantic Search Engine
@KorayGubur
Ontology
• Ontology completes the taxonomy.
• Ontos-logos, essence of things.
• It is a barnch of philosophy.
• Ontology is a reflex for all human-beings.
• Ontology can be created based on mutual
points of different entities.
• According to the mutual attribute between
entities, the taxonomy can change, and
ontology can follow it also.
• If three named entities are from same region,
region name is the mutual attribute, and it
can have other types of connections based
on this.
72. @KorayGubur
Semantic Search Engine
@KorayGubur
Onomastics
• Onomastics is the science of naming, and
analyzing the name patterns for different
languages.
• Every enttiy type has a different naming pattern.
• Name patterns are used to recognize entities,
entity types, and attributes of entities.
• It comes from onoma and stikos, means names
of things.
• Different science names, city names, event
names, situation names, or instituion names can
have naming patterns.
• Some onomastics sub-type examples,
1. helonyms: proper names of swamps, marshes and bogs.
2. limnonyms: proper names of lakes and ponds.
3. oceanonyms: proper names of oceans.
4. pelagonyms: proper names of seas and maritime bays.
5. potamonyms: proper names of rivers and streams.
• Onomastics can be used for taxonomy and
ontology creation too. Even a water can have
multiple naming patterns based on sub-types.
74. @KorayGubur
Semantic Search Engine
@KorayGubur
BERT - SMITH
Uses, Masked Language Model.
It masks 15% of every tokens for prediction model.
Used, Bidrectional Language Understanding.
It reads all sentence at once from both direction.
It predicts the next sentence.
Used bigger tokens than 512 with SMITH.
Used fine-tuning based representation model.
75. @KorayGubur
Semantic Search Engine
@KorayGubur
MuM
The research papers have been taken in 2021 March.
In 2021 May, they announced MuM.
In 2021 June, they announced that they started to use MuM.
All system is related to the understand ‘Related Search Activity’ to predict the future queries.
78. @KorayGubur
Semantic Search Engine
@KorayGubur
Conversational Search
Conversational Search is close to Conversational AI.
It connects different entities, concepts, intents to each
other.
Creates new Contextual Domains, and Co-occurence
Matrixes.
Conversational Search Announcement includes only the
past queries.
MuM, and LaMDA includes future queries.
81. @KorayGubur
Semantic Search Engine
@KorayGubur
ReALM
Inventors: Kenton Chiu Tsun Lee,
Kelvin Gu, Zora Tung, Panupong
Pasupat, and Ming-Wei Chang
Assignee: Google LLC
US Patent: 11,003,865
Granted: May 11, 2021
Filed: May 20, 2020
First a Research Paper,
Then, a Patent.
Lastly, an Update with Official Statement,
Or Non-Official Statement.
87. @KorayGubur
@KorayGubur
‘Without understanding the Query Processing in the eyes of
Search Engine, you can’t create the relevant, and satisfying
document based on minor and dominant search activity
types.’
Thank You