Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
From Knowledge Graphs to AI-powered SEO:
Using taxonomies, schemas and knowledge
graphs to improve search engine rankings
...
IN
TRO
Andrea Volpini
Cofounder and CEO of
WordLift
David Riccitelli
Cofounder and CTO of
WordLift
We help website owners expand traffic with an
organic sustainable growth
...using semantic web technologies
Source: SEOZOOM ...
We’ll dive into SEO and structured data.
Our goal is to build a knowledge graph that search engines can use to understand ...
“Make Your Website Talk” using Google
featured snippets and structured data
● Developed by and for the search engines
● Stable, reliable and extensible
● Has become the de-facto standard for Linked
...
...then one day, in 2010, we tell
our biggest client that we have
a product that uses schema
markup to bost SEO!!
ROI of Semantic Web technologies - in the context
of Search Optimisation - is now easy to prove
● +12% avg. rankings
growt...
“Open data needs to be
taken as serious as open
source software.”
Chris Taggart
Semantic
Open Data is
an essential
buildin...
Danny Sullivan and
John Mueller on
how to enable
snippets,
thumbnails and
rich results
following EU
reform of online
copyr...
GOOGLE MAKES BROADER USE
OF SCHEMA TYPES THAN WHAT
IS APPARENT THROUGH
EVIDENCE LIKE RICH SNIPPETS.
Query augmentation patent - 3/2018
“In addition to actual queries submitted
by users, augmentation queries can also
includ...
Isn’t this
supposed to be
Wikipedia?
MainEntityOfPage
http://data.wordlift.io/wl0737/entity/andrea_volpini
P6363
http://data.wordlift.io/wl0737/entity/andrea_volpini
What Content
Shall be spoken
aloud?
Using speakable schema
markup we can tell Google
what sections within an article
or a ...
Make Content easy to be
found by Assistant Users
● Optimised for Search
and for the Google
Assistant
● Semantically enrich...
Can I book this
bungalow for
you?
Using Schema Actions we
can tell Google and personal
digital assistants what actions
can...
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "WebSite",
"url": "https://www.example.co...
1 2 3 4
Recap
Using schema
markup I opt-in for
Google’s snippets
thumbnails and rich
results. Google
considers structured
...
Hands-On
Agentive SEO WordLift
Your
Content
A Knowledge Graph is a programmatic way to model a
knowledge domain with the help of subject-matter
experts, data interlin...
Building a
SEO-friendly
Knowledge
Graph
Website
analysis and
URLs selection
Named
Entity
Extraction
Data Linking &
Enrichment
Data
Publishing
Data
Exploitation
& ...
From URLs to a Knowledge Graph
data expansion for travel destinations
CRAW
LER
1. Crawl and analyze urls
that can benefit f...
Named
Entity
Extraction
using Google
Colab !
👉 http://bit.ly/semantic-audit
From URLs to most common Entities
Semantic Audit - We extract named entities from the URLs using Python
BETA
all the
urls ...
How to clean
data
MISSING DATA
WRONG VALUES
ENTITY RESOLUTION
TYPE CONVERSION
DATA INTEGRATION
MISSED MEASUREMENTS,
INCOMPLETE FIELDS, ETC.
...
We refine data by isolating rows and
by applying changes
FILTER TRANSFORM
How to link &
enrich data
Data Enrichment Pipeline
1. Reconciliation/Match against Wikidata
2. Get Wikidata URI
3. Load Wikidata JSON
4. Get Wikiped...
Entity Reconciliation on Wikidata
Here is a video to watch
Step-by-Step 1/6
1. Reconciliation/Match against Wikidata in the required language
2. Get Wikidata URI
id = cells["entity"...
Step-by-Step 2/6
5. Get DBpedia URI
return value.replace('https://en.wikipedia.org/wiki/', 'http://dbpedia.org/resource/')...
Step-by-Step 3/6
7. Add synonyms from Wikidata JSON
import json
def get_value( synonym ):
return synonym['value']
dict = j...
Step-by-Step 4/6
Python/Jython
9. Add schema entity types from DBpedia JSON
import json
def get_value( item ):
return item...
Step-by-Step 5/6
11. Add comment from DBpedia JSON
import json
def by_language( lang ):
def filter_by_language( item ):
re...
Step-by-Step 6/6
13. Add every other property from Wikidata (all at once) either using the property name
(instance of) or ...
Data Publishing
N
extSteps
export
From
OpenRefine
custom
WP Plugin
WordPress
&
WordLift
(WP or Cloud)
Webpages
Knowledge Gr...
DO I REALLY NEED
SEMANTIC ANNOTATIONS AND
LINKED DATA?
8. SITE LINKS
9. SITE IMAGE
CAROUSEL
10. TOP
STORIES FOR
NEWS
11. AMP
12. GOOGLE
FLIGHTS
13. PEOPLE
ALSO ASK
14. CATEGORY
...
8. SITE LINKS
9. SITE IMAGE
CAROUSEL
10. TOP
STORIES FOR
NEWS
11. AMP
12. GOOGLE
FLIGHTS
13. PEOPLE
ALSO ASK
14. CATEGORY
...
Can I simply add structured data with my CMS?
Average
5.23 89 2.39% 35.32
2.15 57 1.60% 37.93
243.09% 156.56% 149.06% 6.88...
Semi-automate Structured
Linked Data using NLP
1. Annotate/Link Named
Entities in posts and
pages
2. Build a Knowledge
Gra...
DRIVING FORCES
BEHIND AI
TRANSFORMATION
IS DATA FOR
GOOGLE ONLY?
Google Analytics goes Semantic
Traffic by Topic Topics mentioned in an article
W
O
RD
LIFT
“We want
raw data NOW.”
Tim Berners-Lee
Always link
data with other
data - data
becomes the
center of your
new
product/ser...
GRAZIE!!
Here is your
SPECIAL OFFER!!
https://wordlift.io/connecteddata
Resources
● More information about data cleaning:
https://www.aviz.fr/wiki/uploads/Teaching2018/2018-11-22-CleaningIntroTu...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings ...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings ...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings ...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings ...
Upcoming SlideShare
Loading in …5
×

From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings and web publishing workflows

1,018 views

Published on

Do you want to learn how to use the low-hanging fruit of knowledge graphs — schema.org and JSON-LD — to annotate content and improve your SEO with semantics and entities? This hands-on workshop with one of the leading Semantic SEO practitioners will help you get started.

Published in: Data & Analytics

From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings and web publishing workflows

  1. 1. From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowledge graphs to improve search engine rankings and web publishing workflows wordlift.io for Connected Data - October 2019
  2. 2. IN TRO Andrea Volpini Cofounder and CEO of WordLift David Riccitelli Cofounder and CTO of WordLift
  3. 3. We help website owners expand traffic with an organic sustainable growth ...using semantic web technologies Source: SEOZOOM data from an Italian blog on photography
  4. 4. We’ll dive into SEO and structured data. Our goal is to build a knowledge graph that search engines can use to understand the content of a website. We’ill run a “semantic audit” on a reference website using Python (code available in Google Colab) to extract the most common entities. Starting from these entities we will enrich them using OpenRefine with queries against DBpedia and Wikidata (code included in this presentation). At last we will import this data on our website and we will re-use it to markup the articles, to add internal links and to gain more insights on search traffic. What is the plan for today?
  5. 5. “Make Your Website Talk” using Google featured snippets and structured data
  6. 6. ● Developed by and for the search engines ● Stable, reliable and extensible ● Has become the de-facto standard for Linked Data development ● Strikes the right balance from complexity and expressiveness ● Open and community-driven
  7. 7. ...then one day, in 2010, we tell our biggest client that we have a product that uses schema markup to bost SEO!!
  8. 8. ROI of Semantic Web technologies - in the context of Search Optimisation - is now easy to prove ● +12% avg. rankings growth (from 28.7 to 25.6) ● +22.22% CTR increase (from 1.8% to 2.2%) Source: GSC data from a travel brrand in Canada
  9. 9. “Open data needs to be taken as serious as open source software.” Chris Taggart Semantic Open Data is an essential building block of Modern SEO LEARN IN G
  10. 10. Danny Sullivan and John Mueller on how to enable snippets, thumbnails and rich results following EU reform of online copyright law https://twitter.com/cyberandy/status/1176942111767388160?s=12
  11. 11. GOOGLE MAKES BROADER USE OF SCHEMA TYPES THAN WHAT IS APPARENT THROUGH EVIDENCE LIKE RICH SNIPPETS.
  12. 12. Query augmentation patent - 3/2018 “In addition to actual queries submitted by users, augmentation queries can also include syntetic queries that are machine generated [...] A way of identifying an augmentation query is mining structured data, e.g., business telephone listings, and identifying queries that include terms of the structured data, e.g., business names.” Ok Google, find the best syntetic query to answer my request “Google may decide to add results from an augmentation query to the results for the query searched for to improve the overall search results.” Bill Slawski - SEO by the sea PATENT
  13. 13. Isn’t this supposed to be Wikipedia?
  14. 14. MainEntityOfPage http://data.wordlift.io/wl0737/entity/andrea_volpini
  15. 15. P6363 http://data.wordlift.io/wl0737/entity/andrea_volpini
  16. 16. What Content Shall be spoken aloud? Using speakable schema markup we can tell Google what sections within an article or a webpage are best suited for audio playback using text-to-speech (TTS). BETA
  17. 17. Make Content easy to be found by Assistant Users ● Optimised for Search and for the Google Assistant ● Semantically enriched content ● 5 stars linked open data ● Personalise the listing in the Google Actions Directory ● Let Google find your action with implicit invocation ● Add links to your intent
  18. 18. Can I book this bungalow for you? Using Schema Actions we can tell Google and personal digital assistants what actions can be trigger for a given entity.+7.5% +43.0% Source: GSC data from a travel brrand in Netherland
  19. 19. <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "WebSite", "url": "https://www.example.com/", "potentialAction": { "@type": "SearchAction", "target": "https://query.example.com/search?q= {search_term_string}", "query-input": "required name=search_term_string" } } </script> Actions in Action
  20. 20. 1 2 3 4 Recap Using schema markup I opt-in for Google’s snippets thumbnails and rich results. Google considers structured data as free-to-use open data. The ROI of rich results is now easy to demonstrate. Google makes broader use of structured linked data that we can benefit from. Findability is improved by helping Google augment user queries. Branding is also improved/controlled by helping Google enrich its Knowledge Graph (I can claim kg panels and connect data). We can also claim a directory pages for the Google Assistant. Using schema actions I can improve the user engagement on Google SERP. This results in new entry points for the funnel (ie. deep links to apps) and support for conversational UIs.
  21. 21. Hands-On Agentive SEO WordLift Your Content
  22. 22. A Knowledge Graph is a programmatic way to model a knowledge domain with the help of subject-matter experts, data interlinking, and machine learning algorithms. The Knowledge Graph (for SEO) is built on top of existing databases such as Wikidata and DBpedia to link all data together at web-scale combining both structured information (i.e. the list of destinations on a travel website) or unstructured (the articles on the website). “ ”
  23. 23. Building a SEO-friendly Knowledge Graph
  24. 24. Website analysis and URLs selection Named Entity Extraction Data Linking & Enrichment Data Publishing Data Exploitation & SEO WordPress/WordLift (or your favorite CMS)OpenRefineCrawler Google Colab
  25. 25. From URLs to a Knowledge Graph data expansion for travel destinations CRAW LER 1. Crawl and analyze urls that can benefit from structured data 2. Collect data from LOD to enrich each page 3. Improve Impressions and Clicks on Google
  26. 26. Named Entity Extraction using Google Colab ! 👉 http://bit.ly/semantic-audit
  27. 27. From URLs to most common Entities Semantic Audit - We extract named entities from the URLs using Python BETA all the urls that matter the most common entites
  28. 28. How to clean data
  29. 29. MISSING DATA WRONG VALUES ENTITY RESOLUTION TYPE CONVERSION DATA INTEGRATION MISSED MEASUREMENTS, INCOMPLETE FIELDS, ETC. MISSPELLINGS, OUTLIERS, “SPURIOUS INTEGRITY”, ETC. DIFFERENT VALUES, ABBREVS., 2+ ENTRIES FOR THE SAME THING? E.G., ZIP CODE OR PLACE NAME TO LAT-LON MISMATCHES AND INCONSISTENCIES WHEN COMBINING DATA
  30. 30. We refine data by isolating rows and by applying changes
  31. 31. FILTER TRANSFORM
  32. 32. How to link & enrich data
  33. 33. Data Enrichment Pipeline 1. Reconciliation/Match against Wikidata 2. Get Wikidata URI 3. Load Wikidata JSON 4. Get Wikipedia Address 5. Add synonyms from Wikidata JSON 6. Get DBpedia URI 7. Load DBpedia JSON 8. Add images from DBpedia JSON 9. Add schema entity types from DBpedia JSON 10. Add sameAs from DBpedia JSON 11. Add comment from DBpedia JSON 12. Add abstract from DBpedia JSON 13. (Add every other property from Wikidata) W alkthrough
  34. 34. Entity Reconciliation on Wikidata Here is a video to watch
  35. 35. Step-by-Step 1/6 1. Reconciliation/Match against Wikidata in the required language 2. Get Wikidata URI id = cells["entity"].recon.match.id wikidata_uri = "http://www.wikidata.org/entity/" + id return wikidata_uri 3. Load Wikidata JSON import urllib q = cells['entity'].recon.match.id url = "https://www.wikidata.org/wiki/Special:EntityData/" + q + ".json" response = urllib.urlopen(url) return response.read() 4. Get Wikipedia Address import json dict = json.loads(value)['entities'] q = dict.keys()[0] return dict.get(q)['sitelinks']['enwiki']['url'] Python/Jython
  36. 36. Step-by-Step 2/6 5. Get DBpedia URI return value.replace('https://en.wikipedia.org/wiki/', 'http://dbpedia.org/resource/') 6. Load DBpedia JSON import urllib url = value.replace( 'http://dbpedia.org/resource/', 'http://dbpedia.org/data/' ) + '.json' response = urllib.urlopen( url ) return response.read() Python/Jython
  37. 37. Step-by-Step 3/6 7. Add synonyms from Wikidata JSON import json def get_value( synonym ): return synonym['value'] dict = json.loads(value)['entities'] q = dict.keys()[0] synonyms = dict.get(q)['aliases']['en'] values = map( get_value, synonyms ) return 't'.join( values ) 8. Add images from DBpedia JSON import json def get_value( item ): return item['value'] dbpedia_json = cells["DBpedia JSON"].value dbpedia_uri = cells["DBpedia URI"].value dict = json.loads( dbpedia_json ) images = dict.get( dbpedia_uri )['http://xmlns.com/foaf/0.1/depiction'] values = map( get_value, images ) return ', '.join( values ) Python/Jython
  38. 38. Step-by-Step 4/6 Python/Jython 9. Add schema entity types from DBpedia JSON import json def get_value( item ): return item['value'] def starts_with_schema_org( value ): return value.startswith( "http://schema.org/" ) dbpedia_uri = cells["DBpedia URI"].value dict = json.loads(value) same_as = dict.get( dbpedia_uri )[ "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" ] values = filter( starts_with_schema_org, map( get_value, same_as ) ) return ', '.join( values ) 10. Add sameAs from DBpedia JSON import json def get_value( item ): return item['value'] dbpedia_uri = cells["DBpedia URI"].value dict = json.loads(value) same_as = dict.get( dbpedia_uri )[ "http://www.w3.org/2002/07/owl#sameAs" ] values = map( get_value, same_as ) return ', '.join( values )
  39. 39. Step-by-Step 5/6 11. Add comment from DBpedia JSON import json def by_language( lang ): def filter_by_language( item ): return lang == item['lang'] return filter_by_language def to_value( item ): return item['value'] dict = json.loads( value ) dbpedia_uri = cells["DBpedia URI"].value items = filter( by_language('en'), dict[dbpedia_uri]['http://www.w3.org/2000/01/rdf-schema#comment'] ) return ', '.join( map( to_value, items ) ) 12. Add abstract from DBpedia JSON import json def by_language( lang ): def filter_by_language( item ): return lang == item['lang'] return filter_by_language def to_value( item ): return item['value'] dict = json.loads( value ) dbpedia_uri = cells["DBpedia URI"].value items = filter( by_language('en'), dict[dbpedia_uri]['http://dbpedia.org/ontology/abstract'] ) return ', '.join( map( to_value, items ) ) Python/Jython
  40. 40. Step-by-Step 6/6 13. Add every other property from Wikidata (all at once) either using the property name (instance of) or the identifier (P31) Python/Jython
  41. 41. Data Publishing N extSteps export From OpenRefine custom WP Plugin WordPress & WordLift (WP or Cloud) Webpages Knowledge Graph data.wordlift.io TSV
  42. 42. DO I REALLY NEED SEMANTIC ANNOTATIONS AND LINKED DATA?
  43. 43. 8. SITE LINKS 9. SITE IMAGE CAROUSEL 10. TOP STORIES FOR NEWS 11. AMP 12. GOOGLE FLIGHTS 13. PEOPLE ALSO ASK 14. CATEGORY 15. IMAGES 16. VIDEO / TRAILERS 17. LIVE 18. TOP SIGHTS 19. REVIEWS 20. BLOGS 21. KNOWLEDGE PANEL 22. CAROUSEL 23. APPS 24. GOOGLE FOR JOBS 25. RECIPES 26. SCHOLARLY RESEARCH 27. WEATHER 28. GAME SCORES 29. TWEETS 30. DISCOVER MORE PLACES 31. SEND TO GOOGLE HOME 32. PEOPLE ALSO SEARCH FOR 33. SEE RESULTS ABOUT 34. WIDGETS 35. FOUND IN RELATED SEARCH 36. QUOTES 37. EVENTS 38. DATASETS SEARCH 1. STANDARD 2. TALLER ORGANIC CARD 3. LOCAL 3-PACK 4. HOWTO 5. SHOPPING 6. RICH SNIPPET 7. SITE CAROUSEL 39. MOVIE CAROUSEL 40. PODCAST 41. COURSE
  44. 44. 8. SITE LINKS 9. SITE IMAGE CAROUSEL 10. TOP STORIES FOR NEWS 11. AMP 12. GOOGLE FLIGHTS 13. PEOPLE ALSO ASK 14. CATEGORY 15. IMAGES 16. VIDEO / TRAILERS 17. LIVE 18. TOP SIGHTS 19. REVIEWS 20. BLOGS 21. KNOWLEDGE PANEL 22. CAROUSEL 23. APPS 24. GOOGLE FOR JOBS 25. RECIPES 26. SCHOLARLY RESEARCH 27. WEATHER 28. GAME SCORES 29. TWEETS 30. DISCOVER MORE PLACES 31. SEND TO GOOGLE HOME 32. PEOPLE ALSO SEARCH FOR 33. SEE RESULTS ABOUT 34. WIDGETS 35. FOUND IN RELATED SEARCH 36. QUOTES 37. EVENTS 38. DATASETS SEARCH 1. STANDARD 2. TALLER ORGANIC CARDS 3. LOCAL 3-PACK 4. HOWTOs 5. SHOPPING 6. RICH SNIPPETS 7. SITE CAROUSEL ??. ... 38. DATASETS SEARCH 39. MOVIE CAROUSEL ??. ...40. PODCAST 41. COURSE
  45. 45. Can I simply add structured data with my CMS? Average 5.23 89 2.39% 35.32 2.15 57 1.60% 37.93 243.09% 156.56% 149.06% 6.88% Clicks Impressions CTR Position Semantically Enriched Non-Semantically Enriched W O RD LIFT Source: GSC data from a content publisher in Germany
  46. 46. Semi-automate Structured Linked Data using NLP 1. Annotate/Link Named Entities in posts and pages 2. Build a Knowledge Graph optimised for SEO 3. Improve Impressions and Clicks on Google ● 148.166 new users from organic search in 6 months ● +92.64% growth when compared to similar websites in Austria W O RD LIFT Source: Google Analytics data from a travel brand in Austria
  47. 47. DRIVING FORCES BEHIND AI TRANSFORMATION IS DATA FOR GOOGLE ONLY?
  48. 48. Google Analytics goes Semantic Traffic by Topic Topics mentioned in an article W O RD LIFT
  49. 49. “We want raw data NOW.” Tim Berners-Lee Always link data with other data - data becomes the center of your new product/service LEARN IN G
  50. 50. GRAZIE!! Here is your SPECIAL OFFER!! https://wordlift.io/connecteddata
  51. 51. Resources ● More information about data cleaning: https://www.aviz.fr/wiki/uploads/Teaching2018/2018-11-22-CleaningIntroTutor ial.pdf ● Entity reconciliation with Openrefine: https://www.youtube.com/watch?v=q8ffvdeyuNQ ● Openrefine tutorial 1: https://www.youtube.com/watch?time_continue=8&v=cO8NVCs_Ba0 ● Openrefine tutorial 2: https://www.youtube.com/watch?time_continue=167&v=B70J_H_zAWM ● Wikidata for begginers: https://upload.wikimedia.org/wikipedia/commons/6/64/Wikidata_-_A_Gentle _Introduction_for_Complete_Beginners_%28WMF_February_2017%29.pdf

×