Semantic MediaWiki as
Linked Open Data Platform
Bernhard Krabina, KM-A
2
Introduction
Managing partner at KM-A Knowledge
Management Associates
Active member of the Semantic
MediaWiki community ~ 15 years
Knowledge Graph/Wiki researcher at WU
Vienna (Prof. Polleres)
Knowledge Management lecturer at
university of applied sciences
2
• KM consulting
• KM training
• KM research
• open-source SMW stack
• professional hosting
1
2
3
Agenda
MediaWiki as Knowledge
Base/Graph
Structured Data in MediaWiki
Semantic MediaWiki as LOD
platform
Wikipedia – Wikimedia – Mediawiki
Encyclopedia Operator Software 4
The power of Knowledge Graphs
 A knowledge graph represents knowledge in form of tripes:
subject – predicate – object (format: RDF)
This forms a network of nodes and edges
page 5
Hedy Lamarr actor
is an
born in
Vienna
Austria
located in
artist
subclass of
 Query for Austrian artists
can retrieve
Hedy Lamarr
even when this is not tagged
on her page!
Knowledge Graph vs. Knowledge Base
 Term was made popular by Google
„Introducing the Knowledge Graph: things, not strings”
 Trivial definition: Ontology + Instances
https://blog.google/products/search/introducing-knowledge-graph-things-not/
actress woman
person
???
is a
subclass of
has birthdate
has birthdate
Hedy Lamarr
Romy Schneider
Christiane Hörbiger
Paula Wessely
https://en.wikipedia.org/wiki/Category:Actresses_
A scientific definition (Paulheim 2016)
A knowledge graph
• mainly describes real world
entities and their interrelations,
organized in a graph,
• defines possible classes and
relations of entities in a schema,
• allows for potentially interrelating
arbitrary entities,
• covers various topical domains.
in MediaWiki
• real world entities = wiki pages
• classes = categories and
relations of entities =
properties,
• interrelating entities = linking
• wiki topic
Structures in MediaWiki
 Formatted text (Headings, numerations, paragraphs, quotes)
 Templates
 Pages and subpages
 Namespaces
 Categories and subcategories
 Category „inflation“
 Manually curated lists
 No querying of data inside MediaWiki
9
Knowledge Graphs and Wikipedia
vs. custom Knowledge Graphs
• extract structured information from Wikipedia and make
this information available on the Web
 free knowledge base that can be read and edited by
humans and machines alike… central storage for the data
that may be accessed by the client Wikipedias
 turns MediaWiki into a powerful and flexible knowledge
management system
 lets you store and query data within the wiki's pages
 a set of extensions for MediaWiki
1
2
3
Agenda
MediaWiki as Knowledge
Base/Graph
Structured Data in MediaWiki
Semantic MediaWiki as LOD
platform
Options
Knowledge Graph?
Storage of data MW database,
ElasticSearch,
TripleStores (incl.
Blazegraph)
MW database,
Blazegraph
MW database MW database
Properties flexible defined before
usage, unchangeable
no properties, but
table fields
defined through
JSON-schema
Queries parser function, API,
TripleStore (SPARQL)
API,
TripeStore (SPARQL)
parser function, API parser function, API
Linking Data RDF, importing
ontologies
RDF, reusing Wikidata
ontology
- -
Knowledge Graph   
would need RDF or
JSON-LD

would need RDF or
JSON-LD
13
Semantic MediaWiki or Wikibase?
https://www.mediawiki.org/wiki/Manual:Managing_data_in_MediaWiki
Semantic MediaWiki Wikibase
flexible data model data model of Wikidata
properties can be pre-defined or declared by annotating properties need to be pre-defined
properties (and datatypes) can be changed any time properties cannot be changed!
requires extensions for form-based input comes with a fixed, built-in edit interface
SPARQL only with external triplestore
internal query language (easier than SPARQL) no built-in querying of data
14
Semantic MediaWiki or Wikibase?
https://www.mediawiki.org/wiki/Manual:Managing_data_in_MediaWiki
Semantic MediaWiki when you Wikibase when you
have data that is not always disputed
(reference datatype exists)
need data provenance
(complex statments with references for disputed data)
want a data structure that defines your (smaller) world want a data structure that describes the world
want to re-use domain-specific vocabularies (schema.org) want to re-use (part of the) Wikidata ontology
want to define your own edit forms want the edit interface of Wikidata
want to interact with your data in wiki pages want to interact with your data primarily via SPARQL
endpoint
1
2
3
Agenda
MediaWiki as Knowledge
Base/Graph
Structured Data in MediaWiki
Semantic MediaWiki as LOD
platform
16
What is Semantic MediaWiki (SMW)?
open source project:
• www.semantic-mediawiki.org
• https://github.com/SemanticMediaWiki
the „swiss army knife“ for data and
semantics
built on the MediaWiki ecosystem:
the wiki engine that powers Wikipedia
can be used for much more than just
wikis…
17
MediaWiki + SMW + more extensions
• collaborative editing
• version history of every edit
• no backend:
everything is a wiki page
• structure via categories and
namespaces
• API
• …
• structured data
(Web database)
• result lists and formats
via {{#ask:}} queries
• Semantic Web standards
• triplestore support
• …
• online forms for data entry
• more visualizations
• responsive skin
• authentication
• image annotation
• SPARQL
• …
Linked Data vs. Open data
Open Data and SMW
 Make your SMW readable by everyone
– editing can still be restricted to logged-in users
 Include an open license
– see https://www.mediawiki.org/wiki/Manual:Copyright
– $wgRightsIcon = "$wgScriptPath/resources/assets/licenses/cc-by-sa.png";
 Make it easy for users to access your data:
– SMW puts a link to the RDF-representation of individual pages (via Special:ExportRDF) in the
HTML automatically, see https://www.semantic-mediawiki.org/wiki/Help:RDF_export
– create an RDF Dump https://www.semantic-mediawiki.org/wiki/Help:Maintenance_script_dumpRDF.php
– indicate exporting of data on pages or lists (e. g. vCard, iCalendar, BibTex, KML)
– provide export pages with explanations and several result formats (CSV, JSON, RDF)
Linked Data and SMW
 Re-use external vocabularies, most importantly Schema.org
 Add vocabulary definitions to properties, but also to category pages
 Use the page IDs of MediaWiki as unique identifiers
 Use the data type External identifier to link to external sources
21
Building your Knowledge Base
• page Vienna can have properties
– number of inhabitants, located in,
coordinates, WikidataID, …
• properties can have various data types
– page, text, number, date, URL, …
– external identifier links to external resources
• re-use external vocabularies
– “Coordinates” imported from schema:geo
• a page should be put into a category
– Also category pages should re-use vocabularies:
{{#set:Imported from=schema:City}}
Unique IDs in MediaWiki and
Special:URIResolver in SMW
 Every page has a unique page ID
 display it with the magic word:
{{PAGEID}}
 use it to link to a page without using the
page name (that could change over time)
 Use Special:URIResolver
https://yourwiki/Special:URIResolver/?curid=34
 supports content negotiation
 Furthermore: IDs for every version of a
page!
Using External Vocabularies
1. Add/edit a page
MediaWiki:Smw import schema
2. Instead of local datatype declarations, use
{{#set:Imported from=schema:geo}}
on the property page (e. g. Property:Coordinates)
instead of {{#set:Has type=Geographic coordinates}}
Add (or remove) vocabulary
terms any time…
Linking to external identifiers
 Define a property
 Assign datatype „External identifier“
– Links to external ids
{{#set:Has type=External identifier
|External formatter uri=
http://www.wikidata.org/entity/$1}}
 Look for other identifiers
– ORCID https://orcid.org/
– GND
– …..
Even better, use Schema.org:
{{#set:Imported from::schema:sameAs}}
https://www.semantic-mediawiki.org/wiki/
MediaWiki:Smw_import_schema
Changing data types in SMW
https://www.caf-network.eu/MediaWiki:Smw_import_dcterms
27
Internal query language
{{#ask:
[[Category:Practices]]
[[Country::Austria]]
|?Organisation
|?Coordinates
|format=table
}}
28
Internal query language
{{#ask:
[[Category:Practices]]
[[Country::Austria]]
|?Organisation
|?Coordinates
|format=map
}}
29
> 70 result formats, supporting
MediaWiki templates
|format=moderntimeline
|format=calendar
|format=median
|format=D3chart
|format=gantt
|format=tagcloud
|format=json
|format=rdf
|format=bibtex
…
30
Maintaining your Knowledge Graph
Data/ontology curation: “semantic gardening”
• user rights
– admins, curators, users
• property annotation health
– Outdated properties/entities
– Similar properties
– Property uniqueness
– Improper annotations and failed queries
– Missing redirect annotations
https://www.semantic-mediawiki.org/wiki/Semantic_gardening
Inferencing
• subcategories
• subproperties
• equality of pages:
redirects
• subqueries
https://www.semantic-mediawiki.org
/wiki/Help:Inferencing
31
Semantic MediaWiki storage options
SQL Store (default)
• extra tables in the
SQL store of
MediaWiki
ElasticStore
• search engine, not a
storage backend
SPARQL/RDF Store
• custom, default
• Virtuoso
• Blazegraph
• Fuseki
• Sesame
• 4store
easy (to install) harder to install but more powerful
Examples
 Vienna History Wiki
– https://www.geschichtewiki.wien.gv.at
 Knowledge Management Platform
– https://www.wissensmanagement.gv.at
 Austrian Public Sector Award
– https://www.verwaltungspreis.gv.at
 FINA Wiki (numismatic research)
– https://fina.oeaw.ac.at/
https://doi.org/10.1016/j.websem.2022.100771
34
Hacking Semantic MediaWiki
Get involved in the SMW community: www.semantic-mediawiki.org
Join our Github account: https://github.com/SemanticMediaWiki/
Join mailing lists: https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki_mailing_lists
Element/Matric/Telegram chat https://t.me/joinchat/MCG84k3OMoaYZoFA9yhyMg
Social Media channels (Twitter, Mastodon, LinkedIn, Facebook, YouTube)
Projects you should look into
• https://canasta.wiki
• https://www.open-csp.org
SMW sponsorship: donating time
https://www.semantic-mediawiki.org/wiki/Sponsorship
Self-declaration
Donate money!
https://opencollective.com/smw/
1
2
Let’s collaborate on the future of SMW!
KMA Knowledge Management Associates | Gersthofer Straße 162 | A-1180 Wien | office@km-a.net | www.km-a.net 39
 Knowledge Management
 Wiki consulting, Semantic MediaWiki
 Open Government, Open Data
Bernhard Krabina
+43 676 5103593
Bernhard.krabina@km-a.net
linkedin.com/in/krabina
@krabina

Semantic MediaWiki - a Linked Open Data Platform

  • 1.
    Semantic MediaWiki as LinkedOpen Data Platform Bernhard Krabina, KM-A
  • 2.
    2 Introduction Managing partner atKM-A Knowledge Management Associates Active member of the Semantic MediaWiki community ~ 15 years Knowledge Graph/Wiki researcher at WU Vienna (Prof. Polleres) Knowledge Management lecturer at university of applied sciences 2 • KM consulting • KM training • KM research • open-source SMW stack • professional hosting
  • 3.
    1 2 3 Agenda MediaWiki as Knowledge Base/Graph StructuredData in MediaWiki Semantic MediaWiki as LOD platform
  • 4.
    Wikipedia – Wikimedia– Mediawiki Encyclopedia Operator Software 4
  • 5.
    The power ofKnowledge Graphs  A knowledge graph represents knowledge in form of tripes: subject – predicate – object (format: RDF) This forms a network of nodes and edges page 5 Hedy Lamarr actor is an born in Vienna Austria located in artist subclass of  Query for Austrian artists can retrieve Hedy Lamarr even when this is not tagged on her page!
  • 6.
    Knowledge Graph vs.Knowledge Base  Term was made popular by Google „Introducing the Knowledge Graph: things, not strings”  Trivial definition: Ontology + Instances https://blog.google/products/search/introducing-knowledge-graph-things-not/ actress woman person ??? is a subclass of has birthdate has birthdate Hedy Lamarr Romy Schneider Christiane Hörbiger Paula Wessely https://en.wikipedia.org/wiki/Category:Actresses_
  • 7.
    A scientific definition(Paulheim 2016) A knowledge graph • mainly describes real world entities and their interrelations, organized in a graph, • defines possible classes and relations of entities in a schema, • allows for potentially interrelating arbitrary entities, • covers various topical domains. in MediaWiki • real world entities = wiki pages • classes = categories and relations of entities = properties, • interrelating entities = linking • wiki topic
  • 8.
    Structures in MediaWiki Formatted text (Headings, numerations, paragraphs, quotes)  Templates  Pages and subpages  Namespaces  Categories and subcategories  Category „inflation“  Manually curated lists  No querying of data inside MediaWiki
  • 9.
    9 Knowledge Graphs andWikipedia vs. custom Knowledge Graphs • extract structured information from Wikipedia and make this information available on the Web  free knowledge base that can be read and edited by humans and machines alike… central storage for the data that may be accessed by the client Wikipedias  turns MediaWiki into a powerful and flexible knowledge management system  lets you store and query data within the wiki's pages  a set of extensions for MediaWiki
  • 10.
    1 2 3 Agenda MediaWiki as Knowledge Base/Graph StructuredData in MediaWiki Semantic MediaWiki as LOD platform
  • 11.
  • 12.
    Knowledge Graph? Storage ofdata MW database, ElasticSearch, TripleStores (incl. Blazegraph) MW database, Blazegraph MW database MW database Properties flexible defined before usage, unchangeable no properties, but table fields defined through JSON-schema Queries parser function, API, TripleStore (SPARQL) API, TripeStore (SPARQL) parser function, API parser function, API Linking Data RDF, importing ontologies RDF, reusing Wikidata ontology - - Knowledge Graph    would need RDF or JSON-LD  would need RDF or JSON-LD
  • 13.
    13 Semantic MediaWiki orWikibase? https://www.mediawiki.org/wiki/Manual:Managing_data_in_MediaWiki Semantic MediaWiki Wikibase flexible data model data model of Wikidata properties can be pre-defined or declared by annotating properties need to be pre-defined properties (and datatypes) can be changed any time properties cannot be changed! requires extensions for form-based input comes with a fixed, built-in edit interface SPARQL only with external triplestore internal query language (easier than SPARQL) no built-in querying of data
  • 14.
    14 Semantic MediaWiki orWikibase? https://www.mediawiki.org/wiki/Manual:Managing_data_in_MediaWiki Semantic MediaWiki when you Wikibase when you have data that is not always disputed (reference datatype exists) need data provenance (complex statments with references for disputed data) want a data structure that defines your (smaller) world want a data structure that describes the world want to re-use domain-specific vocabularies (schema.org) want to re-use (part of the) Wikidata ontology want to define your own edit forms want the edit interface of Wikidata want to interact with your data in wiki pages want to interact with your data primarily via SPARQL endpoint
  • 15.
    1 2 3 Agenda MediaWiki as Knowledge Base/Graph StructuredData in MediaWiki Semantic MediaWiki as LOD platform
  • 16.
    16 What is SemanticMediaWiki (SMW)? open source project: • www.semantic-mediawiki.org • https://github.com/SemanticMediaWiki the „swiss army knife“ for data and semantics built on the MediaWiki ecosystem: the wiki engine that powers Wikipedia can be used for much more than just wikis…
  • 17.
    17 MediaWiki + SMW+ more extensions • collaborative editing • version history of every edit • no backend: everything is a wiki page • structure via categories and namespaces • API • … • structured data (Web database) • result lists and formats via {{#ask:}} queries • Semantic Web standards • triplestore support • … • online forms for data entry • more visualizations • responsive skin • authentication • image annotation • SPARQL • …
  • 18.
    Linked Data vs.Open data
  • 19.
    Open Data andSMW  Make your SMW readable by everyone – editing can still be restricted to logged-in users  Include an open license – see https://www.mediawiki.org/wiki/Manual:Copyright – $wgRightsIcon = "$wgScriptPath/resources/assets/licenses/cc-by-sa.png";  Make it easy for users to access your data: – SMW puts a link to the RDF-representation of individual pages (via Special:ExportRDF) in the HTML automatically, see https://www.semantic-mediawiki.org/wiki/Help:RDF_export – create an RDF Dump https://www.semantic-mediawiki.org/wiki/Help:Maintenance_script_dumpRDF.php – indicate exporting of data on pages or lists (e. g. vCard, iCalendar, BibTex, KML) – provide export pages with explanations and several result formats (CSV, JSON, RDF)
  • 20.
    Linked Data andSMW  Re-use external vocabularies, most importantly Schema.org  Add vocabulary definitions to properties, but also to category pages  Use the page IDs of MediaWiki as unique identifiers  Use the data type External identifier to link to external sources
  • 21.
    21 Building your KnowledgeBase • page Vienna can have properties – number of inhabitants, located in, coordinates, WikidataID, … • properties can have various data types – page, text, number, date, URL, … – external identifier links to external resources • re-use external vocabularies – “Coordinates” imported from schema:geo • a page should be put into a category – Also category pages should re-use vocabularies: {{#set:Imported from=schema:City}}
  • 22.
    Unique IDs inMediaWiki and Special:URIResolver in SMW  Every page has a unique page ID  display it with the magic word: {{PAGEID}}  use it to link to a page without using the page name (that could change over time)  Use Special:URIResolver https://yourwiki/Special:URIResolver/?curid=34  supports content negotiation  Furthermore: IDs for every version of a page!
  • 23.
    Using External Vocabularies 1.Add/edit a page MediaWiki:Smw import schema 2. Instead of local datatype declarations, use {{#set:Imported from=schema:geo}} on the property page (e. g. Property:Coordinates) instead of {{#set:Has type=Geographic coordinates}} Add (or remove) vocabulary terms any time…
  • 24.
    Linking to externalidentifiers  Define a property  Assign datatype „External identifier“ – Links to external ids {{#set:Has type=External identifier |External formatter uri= http://www.wikidata.org/entity/$1}}  Look for other identifiers – ORCID https://orcid.org/ – GND – ….. Even better, use Schema.org: {{#set:Imported from::schema:sameAs}}
  • 25.
  • 26.
    Changing data typesin SMW https://www.caf-network.eu/MediaWiki:Smw_import_dcterms
  • 27.
  • 28.
  • 29.
    29 > 70 resultformats, supporting MediaWiki templates |format=moderntimeline |format=calendar |format=median |format=D3chart |format=gantt |format=tagcloud |format=json |format=rdf |format=bibtex …
  • 30.
    30 Maintaining your KnowledgeGraph Data/ontology curation: “semantic gardening” • user rights – admins, curators, users • property annotation health – Outdated properties/entities – Similar properties – Property uniqueness – Improper annotations and failed queries – Missing redirect annotations https://www.semantic-mediawiki.org/wiki/Semantic_gardening Inferencing • subcategories • subproperties • equality of pages: redirects • subqueries https://www.semantic-mediawiki.org /wiki/Help:Inferencing
  • 31.
    31 Semantic MediaWiki storageoptions SQL Store (default) • extra tables in the SQL store of MediaWiki ElasticStore • search engine, not a storage backend SPARQL/RDF Store • custom, default • Virtuoso • Blazegraph • Fuseki • Sesame • 4store easy (to install) harder to install but more powerful
  • 32.
    Examples  Vienna HistoryWiki – https://www.geschichtewiki.wien.gv.at  Knowledge Management Platform – https://www.wissensmanagement.gv.at  Austrian Public Sector Award – https://www.verwaltungspreis.gv.at  FINA Wiki (numismatic research) – https://fina.oeaw.ac.at/
  • 33.
  • 34.
    34 Hacking Semantic MediaWiki Getinvolved in the SMW community: www.semantic-mediawiki.org Join our Github account: https://github.com/SemanticMediaWiki/ Join mailing lists: https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki_mailing_lists Element/Matric/Telegram chat https://t.me/joinchat/MCG84k3OMoaYZoFA9yhyMg Social Media channels (Twitter, Mastodon, LinkedIn, Facebook, YouTube) Projects you should look into • https://canasta.wiki • https://www.open-csp.org
  • 35.
    SMW sponsorship: donatingtime https://www.semantic-mediawiki.org/wiki/Sponsorship Self-declaration
  • 36.
  • 37.
    1 2 Let’s collaborate onthe future of SMW!
  • 39.
    KMA Knowledge ManagementAssociates | Gersthofer Straße 162 | A-1180 Wien | office@km-a.net | www.km-a.net 39  Knowledge Management  Wiki consulting, Semantic MediaWiki  Open Government, Open Data Bernhard Krabina +43 676 5103593 Bernhard.krabina@km-a.net linkedin.com/in/krabina @krabina