Coupling Semantic MediaWiki with 
MASTRO 
Student: Albin Ahmeti 
Advisor: Prof. Maurizio Lenzerini 
Master thesis 26/01/2011
Outline 
• Mediawiki 
•Semantic Mediawiki (SMW) 
• Coupling SMW with MASTRO 
•SMWQuonto Control Panel 
• Conclusion 
• Future work 
Coupling SemanticMediaWiki with MASTRO 2
MediaWiki 
• MediaWiki (abbr. MW) is a free server-based software, licensed under the 
GNU General Public License, which runs Wikipedia 
• It has been widely used in a lots of companies as a content-management 
system, it provides fast page processing and short request time 
• PHP & Mysql in backend 
• It manages content mirroring, concurrent and conflicting page edits 
between users 
• The articles in Wikipedia are consisted of wiki text, which is actually, a 
bunch of plain text and a kind of lightweight markup language 
Coupling SemanticMediaWiki with MASTRO 3
MediaWiki (cont.) 
• Creating pages is simplified to minimum, one has just to type “[[Title]]” 
and then type the content, e.g. creating a wiki page titled “Rome”, 
[[Rome]] 
Coupling SemanticMediaWiki with MASTRO 4
MediaWiki (cont. ) 
• It distinguishes between pages using namespaces 
• Main: 
• User: 
• Help: 
• Talk: 
• Usage of templates: 
• Transclusion -> {{Template name}} 
• Substitution -> {{subst:Template name}} 
Coupling SemanticMediaWiki with MASTRO 5
Semantic MediaWiki 
• Semantic MediaWiki (abbr. SMW) is the most popular platform to date 
that encodes semantic data to the wiki articles 
• Introduces some basic syntax, a sort of metadata that is machine 
processable for each of the page constructs (link types) 
• SMW is built on top of MediaWiki (MW), it has been developed using 
the same technology as MW, i.e. tight coupling 
Coupling SemanticMediaWiki with MASTRO 6
Semantic MediaWiki 
• Annotation of pages 
Is page-centric oriented: 
 Categories 
 Properties 
 Attributes 
Coupling SemanticMediaWiki with MASTRO 7
Semantic MediaWiki 
•Categories 
 Are used to classify pages for better retrieval and organization 
 Correspond to Classes in OWL DL 
[[Category:City]] [[Category:Holy cities]] 
•Sub-categories 
 Same notation, but defined in the namespace Category: 
 Class inclusion in OWL DL -> Intensional knowledge 
 Holy Cities ⊑ City 
Coupling SemanticMediaWiki with MASTRO 8
Semantic MediaWiki 
• Properties 
 Link types (relations) between wiki pages, i.e. hyperlinks 
 Correspond to OWL Object Property 
Rome is capital of [[Italy]] 
Rome is capital of [[capital of::Italy]] 
• Subproperties 
 are defined in the namespace Property: 
 [[subproperty of::Property:Located In]] 
 capital of ⊑ Located In 
Coupling SemanticMediaWiki with MASTRO 9
Semantic MediaWiki 
•Attributes 
 Relations between a wiki page and a datatype 
Rome has population 2,700,000. 
Rome has population [[population::2,700,000]] 
A property can be changed to an attribute, by giving a meaningful datatype 
in Property namespace: 
[[Has type::number]] 
Attributes in wiki pages correspond to OWL Data Type Property, Annotation 
Property and Object Property 
Coupling SemanticMediaWiki with MASTRO 10
Semantic MediaWiki 
•Querying in SMW (inline queries) 
{{#ask: [[Category:City]] [[Located in::Italy]] 
|?Population 
|?Area 
|sort=Population, Area 
|order=descending, ascending 
}} 
Coupling SemanticMediaWiki with MASTRO 11
Semantic MediaWiki 
{{#ask: [[Category:Student]][[degree::!Sapienza]] [[age::>24]][[age::<30]] 
|?name 
|?surname 
|?age 
}} 
CWA -> easy to evaluate 
OWA -> not easy to evaluate, ontology does not have complete information 
Coupling SemanticMediaWiki with MASTRO 12
Semantic MediaWiki 
• Architecture of SMW 
Coupling SemanticMediaWiki with MASTRO 13
Semantic MediaWiki 
SMW has 
• rather limited expressivity 
• no disjoint classes 
• no disjoint properties 
• no functionalities 
• no inverses 
• query language that does not allow joins and explicit variables 
What, if we add more expressivity and keep the reasoning tasks (query answering) polynomial 
? 
Coupling SemanticMediaWiki with MASTRO 14
Our approach 
•We use QuOnto as a reasoner 
• Use expressivity of DL-Lite, reasoning tasks are polynomial wrt to the size 
of the ontology: 
 query answering 
 subsumption 
 ontology satisfiability 
 instance checking 
• Use Union of Conjunctive Queries (UCQs) for posing queries 
 can express joins 
 allow variables 
 coincide with SELECT-PROJECT-JOIN SQL queries 
 LOGSPACE wrt to the data complexity 
Coupling SemanticMediaWiki with MASTRO 15
Our approach 
• DL-Litecore 
B → A | ∃R R → P | P − 
C → B | ¬B E → R | ¬R 
B ⊑ C 
A denotes an atomic concept, P an atomic role and P − its inverse. 
B denotes a basic concept, R is a basic role 
A(a) P (a, b) 
DL-LiteF 
(funct R) 
DL-LiteR 
R ⊑ E 
Coupling SemanticMediaWiki with MASTRO 16
Our approach 
•Architecture 
Coupling SemanticMediaWiki with MASTRO 17
Exploration vs. Exploitation 18
Our approach 
Besides, we added four types of assertions, not available in SMW: 
Coupling SemanticMediaWiki with MASTRO 19
Our approach 
Coupling SemanticMediaWiki with Mastro 20
Our approach 
DL-Lite vs OWL DL interpretation of SMW annotations 
Coupling SemanticMediaWiki with MASTRO 21
Our approach 
•Query translation 
• query translation from #ask to function free positive LP (logic 
programming) rules, obtaining minimal Herbrand model 
semantics in the process, based on JieBao et al. (captures 
SMW 1.4.2) 
 n-ary predicates removed 
 Adjustments (tunings) to deal with latest SMW 1.5.x 
(mainly code-based) 
Coupling SemanticMediaWiki with MASTRO 22
Our approach 
•Query translation 
Translation from SMW-QL to Logic Program, as defined in the paper 
“Knowledge Representation and Query in Semantic MediaWiki: A Formal Study”-JieBao et al. 
Coupling SemanticMediaWiki with MASTRO 23
Our approach 
•Query translation 
1) map query annotations to Logic counterparts (based on schema) 
2) Replace body atoms 
3) Perform Depth-First-Search on rules 
Coupling SemanticMediaWiki with MASTRO 24
Our approach 
•Query translation 
{{#ask: [[Category:Student]] 
[[enrolled in:: <q>[[Category:Sapienza]]</q> || DIS]] }} 
Q(x) :- L(x) 
L(x) :- A1(x) ,A2(x) 
A1(x) :- Student(x) 
A2(x) :- enrolledIn(x, y), N(y) ... (1) 
N(x) :- Sapienza(x) 
N(x) :- x:=DIS 
Coupling SemanticMediaWiki with MASTRO 25
Our approach 
•Query translation 
Replace body of the queries with head definitions (occurring once) 
Q(x):- Student(x), enrolledIn(x, y), N(y) 
N(x):- Sapienza(x) 
N(x):- x:=DIS … (2) 
Coupling SemanticMediaWiki with MASTRO 26
Our approach 
•Query translation 
• Apply Depth First Search (DFS) on rules (2): 
Q(x):- Student(x), enrolledIn(x, y), N(y) 
N(x):- Sapienza(x) N(x):- x:=DIS 
Coupling SemanticMediaWiki with MASTRO 27
Our approach 
•Query translation 
After applying DFS, we got Union Of Conjunctive Queries (UCQs): 
Q(x):-Student(x), enrolledIn(x, y), Sapienza(y) 
Q(x):-Student(x), enrolledIn(x, ‘DIS’) 
Coupling SemanticMediaWiki with MASTRO 28
Our approach 
•Query translation 
{{#ask: [[Category:Student]] 
[[enrolled in:: <q>[[Category:Sapienza]]</q> || DIS]] }} 
• It returns all students that are in class Sapienza, and all its subclasses 
(research centers, affiliations, etc.) 
• This is fulfilled thanks to the PerfectRef, implemented in QuOnto, so no 
special algorithm needed! 
Coupling SemanticMediaWiki with MASTRO 29
SMWQuonto Control Panel 
DBMS 
it manages ABox 
XML file 
manages TBox 
Coupling SemanticMediaWiki with Mastro 30
SMWQuonto Control Panel 
Semantic data 
pushed to QuOnto 
Coupling SemanticMediaWiki with Mastro 31
SMWQuonto Control Panel 
AskQL query 
posed to QuOnto. 
UCQs obtained 
Coupling SemanticMediaWiki with Mastro 32 
Results
Conclusion 
•We managed to create a system, which offers a higher level of expressivity, 
than the one offered by SMW, by maintaining complexity polynomial 
• It can deal with a hundred thousands of pages (instances), e.g. Wikipedia 
•We have more expressive query language – UCQ 
– Constraining !, <, >, ~ , which can be dealt as well imposing EQL constraints on UCQ – 
SparSQL 
– Any extension in expressivity of askQL query language, makes the complexity NP-hard 
• It is meant to be “proposal” for alternative triple-store used in SMW 
• It can be considered as a SMW extension 
Coupling SemanticMediaWiki with MASTRO 33
Future work 
• Use SparSQL query language in order to deal with special constructs (!, <, >, 
∼), hence by fully capturing the askQL queries. 
• Provide a more scalable coupling between the two architectures, e.g., 
SOAP and Web Services 
• Grab semantic data from more than one page, using RDF/XML output 
• Grab category hierarchy at once, thus by having the taxonomy of the ontology 
Coupling SemanticMediaWiki with MASTRO 34
Questions ? 
Coupling SemanticMediaWiki with MASTRO 35

Coupling Semantic MediaWiki with MASTRO

  • 1.
    Coupling Semantic MediaWikiwith MASTRO Student: Albin Ahmeti Advisor: Prof. Maurizio Lenzerini Master thesis 26/01/2011
  • 2.
    Outline • Mediawiki •Semantic Mediawiki (SMW) • Coupling SMW with MASTRO •SMWQuonto Control Panel • Conclusion • Future work Coupling SemanticMediaWiki with MASTRO 2
  • 3.
    MediaWiki • MediaWiki(abbr. MW) is a free server-based software, licensed under the GNU General Public License, which runs Wikipedia • It has been widely used in a lots of companies as a content-management system, it provides fast page processing and short request time • PHP & Mysql in backend • It manages content mirroring, concurrent and conflicting page edits between users • The articles in Wikipedia are consisted of wiki text, which is actually, a bunch of plain text and a kind of lightweight markup language Coupling SemanticMediaWiki with MASTRO 3
  • 4.
    MediaWiki (cont.) •Creating pages is simplified to minimum, one has just to type “[[Title]]” and then type the content, e.g. creating a wiki page titled “Rome”, [[Rome]] Coupling SemanticMediaWiki with MASTRO 4
  • 5.
    MediaWiki (cont. ) • It distinguishes between pages using namespaces • Main: • User: • Help: • Talk: • Usage of templates: • Transclusion -> {{Template name}} • Substitution -> {{subst:Template name}} Coupling SemanticMediaWiki with MASTRO 5
  • 6.
    Semantic MediaWiki •Semantic MediaWiki (abbr. SMW) is the most popular platform to date that encodes semantic data to the wiki articles • Introduces some basic syntax, a sort of metadata that is machine processable for each of the page constructs (link types) • SMW is built on top of MediaWiki (MW), it has been developed using the same technology as MW, i.e. tight coupling Coupling SemanticMediaWiki with MASTRO 6
  • 7.
    Semantic MediaWiki •Annotation of pages Is page-centric oriented:  Categories  Properties  Attributes Coupling SemanticMediaWiki with MASTRO 7
  • 8.
    Semantic MediaWiki •Categories  Are used to classify pages for better retrieval and organization  Correspond to Classes in OWL DL [[Category:City]] [[Category:Holy cities]] •Sub-categories  Same notation, but defined in the namespace Category:  Class inclusion in OWL DL -> Intensional knowledge  Holy Cities ⊑ City Coupling SemanticMediaWiki with MASTRO 8
  • 9.
    Semantic MediaWiki •Properties  Link types (relations) between wiki pages, i.e. hyperlinks  Correspond to OWL Object Property Rome is capital of [[Italy]] Rome is capital of [[capital of::Italy]] • Subproperties  are defined in the namespace Property:  [[subproperty of::Property:Located In]]  capital of ⊑ Located In Coupling SemanticMediaWiki with MASTRO 9
  • 10.
    Semantic MediaWiki •Attributes  Relations between a wiki page and a datatype Rome has population 2,700,000. Rome has population [[population::2,700,000]] A property can be changed to an attribute, by giving a meaningful datatype in Property namespace: [[Has type::number]] Attributes in wiki pages correspond to OWL Data Type Property, Annotation Property and Object Property Coupling SemanticMediaWiki with MASTRO 10
  • 11.
    Semantic MediaWiki •Queryingin SMW (inline queries) {{#ask: [[Category:City]] [[Located in::Italy]] |?Population |?Area |sort=Population, Area |order=descending, ascending }} Coupling SemanticMediaWiki with MASTRO 11
  • 12.
    Semantic MediaWiki {{#ask:[[Category:Student]][[degree::!Sapienza]] [[age::>24]][[age::<30]] |?name |?surname |?age }} CWA -> easy to evaluate OWA -> not easy to evaluate, ontology does not have complete information Coupling SemanticMediaWiki with MASTRO 12
  • 13.
    Semantic MediaWiki •Architecture of SMW Coupling SemanticMediaWiki with MASTRO 13
  • 14.
    Semantic MediaWiki SMWhas • rather limited expressivity • no disjoint classes • no disjoint properties • no functionalities • no inverses • query language that does not allow joins and explicit variables What, if we add more expressivity and keep the reasoning tasks (query answering) polynomial ? Coupling SemanticMediaWiki with MASTRO 14
  • 15.
    Our approach •Weuse QuOnto as a reasoner • Use expressivity of DL-Lite, reasoning tasks are polynomial wrt to the size of the ontology:  query answering  subsumption  ontology satisfiability  instance checking • Use Union of Conjunctive Queries (UCQs) for posing queries  can express joins  allow variables  coincide with SELECT-PROJECT-JOIN SQL queries  LOGSPACE wrt to the data complexity Coupling SemanticMediaWiki with MASTRO 15
  • 16.
    Our approach •DL-Litecore B → A | ∃R R → P | P − C → B | ¬B E → R | ¬R B ⊑ C A denotes an atomic concept, P an atomic role and P − its inverse. B denotes a basic concept, R is a basic role A(a) P (a, b) DL-LiteF (funct R) DL-LiteR R ⊑ E Coupling SemanticMediaWiki with MASTRO 16
  • 17.
    Our approach •Architecture Coupling SemanticMediaWiki with MASTRO 17
  • 18.
  • 19.
    Our approach Besides,we added four types of assertions, not available in SMW: Coupling SemanticMediaWiki with MASTRO 19
  • 20.
    Our approach CouplingSemanticMediaWiki with Mastro 20
  • 21.
    Our approach DL-Litevs OWL DL interpretation of SMW annotations Coupling SemanticMediaWiki with MASTRO 21
  • 22.
    Our approach •Querytranslation • query translation from #ask to function free positive LP (logic programming) rules, obtaining minimal Herbrand model semantics in the process, based on JieBao et al. (captures SMW 1.4.2)  n-ary predicates removed  Adjustments (tunings) to deal with latest SMW 1.5.x (mainly code-based) Coupling SemanticMediaWiki with MASTRO 22
  • 23.
    Our approach •Querytranslation Translation from SMW-QL to Logic Program, as defined in the paper “Knowledge Representation and Query in Semantic MediaWiki: A Formal Study”-JieBao et al. Coupling SemanticMediaWiki with MASTRO 23
  • 24.
    Our approach •Querytranslation 1) map query annotations to Logic counterparts (based on schema) 2) Replace body atoms 3) Perform Depth-First-Search on rules Coupling SemanticMediaWiki with MASTRO 24
  • 25.
    Our approach •Querytranslation {{#ask: [[Category:Student]] [[enrolled in:: <q>[[Category:Sapienza]]</q> || DIS]] }} Q(x) :- L(x) L(x) :- A1(x) ,A2(x) A1(x) :- Student(x) A2(x) :- enrolledIn(x, y), N(y) ... (1) N(x) :- Sapienza(x) N(x) :- x:=DIS Coupling SemanticMediaWiki with MASTRO 25
  • 26.
    Our approach •Querytranslation Replace body of the queries with head definitions (occurring once) Q(x):- Student(x), enrolledIn(x, y), N(y) N(x):- Sapienza(x) N(x):- x:=DIS … (2) Coupling SemanticMediaWiki with MASTRO 26
  • 27.
    Our approach •Querytranslation • Apply Depth First Search (DFS) on rules (2): Q(x):- Student(x), enrolledIn(x, y), N(y) N(x):- Sapienza(x) N(x):- x:=DIS Coupling SemanticMediaWiki with MASTRO 27
  • 28.
    Our approach •Querytranslation After applying DFS, we got Union Of Conjunctive Queries (UCQs): Q(x):-Student(x), enrolledIn(x, y), Sapienza(y) Q(x):-Student(x), enrolledIn(x, ‘DIS’) Coupling SemanticMediaWiki with MASTRO 28
  • 29.
    Our approach •Querytranslation {{#ask: [[Category:Student]] [[enrolled in:: <q>[[Category:Sapienza]]</q> || DIS]] }} • It returns all students that are in class Sapienza, and all its subclasses (research centers, affiliations, etc.) • This is fulfilled thanks to the PerfectRef, implemented in QuOnto, so no special algorithm needed! Coupling SemanticMediaWiki with MASTRO 29
  • 30.
    SMWQuonto Control Panel DBMS it manages ABox XML file manages TBox Coupling SemanticMediaWiki with Mastro 30
  • 31.
    SMWQuonto Control Panel Semantic data pushed to QuOnto Coupling SemanticMediaWiki with Mastro 31
  • 32.
    SMWQuonto Control Panel AskQL query posed to QuOnto. UCQs obtained Coupling SemanticMediaWiki with Mastro 32 Results
  • 33.
    Conclusion •We managedto create a system, which offers a higher level of expressivity, than the one offered by SMW, by maintaining complexity polynomial • It can deal with a hundred thousands of pages (instances), e.g. Wikipedia •We have more expressive query language – UCQ – Constraining !, <, >, ~ , which can be dealt as well imposing EQL constraints on UCQ – SparSQL – Any extension in expressivity of askQL query language, makes the complexity NP-hard • It is meant to be “proposal” for alternative triple-store used in SMW • It can be considered as a SMW extension Coupling SemanticMediaWiki with MASTRO 33
  • 34.
    Future work •Use SparSQL query language in order to deal with special constructs (!, <, >, ∼), hence by fully capturing the askQL queries. • Provide a more scalable coupling between the two architectures, e.g., SOAP and Web Services • Grab semantic data from more than one page, using RDF/XML output • Grab category hierarchy at once, thus by having the taxonomy of the ontology Coupling SemanticMediaWiki with MASTRO 34
  • 35.
    Questions ? CouplingSemanticMediaWiki with MASTRO 35