SlideShare a Scribd company logo
1 of 27
Download to read offline
A Multilingual Semantic Wiki Based on
Controlled Natural Language
Tobias Kuhn
Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich,
Switzerland
Insight, National University of Ireland, Galway
19 August 2014
About This Talk
This talk is mainly based on the following papers:
Kaarel Kaljurand and Tobias Kuhn. A Multilingual Semantic Wiki
Based on Attempto Controlled English and Grammatical Framework.
In Proceedings of the 10th Extended Semantic Web Conference
(ESWC). 2013.
http://purl.org/tkuhn/eswc2013acewikigf
Kaarel Kaljurand, Tobias Kuhn, and Laura Canedo. Collaborative
multilingual knowledge management based on controlled natural
language. Semantic Web. Accepted, to appear.
http://www.semantic-web-journal.net/system/files/swj524.pdf
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 2 / 27
Imagine ...
... that Wikipedia can check consistency and answer
questions about the contained knowledge, and
... that all content is instantly available in all
languages!
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 3 / 27
• AceWiki is a semantic wiki
• Articles are written in Attempto Controlled English (ACE)
• These sentences are internally translated into the Semantic Web
language OWL
• An OWL reasoner is built in to answer questions and detect
inconsistencies
• Special editor for writing ACE statements
• Extended to support multilinguality
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 4 / 27
Monolingual AceWiki: Screenshot
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 5 / 27
Attempto Controlled English (ACE)
Subset of natural English:
• Conjunction, disjunction, negation, if-then, ...
• Anaphoric references: pronouns, definite noun phrases, variables
• Quantifiers: every, no, at least 3, ...
• Content words: proper names, nouns, verbs, adjectives, ...
Grammar is fixed, but users can change content words.
Deterministic ambiguity handling:
• Anaphora resolution (France borders Spain and it borders
Portugal.)
• Quantifier scope (Every country borders a country.)
• Attachment (Every EU-country borders a country that is an
EU-country and is a NATO-country.)
Well-defined translations to and from first-order logic, OWL, ...
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 6 / 27
Predictive Editor
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 7 / 27
Consistency Checking
AceWiki ensures consistency by checking every new statement:
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 8 / 27
Question Answering
AceWiki supports simple wh-questions:
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 9 / 27
Monolingual AceWiki: Demo
http://attempto.ifi.uzh.ch/acewiki/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 10 / 27
ACE Reasoning via Translation to OWL
Every country that does not border a sea is a landlocked-country.
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom(
:border
:sea
)
)
)
:landlocked-country
)
Which country is a landlocked-country?
ObjectIntersectionOf(
:country
:landlocked-country
)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 11 / 27
Evaluation
Two small usability experiments with earlier versions of AceWiki:
• Altogether 26 untrained participants
• Task: Collaborative creation of a knowledge base
Results:
• 78%-81% of the sentences were correct and sensible
• 61%-70% of them were complex (containing negations,
implications, disjunctions or number restrictions)
• Creation of a correct sentence every 5–6 minutes
• Definition of a new word every 5–7 minutes
→ Even untrained users can effectively use AceWiki
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 12 / 27
Multilingual AceWiki: AceWiki-GF
General ideas:
• Make wiki content available in different languages
• Automatically translated content using high-quality rule-based
machine translation: Grammatical Framework (GF)
• Language switching like in Wikipedia
• Localization of the user interface
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 13 / 27
Grammatical Framework (GF)
GF is a framework for multilingual grammar engineering:
• Rule-based
• Functional programming language (based on Haskell) optimized
to handle natural language
• Resource Grammar Library implementing common morphological
and syntactic structures
• Mildly context sensitive
• Bidirectional translations: concrete language ⇔ abstract syntax
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 14 / 27
GF grammars and translations
GF grammars consist of:
• One language-neutral abstract syntax
• Concrete syntaxes specify words, agreement, word order, etc. by
implementing the abstract categories and functions
Example
border : Country -> Country -> Relation
English: border x y = x!Nom + "borders" + y!Nom
Estonian: border x y = x!Gen + "naaber on" + y!Nom
GF translations consist of:
• First, parse a string in the original language to a tree (or trees)
in the abstract syntax
• Then, linearize these trees as strings in the target language
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 15 / 27
Multilingual AceWiki: Screenshot
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 16 / 27
Multilingual AceWiki: Demo
http://attempto.ifi.uzh.ch/acewiki-gf/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 17 / 27
ACE-in-GF
• Multiple controlled versions of natural languages that map to
ACE (and to each other)
• As a result, they can be bidirectionally mapped to various formal
languages already supported by ACE
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 18 / 27
ACE-in-GF: Example
German: Jedes Land, das nicht an ein Meer grenzt, ist ein
Binnenland.
ACE-in-GF tree:
baseText (sText (s (vpS (everyNP (relCN (cn_as_VarCN country_CN)
(neg_predRS which_RP (v2VP border_V2 (thereNP_as_NP
(aNP (cn_as_VarCN sea_CN))))))) (npVP (thereNP_as_NP
(aNP (cn_as_VarCN landlocked_country_CN)))))))
ACE: Every country that does not border a sea is a
landlocked-country.
OWL:
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom( :border :sea )
)
)
:landlocked-country
)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 19 / 27
ACE-in-GF: Implementation
Implementation of the ACE syntax:
• Targeting the subset of ACE that can be mapped to OWL
• Almost 100% coverage at almost 0% ambiguity
Support of most RGL languages:
• Bulgarian, Catalan, Chinese, Danish, Dutch, English, Finnish,
French, German, Greek, Hindi, Italian, Latvian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish, Thai, Urdu
• RGL-based design provides automatic increase in quality and
language-coverage over time
Status
• Some precision problems, e.g. with anaphoric references
• Ambiguity and coverage problems in some languages
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 20 / 27
Evaluation of AceWiki-GF
Hypothesis: A group of users reaches almost the same level of agreement
on the content of an article presented to them in different languages as
when the article is presented to all of them in the same language.
Design
• Based on a 500-word lexicon on European geography in three
languages: English, German and Spanish
• 30 participants accessed AceWiki-GF and wrote sentences in
their language (10 participants for each language)
• They had to enter true and false sentences and tag them as such
• In a post-editing task, each participant checked the output of
two other participants: one translated from another language
and one written in the same language (true/false tags were
removed and sentences shuffled); they were asked to remove all
false sentences
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 21 / 27
Evaluation of AceWiki-GF: Results
30 participants spent on average 37 minutes using AceWiki-GF,
creating 316 sentences in total.
Definition of agreement level: (Tk + Fd )/S
S is the total number of sentences, Tk the number of sentences marked as true
and kept, and Fd the ones marked as false and deleted
Agreement level (difference is not significant):
82.2%without translation
84.0%with translation
0% 25% 50% 75% 100%
agreement level
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 22 / 27
Evaluation of AceWiki-GF: Results
Assumption: Translation introduces a constant translation error rate r
that has the effect of reducing the agreement level a to (1 − r) · a.
New hypothesis: The translation error rate is less than 5%.
78.1%with hypothetical translation (r = 5%)
84.0%with translation
0% 25% 50% 75% 100%
agreement level
p-value with one-tailed Wilcoxon signed-rank test: 0.046
→ With AceWiki-GF, translation error rate is less than 5%
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 23 / 27
Evaluation of AceWiki-GF: Feedback
Questionnaire for the participants contained these questions:
1 Was AceWiki Geography easy or difficult to use in general?
2 Was the sentence editor easy or difficult to use?
3 Was creating true and false statements easy or difficult to
perform?
Possible answers: “very difficult” (0), “difficult” (1), “medium” (2),
“easy” (3), and “very easy” (4)
Results:
1 Average: 2.93 (∼“easy”)
2 Average: 2.77 (∼“easy”)
3 Average: 2.70 (∼“easy”)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 24 / 27
The Future...?
Can we make a truly multilingual Wikipedia?
• Store main content in a semantic representation
• Verbalization in different languages
• All content is instantly available in all languages (once the
required vocabulary is defined)
• Breaking the current dominance of English and putting an end
to the lock-out of users speaking less widespread or
underrepresented languages
• Contributing to the Semantic Web
Related:
• http://www.wikidata.org
• http://meta.wikimedia.org/wiki/A_proposal_towards_a_
multilingual_Wikipedia
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 25 / 27
Links
ACE parser (APE) source code: https://github.com/Attempto/APE
ACE-in-GF source code: http://github.com/Attempto/ACE-in-GF
AceWiki and AceWikiGF
• Source code: http://github.com/AceWiki/AceWiki
• Demos (non-GF): http://attempto.ifi.uzh.ch/acewiki/
• Demos (GF): http://attempto.ifi.uzh.ch/acewiki-gf/
MOLTO project web site: http://www.molto-project.eu
Attempto web site: http://attempto.ifi.uzh.ch
GF: http://www.grammaticalframework.org
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 26 / 27
Thank you for your Attention!
If you are interested in Controlled Natural Languages,
come visit us at CNL 2014!
CNL
2014
Fourth Workshop on Controlled Natural Language
20–22 August 2014, Galway
http://attempto.ifi.uzh.ch/site/cnl2014/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 27 / 27

More Related Content

Viewers also liked

Controlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationControlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationTobias Kuhn
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesTobias Kuhn
 
Underspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsUnderspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsTobias Kuhn
 
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset ClassIoana Stoica
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiTobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
 
Broadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsBroadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsTobias Kuhn
 

Viewers also liked (11)

Controlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationControlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for Standardization
 
AceWiki
AceWikiAceWiki
AceWiki
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural Languages
 
Underspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsUnderspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in Nanopublications
 
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic Wiki
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Broadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsBroadening the Scope of Nanopublications
Broadening the Scope of Nanopublications
 

Similar to A Multilingual Semantic Wiki Based on Controlled Natural Language

How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisTobias Kuhn
 
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...MeetupDataScienceRoma
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceGerard de Melo
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionSebastian Ruder
 
Multilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingMultilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingScott A. Hale
 
What you Can Make Out of Linked Data
What you Can Make Out of Linked DataWhat you Can Make Out of Linked Data
What you Can Make Out of Linked DataMarco Fossati
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Janifer Gatenby
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...Digital Classicist Seminar Berlin
 
natasha.ppt
natasha.pptnatasha.ppt
natasha.pptMrBrave
 
Language commons wiki_final
Language commons wiki_finalLanguage commons wiki_final
Language commons wiki_finalEd Bice
 
Umd draft-2010 jun22
Umd draft-2010 jun22Umd draft-2010 jun22
Umd draft-2010 jun22Ed Bice
 
Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Yunyao Li
 
Linked Open Data Cloud
Linked Open Data CloudLinked Open Data Cloud
Linked Open Data CloudPretaLLOD
 
Flexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldFlexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldAlannah Fitzgerald
 
Wikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemWikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemJakob .
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Sebastian Ruder
 
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...Alannah Fitzgerald
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesValeria de Paiva
 

Similar to A Multilingual Semantic Wiki Based on Controlled Natural Language (20)

How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic Wikis
 
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined Evidence
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary Induction
 
Multilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingMultilinguals and Wikipedia Editing
Multilinguals and Wikipedia Editing
 
What you Can Make Out of Linked Data
What you Can Make Out of Linked DataWhat you Can Make Out of Linked Data
What you Can Make Out of Linked Data
 
Programing Language
Programing LanguagePrograming Language
Programing Language
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
 
natasha.ppt
natasha.pptnatasha.ppt
natasha.ppt
 
Language commons wiki_final
Language commons wiki_finalLanguage commons wiki_final
Language commons wiki_final
 
Umd draft-2010 jun22
Umd draft-2010 jun22Umd draft-2010 jun22
Umd draft-2010 jun22
 
Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)
 
Linked Open Data Cloud
Linked Open Data CloudLinked Open Data Cloud
Linked Open Data Cloud
 
Flexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldFlexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual World
 
Wikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemWikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization System
 
List of wikipedias
List of wikipediasList of wikipedias
List of wikipedias
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
 
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and Machines
 

More from Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischTobias Kuhn
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsTobias Kuhn
 

More from Tobias Kuhn (17)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Nanopubs
NanopubsNanopubs
Nanopubs
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem Englisch
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
 

Recently uploaded

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 

Recently uploaded (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 

A Multilingual Semantic Wiki Based on Controlled Natural Language

  • 1. A Multilingual Semantic Wiki Based on Controlled Natural Language Tobias Kuhn Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich, Switzerland Insight, National University of Ireland, Galway 19 August 2014
  • 2. About This Talk This talk is mainly based on the following papers: Kaarel Kaljurand and Tobias Kuhn. A Multilingual Semantic Wiki Based on Attempto Controlled English and Grammatical Framework. In Proceedings of the 10th Extended Semantic Web Conference (ESWC). 2013. http://purl.org/tkuhn/eswc2013acewikigf Kaarel Kaljurand, Tobias Kuhn, and Laura Canedo. Collaborative multilingual knowledge management based on controlled natural language. Semantic Web. Accepted, to appear. http://www.semantic-web-journal.net/system/files/swj524.pdf Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 2 / 27
  • 3. Imagine ... ... that Wikipedia can check consistency and answer questions about the contained knowledge, and ... that all content is instantly available in all languages! Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 3 / 27
  • 4. • AceWiki is a semantic wiki • Articles are written in Attempto Controlled English (ACE) • These sentences are internally translated into the Semantic Web language OWL • An OWL reasoner is built in to answer questions and detect inconsistencies • Special editor for writing ACE statements • Extended to support multilinguality Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 4 / 27
  • 5. Monolingual AceWiki: Screenshot Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 5 / 27
  • 6. Attempto Controlled English (ACE) Subset of natural English: • Conjunction, disjunction, negation, if-then, ... • Anaphoric references: pronouns, definite noun phrases, variables • Quantifiers: every, no, at least 3, ... • Content words: proper names, nouns, verbs, adjectives, ... Grammar is fixed, but users can change content words. Deterministic ambiguity handling: • Anaphora resolution (France borders Spain and it borders Portugal.) • Quantifier scope (Every country borders a country.) • Attachment (Every EU-country borders a country that is an EU-country and is a NATO-country.) Well-defined translations to and from first-order logic, OWL, ... Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 6 / 27
  • 7. Predictive Editor Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 7 / 27
  • 8. Consistency Checking AceWiki ensures consistency by checking every new statement: Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 8 / 27
  • 9. Question Answering AceWiki supports simple wh-questions: Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 9 / 27
  • 10. Monolingual AceWiki: Demo http://attempto.ifi.uzh.ch/acewiki/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 10 / 27
  • 11. ACE Reasoning via Translation to OWL Every country that does not border a sea is a landlocked-country. SubClassOf( ObjectIntersectionOf( :country ObjectComplementOf( ObjectSomeValuesFrom( :border :sea ) ) ) :landlocked-country ) Which country is a landlocked-country? ObjectIntersectionOf( :country :landlocked-country ) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 11 / 27
  • 12. Evaluation Two small usability experiments with earlier versions of AceWiki: • Altogether 26 untrained participants • Task: Collaborative creation of a knowledge base Results: • 78%-81% of the sentences were correct and sensible • 61%-70% of them were complex (containing negations, implications, disjunctions or number restrictions) • Creation of a correct sentence every 5–6 minutes • Definition of a new word every 5–7 minutes → Even untrained users can effectively use AceWiki Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 12 / 27
  • 13. Multilingual AceWiki: AceWiki-GF General ideas: • Make wiki content available in different languages • Automatically translated content using high-quality rule-based machine translation: Grammatical Framework (GF) • Language switching like in Wikipedia • Localization of the user interface Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 13 / 27
  • 14. Grammatical Framework (GF) GF is a framework for multilingual grammar engineering: • Rule-based • Functional programming language (based on Haskell) optimized to handle natural language • Resource Grammar Library implementing common morphological and syntactic structures • Mildly context sensitive • Bidirectional translations: concrete language ⇔ abstract syntax Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 14 / 27
  • 15. GF grammars and translations GF grammars consist of: • One language-neutral abstract syntax • Concrete syntaxes specify words, agreement, word order, etc. by implementing the abstract categories and functions Example border : Country -> Country -> Relation English: border x y = x!Nom + "borders" + y!Nom Estonian: border x y = x!Gen + "naaber on" + y!Nom GF translations consist of: • First, parse a string in the original language to a tree (or trees) in the abstract syntax • Then, linearize these trees as strings in the target language Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 15 / 27
  • 16. Multilingual AceWiki: Screenshot Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 16 / 27
  • 17. Multilingual AceWiki: Demo http://attempto.ifi.uzh.ch/acewiki-gf/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 17 / 27
  • 18. ACE-in-GF • Multiple controlled versions of natural languages that map to ACE (and to each other) • As a result, they can be bidirectionally mapped to various formal languages already supported by ACE Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 18 / 27
  • 19. ACE-in-GF: Example German: Jedes Land, das nicht an ein Meer grenzt, ist ein Binnenland. ACE-in-GF tree: baseText (sText (s (vpS (everyNP (relCN (cn_as_VarCN country_CN) (neg_predRS which_RP (v2VP border_V2 (thereNP_as_NP (aNP (cn_as_VarCN sea_CN))))))) (npVP (thereNP_as_NP (aNP (cn_as_VarCN landlocked_country_CN))))))) ACE: Every country that does not border a sea is a landlocked-country. OWL: SubClassOf( ObjectIntersectionOf( :country ObjectComplementOf( ObjectSomeValuesFrom( :border :sea ) ) ) :landlocked-country ) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 19 / 27
  • 20. ACE-in-GF: Implementation Implementation of the ACE syntax: • Targeting the subset of ACE that can be mapped to OWL • Almost 100% coverage at almost 0% ambiguity Support of most RGL languages: • Bulgarian, Catalan, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hindi, Italian, Latvian, Norwegian, Polish, Romanian, Russian, Spanish, Swedish, Thai, Urdu • RGL-based design provides automatic increase in quality and language-coverage over time Status • Some precision problems, e.g. with anaphoric references • Ambiguity and coverage problems in some languages Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 20 / 27
  • 21. Evaluation of AceWiki-GF Hypothesis: A group of users reaches almost the same level of agreement on the content of an article presented to them in different languages as when the article is presented to all of them in the same language. Design • Based on a 500-word lexicon on European geography in three languages: English, German and Spanish • 30 participants accessed AceWiki-GF and wrote sentences in their language (10 participants for each language) • They had to enter true and false sentences and tag them as such • In a post-editing task, each participant checked the output of two other participants: one translated from another language and one written in the same language (true/false tags were removed and sentences shuffled); they were asked to remove all false sentences Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 21 / 27
  • 22. Evaluation of AceWiki-GF: Results 30 participants spent on average 37 minutes using AceWiki-GF, creating 316 sentences in total. Definition of agreement level: (Tk + Fd )/S S is the total number of sentences, Tk the number of sentences marked as true and kept, and Fd the ones marked as false and deleted Agreement level (difference is not significant): 82.2%without translation 84.0%with translation 0% 25% 50% 75% 100% agreement level Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 22 / 27
  • 23. Evaluation of AceWiki-GF: Results Assumption: Translation introduces a constant translation error rate r that has the effect of reducing the agreement level a to (1 − r) · a. New hypothesis: The translation error rate is less than 5%. 78.1%with hypothetical translation (r = 5%) 84.0%with translation 0% 25% 50% 75% 100% agreement level p-value with one-tailed Wilcoxon signed-rank test: 0.046 → With AceWiki-GF, translation error rate is less than 5% Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 23 / 27
  • 24. Evaluation of AceWiki-GF: Feedback Questionnaire for the participants contained these questions: 1 Was AceWiki Geography easy or difficult to use in general? 2 Was the sentence editor easy or difficult to use? 3 Was creating true and false statements easy or difficult to perform? Possible answers: “very difficult” (0), “difficult” (1), “medium” (2), “easy” (3), and “very easy” (4) Results: 1 Average: 2.93 (∼“easy”) 2 Average: 2.77 (∼“easy”) 3 Average: 2.70 (∼“easy”) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 24 / 27
  • 25. The Future...? Can we make a truly multilingual Wikipedia? • Store main content in a semantic representation • Verbalization in different languages • All content is instantly available in all languages (once the required vocabulary is defined) • Breaking the current dominance of English and putting an end to the lock-out of users speaking less widespread or underrepresented languages • Contributing to the Semantic Web Related: • http://www.wikidata.org • http://meta.wikimedia.org/wiki/A_proposal_towards_a_ multilingual_Wikipedia Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 25 / 27
  • 26. Links ACE parser (APE) source code: https://github.com/Attempto/APE ACE-in-GF source code: http://github.com/Attempto/ACE-in-GF AceWiki and AceWikiGF • Source code: http://github.com/AceWiki/AceWiki • Demos (non-GF): http://attempto.ifi.uzh.ch/acewiki/ • Demos (GF): http://attempto.ifi.uzh.ch/acewiki-gf/ MOLTO project web site: http://www.molto-project.eu Attempto web site: http://attempto.ifi.uzh.ch GF: http://www.grammaticalframework.org Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 26 / 27
  • 27. Thank you for your Attention! If you are interested in Controlled Natural Languages, come visit us at CNL 2014! CNL 2014 Fourth Workshop on Controlled Natural Language 20–22 August 2014, Galway http://attempto.ifi.uzh.ch/site/cnl2014/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 27 / 27