0.88
Same language: 0.86
Other language: 0.84
• No statistically significant difference between same and other
language
→ Multilingual presentation does not negatively impact
agreement on content
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
28 / 31
Conclusions
- AceWiki is a semantic wiki that allows untrained users to
collaboratively build and maintain a formal knowledge base
- ACE provides a controlled language interface to formal languages
like OWL
- AceWiki-GF extends AceWiki to multilinguality using GF
- ACE-in-GF provides a multilingual grammar for ACE in GF
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
This presentation introduces AceWiki-GF, a semantic wiki based on controlled natural language that makes its knowledge base viewable and editable in different languages applying high-quality rule-based machine translation.
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset ClassIoana Stoica
The "2013 Report on Angel Investing Activity in Canada: Accelerating the Asset Class", the fourth of its kind, was released in partnership with the Government of Canada, KPMG Enterprise and BDC Venture Capital. The report captures 199 investments in 2013 totaling $89.0 million made through 29 Angel groups. Over 2,100 investors were represented, 40% in Western Canada, 5% in Eastern Canada, and 55% in Central Canada.
How to Evaluate Controlled Natural LanguagesTobias Kuhn
(CC Attribution License does not apply to third-party material on slides 5 and 6; see paper for details: http://attempto.ifi.uzh.ch/site/pubs/papers/cnl2009main_kuhn.pdf )
Underspecified Scientific Claims in NanopublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slides 3 and 4; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2012wole.pdf )
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
This presentation introduces AceWiki-GF, a semantic wiki based on controlled natural language that makes its knowledge base viewable and editable in different languages applying high-quality rule-based machine translation.
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset ClassIoana Stoica
The "2013 Report on Angel Investing Activity in Canada: Accelerating the Asset Class", the fourth of its kind, was released in partnership with the Government of Canada, KPMG Enterprise and BDC Venture Capital. The report captures 199 investments in 2013 totaling $89.0 million made through 29 Angel groups. Over 2,100 investors were represented, 40% in Western Canada, 5% in Eastern Canada, and 55% in Central Canada.
How to Evaluate Controlled Natural LanguagesTobias Kuhn
(CC Attribution License does not apply to third-party material on slides 5 and 6; see paper for details: http://attempto.ifi.uzh.ch/site/pubs/papers/cnl2009main_kuhn.pdf )
Underspecified Scientific Claims in NanopublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slides 3 and 4; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2012wole.pdf )
Controlled Natural Language and Opportunities for StandardizationTobias Kuhn
(CC Attribution License does not apply to included third-party material; see the paper for the references: http://attempto.ifi.uzh.ch/site/pubs/papers/kuhn2013cl.pdf )
Finding and Accessing Diagrams in Biomedical PublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slides 3, 6, 12, and 19; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2012amia.pdf )
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
The concept of nanopublications was first proposed about six years ago, but it lacked openly available implementations. The library presented here is the first one that has become an official implementation of the nanopublication community. Its core features are stable, but it also contains unofficial and experimental extensions: for publishing to a decentralized server network, for defining sets of nanopublications with indexes, for informal assertions, and for digitally signing nanopublications. Most of the features of the library can also be accessed via an online validator interface.
Data Publishing and Post-Publication ReviewsTobias Kuhn
This presentation is about the combination of data publishing and post-publication reviews, and it covers some recent work on nanopublications for data publishing and reputation mechanisms for Web-scale quality metrics of scientific contributions.
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this talk, I outline how we can design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities. I present a protocol and a server network to decentrally store and archive data in the form of nanopublications, a format based on Semantic Web techniques to represent scientific data with formal semantics. Such nanopublications can be made verifiable and immutable by applying cryptographic methods with identifiers called Trusty URIs. I show how this approach allows researchers to produce, publish, retrieve, address, verify, and recombine datasets and their individual nanopublications in a reliable and trustworthy manner, and I discuss how the current small network can grow to handle the large amounts of structured data that modern science is producing and consuming.
(CC license does not apply to third-party content)
Broadening the Scope of NanopublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slide 3; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2013eswc.pdf )
http://www.scotthale.net/pubs/?websci2014
This article analyzes one month of edits to Wikipedia in order to examine the role of users editing multiple language editions (referred to as multilingual users). Such multilingual users may serve an important function in diffusing information across different language editions of the encyclopedia, and prior work has suggested this could reduce the level of self-focus bias in each edition. This study finds multilingual users are much more active than their single-edition (monolingual) counterparts. They are found in all language editions, but smaller-sized editions with fewer users have a higher percentage of multilingual users than larger-sized editions. About a quarter of multilingual users always edit the same articles in multiple languages, while just over 40% of multilingual users edit different articles in different languages. When non-English users do edit a second language edition, that edition is most frequently English. Nonetheless, several regional and linguistic cross-editing patterns are also present.
“A Universal Translator as a Cognitive System, beginning as a Guidebook with ...diannepatricia
Scott MacLeod, founder and president of World University and School gave this presentation at the Cognitive Systems Institute Speaker Series on May 5, 2016.
Controlled Natural Language and Opportunities for StandardizationTobias Kuhn
(CC Attribution License does not apply to included third-party material; see the paper for the references: http://attempto.ifi.uzh.ch/site/pubs/papers/kuhn2013cl.pdf )
Finding and Accessing Diagrams in Biomedical PublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slides 3, 6, 12, and 19; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2012amia.pdf )
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
The concept of nanopublications was first proposed about six years ago, but it lacked openly available implementations. The library presented here is the first one that has become an official implementation of the nanopublication community. Its core features are stable, but it also contains unofficial and experimental extensions: for publishing to a decentralized server network, for defining sets of nanopublications with indexes, for informal assertions, and for digitally signing nanopublications. Most of the features of the library can also be accessed via an online validator interface.
Data Publishing and Post-Publication ReviewsTobias Kuhn
This presentation is about the combination of data publishing and post-publication reviews, and it covers some recent work on nanopublications for data publishing and reputation mechanisms for Web-scale quality metrics of scientific contributions.
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this talk, I outline how we can design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities. I present a protocol and a server network to decentrally store and archive data in the form of nanopublications, a format based on Semantic Web techniques to represent scientific data with formal semantics. Such nanopublications can be made verifiable and immutable by applying cryptographic methods with identifiers called Trusty URIs. I show how this approach allows researchers to produce, publish, retrieve, address, verify, and recombine datasets and their individual nanopublications in a reliable and trustworthy manner, and I discuss how the current small network can grow to handle the large amounts of structured data that modern science is producing and consuming.
(CC license does not apply to third-party content)
Broadening the Scope of NanopublicationsTobias Kuhn
(CC Attribution License does not apply to included third-party material on slide 3; see the paper for the references: http://www.tkuhn.ch/pub/kuhn2013eswc.pdf )
http://www.scotthale.net/pubs/?websci2014
This article analyzes one month of edits to Wikipedia in order to examine the role of users editing multiple language editions (referred to as multilingual users). Such multilingual users may serve an important function in diffusing information across different language editions of the encyclopedia, and prior work has suggested this could reduce the level of self-focus bias in each edition. This study finds multilingual users are much more active than their single-edition (monolingual) counterparts. They are found in all language editions, but smaller-sized editions with fewer users have a higher percentage of multilingual users than larger-sized editions. About a quarter of multilingual users always edit the same articles in multiple languages, while just over 40% of multilingual users edit different articles in different languages. When non-English users do edit a second language edition, that edition is most frequently English. Nonetheless, several regional and linguistic cross-editing patterns are also present.
“A Universal Translator as a Cognitive System, beginning as a Guidebook with ...diannepatricia
Scott MacLeod, founder and president of World University and School gave this presentation at the Cognitive Systems Institute Speaker Series on May 5, 2016.
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...Alannah Fitzgerald
Workshop delivered at the e-Learning Symposium on the 25th of January, 2013 with the Centre for Languages, Linguistics and Area Studies at the University of Southampton.
Towards a Universal Wordnet by Learning from Combined EvidenceGerard de Melo
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their
meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high
level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification.
Evolution of minds and languages: What evolved first and develops first in ch...Aaron Sloman
SLIDESHARE NOW STUPIDLY DOES NOT ALLOW SLIDES TO BE UPDATED. To find the latest version of these slides go to http://www.cs.bham.ac.uk/research/projects/cogaff//talks/#talk111
The version posted here was last updated on 16 March 2015. There have been several changes since then on the alternative site. Why did Slideshare take such a stupid decision (after being bought by Linkedin?)
A theory is presented according to which "languages" with structural variability and compositional semantics evolved in several species for *internal* use (e.g. in perception, planning, learning, forming goals, deciding, etc.) before *external* languages evolved for communication. The theory implies that such internal languages develop in young humans before a language for communication.
It is is also noted that the standard notion of 'compositional semantics' has to allow for the propagation of semantic content from parts to wholes to be potentially context sensitive at every stage: i.e. current context, speaker intentions, user knowledge, shared goals, can all affect how semantics of larger parts are derived from semantics of smaller parts+syntactic structure. This applies as much to non-verbal languages as to verbal ones.
This theory of how human languages evolved from earlier 'internal languages' (GLs) is inconsistent with the best known published theories of evolution or development of language.
But that does not make it wrong. Moreover, this theory is supported by empirical evidence including the example of deaf children in Nicaragua: http://en.wikipedia.org/wiki/Nicaraguan_Sign_Language
Linked Data Publishing with NanopublicationsTobias Kuhn
When we think about scientific publishing, we mainly think about papers published in journals or proceedings. This publishing model, however, doesn't seem to work very well for datasets, which have become increasingly important in many areas of science. Instead of treating datasets like papers, couldn't we put the data first by allowing scientists to directly publish data entries that represent their results? Nanopublications are an approach in this direction that builds upon the recent maturation of Linked Data technologies and proposes an entirely new paradigm for the future of scholarly communication.
Many scholars have pointed out that the classical way of publishing scientific articles is ill-suited to deal with the rapid growth of both, volume and complexity, of scientific contributions. To overcome these problems, next generation scientific publishing has to respond to the increasing importance of datasets and software, and needs to provide methods to automatically organize and aggregate reported scientific findings. Perhaps the most important shortcoming of the current publication system is that scientific papers do not come with formal semantics that could be processed, aggregated, and interpreted in an automated fashion.
Semantic publishing is a general approach to tackle this problem using the concepts and tools of the Semantic Web and related fields.
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
It is rare that texts or entire books written in a Controlled Natural Language (CNL) become very popular, but exactly this has happened with a book that has been published last year. Randall Munroe's Thing Explainer uses only the 1'000 most often used words of the English language together with drawn pictures to explain complicated things such as nuclear reactors, jet engines, the solar system, and dishwashers. This restricted language is a very interesting new case for the CNL community. I describe here its place in the context of existing approaches on Controlled Natural Languages, and I provide a first analysis from a scientific perspective, covering the word production rules and word distributions.
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science.
Here we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data.
We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used for the Semantic Web in general. Evaluation of the current small network shows that this system is efficient and reliable.
Semantic Publishing and NanopublicationsTobias Kuhn
Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this talk, I outline how we can design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities. I present a protocol and a server network to decentrally store and archive data in the form of nanopublications, a format based on Semantic Web techniques to represent scientific data with formal semantics. Such nanopublications can be made verifiable and immutable by applying cryptographic methods with identifiers called Trusty URIs. I show how this approach allows researchers to produce, publish, retrieve, address, verify, and recombine datasets and their individual nanopublications in a reliable and trustworthy manner, and I discuss how the current small network can grow to handle the large amounts of structured data that modern science is producing and consuming.
Semantic Publishing with Nanopublications Tobias Kuhn
Classical forms of publishing centered around printed narrative articles do not seem well-suited for publishing scientific datasets, which have become increasingly important for science. In this talk, I will introduce the concept of nanopublications, an approach to enable provenance-aware data publishing. I will present our work to make nanopublications more general by allowing for informal assertions and meta-nanopublications. By relying on Semantic Web technologies, they allow us to organize and structure scientific knowledge and discourse, and thereby to improve the efficiency of scientific processes. Using cryptographic hash values in their identifiers, we can make nanopublications trustworthy and reliable even in an open and decentralized environment. I will present our proposal of Trusty URIs and a preliminary implementation of a decentralized server network to publish, archive, and retrieve nanopublications in a reliable and efficient manner.
In this lightning talk, I introduce an approach to publish RDF data in a provenance-aware and reliable manner. This approach is based on the concept of nanopublications, which we can give unique and verifiable identifiers using cryptographic hash values. Based on that, I present ongoing work to establish a decentralized server network for publishing, archiving, and retrieveing Linked Data in a reliable and trustworthy way.
The Fascinating World of Bats: Unveiling the Secrets of the Nightthomasard1122
The Fascinating World of Bats: Unveiling the Secrets of the Night
Bats, the mysterious creatures of the night, have long been a source of fascination and fear for humans. With their eerie squeaks and fluttering wings, they have captured our imagination and sparked our curiosity. Yet, beyond the myths and legends, bats are fascinating creatures that play a vital role in our ecosystem.
There are over 1,300 species of bats, ranging from the tiny Kitti's hog-nosed bat to the majestic flying foxes. These winged mammals are found in almost every corner of the globe, from the scorching deserts to the lush rainforests. Their diversity is a testament to their adaptability and resilience.
Bats are insectivores, feeding on a vast array of insects, from mosquitoes to beetles. A single bat can consume up to 1,200 insects in an hour, making them a crucial part of our pest control system. By preying on insects that damage crops, bats save the agricultural industry billions of dollars each year.
But bats are not just useful; they are also fascinating creatures. Their ability to fly in complete darkness, using echolocation to navigate and hunt, is a remarkable feat of evolution. They are also social animals, living in colonies and communicating with each other through a complex system of calls and body language.
Despite their importance, bats face numerous threats, from habitat destruction to climate change. Many species are endangered, and conservation efforts are necessary to protect these magnificent creatures.
In conclusion, bats are more than just creatures of the night; they are a vital part of our ecosystem, playing a crucial role in maintaining the balance of nature. By learning more about these fascinating animals, we can appreciate their importance and work to protect them for generations to come. So, let us embrace the beauty and mystery of bats, and celebrate their unique place in our world.
Care Instructions for Activewear & Swim Suits.pdfsundazesurf80
SunDaze Surf offers top swimwear tips: choose high-quality, UV-protective fabrics to shield your skin. Opt for secure fits that withstand waves and active movement. Bright colors enhance visibility, while adjustable straps ensure comfort. Prioritize styles with good support, like racerbacks or underwire tops, for active beach days. Always rinse swimwear after use to maintain fabric integrity.
At Affordable Garage Door Repair, we specialize in both residential and commercial garage door services, ensuring your property is secure and your doors are running smoothly.
Understanding the Mahadasha of Shukra (Venus): Effects and RemediesAstro Pathshala
The Mahadasha of Shukra (Venus) is one of the most significant periods in Vedic astrology. Shukra is known as the planet of love, beauty, wealth, and luxury. Its Mahadasha can bring about profound changes in an individual's life, both positive and negative, depending on its placement and condition in the natal chart.
What is Shukra Mahadasha?
Mahadasha is a planetary period in Vedic astrology that affects various aspects of an individual's life for a specific number of years. The Mahadasha of Shukra lasts for 20 years and is known to bring a period of significant transformation. Shukra is associated with pleasures, creativity, relationships, and material comforts. During its Mahadasha, these areas of life tend to get highlighted.
La transidentité, un sujet qui fractionne les FrançaisIpsos France
Ipsos, l’une des principales sociétés mondiales d’études de marché dévoile les résultats de son étude Ipsos Global Advisor “Pride 2024”. De ses débuts aux Etats-Unis et désormais dans de très nombreux pays, le mois de juin est traditionnellement consacré aux « Marches des Fiertés » et à des événements festifs autour du concept de Pride. A cette occasion, Ipsos a réalisé une enquête dans vingt-six pays dressant plusieurs constats. Les clivages des opinions entre générations s’accentuent tandis que le soutien à des mesures sociétales et d’inclusion en faveur des LGBT+ notamment transgenres continue de s’effriter.
MRS PUNE 2024 - WINNER AMRUTHAA UTTAM JAGDHANEDK PAGEANT
Amruthaa Uttam Jagdhane, a stunning woman from Pune, has won the esteemed title of Mrs. India 2024, which is given out by the Dk Exhibition. Her journey to this prestigious accomplishment is a confirmation of her faithful assurance, extraordinary gifts, and profound commitment to enabling women.
Johnny Depp Long Hair: A Signature Look Through the Yearsgreendigital
Johnny Depp, synonymous with eclectic roles and unparalleled acting prowess. has also been a significant figure in fashion and style. Johnny Depp long hair is a distinctive trademark among the various elements that define his unique persona. This article delves into the evolution, impact. and cultural significance of Johnny Depp long hair. exploring how it has contributed to his iconic status.
Follow us on: Pinterest
Introduction
Johnny Depp is an actor known for his chameleon-like ability to transform into a wide range of characters. from the eccentric Captain Jack Sparrow in "Pirates of the Caribbean" to the introspective Edward Scissorhands. His long hair is one constant throughout his evolving roles and public appearances. Johnny Depp long hair is not a style choice but a significant aspect of his identity. contributing to his allure and mystique. This article explores the journey and significance of Johnny Depp long hair. highlighting how it has become integral to his brand.
The Early Years: A Budding Star with Signature Locks
1980s: The Rise of a Young Heartthrob
Johnny Depp's journey in Hollywood began in the 1980s. with his breakout role in the television series "21 Jump Street." During this time, his hair was short, but it was already clear that Depp had a penchant for unique and edgy styles. By the decade's end, Depp started experimenting with longer hair. setting the stage for a lifelong signature.
1990s: From Heartthrob to Icon
The 1990s were transformative for Johnny Depp his career and personal style. Films like "Edward Scissorhands" (1990) and "Benny & Joon" (1993) saw Depp sporting various hair lengths and styles. But, his long, unkempt hair in "What's Eating Gilbert Grape" (1993) began to draw significant attention. This period marked the beginning of Johnny Depp long hair. which became a defining feature of his image.
The Iconic Roles: Hair as a Character Element
Edward Scissorhands (1990)
In "Edward Scissorhands," Johnny Depp's character had a wild and mane that complemented his ethereal and misunderstood persona. This role showcased how long hair Johnny Depp could enhance a character's depth and mystery.
Captain Jack Sparrow: The Pirate with Flowing Locks
One of Johnny Depp's iconic roles is Captain Jack Sparrow from the "Pirates of the Caribbean" series. Sparrow's long, dreadlocked hair symbolised his rebellious and unpredictable nature. The character's look, complete with beads and trinkets woven into his hair. was a collaboration between Depp and the film's costume designers. This style became iconic and influenced fashion trends and Halloween costumes worldwide.
Other Memorable Characters
Depp's long hair has also been featured in other roles, such as Ichabod Crane in "Sleepy Hollow" (1999). and Roux in "Chocolat" (2000). In these films, his hair added a layer of authenticity and depth to his characters. proving that Johnny Depp with long hair is more than a style—it's a storytelling tool.
Off-Screen Influenc
Johnny Depp Long Hair: A Signature Look Through the Years
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammatical Framework
1. A Multilingual Semantic Wiki based on
Attempto Controlled English and
Grammatical Framework
Tobias Kuhn
Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich,
Switzerland
UNISA, Pretoria (South Africa)
5 June 2013
2. About This Talk
This talk is mainly based on the following paper:
Kaarel Kaljurand and Tobias Kuhn. A Multilingual Semantic Wiki
Based on Attempto Controlled English and Grammatical Framework.
In Proceedings of the 10th Extended Semantic Web Conference
(ESWC). 2013.
It can be downloaded here:
http://purl.org/tkuhn/eswc2013acewikigf
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
2 / 31
3. Imagine ...
... that Wikipedia can check consistency and
answer questions about the contained
knowledge, and
...that all content is instantly available in all
languages!
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
3 / 31
4. • AceWiki is a semantic wiki
• Articles are written in Attempto Controlled English (ACE)
• These sentences are internally translated into the Semantic Web
language OWL
• An OWL reasoner is built in to answer questions and detect
inconsistencies
• Special editor for writing ACE statements
• Has been extended to support multilinguality
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
4 / 31
6. Attempto Controlled English (ACE)
Subset of natural English:
• Conjunction, disjunction, negation, if-then, ...
• Anaphoric references: pronouns, definite noun phrases, variables
• Quantifiers: every, no, at least 3, ...
• Content words: proper names, nouns, verbs, adjectives, ...
Grammar is fixed, but users can change content words.
Deterministic ambiguity handling:
• Anaphora resolution (France borders Spain and it borders
Portugal.)
• Quantifier scope (Every country borders a country.)
• Attachment (Every EU-country borders a country that is a
EU-country and is a NATO-country.)
Well-defined translations to and from first-order logic, OWL, ...
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
6 / 31
8. Consistency Checking
Consistency of the knowledge base is very important, because it is a
prerequisite for all other reasoning tasks.
AceWiki ensures consistency by checking every new statement:
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
8 / 31
11. ACE Reasoning via Translation to OWL
Every country that does not border a sea is a landlocked-country.
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom(
:border
:sea
)
)
)
:landlocked-country
)
Which country is a landlocked-country?
ObjectIntersectionOf(
:country
:landlocked-country
)
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
11 / 31
12. Expressiveness versus Efficiency
Trade-off: Expressiveness/Complexity ⇔ Decidability/Efficiency
• First-order logic: expressive, undecidable, very inefficient
• Description Logics, OWL: less expressive, decidable, inefficient
• OWL Profiles: even less expressive, decidable, efficient
AceWiki can use full OWL or an OWL profile for reasoning.
Sentences that are more complex get a red triangle and are ignored
for reasoning:
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
12 / 31
13. Evaluation
Two small usability experiments with earlier versions of AceWiki:
• Altogether 26 untrained participants
• Task: Collaborative creation of a knowledge base
Results:
• 78%-81% of the sentences were correct and sensible
• 61%-70% of them were complex (containing negations,
implications, disjunctions or number restrictions)
• Creation of a correct sentence every 5–6 minutes
• Definition of a new word every 5–7 minutes
→ Even untrained users can effectively use AceWiki
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
13 / 31
14. Multilingual AceWiki: AceWiki-GF
General ideas:
• Make wiki content available in different languages
• Automatically translated content using rule-based machine
translation: Grammatical Framework (GF)
• Language switching like in Wikipedia
• Localization of the user interface
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
14 / 31
15. Grammatical Framework (GF)
GF is a framework for multilingual grammar engineering:
• Rule-based (i.e. not statistical)
• Functional programming language optimized to handle natural
language
• Resource Grammar Library implementing common morphological
and syntactic structures
• Mildly context sensitive
• Bidirectional translations: concrete language ⇔ abstract syntax
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
15 / 31
16. GF grammars and translations
GF grammars consist of:
• One language-neutral abstract syntax
• Multiple concrete syntaxes that implement the given abstract
categories, specifying words, word order, agreement, etc.
Example
border : Country -> Country -> Relation
English: border x y = x!Nom + "borders" + y!Nom
Estonian: border x y = x!Gen + "naaber on" + y!Nom
GF translations consist of:
• First, parse a string in the original language to a tree (or trees)
in the abstract syntax
• Then, linearize these trees as strings in the target language
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
16 / 31
18. GF Resource Grammar Library (RGL)
• Morphology and syntax for ∼30 languages via language-neutral
API
• Developers do not need detailed knowledge of the languages
that they want to support in their application
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
18 / 31
19. Implementation of AceWiki-GF
Integration of ACE with GF (ACE-in-GF):
• Implemented a multilingual grammar of ACE in the GF
framework
• Covered the languages supported by the GF resource grammar
• Not fine-tuned to any particular language (apart from ACE)
Integration of AceWiki with GF (AceWiki-GF):
• Implemented connection to GF tools (GF Webservice / Cloud
Service)
• Added support for the management of multilinguality, ambiguity,
grammar
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
19 / 31
21. ACE-in-GF
An ACE grammar implemented in GF adds multiple natural languages
as front-ends to ACE. As a result, these languages can be mapped to
and from various formal languages already supported by ACE.
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
21 / 31
22. ACE-in-GF: Example
German: Jedes Land, das nicht an ein Meer grenzt, ist ein
Binnenland.
ACE-in-GF tree:
baseText (sText (s (vpS (everyNP (relCN (cn_as_VarCN country_CN)
(neg_predRS which_RP (v2VP border_V2 (thereNP_as_NP
(aNP (cn_as_VarCN sea_CN))))))) (npVP (thereNP_as_NP
(aNP (cn_as_VarCN landlocked_country_CN)))))))
ACE: Every country that does not border a sea is a
landlocked-country.
OWL:
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom( :border :sea )
)
)
:landlocked-country
)
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
22 / 31
23. ACE-in-GF: Implementation
Implementation of the ACE syntax:
• Extension of Angelov and Ranta (CNL 2009)
• Focus on the subset of ACE that can be mapped to OWL
• Almost 100% coverage at almost 0% ambiguity
Support most RGL languages:
• Bulgarian, Catalan, Chinese, Danish, Dutch, English, Finnish,
French, German, Greek, Hindi, Italian, Latvian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish, Thai, Urdu
• RGL-based design provides automatic increase in quality and
language-coverage over time
Status
• Some precision problems, e.g. with anaphoric references
• Ambiguity and coverage problems in some languages
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
23 / 31
25. Evaluation of ACE-in-GF
Design
• Generated ∼100 ACE sentences/questions and automatically
translated them to all the languages
• Full coverage of all the grammar functions
• Large coverage of OWL axiom structures (subclass, range,
domain, transitivity, ...)
• Measured translation accuracy from ACE to other languages
• Used Google Translate as the baseline
• 20 human evaluators (2 per language) as the gold standard
Results
• Participants preferred ACE-in-GF translations to Google
translations and post-edited them less
• Many edits were stylistic
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
25 / 31
27. Evaluation of AceWiki-GF
Hypothesis: A group of users reaches almost the same level of agreement
on the content of an article presented to them in different languages as
when the article is presented to all of them in the same language.
Design
• Based on a 500-word lexicon on European geography in three
languages: English, German and Spanish
• 30 participants accessed AceWiki-GF and wrote sentences in
their language (10 participants for each language)
• They had to enter true and false sentences and tag them as such
• In a post-editing task, each participant checked the output of
two other participants: one translated from another language
and one written in the same language (true/false tags were
removed and sentences shuffled)
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
27 / 31
28. Evaluation of AceWiki-GF: Results
• 30 participants spent on average 37 minutes using AceWiki-GF,
creating in total 316 sentences
• Definition of agreement level: (Tk + Fd )/S
S is the total number of sentences, Tk the number of sentences marked as
true and kept, and Fd the ones marked as false and deleted
• Agreement level with and without translation: 84.0% and 82.2%
(difference is not significant)
• Assumption: translation introduces a constant translation error
rate r that has the effect that the agreement level is (1 − r ) × a
instead of a
• New hypothesis: The translation error rate is less than 5%.
• p-value with one-tailed Wilcoxon signed rank test: 0.046
• With AceWiki-GF, translation error rate is less than 5%!
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
28 / 31
29. Evaluation of AceWiki-GF: Feedback
Questionnaire for the participants contained these questions:
1
Was AceWiki Geography easy or difficult to use in general?
2
Was the sentence editor easy or difficult to use?
3
Was creating true and false statements easy or difficult to
perform?
Possible answers: “very difficult” (0), “difficult” (1), “medium” (2),
“easy” (3), and “very easy” (4)
Results:
1
Average: 2.93 (∼“easy”)
2
Average: 2.77 (∼“easy”)
3
Average: 2.70 (∼“easy”)
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
29 / 31
30. Links
ACE parser (APE) source code: https://github.com/Attempto/APE
ACE-in-GF source code: http://github.com/Attempto/ACE-in-GF
AceWiki and AceWikiGF
• Source code: http://github.com/AceWiki/AceWiki
• Demos (non-GF): http://attempto.ifi.uzh.ch/acewiki/
• Demos (GF): http://attempto.ifi.uzh.ch/acewiki-gf/
MOLTO project web site: http://www.molto-project.eu
Attempto project web site: http://attempto.ifi.uzh.ch
Grammatical Framework project
• Web site: http://www.grammaticalframework.org
• GF Summer School, August 2013 in Germany:
http://school.grammaticalframework.org/2013/
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
30 / 31
31. Thank you for your Attention!
Questions?
Tobias Kuhn, ETH Zurich
A Multilingual Semantic Wiki
31 / 31