Dependency Analysis of Abstract Universal Structures in Korean and EnglishJinho Choi
This thesis gives two contributions in the form of lexical resourcesto (1) dependency parsing in Korean and (2) semantic parsing in English. First,we describe our methodology for building three dependency treebanks in Korean derived from existing treebanks and pseudo-annotated according to the latest guidelines from the Universal Dependencies (UD). The original Google Korean UD Treebank is re-tokenized to ensure morpheme-level annotation consistency with other corpora while maintaining linguistic validity of the revised tokens. Phrase structure trees in the Penn Korean Treebank and the Kaist Treebank are automatically converted into UD dependency trees by applying head-percolation rules and linguistically motivated heuristics. A total of 38K+ dependency trees are generated.To the best of our knowledge, this is the first time that the three Korean treebanks are converted into UD dependency treebanks following the latest annotation guidelines. Second, we introduce an on-going project for augmenting the OntoNotes phrase structure treebank with semantic features found in the Abstract Meaning Representation (AMR), as part of an effort to build an accurate AMR parser. We propose a novel technique for AMR parsing that first trains a dependency parser on the OntoNotes corpus augmented with numbered arguments in the Proposition Bank (PropBank), and then does a transfer learning of the trained dependency parser for the AMR parsing task. A preliminary step is to prepare dependency data by performing an automatic replacement of dependencies that define predicate argument structure with their corresponding PropBank argument labels during constituent-to-dependency conversion. To the best of our knowledge, this is the first time that the PropBank labels are directly inserted into dependency structure, producing a new dependency corpus with rich syntactic information as well as semantic role information provided by PropBank that fully describes the predicate-argument structure, making it an ideal resource for AMR parsing and, broadly, semantic parsing.
Corpus-based part-of-speech disambiguation of PersianIDES Editor
In this paper we introduce a method for part-ofspeech
disambiguation of Persian texts, which uses word class
probabilities in a relatively small training corpus in order to
automatically tag unrestricted Persian texts. The experiment
has been carried out in two levels as unigram and bi-gram
genotypes disambiguation. Comparing the results gained from
the two levels, we show that using immediate right context to
which a given word belongs can increase the accuracy rate of
the system to a high degree
Dependency Analysis of Abstract Universal Structures in Korean and EnglishJinho Choi
This thesis gives two contributions in the form of lexical resourcesto (1) dependency parsing in Korean and (2) semantic parsing in English. First,we describe our methodology for building three dependency treebanks in Korean derived from existing treebanks and pseudo-annotated according to the latest guidelines from the Universal Dependencies (UD). The original Google Korean UD Treebank is re-tokenized to ensure morpheme-level annotation consistency with other corpora while maintaining linguistic validity of the revised tokens. Phrase structure trees in the Penn Korean Treebank and the Kaist Treebank are automatically converted into UD dependency trees by applying head-percolation rules and linguistically motivated heuristics. A total of 38K+ dependency trees are generated.To the best of our knowledge, this is the first time that the three Korean treebanks are converted into UD dependency treebanks following the latest annotation guidelines. Second, we introduce an on-going project for augmenting the OntoNotes phrase structure treebank with semantic features found in the Abstract Meaning Representation (AMR), as part of an effort to build an accurate AMR parser. We propose a novel technique for AMR parsing that first trains a dependency parser on the OntoNotes corpus augmented with numbered arguments in the Proposition Bank (PropBank), and then does a transfer learning of the trained dependency parser for the AMR parsing task. A preliminary step is to prepare dependency data by performing an automatic replacement of dependencies that define predicate argument structure with their corresponding PropBank argument labels during constituent-to-dependency conversion. To the best of our knowledge, this is the first time that the PropBank labels are directly inserted into dependency structure, producing a new dependency corpus with rich syntactic information as well as semantic role information provided by PropBank that fully describes the predicate-argument structure, making it an ideal resource for AMR parsing and, broadly, semantic parsing.
Corpus-based part-of-speech disambiguation of PersianIDES Editor
In this paper we introduce a method for part-ofspeech
disambiguation of Persian texts, which uses word class
probabilities in a relatively small training corpus in order to
automatically tag unrestricted Persian texts. The experiment
has been carried out in two levels as unigram and bi-gram
genotypes disambiguation. Comparing the results gained from
the two levels, we show that using immediate right context to
which a given word belongs can increase the accuracy rate of
the system to a high degree
Lecture 2: From Semantics To Semantic-Oriented ApplicationsMarina Santini
From the "Natural Language Processing" LinkedIn group:
John Kontos, Professor of Artificial Intelligence
I wonder whether translating into formal logic is nothing more than transliteration which simply isolates the part of the text that can be reasoned upon using the simple inference mechanism of formal logic. The real problem I think lies with the part of text that CANNOT be translated one the one hand and the one that changes its meaning due to civilization advances. My own proposal is to leave NL text alone and try building inference mechanisms for the UNTRANSLATED text depending on the task requirements.
All the best
John"
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...Andre Freitas
Tasks such as question answering and semantic search are dependent
on the ability of querying & reasoning over large-scale commonsense knowledge
bases (KBs). However, dealing with commonsense data demands coping with
problems such as the increase in schema complexity, semantic inconsistency, incompleteness
and scalability. This paper proposes a selective graph navigation
mechanism based on a distributional relational semantic model which can be applied
to querying & reasoning over heterogeneous knowledge bases (KBs). The
approach can be used for approximative reasoning, querying and associational
knowledge discovery. In this paper we focus on commonsense reasoning as the
main motivational scenario for the approach. The approach focuses on addressing
the following problems: (i) providing a semantic selection mechanism for facts
which are relevant and meaningful in a specific reasoning & querying context
and (ii) allowing coping with information incompleteness in large KBs. The approach
is evaluated using ConceptNet as a commonsense KB, and achieved high
selectivity, high scalability and high accuracy in the selection of meaningful nav-
igational paths. Distributional semantics is also used as a principled mechanism
to cope with information incompleteness.
RuleML2015 The Herbrand Manifesto - Thinking Inside the Box RuleML
The traditional semantics for First Order Logic (sometimes called Tarskian semantics) is based on the notion of interpretations of constants. Herbrand semantics is an alternative semantics based directly on truth assignments for ground sentences rather than interpretations of constants. Herbrand semantics is simpler and more intuitive than Tarskian semantics; and, consequently, it is easier to teach and learn. Moreover, it is more expressive. For example, while it is not possible to finitely axiomatize integer arithmetic with Tarskian semantics, this can be done easily with Herbrand Semantics. The downside is a loss of some common logical properties, such as compactness and completeness. However, there is no loss of inferential power. Anything that can be proved according to Tarskian semantics can also be proved according to Herbrand semantics. In this presentation, we define Herbrand semantics; we look at the implications for research on logic and rules systems and automated reasoning; and and we assess the potential for popularizing logic.
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYAijnlc
Morphological segmentation is a fundamental task in language processing. Some languages, such as
Arabic and Tigrinya,have words packed with very rich morphological information.Therefore, unpacking
this information becomes a necessary taskfor many downstream natural language processing tasks. This
paper presents the first morphological segmentation research forTigrinya. We constructed a new
morphologically segmented corpus with 45,127 manually segmented tokens. Conditional random fields
(CRF) and window-based longshort-term memory (LSTM) neural networkswere employed separately to
develop our boundary detection models. We appliedlanguage-independent character and substring features
for the CRFand character embeddings for the LSTM networks. Experimentswere performed with four
variants of the Begin-Inside-Outside (BIO) chunk annotation scheme. We achieved 94.67% F1 scoreusing
bidirectional LSTMs with fixed-sizewindow approach to morphemeboundary detection.
Analogy is one of the most studied representatives of a family of non-classical forms of reasoning working across different domains, usually taken to play a crucial role in creative thought and problem-solving. In the first part of the talk, I will shortly introduce general principles of computational analogy models (relying on a generalization-based approach to analogy-making). We will then have a closer look at Heuristic-Driven Theory Projection (HDTP) as an example for a theoretical framework and implemented system: HDTP computes analogical relations and inferences for domains which are represented using many-sorted first-order logic languages, applying a restricted form of higher-order anti-unification for finding shared structural elements common to both domains. The presentation of the framework will be followed by a few reflections on the "cognitive plausibility" of the approach motivated by theoretical complexity and tractability considerations.
In the second part of the talk I will discuss an application of HDTP to modeling essential parts of concept blending processes as current "hot topic" in Cognitive Science. Here, I will sketch an analogy-inspired formal account of concept blending —developed in the European FP7-funded Concept Invention Theory (COINVENT) project— combining HDTP with mechanisms from Case-Based Reasoning.
2007. Introduction to the panel 'Pragmatic Interfaces' organised by the authors at the International Pragmatics Conference (IPRA) in Goteborg (Sweden), July 2007. Didier Maillat and Louis de Saussure
This paper describes BABAR, a knowledge extraction and representation system, completely implemented in CLOS, that is primarily geared towards organizing and reasoning about knowledge extracted from the Wikipedia Website. The system combines natural language processing techniques, knowledge representation paradigms and machine learning algorithms. BABAR is currently an ongoing independent research project that when sufficiently mature, may provide various commercial opportunities.
BABAR uses natural language processing to parse both page name and page contents. It automatically generates Wikipedia topic taxonomies thus providing a model for organizing the approximately 4,000,000 existing Wikipedia pages. It uses similarity metrics to establish concept relevancy and clustering algorithms to group topics based on semantic relevancy. Novel algorithms are presented that combine approaches from the areas of machine learning and recommender systems. The system also generates a knowledge hypergraph which will ultimately be used in conjunction with an automated reasoner to answer questions about particular topics.
Lecture 2: From Semantics To Semantic-Oriented ApplicationsMarina Santini
From the "Natural Language Processing" LinkedIn group:
John Kontos, Professor of Artificial Intelligence
I wonder whether translating into formal logic is nothing more than transliteration which simply isolates the part of the text that can be reasoned upon using the simple inference mechanism of formal logic. The real problem I think lies with the part of text that CANNOT be translated one the one hand and the one that changes its meaning due to civilization advances. My own proposal is to leave NL text alone and try building inference mechanisms for the UNTRANSLATED text depending on the task requirements.
All the best
John"
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...Andre Freitas
Tasks such as question answering and semantic search are dependent
on the ability of querying & reasoning over large-scale commonsense knowledge
bases (KBs). However, dealing with commonsense data demands coping with
problems such as the increase in schema complexity, semantic inconsistency, incompleteness
and scalability. This paper proposes a selective graph navigation
mechanism based on a distributional relational semantic model which can be applied
to querying & reasoning over heterogeneous knowledge bases (KBs). The
approach can be used for approximative reasoning, querying and associational
knowledge discovery. In this paper we focus on commonsense reasoning as the
main motivational scenario for the approach. The approach focuses on addressing
the following problems: (i) providing a semantic selection mechanism for facts
which are relevant and meaningful in a specific reasoning & querying context
and (ii) allowing coping with information incompleteness in large KBs. The approach
is evaluated using ConceptNet as a commonsense KB, and achieved high
selectivity, high scalability and high accuracy in the selection of meaningful nav-
igational paths. Distributional semantics is also used as a principled mechanism
to cope with information incompleteness.
RuleML2015 The Herbrand Manifesto - Thinking Inside the Box RuleML
The traditional semantics for First Order Logic (sometimes called Tarskian semantics) is based on the notion of interpretations of constants. Herbrand semantics is an alternative semantics based directly on truth assignments for ground sentences rather than interpretations of constants. Herbrand semantics is simpler and more intuitive than Tarskian semantics; and, consequently, it is easier to teach and learn. Moreover, it is more expressive. For example, while it is not possible to finitely axiomatize integer arithmetic with Tarskian semantics, this can be done easily with Herbrand Semantics. The downside is a loss of some common logical properties, such as compactness and completeness. However, there is no loss of inferential power. Anything that can be proved according to Tarskian semantics can also be proved according to Herbrand semantics. In this presentation, we define Herbrand semantics; we look at the implications for research on logic and rules systems and automated reasoning; and and we assess the potential for popularizing logic.
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYAijnlc
Morphological segmentation is a fundamental task in language processing. Some languages, such as
Arabic and Tigrinya,have words packed with very rich morphological information.Therefore, unpacking
this information becomes a necessary taskfor many downstream natural language processing tasks. This
paper presents the first morphological segmentation research forTigrinya. We constructed a new
morphologically segmented corpus with 45,127 manually segmented tokens. Conditional random fields
(CRF) and window-based longshort-term memory (LSTM) neural networkswere employed separately to
develop our boundary detection models. We appliedlanguage-independent character and substring features
for the CRFand character embeddings for the LSTM networks. Experimentswere performed with four
variants of the Begin-Inside-Outside (BIO) chunk annotation scheme. We achieved 94.67% F1 scoreusing
bidirectional LSTMs with fixed-sizewindow approach to morphemeboundary detection.
Analogy is one of the most studied representatives of a family of non-classical forms of reasoning working across different domains, usually taken to play a crucial role in creative thought and problem-solving. In the first part of the talk, I will shortly introduce general principles of computational analogy models (relying on a generalization-based approach to analogy-making). We will then have a closer look at Heuristic-Driven Theory Projection (HDTP) as an example for a theoretical framework and implemented system: HDTP computes analogical relations and inferences for domains which are represented using many-sorted first-order logic languages, applying a restricted form of higher-order anti-unification for finding shared structural elements common to both domains. The presentation of the framework will be followed by a few reflections on the "cognitive plausibility" of the approach motivated by theoretical complexity and tractability considerations.
In the second part of the talk I will discuss an application of HDTP to modeling essential parts of concept blending processes as current "hot topic" in Cognitive Science. Here, I will sketch an analogy-inspired formal account of concept blending —developed in the European FP7-funded Concept Invention Theory (COINVENT) project— combining HDTP with mechanisms from Case-Based Reasoning.
2007. Introduction to the panel 'Pragmatic Interfaces' organised by the authors at the International Pragmatics Conference (IPRA) in Goteborg (Sweden), July 2007. Didier Maillat and Louis de Saussure
This paper describes BABAR, a knowledge extraction and representation system, completely implemented in CLOS, that is primarily geared towards organizing and reasoning about knowledge extracted from the Wikipedia Website. The system combines natural language processing techniques, knowledge representation paradigms and machine learning algorithms. BABAR is currently an ongoing independent research project that when sufficiently mature, may provide various commercial opportunities.
BABAR uses natural language processing to parse both page name and page contents. It automatically generates Wikipedia topic taxonomies thus providing a model for organizing the approximately 4,000,000 existing Wikipedia pages. It uses similarity metrics to establish concept relevancy and clustering algorithms to group topics based on semantic relevancy. Novel algorithms are presented that combine approaches from the areas of machine learning and recommender systems. The system also generates a knowledge hypergraph which will ultimately be used in conjunction with an automated reasoner to answer questions about particular topics.
As the volume of content continues to grow exponentially helping search engines to understand context and the topical themes within your site is increasingly important. Understanding some of the concepts are covered and also ways to utilise these in your marketing strategy.
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
Slides from my keynote talk at the Recherche d'Information SEmantique (RISE) workshop at CORIA-TALN 2018 conference in Rennes, France.
(Abstract)
Neural Information Retrieval (or neural IR) is the application of shallow or deep neural networks to IR tasks. Unlike classical IR models, these machine learning (ML) based approaches are data-hungry, requiring large scale training data before they can be deployed. Traditional learning to rank models employ supervised ML techniques—including neural networks—over hand-crafted IR features. By contrast, more recently proposed neural models learn representations of language from raw text that can bridge the gap between the query and the document vocabulary.
Neural IR is an emerging field and research publications in the area has been increasing in recent years. While the community explores new architectures and training regimes, a new set of challenges, opportunities, and design principles are emerging in the context of these new IR models. In this talk, I will share five lessons learned from my personal research in the area of neural IR. I will present a framework for discussing different unsupervised approaches to learning latent representations of text. I will cover several challenges to learning effective text representations for IR and discuss how latent space models should be combined with observed feature spaces for better retrieval performance. Finally, I will conclude with a few case studies that demonstrates the application of neural approaches to IR that go beyond text matching.
This type of grammar enables the learners to apply the language in different discourse whether it is oral or written. Competence in the language is very important in order for the learners to understand and construct sentences which are grammatically correct, it is still significant for them to be given the chance to make linguistic outputs so that tey will truly experience the importance of having the language as a medium of communication or instruction and this should be given importance in the teaching and learning process for them not only master linguistic competence, but also linguistic performance.
Presentation of "Challenges in transfer learning in NLP" from Madrid Natural Language Processing Meetup Event, May, 2019.
https://www.meetup.com/es-ES/Madrid-Natural-Language-Processing-meetup/
Practical related work in repository: https://github.com/laraolmos/madrid-nlp-meetup
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Leading Change strategies and insights for effective change management pdf 1.pdf
AbstractKR on Pargram 2006
1. Abstract KR: Using WordNet
and VerbNet for Question
Answering
Valeria de Paiva
speaking for the PARC Aquaint team
Dick Crouch, Tracy King, Danny Bobrow,
Cleo Condoravdi, Ron Kaplan,
Annie Zaenen, Lauri Karttunen
2. Outline
Motivation: why use deep linguistic analysis
for question answering
Basic architecture and example structures
Abstract Knowledge Representation
The experiment: WordNet/VerbNet
Discussion
3. Motivation
Knowledge-based question answering
– Deep representations allow high precision and recall, but
– Typically on restricted domains
– Hard for users to pose KR questions and interpret KR answers
– very hard for system to build up knowledge
Shallow, open-domain question answering
– Broad-coverage
– Lower precision and recall
– Easy to pose questions but sensitive to question form
Question answering at PARC
– Layered mapping from language to deeper semantic representations
– Broad-coverage: Matching for answers as light reasoning
– Expresses KRR answers in real English -- eventually
4. 2-way bridge between language & KR
XLE/LFG Pa Target
Text r sing KR Mapping
KRR
F-structure
Semantics
Abstract KR
Sources
Assertions
Match M M Inference
Question Query
Composed Answers
Text XLE/LFG
Text Generation
F-Structure Explanations
to user Templates Subqueries
5. 2-way bridge between language & KR
XLE/LFG Pa Target
Text r sing KR Mapping
KRR
F-structure
Semantics
Abstract KR
Sources
Assertions
Match M M Inference
Question Query
Infrastructure
Composed Resources
Theories XLE/LFG XLE
Text Functional Grammar F-Structure English Grammar
Lexical Generation Term rewriting
Templates KR mapping rules
Ambiguity management MaxEnt models
Unified Lexicon
6. Key Process: Canonicalize
representations
F-structures are semantic representations to
some – e.g. logic forms (Rus/Moldovan’01)
Transformed into (flat and contexted)
semantic structures, inspired by Glue, born
out of the need to ‘pack’ semantics ( talk to Dick
about it..)
Transformed into (flat and contexted) AKR
structures
Want to discuss semantics to AKR
but, what does canonicalization buy you?
7. Canonicalization helps matching
Argument structure:
– Mary bought an apple./An apple was bought by Mary.
Synonyms and hypernyms:
– Mary bought/purchased/acquired an apple.
Factivity and contexts:
– Mary bought an apple./Mary did not buy an apple.
– Mary managed/failed to buy an apple.
– Ed prevented Mary from buying an apple.
– We know/said/believe that Mary bought an apple.
– Mary didn’t wait to buy an apple. (talk to Lauri..)
8. Canonicalization helps matching
basic argument structure
Did Mary buy an apple?
“Mary bought an apple.” Yes
“An apple was bought by Mary.” Yes
Semantics gives us:
in_context(t, cardinality('Mary':9,sg)),
in_context(t, cardinality(apple:1,sg)),
in_context(t, specifier(apple:1,a)),
in_context(t, tense(buy:3,past)),
in_context(t, buy(buy:3,'Mary':9,apple:1)),
in_context(t, proper_name('Mary':9,name,'Mary'))
9. Canonicalization helps matching
basic synonyms/hypernyms
Did Mary buy an apple?
“Mary bought an apple.” Yes
“Mary purchased an apple.” Yes
Using Entailment detection:
“Mary bought an apple.” IMPLIES
“Mary purchased an apple.” ?
System Answer is YES
it aligns skolems in the appropriate way:
[purchase##1-buy##1,Mary##0-Mary##0,apple##3-apple##3,t-
t]
10. Canonicalization helps matching
basic context structure
Negation dealt with by context.
Does
“Mary did not buy an apple.” imply
“Mary bought an apple” ? No
System Answer is NO:
Skolems are aligned, respecting contexts
[buy##1-buy##4,Ed##0-Ed##0,apple##3-apple##6,t-t]
we get a conflict:
conflict(uninstantiable(passage,buy##4,t),
instantiable(question,buy##4,t)))
11. Canonicalization helps matching
For negation you may think contexts are an overkill, but
for implicative context structure:
Does
“Mary failed to buy an apple.” imply
“Mary bought an apple” ?
System Answer is NO:
under the right skolem alignment
[buy##1-buy##7,Mary##0-Mary##0,apple##3-apple##9,t-t]
That is entailment detection knows about contexts
12. Overcoming language/KR misalignments:
A principled approach
Language
– Generalizations come from the structure of the language
– Representations compositionally derived from sentence structure
Knowledge representation and reasoning
– Generalizations come from the structure of the world
– Representations to support reasoning
Layered architecture helps with different constraints
But boundaries are not fixed (like beads in a string?)
13. This talk: Semantics To AKR only…
XLE/LFG Pa Target
Text r sing KR Mapping
KRR
F-structure
Semantics
Abstract KR
Sources
Assertions
Match M M Inference
Question Query
Infrastructure
Composed Resources
Theories XLE/LFG XLE
Text Functional Grammar F-Structure English Grammar
Lexical Generation Term rewriting
Templates KR mapping rules
Ambiguity management MaxEnt models
Unified Lexicon
14. Ambiguity-enabled Text Analysis Pipeline
The official version
Text
LFG Grammar
C-structure Morphology
XLE/LFG Lexicon
parsing
Selection
F-structure
Linguistic VerbNet
Semantic
semantics lexicon
Stochastic
Unified
WordNet
Lexicon
Term
Rewriting Cyc
(transfer) Abstract Mapping Mappings
KR Rules
Target KR
(e.g. CycL, KM…)
15. Ambiguity-enabled Text Analysis pipeline
Text Early 2005
Stochastic
Selection
LFG Grammar
C-structure Morphology
XLE/LFG Lexicon
parsing
F-structure
Linguistic VerbNet
Semantic
semantics lexicon
Unified
WordNet
Lexicon
Term
Rewriting
(transfer) Abstract Mapping Cyc
KR Rules
Mappings
Target KR
(e.g. CycL, KM…)
18. Resources in a Unified Lexicon
Mapping to abstract KR requires
– Coverage of most words and constructions
– Detailed lexical entailments
Results of merging XLE, VerbNet, Cyc, WordNet
– 9,835 different verb stems
– 46,000 verb entries (indexed by subcat/sense)
– 24,000 with VerbNet information
– 2,800 with Cyc information
Triangulation to extend mappings to Cyc concepts
– 21,000 Verbs with Cyc data
– 50,000 noun entries, including N-N compounds
– 2,000 adjectives & adverbs
(Quality of data not evaluated)
19. PRED fire<Ed, boy>
TENSE past
Abstract KR: “Ed fired the boy.” SUBJ [ PRED Ed ]
PRED boy
OBJ
DEF +
(subconcept Ed3 Person)
(subconcept boy2 MaleChild)
(subconcept fire_ev1 DischargeWithPrejudice) Conceptual
(role fire_ev1 performedBy Ed3)
(role fire_ev1 objectActedOn boy2)
(context t)
(instantiable Ed3 t)
Contextual
(instantiable boy2 t)
(instantiable fire_ev1 t)
(temporalRel startsAfterEndingOf Now Temporal
fire_ev1)
20. PRED fire<Ed, hun>
TENSE past
Abstract KR: “Ed fired the gun.” SUBJ [ PRED Ed ]
PRED gun
OBJ
DEF +
(subconcept Ed0 Person)
(subconcept gun3 Gun)
(subconcept fire_ev1 ShootingAGun) Conceptual
(role fire_ev1 performedBy Ed0)
(role fire_ev1 deviceUsed gun3)
(context t)
(instantiable Ed0 t)
Contextual
(instantiable gun3 t)
(instantiable fire_ev1 t)
(temporalRel startsAfterEndingOf Now Temporal
fire_ev1)
21. Abstract Knowledge Representation
Encode different aspects of meaning
– Asserted content
» concepts and arguments, relations among objects
– Contexts
» author commitment, belief, report, denial, prevent, …
– Temporal relations
» qualitative relations among time intervals, events
Translate to various target KR's
e.g. CycL,Knowledge Machine, Description Logic, AnsProlog
Capture meaning ambiguity
– Mapping to KR can introduce and reduce ambiguity
– Need to handle ambiguity efficiently
A Basic Logic for Textual Inference (Bobrow et al, July 05)
22. Cyc oriented version of AKR
Cyc concepts: Person, Laughing,
DischargeWithPrejudice, etc..
Cyc Roles: bodilyDoer, performedBy,
infoTransferred, etc
WordNet/VerbNet as fallback mechanisms
Sortal restrictions from Cyc disambiguate
e.g, Ed fired the boy/Ed fired the gun.
Limitation: raggedness of Cyc
How do we know? Can we do better ?
23. Raggedness?
It’s clear that Cyc misses many senses of
words, particularly more abstract ones.
“Ed admitted the mistake” gives only:
(A1, subconcept('admit_ev##1','AdmitToMembership')),
(A2, subconcept('admit_ev##1','PermittingEntrance'))
Nothing about “Ed admitted that [Mary had arrived]”.
Also nothing about “hesitate”, “surprise”, “please”.
Nothing about “forget” (compare: Ed forgot that Mary
left -> Mary left but Ed forgot to leave -> Ed did not
leave)
24. Cyc KR: ambiguating too much?
The school bought the
choice([A1, A2], 1),
choice([B1, B2], A1), store
choice([C1, C2], A2),
choice([D1], or(B2,C2)),
choice([E1, E2], B1), 3 schools, 2 stores, 3 buy in UL
choice([F1, F2], C1),
choice([G1], or(E1,F2)),
choice([H1], or(E2,F1)),
choice([I1], or(G1,B2,C2))
….
cf(or(B2,or(C2,E1,F2)), role(buyer,'buy_ev##2','school##1')),
cf(or(B2,or(C2,E1,F2)), role(objectPaidFor,'buy_ev##2','store##4')),
…..
cf(or(B2,or(C2,E1,F2)), subconcept('buy_ev##2','Buying')),
cf(or(E2,F1), subconcept('buy_ev##2',buy)),
cf(D1, subconcept('school##1','SchoolInstitution')),
cf(G1, subconcept('school##1','SchoolInstitution-KThrough12')),
cf(H1, subconcept('school##1','GroupFn'('Fish'))),
cf(A2, subconcept('store##4','RetailStore')),
cf(A1, subconcept('store##4','RetailStoreSpace'))
25. The Experiment:
Text
LFG Grammar
C-structure Morphology
XLE/LFG Lexicon
parsing
F-structure Resources
Transfer Linguistic VerbNet
Semantic
semantics semantics lexicon
Unified
WordNet
Lexicon
Abstract Mapping
Term KR Rules c s
Cy ing
p
Rewriting M ap
Target KR
(e.g. CycL)
26. Experiment: UL oriented version of
AKR
Cyc only one of the lexical resources
VerbNet/WordNet “concepts” are synsets
Roles: very coarse ones from VerbNet
Goal roles: less refined than Cyc (800+)
more refined than VerbNet (<20)
Have infrastructure to experiment:
Can we do knowledge representation with
VerbNet/WordNet?
28. Example: The school bought the store
KR:
choice([A1, A2], 1)
Packed Clauses:
cf(1, context(t)),
cf(1, instantiable('buy##2',t)),
cf(1, instantiable('school##1',t)),
cf(1, instantiable('store##4',t)),
cf(A1, role('Agent','buy##2','school##1')),
cf(A2, role('Asset','buy##2','school##1')),
cf(1, role('Theme','buy##2','store##4')),
cf(1,subconcept('buy##2',[[2186766]])),
cf(1,subconcept('school##1',
[[8162936,8162558,7943952,7899136,7842951,29714,2236,2119,1740],
[4099321,2884748,4290445,20846,3991,3122,1930,1740],
[5686799,5682845,5681825,5631]])),
cf(1,subconcept('store##4',
[[4153847,3706735,3908195,3262760,4290445,20846,3991,3122,1930,1740],
[13194813,13194436,13087849,13084632,13084472,13084292,131597020]])),
cf(1, temporalRel(startsAfterEndingOf,'Now','buy##2'))
29. First thoughts
Surprisingly easy to change the base
ontology
Modular architecture does help
Do we have as good results as before?
Well, no…
30. Improvements can be made…
Some WordNet senses of buying should only
go with A1, some with A2
Sortal restrictions/preferences should come in
But ambiguity is managed without losing
meanings
31. Discussion
Experiment ongoing
Not “serious” report, yet. Gut feelings
UL marking of factives/implicatives/etc very
useful, (much) more needs to be done
Matching can do more inference than first
thought.
Ambiguity enabling technology on concepts
means no need to cluster WN senses ?
Sortal restrictions/preferences a must ?
34. Layered mapping
Text
text
C-structure
XLE/LFG
parsing
F-structure
in_context(t,arrive(arrive:2,boy:1)),
in_context(t,cardinality(boy:1,sg)),
in_context(t,specifier(boy:1,a)),
Linguistic
semantics in_context(t,tense(arrive:2,past))
Introduces contexts, cardinality restrictions, etc
Term Abstract
Rewriting KR
Target KRR
35. Layered mapping
Text Ed knows that Mary arrived.
text
cf(1, context('cx_arrive##6')),
C-structure cf(1, context(t)),
XLE/LFG cf(1, context_head('cx_arrive##6','arrive##6')),
parsing cf(1, context_head(t,'know##1')),
cf(1, context_lifting_relation(veridical,t,'cx_arrive##6')),
F-structure
cf(1, instantiable('Ed##0',t)),
cf(1, instantiable('Mary##4','cx_arrive##6')),
cf(1, instantiable('Mary##4',t)),
cf(1, instantiable('arrive##6','cx_arrive##6')),
cf(1, instantiable('arrive##6',t)),
Linguistic cf(1, instantiable('know##1',t)),
semantics cf(1, role('Agent','know##1','Ed##0')),
Term cf(1, role('Theme','arrive##6','Mary##4')),
Rewriting cf(1, role('Theme','know##1','cx_arrive##6')),
cf(1, subconcept('Ed##0',[[7626,4576,4359,3122,7127,1930,1740]])),
Abstract cf(1, subconcept('Mary##4',[[7626,4576,4359,3122,7127,1930,1740]])
KR cf(1, subconcept('arrive##6',[[1987643]])),
cf(1, subconcept('know##1',[[585692],[587146],[587430],[588050]])),
cf(1, temporalRel(startsAfterEndingOf,'Now','arrive##6')),
Target KRR cf(1, temporalRel(temporallyContains,'know##1','Now'))
36. Ambiguity is rampant in language
Alternatives multiply within and across layers…
Linguistic Sem
Abstract KR
C-structure
F-structure
KRR
37. What not to do
Use heuristics to prune as soon as possible
Oops: Strong constraints may reject the so-far-best (= only) option
Statistics
X X
Linguistic sem
X
Abstract KR
C-structure
F-structure
X
KRR
X
Fast computation, wrong result
38. Manage ambiguity instead
How many sheep?
The sheep liked the fish.
How many fish?
Options multiplied out
The sheep-sg liked the fish-sg.
The sheep-pl liked the fish-sg.
The sheep-sg liked the fish-pl.
The sheep-pl liked the fish-pl.
Options packed
sg sg
The sheep liked the fish
pl pl
Packed representation:
– Encodes all dependencies without loss of information
– Common items represented, computed once
– Key to practical efficiency with broad-coverage grammars
39. Embedded contexts:
“The man said that Ed fired a boy.”
(subconcept say_ev19 Inform-CommunicationAct)
(role say_ev19 senderOfInfo man24)
(role say_ev19 infoTransferred comp21)
(subconcept fire_ev20 DischargeWithPrejudice)
(role fire_ev20 objectActedOn boy22)
(role fire_ev20 performedBy Ed23)
(context t)
(context comp21)
(context_lifting_rules averidical t comp21)
(instantiable man24 t)
(instantiable say_ev19 t)
(instantiable Ed23 t)
(instantiable boy22 comp21)
(instantiable fire_ev20 comp21)
40. Contexts for negation
“No congressman has gone to Iraq since the war.”
Contextual relations:
(context t)
(context not58) context triggered by negation
(context_lifting_rules antiveridical t not58)
interpretation of negation
Instantiability claims
(instantiable go_ev57 not58)
(uninstantiable go_ev57 t) entailment of negation
41. Local Textual Inference
Broadening and Narrowing
Ed went to Boston by bus → Ed went to Boston
Ed didn’t go to Boston → Ed didn’t go to Boston by bus
Positive implicative
Ed managed to go → Ed went
Ed didn’t manage to go → Ed didn’t go
Negative implicative
Ed forgot to go → Ed didn’t go
Ed didn’t forget to go → Ed went
Factive
Ed forgot that Bill went → Bill went
Ed didn’t forget that Bill went
42. Local Textual Inference
Matching of Abstract KR: Inference by Inclusion
Ed went to Boston by bus. → Ed went to Boston.
(context t) (context t)
(instantiable Boston8 t) (instantiable Boston8 t)
(instantiable Ed9 t) (instantiable Ed9 t)
(instantiable bus7 t)
(instantiable go_ev6 t) (instantiable go_ev6 t)
(role go_ev6 objectMoving Ed9) (role go_ev6 objectMoving Ed9)
(role go_ev6 toLocation Boston8) (role go_ev6 toLocation Boston8)
(role go_ev6 vehicle bus7)
(subconcept Boston8 CityOfBostonMass) (subconcept Boston8 CityOfBostonMass)
(subconcept Ed9 Person) (subconcept Ed9 Person)
(subconcept bus7 Bus-RoadVehicle)
(subconcept go_ev6 (subconcept go_ev6
Movement-TranslationEvent) Movement-TranslationEvent)
(temporalRel startsAfterEndingOf (temporalRel startsAfterEndingOf
Now go_ev6) Now go_ev6)
43. NYT 2000: top of the chart
be (un)able to > 12600
fail to > 5000
know that > 4800
acknowledge that > 2800
be too ADJ to > 2400
happen to > 2300
manage to > 2300
admit/concede that > 2000
be ADJ enough to > 1500
have a/the chance to > 1500
go on/proceed to > 1000
have time/money to > 1000
44. Verbs differ as to speaker commitments
“Bush realized that the US Army had to be transformed to meet new
threats”
“The US Army had to be transformed to meet new threats”
“Bush said that Khan sold centrifuges to North Korea”
“Khan sold centrifuges to North Korea”
45. Implicative verbs: carry commitments
Commitment depends on verb context
Positive (+) or negative (-)
Speaker commits to truth-value of complement
True (+) or false (-)
48. ++ Implicative
positive context only: positive commitment
“Ann forced Ed to leave”
“Ed left”
“Ann didn’t force Ed to leave”
“Ed left / didn’t leave”
49. +- Implicative
positive context only: negative commitment
“Ed refused to leave”
“Ed didn’t left”
“Ed didn’t refuse to leave”
“Ed left / didn’t leave”
50. -+ Implicative
negative context only: positive commitment
“Ed hesitated to leave”
“Ed left / didn’t leave”
“Ed didn’t hesitate to leave”
“Ed left”
51. -- Implicative
negative context only: negative commitment
“Ed attempted to leave”
“Ed left / didn’t leave”
“Ed didn’t attempt to leave”
“Ed didn’t leave”
52. +Factive
positive commitment no matter what context
“Ann realized that Ed left”
“Ann didn’t realize that Ed left”
“Ed left”
Like ++/-+ implicative but additional commitments
in questions and conditionals
Did Ann realize that Ed left?
cf. Did Ed manage to leave?
53. - Factive
negative commitment no matter what context
“Ann pretended that Ed left”
“Ann didn’t pretend that Ed left”
“Ed didn’t leave”
Like +-/-- implicative but with additional commitments
54. Coding of implicative verbs
Implicative properties of complement-taking verbs
• Not available in any external lexical resource
• No obvious machine-learning strategy
1250 embedded-clause verbs in PARC lexicon
400 examined on first pass
• Considered in BNC frequency order
• Google search when intuitions unclear
1/3 are implicative, classified in Unified Lexicon
55. Matching for implicative entailments
Term rewriting promotes commitments in Abstract KR
context(cx_leave7)
context(t)
instantiable(leave_ev7 t)
context_lifting_rules(veridical t cx_leave7)
“Ed managed instantiable(Ed0 t)
to leave.” instantiable(leave_ev7 cx_leave7)
instantiable(manage_ev3 t)
role(objectMoving leave_ev7 Ed0)
role(performedBy manage_ev3 Ed0)
role(whatsSuccessful manage_ev3 cx_leave7)
context(t)
“Ed left.” instantiable(Ed0 t)
instantiable(leave_ev1 t)
role(objectMoving leave_ev1 Ed0)
56. Embedded examples in real text
From Google:
Song, Seoul's point man, did not forget to
persuade the North Koreans to make a “strategic
choice” of returning to the bargaining table...
Song persuaded the North Koreans…
57. Promotion for a (simple) embedding
“Ed did not forget to force Dave to leave”
Dave left.
58. Conclusions
Broad coverage, efficient language to KR
system
Need extensive resources (ontologies,
argument mappings)
– Much work to incorporate
Use layered architecture
– Matching as well as reasoning
KR mapping both increases and decreases
ambiguity
59. Term-Rewriting for KR Mapping
Rule form
<Input terms> ==> <Output terms> (obligatory)
<Input terms> ?=> <Output terms> (optional)
(optional rules introduces new choices)
Input patterns allow
– Consume term if matched: Term
– Test on term without consumption: +Term
– Test that term is missing: -Term
– Procedural attachment: {ProcedureCall}
Ordered rule application
– Rule1 applied in all possible ways to Input to produce Output1
– Rule2 applied in all possible ways to Output1 to produce Output2
61. Term-rewriting can introduce ambiguity
Alternative lexical mappings
/- cyc_concept_map(bank, FinancialInstitution). Permanent
Background
/- cyc_concept_map(bank, SideOfRiver).
Facts
ist(%%, %P( %Arg)),
Mapping from
cyc_concept_map(%P, %Concept) Predicate to
==> sub_concept(%Arg, %Concept). Cyc concept
Input term
ist(ctx0, bank(bank##1))
Alternative rule applications produce different outputs
Rewrite system represents this ambiguity by a new choice
Output
– C1: sub_concept(bank##1, FinancialInstitution)
C2: sub_concept(bank##1, SideOfRiver)
– (C1 xor C2) ↔ 1
62. Rules can prune ill-formed mappings
The bank hired Ed
hire(%E), subj(%E, %Subj), obj(%E, %Obj),
+sub_concept(%Subj, %CSubj), +sub_concept( %Obj, %CObj), Rule for
{genls( %CSubj, Organization), genls( %CObj, Person)} mapping
==> sub_concept(%E, EmployingEvent), hire
performedBy(%E, %Subj), personEmployed(%E, %Obj).
From Cyc: genls(FinancialInstitution, Organization) true
genls(SideOfRiver, Organization) false
If bank is mapped to SideOfRiver, the rule will not fire.
This leads to a failure to consume the subject.
subj(%%,%%) ==> stop.
prunes this analysis from the choice space.
In general, later rewrites prune analyses that don’t consume grammatical
roles.
63. The NLP-KR Gap
There have been parallel efforts on text-based
and knowledge-based question answering
– Benefits of knowledge-based approaches inaccessible without some bridge
from text to KR.
Prime Goal
– Robust, broad coverage mapping from texts and questions to KR
– Allow KRR systems with lots of world knowledge to answer questions
Second-thoughts Goal
– Robust, broad coverage mapping from texts and questions to KR
– Allow system with basic knowledge of English to answer questions
64. Canonical conceptual semantics
“Cool the device.” “Lower the temperature of the device.”
lower
cool
addressee temperature
addressee device
of device
decreaseCausally(lower-event, device, temperature)