Speaker: Jordi Carrera Ventura, Artificial Intelligence technologist at Telefónica R&D
Summary: Chatbots (aka conversational agents, spoken dialogue systems) allow users to interface with computers using natural language by simply asking questions or issuing commands.
Given a query, the chatbot builds a semantic representation of the input, transforms it into a logical statement, and performs all the necessary actions to fulfill the user's intent. Sometimes this simply means calculating an exact answer or retrieving a fact from a database, whereas other times it means building a contextual model and running a full-fledged conversation flow while keeping track of anaphoras and cross-references.
Besides the direct applications of chatbots in IoT (Amazon’s Alexa, Apple's Siri) and IT (the historical field of Information Retrieval as a whole can be seen as a sub-problem of spoken dialogue systems), chatbots' main appeal for technologists is their location at the intersection of all major Natural Language Processing technologies and many of the deepest questions in Cognitive Science today: semantic parsing, entity recognition, knowledge representation, and coreference resolution.
In this talk, I will explore those questions in the context of an applied industry setting, and I will introduce a framework suitable for addressing them, together with an overview of the state-of-the-art in chatbot technology and some original techniques.
Speaker: Vitalii Braslavskyi, Software Engineer at Grammarly
Summary:
Today, the dominant approach to software engineering is an imperative one — the best practices have been proven over time. But the world is always evolving, and in order to evolve with it and remain as productive as possible, we need to continue searching for better tools to solve problems of increasing complexity.
In this talk, we'll discuss the tools and techniques of the .Net ecosystem that can help us to concentrate on the problem itself — not just on the intermediate steps (which have likely already been solved). We'll compare imperative and declarative approaches and assess solutions to problems.
We'll also offer examples of how engineers in Grammarly's Office Add-in team use these tools to improve the efficiency of our engineering and strengthen our solutions to the problems at hand.
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
In this research meeting, guest Stephen Omohundro gave a fascinating talk on GPT-3, the new massive OpenAI Natural Language Processing model. He reviewed the network architecture, training process, and results in the context of past work. There was extensive discussion on the implications for NLP and for Machine Intelligence / AGI.
Link to GPT-3 paper: https://arxiv.org/abs/2005.14165
Link to YouTube recording of Steve's talk: https://youtu.be/0ZVOmBp29E0
A study states that people are now spending more time in messaging apps than social networking applications. Messaging apps are in trend and chatbots are the future. Learn everything about the chatbots from history to types to working, right here.
As a data science Intern at Leapcheck Services private limited, I have developed a naive chatbot using sequence to sequence model by LSTM of RNN. Sharing the tutorial which I made explicitly for the deep learning enthusiasts to
provide them a basic insight on how chatbot can be developed with the help of recurrent neural network.
Speaker: Vitalii Braslavskyi, Software Engineer at Grammarly
Summary:
Today, the dominant approach to software engineering is an imperative one — the best practices have been proven over time. But the world is always evolving, and in order to evolve with it and remain as productive as possible, we need to continue searching for better tools to solve problems of increasing complexity.
In this talk, we'll discuss the tools and techniques of the .Net ecosystem that can help us to concentrate on the problem itself — not just on the intermediate steps (which have likely already been solved). We'll compare imperative and declarative approaches and assess solutions to problems.
We'll also offer examples of how engineers in Grammarly's Office Add-in team use these tools to improve the efficiency of our engineering and strengthen our solutions to the problems at hand.
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
In this research meeting, guest Stephen Omohundro gave a fascinating talk on GPT-3, the new massive OpenAI Natural Language Processing model. He reviewed the network architecture, training process, and results in the context of past work. There was extensive discussion on the implications for NLP and for Machine Intelligence / AGI.
Link to GPT-3 paper: https://arxiv.org/abs/2005.14165
Link to YouTube recording of Steve's talk: https://youtu.be/0ZVOmBp29E0
A study states that people are now spending more time in messaging apps than social networking applications. Messaging apps are in trend and chatbots are the future. Learn everything about the chatbots from history to types to working, right here.
As a data science Intern at Leapcheck Services private limited, I have developed a naive chatbot using sequence to sequence model by LSTM of RNN. Sharing the tutorial which I made explicitly for the deep learning enthusiasts to
provide them a basic insight on how chatbot can be developed with the help of recurrent neural network.
The GPT-3 model architecture is a transformer-based neural network that has been fed 45TB of text data. It is non-deterministic, in the sense that given the same input, multiple runs of the engine will return different responses. Also, it is trained on massive datasets that covered the entire web and contained 500B tokens, humongous 175 Billion parameters, a more than 100x increase over GPT-2, which was considered state-of-the-art technology with 1.5 billion parameters.
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...ijnlc
In this paper, we proposed a XAI-based Language Learning Chatbot (namely XAI Language Tutor) by using ontology and transfer learning techniques. To facilitate three levels of language learning, XAI Language Tutor consists of three levels for systematically English learning, which includes: 1) phonetics level for speech recognition and pronunciation correction; 2) semantic level for specific domain conversation, and 3) simulation of “free-style conversation” in English - the highest level of language chatbot communication as “free-style conversation agent”. In terms of academic contribution, we implement the ontology graph to explain the performance of free-style conversation, following the concept of XAI (Explainable Artificial Intelligence) to visualize the connections of neural network in bionics, and explain the output sentence from language model. From implementation perspective, our XAI Language Tutor agent integrated the mini-program in WeChat as front-end, and fine-tuned GPT-2 model of transfer learning as back-end to interpret the responses by ontology graph.
All of our source codes have uploaded to GitHub: https://github.com/p930203110/EnglishLanguageRobot
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
This was presented to software developers with the goal of introducing them to basic machine learning workflow, code snippets, possibilities and state-of-the-art in NLP and give some clues on where to get started.
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
Open ai’s gpt 3 language explained under 5 minsAnshul Nema
OpenAI, a non-profit AI research company backed by Peter Thiel, Elon Musk, Reid Hoffman, Marc Benioff, Sam Altman, et al., released its third generation of language prediction model (GPT-3) into the open-source wild.
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
This is a survey about Dialog System, Question and Answering, including the 03 generations: (1) Symbolic Rule/Template Based QA; (2) Data Driven, Learning; (3) Data-Driven Deep Learning. It also presents the available Frameworks and Datas for Dialog Systems.
The GPT-3 model architecture is a transformer-based neural network that has been fed 45TB of text data. It is non-deterministic, in the sense that given the same input, multiple runs of the engine will return different responses. Also, it is trained on massive datasets that covered the entire web and contained 500B tokens, humongous 175 Billion parameters, a more than 100x increase over GPT-2, which was considered state-of-the-art technology with 1.5 billion parameters.
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...ijnlc
In this paper, we proposed a XAI-based Language Learning Chatbot (namely XAI Language Tutor) by using ontology and transfer learning techniques. To facilitate three levels of language learning, XAI Language Tutor consists of three levels for systematically English learning, which includes: 1) phonetics level for speech recognition and pronunciation correction; 2) semantic level for specific domain conversation, and 3) simulation of “free-style conversation” in English - the highest level of language chatbot communication as “free-style conversation agent”. In terms of academic contribution, we implement the ontology graph to explain the performance of free-style conversation, following the concept of XAI (Explainable Artificial Intelligence) to visualize the connections of neural network in bionics, and explain the output sentence from language model. From implementation perspective, our XAI Language Tutor agent integrated the mini-program in WeChat as front-end, and fine-tuned GPT-2 model of transfer learning as back-end to interpret the responses by ontology graph.
All of our source codes have uploaded to GitHub: https://github.com/p930203110/EnglishLanguageRobot
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
This was presented to software developers with the goal of introducing them to basic machine learning workflow, code snippets, possibilities and state-of-the-art in NLP and give some clues on where to get started.
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
Open ai’s gpt 3 language explained under 5 minsAnshul Nema
OpenAI, a non-profit AI research company backed by Peter Thiel, Elon Musk, Reid Hoffman, Marc Benioff, Sam Altman, et al., released its third generation of language prediction model (GPT-3) into the open-source wild.
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
This is a survey about Dialog System, Question and Answering, including the 03 generations: (1) Symbolic Rule/Template Based QA; (2) Data Driven, Learning; (3) Data-Driven Deep Learning. It also presents the available Frameworks and Datas for Dialog Systems.
A recap of interesting points and quotes from the May 2024 WSO2CON opensource application development conference. Focuses primarily on keynotes and panel sessions.
4 European machine translation companies joined forces to build something bigger than themselves: an intelligent platform capable of detecting domain, detecting languages, balancing load with a view to create a marketplace. The project was financed by the European Commission. This is the presentation by Pangeanic in Gala Boston 2018.
Fabrice Lacroix - Connecting a Chatbot to Your Technical Content: Myth and Re...LavaConConference
Analysts predict that chatbots will be your customers’ preferred interface. How can you transition from carefully staged demos to real-life customers?
In this session attendees will learn:
How rules-based chatbots work and why those simple demos are so impressive
Why chatbots so often fail as soon as they enter the real world
Why current technologies are inadequate and give unsatisfactory experiences
The difficulties in fueling a chatbot with text-based content
Technological approaches which will permit you to go to the next level with a chatbot that can take you to a higher degree of functionality
Check my presentation about challenges we're facing with voice and how to successfully getting started. Also viewable online: https://pasteapp.com/p/ANdFfwH2si9
Realizzare un Virtual Assistant con Bot Framework Azure e UnityMarco Parenzan
When we talk about (chat) bots, we think of a bot with which to chat through texts entered with a keyboard. Between typos, buttons and AdaptiveCards, we tend to lose sight of the true essence of a Conversational Agent: conversing through a dialogue, perhaps through the voice. The maximum value of a bot is expressed when using the Azure Cognitive services for Speech to Text and Text to Speech. We will understand how to develop a bot, the use of predefined entries and multilingual support. And what are the consequences in defining a conversation.
And then we'll see how to give the bot "physicality". Through the use of Unity, we will create a scene with an Avatar modeled in 3D that synchronizes with the conversation, giving us the feeling of talking to us like a real person. The whole in the context of an enterprise scenario.
Now is the time to unravel the mysteries of cloud computing. The speaker is going to brief about Generative AI in the session that is related to study Jams 2023
At its core, the conversational experience is an easy-back-and forth with a customer on a specific channel such as chat or voice.
But this isn't enough to understand how it works and what is its importance in the current scenario. Know more about it with this small presentation.
To build your own chatbot or virtual assistant contact us today at - https://bit.ly/2GzL2Nk
Do you understand the differences between pattern recognition, artificial intelligence and machine learning? And most important, what they separately bring to the table? In this week’s webinar we will tackle the terminology and discuss its recent explosion of popularity, and also look at how the Ogilvy analytics team has applied machine learning methods to effectively answer client challenges and drive value.
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first Patrick Van Renterghem
Filip Maertens presented this "AI, Machine Learning and Chatbots" at the "Future of IT" seminar on 20th of September 2017 in Brussels. Twitter: @fmaertens Email: filip@faction.xyz
Similar to Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jordi Carrera Ventura (20)
Grammarly AI-NLP Club #10 - Information-Theoretic Probing with Minimum Descri...Grammarly
Speaker: Elena Voita, a Ph.D. student at the University of Edinburgh and the University of Amsterdam
Summary: How can you know whether a model (e.g., ELMo, BERT) has learned to encode a linguistic property? The most popular approach to measure how well pretrained representations encode a linguistic property is to use the accuracy of a probing classifier (probe). However, such probes often fail to adequately reflect differences in representations, and they can show different results depending on probe hyperparameters. As an alternative to standard probing, we propose information-theoretic probing which measures minimum description length (MDL) of labels given representations. In addition to probe quality, the description length evaluates “the amount of effort” needed to achieve this quality. We show that (i) MDL can be easily evaluated on top of standard probe-training pipelines, and (ii) compared to standard probes, the results of MDL probing are more informative, stable, and sensible.
Grammarly AI-NLP Club #9 - Dumpster diving for parallel corpora with efficien...Grammarly
Speaker: Kenneth Heafield, Lecturer at the University of Edinburgh
Summary: The ParaCrawl project is mining a petabyte of the web for translations to release freely at https://paracrawl.eu/releases.html. But the web is a messy place, with a lot of data to sift through. To find translations, we translate everything into English or at least use a neural encoder. A related project makes machine translation inference more efficient by using optimizations ranging from assembly instructions to removal of bits of model architecture.
Grammarly AI-NLP Club #8 - Arabic Natural Language Processing: Challenges and...Grammarly
Speaker: Nizar Habash is an Associate Professor of Computer Science at New York University Abu Dhabi (NYUAD). Professor Habash’s research includes extensive work on machine translation, morphological analysis, and computational modeling of Arabic and its dialects. Professor Habash has been a principal investigator or co-investigator on over 20 grants. He has over 200 publications including a book titled “Introduction to Arabic Natural Language Processing.” His website is www.nizarhabash.com. He is the director of the NYUAD Computational Approaches to Modeling Language (CAMeL) Lab (www.camel-lab.com).
Summary: The Arabic language presents a number of challenges to researchers and developers of language technologies. Arabic is both morphologically rich and highly ambiguous; and it has a number of dialects that vary widely amongst themselves and with Standard Arabic. The dialects have no official spelling standards, and spelling and grammar errors are common in unedited Standard Arabic. In this talk, we present some of these challenges in detail and cover some of the ongoing efforts to address them with creative language technologies.
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly
Speaker: Artem Chernodub, Chief Scientist at Clikque Technology and Associate Professor at Ukrainian Catholic University
Summary: Sequence Tagging is an important NLP problem that has several applications, including Named Entity Recognition, Part-of-Speech Tagging, and Argument Component Detection. In our talk, we will focus on a BiLSTM+CNN+CRF model — one of the most popular and efficient neural network-based models for tagging. We will discuss task decomposition for this model, explore the internal design of its components, and provide the ablation study for them on the well-known NER 2003 shared task dataset.
Grammarly AI-NLP Club #5 - Automatic text simplification in the biomedical do...Grammarly
Speaker: Natalia Grabar, NLP scientist
Summary: We propose a set of experiments with the general objective of ensuring a better understanding of technical health documents. Various experiments address different steps of this complex and ambitious process: (1) categorization of documents according to their complexity; (2) detection of complex passages within documents; (3) acquisition of resources for the lexical and semantic simplification of documents; (4) alignment of parallel sentences from comparable corpora for generating rules for syntactic transformation. According to the steps and tasks, various methods are exploited (rule-based, machine learning, with and without linguistic knowledge). In addition to text simplification, the results and resources can be used for other NLP applications and tasks (e.g., information retrieval and extraction, question-answering, textual entailment).
Grammarly AI-NLP Club #3 - Learning to Read for Automated Fact Checking - Isa...Grammarly
Speaker: Isabelle Augenstein, Assistant Professor, University of Copenhagen
Summary: The spread of misinformation and disinformation is growing, and it’s having a big impact on interpersonal communications, politics and even science.
Traditional methods, e.g., manual fact-checking by reporters, cannot keep up with the growth of information. On the other hand, there has been much progress in natural language processing recently, partly due to the resurgence of neural methods.
How can natural language processing methods fill this gap and help to automatically check facts?
This talk will explore different ways to frame fact checking and detail our ongoing work on learning to encode documents for automated fact checking, as well as describe future challenges.
Grammarly AI-NLP Club #4 - Understanding and assessing language with neural n...Grammarly
Speaker: Marek Rei, Senior Research Associate, University of Cambridge
Summary: The number of people learning English around the world is currently estimated at 1.5 billion and is predicted to exceed 1.9 billion by 2020. The increasing need to communicate beyond borders has created a large unmet demand for qualified language teachers across the globe. Computational models for error detection and essay scoring can alleviate this issue by giving millions of people access to affordable learning resources. Successful systems for automated language teaching will need to analyse language at various levels of granularity and provide useful feedback to individual students.In this talk, we will explore some of the latest approaches to written language assessment, using neural architectures for composing the meaning of a sentence or text, and also discuss potential future directions in the field.
Grammarly Meetup: DevOps at Grammarly: Scaling 100xGrammarly
Speaker: Dmitry Unkovsky, Software Engineer at Grammarly
Summary: We will tell the story of DevOps at Grammarly since 2013. We’ll talk about how we managed infrastructure growth while keeping up with the rapid pace of product development; what worked for us and what did not, and why; and what it’s like to make technical choices as an engineer at our company. We will share our current vision and future plans.
Grammarly Meetup: Memory Networks for Question Answering on Tabular Data - Sv...Grammarly
Tabular data is difficult to analyze and search through. There is a clear need for new tools and interfaces that would allow even non-tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even having to fully understand the dataset structure. We explore the End-To-End Memory Networks architecture (Sukhbaatar et al., 2015) in application to answering natural language questions from tabular data. This architecture was originally designed for the question-answering tasks from short natural language texts (bAbI tasks) (Weston et al., 2015), which include testing elements of inductive and deductive reasoning, co-reference resolution and time manipulation.
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly
Speaker: Tim Baldwin, Professor of Computer Science, University of Melbourne
Summary: Two forms of bias that are commonly associated with natural language processing (NLP) tasks are domain bias (implicit bias towards documents from a particular domain, with lower performance over other document types) and social bias (implicit bias towards documents authored by particular types of individuals, with lower performance over documents authored by other types of individuals). In this talk, I will discuss the importance of debiasing NLP models across these dimensions, and strategies that can be employed to achieve this. I will focus the talk on the task of language identification (i.e., identifying the language(s) a written document is authored in).
Speaker: Andriy Gryshchuk, Senior Research Engineer at Grammarly.
Summary: Paraphrase detection is a challenging NLP task since it requires both thorough syntactic and thorough semantic analysis to identify whether two phrases have the same intent. A few months ago, paraphrase identification became an objective of one of the most popular Kaggle competitions, Quora Question Pairs. In this talk, Yuriy Guts and Andriy Gryshchuk, silver medalists of the competition, will share their arsenal of statistical, linguistic, and Deep Learning approaches that helped them succeed in this challenge.
Speaker: Yuriy Guts, Machine Learning Engineer at DataRobot.
Paraphrase detection is a challenging NLP task since it requires both thorough syntactic and thorough semantic analysis to identify whether two phrases have the same intent. A few months ago, paraphrase identification became an objective of one of the most popular Kaggle competitions, Quora Question Pairs. In this talk, Yuriy Guts and Andriy Gryshchuk, silver medalists of the competition, will share their arsenal of statistical, linguistic, and Deep Learning approaches that helped them succeed in this challenge.
Natural Language Processing for biomedical text mining - Thierry HamonGrammarly
Speaker: Thierry Hamon, Associate Professor in Computer Science at Université Paris, Member of the LIMSI-CNRS research lab.
Summary: Among the large amounts of unstructured data generated across the world and available nowadays, textual data represent an important source of information. This fact is particularly true in the biomedical domain, where a constant increasing demand to access the textual content is observed: the situation is relevant for accessing and processing Electronic Health Records, online discussion forums, and scientific literature. Indeed, dealing with biomedical texts requires us to take into account a great variety of texts, languages and Users.
For several years now, a lot of NLP research has focused on mining and retrieving information (i.e., medical entities and domain-specific relations), which are relevant for biologists, physicians, terminologists, epidemiologists, and patients. We will propose an overview of the NLP methods used for tackling several such research problems through text mining applications. First, we will present the resources and rule-based approaches we designed for extracting drug-related information from clinical texts, and for acquiring domain-specific semantic relations from digital libraries. Then we will present the cross-lingual approach we are developing for building multilingual terminologies from a patient-centered Ukrainian corpus.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jordi Carrera Ventura
1. Recent advances
in applied chatbot technology
to be presented at AI Ukraine 2016
Jordi Carrera Ventura
NLP scientist
Telefónica Research & Development
4. Chatbots are a form of conversational artificial intelligence (AI) in which a user
interacts with a virtual agent through natural language messaging in
• a messaging interface like Slack or Facebook Messenger
• or a voice interface like Amazon Echo or Google Assistant.
Chatbot1
A bot that lives in a chat (an automation routine inside a UI).
Conversational agent
Virtual Assistant1
A bot that takes natural language as input and returns it as output.
Chatbot2
A virtual assistant that lives in a chat.
User> what are you?
1 Generally assumed to have broader coverage and more advanced AI than chatbots2.
5. Usually intended to
• get quick answers to a specific questions over some pre-defined
repository of knowledge
• perform transactions in a faster and more natural way.
Can be used to
• surface contextually relevant information
• help a user complete an online transaction
• serve as a helpdesk agent to resolve a customer’s issue without
ever involving a human.
Virtual assistants for
Customer support, e-commerce, expense management, booking
flights, booking meetings, data science...
Use cases
6. Advantages of chatbots
From: https://medium.com/@csengerszab/messenger-chatbots-is-it-a-new-
revolution-6f550110c940
7. Actually,
As an interface, chatbots are not the best.
Nothing beats a spreadsheet for most things. For data
already tabulated, a chatbot may actually slow things down.
A well-designed UI can allow a user to do anything in a few
taps, as opposed to a few whole sentences in natural
language.
But chatbots can generally be regarded as universal -every
channel supports text. Chatbots as a UX are effective as a
protocol, like email.
8. What's really important...
Success conditions
• Talk like a person.
• The "back-end" MUST NOT BE a person.
• Provide a single interface to access all other services.
• Useful in industrial processes
Response automation given specific, highly-repetitive
linguistic inputs. Not conversations, but ~auto-replies
linked to more refined email filters.
• Knowledge representation.
9. What's really important...
Conversational technology should really be about
• dynamically finding the best possible way to browse a large
repository of information/actions.
• find the shortest path to any relevant action or piece of
information (to avoid the plane dashboard effect).
• surfacing implicit data in unstructured content ("bocadillo de
calamares in Madrid"). Rather than going open-domain, taking
the closed-domain and going deep into it.
10. What's really important...
Conversational technology should really be about
• dynamically finding the best possible way to browse a large
repository of information/actions.
• find the shortest path to any relevant action or piece of
information (to avoid the plane dashboard effect).
• surfacing implicit data in unstructured content ("bocadillo de
calamares in Madrid"). Rather than going open-domain, taking
the closed-domain and going deep into it.
21. ... and the hype
Getting started with bots is as simple as using any of a handful of
new bot platforms that aim to make bot creation easy...
... sophisticated bots require an understanding of natural language
processing (NLP) and other areas of artificial intelligence.
The optimists
22. ... and the hype
Artificial intelligence has advanced remarkably in the last three
years, and is reaching the point where computers can have
moderately complex conversations with users in natural language
<perform simple actions as response to primitive linguistic
signals>.
The optimists
23. ... and the hype
Most of the value of deep learning today is in narrow domains where you
can get a lot of data. Here’s one example of something it cannot do: have
a meaningful conversation. There are demos, and if you cherry-pick the
conversation, it looks like it’s having a meaningful conversation, but if you
actually try it yourself, it quickly goes off the rails.
Andrew Ng (Chief Scientist, AI, Baidu)
24. ... and the hype
Many companies start off by outsourcing their conversations to human
workers and promise that they can “automate” it once they’ve collected
enough data.
That’s likely to happen only if they are operating in a pretty narrow
domain – like a chat interface to call an Uber for example.
Anything that’s a bit more open domain (like sales emails) is beyond what
we can currently do.
However, we can also use these systems to assist human workers by
proposing and correcting responses. That’s much more feasible.
Andrew Ng (Chief Scientist, AI, Baidu)
26. Lack of suitable metrics
Industrial state-of-the-art
Liu, Chia-Wei, et al. "How NOT to evaluate your dialogue system: An empirical study of
unsupervised evaluation metrics for dialogue response generation." arXiv preprint arXiv:
1603.08023 (2016).
27. By Intento (https://inten.to)
Dataset
SNIPS.ai 2017 NLU Benchmark
https://github.com/snipsco/nlu-benchmark
Example intents
SearchCreativeWork (e.g. Find me the I, Robot television show)
GetWeather (e.g. Is it windy in Boston, MA right now?)
BookRestaurant (e.g. I want to book a highly rated restaurant for me and my boyfriend
tomorrow night)
PlayMusic (e.g. Play the last track from Beyoncé off Spotify)
AddToPlaylist (e.g. Add Diamonds to my roadtrip playlist)
RateBook (e.g. Give 6 stars to Of Mice and Men)
SearchScreeningEvent (e.g. Check the showtimes for Wonder Woman in Paris)
Industrial state-of-the-art
Benchmark
28. Methodology
English language.
Removed duplicates that differ by number of whitespaces, quotes,
lettercase, etc.
Resulting dataset parameters
7 intents, 15.6K utterances (~2K per intent)
3-fold 80/20 cross-validation.
Most providers do not offer programmatic interfaces for adding training
data.
Industrial state-of-the-art
Benchmark
29. Framework Since F1
False
positives
Response
time
Price
IBM Watson
Conversation
2015 99.7 100% 0.35 2.5
API.ai (Google 2016) 2010 99.6 40% 0.28 Free
Microsoft LUIS 2015 99.2 100% 0.21 0.75
Amazon Lex 2016 96.5 82% 0.43 0.75
Recast.ai 2016 97 75% 2.06 N/A
wit.ai (Facebook) 2013 97.4 72% 0.96 Free
SNIPS 2017 97.5 26% 0.36 Per device
Industrial state-of-the-art
32. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
that's a nice one but, i want to see
different jackets .... how can i do that
33. So, let's be real...
We've got what you're looking for!
<image product_id=1 item_id=2>
where can i find a pair of shoes that
match this yak etc.
We've got what you're looking for!
<image product_id=1 item_id=3>
pair of shoes
(, you talking dishwasher)
34. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
lexical variants:
synonyms, paraphrases
35. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent
36. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent
Entity
37. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent
This can be normalized,
e.g. regular expressions
Entity
38. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent
This can also be normalized,
e.g. regular expressions
Entity
39. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent
This can also be normalized,
e.g. regular expressions
Entity
1-week project on a regular-expression-based filter.
From 89% to 95% F1 with only a 5% positive rate.
7-label domain classification task using a
RandomForest classifier over TFIDF-weighted vectors.
40. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent My internet connection does not work.
My smartphone does not work.
My landline does not work.
My router does not work.
My SIM does not work.
My payment method does not work.
My saved movies does not work.
41. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent My internet connection does not work.
My smartphone does not work.
My landline does not work.
My router does not work.
My SIM does not work.
My payment method does not work.
My saved movies does not work.
ConnectionDebugWizard(*)
TechnicalSupport(Mobile)
TechnicalSupport(Landline)
RouterDebugWizard()
VerifyDeviceSettings
AccountVerificationProcess
TVDebugWizard
42. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent My internet connection does not work.
My smartphone does not work.
My landline does not work.
My router does not work.
My SIM does not work.
My payment method does not work.
My saved movies does not work.
ConnectionDebugWizard(*)
TechnicalSupport(Mobile)
TechnicalSupport(Landline)
RouterDebugWizard()
VerifyDeviceSettings
AccountVerificationProcess
TVDebugWizard
Since we associate each intent to an action,
the ability to discriminate actions presupposes
training as many separate intents,
with just as many ambiguities.
43. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent My internet connection does not work.
My smartphone does not work.
My landline does not work.
My router does not work.
My SIM does not work.
My payment method does not work.
My saved movies does not work.
ConnectionDebugWizard(*)
TechnicalSupport(Mobile)
TechnicalSupport(Landline)
RouterDebugWizard()
VerifyDeviceSettings
AccountVerificationProcess
TVDebugWizard
It also presupposes exponentially increasing
amounts of training data as we add
higher-precision entities to our model
(every entity * every intent that applies)
44. So, let's be real...
lexical variants:
synonyms, paraphrases
Intent My internet connection does not work.
My smartphone does not work.
My landline does not work.
My router does not work.
My SIM does not work.
My payment method does not work.
My saved movies does not work.
ConnectionDebugWizard(*)
TechnicalSupport(Mobile)
TechnicalSupport(Landline)
RouterDebugWizard()
VerifyDeviceSettings
AccountVerificationProcess
TVDebugWizard
The entity resolution step (which would have
helped us make the right decision) is nested
downstream in our pipeline.
Any errors from the previous stage
will propagate down.
45. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
that's a nice one but, I want to see
different jackets .... how can i do that
lexical variants:
synonyms, paraphrases formal variants
(typos, ASR errors)
multiword
chunking/parsing
morphological
variants
46. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
lexical variants:
synonyms, paraphrases formal variants
(typos, ASR errors)
multiword
chunking/parsing
morphological
variants
entity recognition
hey i need to buy a men's red leather
jacket with straps for my friend
that's a nice one but, I want to see
different jackets .... how can i do that
47. So, let's be real...
where can i find a <jacket with straps>
looking for a <black mens jacket>
do you have anything in <leather>
<present> for a friend
}Our training
dataset
buy <red jacket>
48. So, let's be real...
where can i buy <a men's red leather
jacket with straps> for my friend
where can i find a <jacket with straps>
looking for a <black mens jacket>
do you have anything in <leather>
<present> for a friend
hi i need <a men's red leather jacket with
straps> that my friend can wear
}Our training
dataset
}Predictions
at runtime
buy <red jacket>
49. So, let's be real...
where can i buy <a men's red leather
jacket with straps> for my friend
where can i find a <jacket with straps>
looking for a <black mens jacket>
do you have anything in <leather>
<present> for a friend
hi i need <a men's red leather jacket with
straps> that my friend can wear
}Our training
dataset
}Predictions
at runtime
Incomplete training data
will cause
entity segmentation issues
buy <red jacket>
50. So, let's be real...
buy <red jacket>
where can i find a <jacket with straps>
looking for a <black mens jacket>
do you have anything in <leather>
<present> for a friend
}Our training
dataset
But more importantly, in the intent-entity
paradigm we are usually unable to train two
entity types with the same distribution:
where can i buy a <red leather jacket>
where can i buy a <red leather wallet>
jacket, 0.5
wallet, 0.5
jacket, 0.5
wallet, 0.5
51. So, let's be real...
buy <red jacket>
where can i find a <jacket with straps>
looking for a <black mens jacket>
do you have anything in <leather>
<present> for a friend
}Our training
dataset
... and have no way of detecting entities
used in isolation:
pair of shoes
MakePurchase
ReturnPurchase
['START', 3-gram, 'END']
GetWeather
Greeting
Goodbye
52. that's a nice one but, I want to see
different jackets .... how can i do that
So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
lexical variants:
synonyms, paraphrases formal variants
(typos, ASR errors)
multiword
chunking/parsing
morphological
variants
entity recognition
hey i need to buy a men's red leather
jacket with straps for my friend
searchItemByText("leather jacket")
searchItemByText(
"to buy a men's ... pfff TLDR... my friend"
)
[
{ "name": "leather jacket",
"product_category_id": 12,
"gender_id": 0,
"materials_id": [68, 34, ...],
"features_id": [],
},
...
{ "name": "leather jacket with straps",
"product_category_id": 12,
"gender_id": 1,
"materials_id": [68, 34, ...],
"features_id": [8754],
},
]
53. that's a nice one but, I want to see
different jackets .... how can i do that
So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
lexical variants:
synonyms, paraphrases formal variants
(typos, ASR errors)
multiword
chunking/parsing
morphological
variants
entity recognition
hey i need to buy a men's red leather
jacket with straps for my friend
searchItemByText("leather jacket")
searchItemByText(
"red leather jacket with straps"
)
[
{ "name": "leather jacket",
"product_category_id": 12,
"gender_id": 0,
"materials_id": [68, 34, ...],
"features_id": [],
},
...
{ "name": "leather jacket with straps",
"product_category_id": 12,
"gender_id": 1,
"materials_id": [68, 34, ...],
"features_id": [8754],
},
]
NO IDs,
NO reasonable expectation of a
cognitively realistic answer and
NO "conversational technology".
54. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Straps
Jackets
Leather items
Since the conditions express
an AND logical operator and
we will search for an item
fulfilling all of them, we could
actually search for the terms
in the conjunction in any
order.
55. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Given
• any known word, denoted by w,
• the vocabulary V of all known words, such that wi ∈ V for any i,
• search space D, where d denotes every document in a collection
D such that d ∈ D and d = {wi, wi + 1, ..., wn}, such that wi ≤ x ≤ n ∈ V,
• a function filter(D, wx) that returns a subset of D, Dwx where wx ∈
d is true for every d: d ∈ Dwx, and
• query q = {"men's", 'red', 'leather', 'jacket', 'straps'},
56. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
do filter(filter(filter(filter(
filter(D, 'leather'), 'straps'), "men's"), 'jacket'),'red'
) =
do filter(filter(filter(filter(
filter(D, 'straps'), 'red'), 'jacket'), 'leather'),"men's"
) =
do filter(filter(filter(filter(
filter(D, "men's"), 'leather'), 'red'), 'straps'),'jacket'
) =
57. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Given query q = {"men's", 'red', 'leather', 'jacket', 'straps'},
find the x: x ∈ q that satisfies:
containsSubstring(x, 'leather') ^
containsSubstring(x, 'straps') ^
...
containsSubstring(x, "men's")
As a result, it will lack the notion of what counts as a partial match.
We cannot use more complex predicates because, in this scenario,
the system has no understanding of the internal structure of the
phrase, its semantic dependencies, or its syntactic dependencies.
58. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
The following partial matches will be
virtually indistinguishable:
• leather straps for men's watches (3/5)
• men's vintage red leather shoes (3/5)
• red wallet with leather straps (3/5)
• red leather jackets with hood (3/5)
59. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
do filter(filter(filter(filter(
filter(D, "men's"), 'red'), 'leather'), 'jacket'),'straps'
)
Given
• any known word, denoted by w,
• the vocabulary V of all known words, such that wi ∈ V for any i,
• search space D, where d denotes every document in a collection D
such that d ∈ D and d = {wi, wi + 1, ..., wn}, such that wi ≤ x ≤ n ∈ V,
• a function filter(D, wx) that returns a subset of D, Dwx where wx ∈ d
is true for every d: d ∈ Dwx, and
• query q = {"men's", 'red', 'leather', 'jacket', 'straps'},
This is our problem:
we're using a function
that treats equally every wx.
60. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
do filter(filter(filter(filter(
filter(D, "men's"), 'red'), 'leather'), 'jacket'),'straps'
)
Given
• any known word, denoted by w,
• the vocabulary V of all known words, such that wi ∈ V for any i,
• search space D, where d denotes every document in a collection D
such that d ∈ D and d = {wi, wi + 1, ..., wn}, such that wi ≤ x ≤ n ∈ V,
• a function filter(D, wx) that returns a subset of D, Dwx where wx ∈ d
is true for every d: d ∈ Dwx, and
• query q = {"men's", 'red', 'leather', 'jacket', 'straps'},
We need a function with at least one
additional parameter, rx
filter(D, wx, rx)
where
rx denotes the depth at which the
dependency tree headed by wx
attaches to the node being
currently evaluated.
61. So, let's be real...
Your query
i need to buy a men's red leather jacket with straps for my friend
Tagging
i/FW need/VBP to/TO buy/VB a/DT men/NNS 's/POS red/JJ leather/NN jacket/NN with/IN
straps/NNS for/IN my/PRP$ friend/NN
Parse
(ROOT
(S
(NP (FW i))
(VP (VBP need)
(S
(VP (TO to)
(VP (VB buy)
(NP
(NP (DT a) (NNS men) (POS 's))
(JJ red) (NN leather) (NN jacket))
(PP (IN with)
(NP
(NP (NNS straps))
(PP (IN for)
(NP (PRP$ my) (NN friend)))))))))))
http://nlp.stanford.edu:8080/parser/index.jsp (as of September 19th 2017)
62. So, let's be real...
Your query
i need to buy a men's red leather jacket with straps for my friend
Tagging
i/FW need/VBP to/TO buy/VB a/DT men/NNS 's/POS red/JJ leather/NN jacket/NN with/IN
straps/NNS for/IN my/PRP$ friend/NN
Parse
(ROOT
(S
(NP (FW i))
(VP (VBP need)
(S
(VP (TO to)
(VP (VB buy)
(NP
(NP (DT a) ((NNS men) (POS 's))
((JJ red) (NN leather)) (NN jacket)))
(PP (IN with)
(NP
(NP (NNS straps))
(PP (IN for)
(NP (PRP$ my) (NN friend)))))))))))
http://nlp.stanford.edu:8080/parser/index.jsp (as of September 19th 2017)
63. So, let's be real...
Your query
i need to buy a men's red leather jacket with straps for my friend
Tagging
i/FW need/VBP to/TO buy/VB a/DT men/NNS 's/POS red/JJ leather/NN jacket/NN with/IN
straps/NNS for/IN my/PRP$ friend/NN
Parse
(ROOT
(S
(NP (FW i))
(VP (VBP need)
(S
(VP (TO to)
(VP (VB buy)
(NP
(NP (DT a) ((NNS men) (POS 's))
((JJ red) (NN leather)) (NN jacket)))
(PP (IN with)
(NP
(NP (NNS straps))
(PP (IN for)
(NP (PRP$ my) (NN friend)))))))))))
http://nlp.stanford.edu:8080/parser/index.jsp (as of September 19th 2017)
(red, leather)
(leather, jacket)
(jacket, buy)
64. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
do filter(filter(filter(filter(
filter(D, "men's"), 'red'), 'leather'), 'jacket'),'straps'
)
Given
• any known word, denoted by w,
• the vocabulary V of all known words, such that wi ∈ V for any i,
• search space D, where d denotes every document in a collection D
such that d ∈ D and d = {wi, wi + 1, ..., wn}, such that wi ≤ x ≤ n ∈ V,
• a function filter(D, wx) that returns a subset of D, Dwx where wx ∈ d
is true for every d: d ∈ Dwx, and
• query q = {"men's", 'red', 'leather', 'jacket', 'straps'},
The output of this function
will no longer be a subset of
search results but a ranked list.
This list can be naively ranked by
∑x
|t| (1 / rx)
although much more advanced
scoring functions can be used.
65. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
66. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> t1 = dependencyTree(a1)
>>> t1.depth("black")
3
>>> t1.depth("jacket")
1
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
67. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> results = filter(D, "jacket", t1.depth("jacket"))
>>> results
[ ( Jacket1, 1.0 ),
( Jacket2, 1.0 ),
...,
( Jacketn, 1.0 )
]
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
68. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> results = filter(results, "leather", t1.depth("leather"))
>>> results
[ ( Jacket1, 1.5 ),
( Jacket2, 1.5 ),
...,
( Jacketn - x, 1.5 )
]
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
69. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
>>> results = filter(results, "red", t1.depth("red"))
>>> results
[ ( Jacket1, 1.5 ),
( Jacket2, 1.5 ),
...,
( Jacketn - x, 1.5 )
]
70. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> t2 = dependencyTree(a2)
>>> t2.depth("red")
3
>>> t2.depth("jacket")
None
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
71. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> results = filter(D, "jacket", t2.depth("jacket"))
>>> results
[]
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
72. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
>>> results = filter(results, "leather", t2.depth("leather"))
>>> results
[ ( Wallet1, 0.5 ),
( Wallet2, 0.5 ),
...,
( Strapn, 0.5 )
]
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
73. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Given:
q: "red leather jacket"
a1: "black leather jacket"
a2: "red leather wallet"
>>> results = filter(results, "red", t2.depth("red"))
>>> results
[ ( Wallet1, 0.83 ),
( Wallet2, 0.83 ),
...,
( Strapn - x, 0.83 )
]
74. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
Results for q2:
"red leather wallet"
[ ( Wallet1, 0.83 ),
( Wallet2, 0.83 ),
...,
( Strapn - x, 0.83 )
]
Results for q1:
"black leather jacket"
[ ( Jacket1, 1.5 ),
( Jacket2, 1.5 ),
...,
( Jacketn - x, 1.5 )
]
75. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
However, the dependency tree is not always available, and it
often contains parsing issues (cf. Stanford parser output).
An evaluation of parser robustness over noisy input reports
performances down to an average of 80% across datasets
and parsers.
Hashemi, Homa B., and Rebecca Hwa. "An Evaluation of Parser Robustness for
Ungrammatical Sentences." EMNLP. 2016.
76. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
In order to make this decision
[ [red leather] jacket ]👍
77. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
[ [red leather] jacket ]
[ red [leather jacket] ]👍
In order to make this decision
78. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]👍
In order to make this decision
79. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
😱
In order to make this decision
[ [men's leather] jacket ]
80. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]😱
In order to make this decision
... the parser already needs
as much information as we
can provide regarding head-
modifier attachments.
81. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]
In order to make this decision
2 2 1
2 1 1
2 1 1
2 2 1
82. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
In order to make this decision
Same pattern, different score. How to assign a probability such
that less plausible results are buried at the bottom of the
hypotheses space? (not only for display purposes, but for
dynamic programming reasons)
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]
2 2 1
2 1 1
2 1 1
2 2 1
83. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
In order to make this decision
p(red | jacket ) ~
p(red | leather )
p(men's | jacket ) >
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]
2 2 1
2 1 1
2 1 1
2 2 1 p(men's | leather )
84. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
That's a lot of data to model. Where will we get it from?
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]
2 2 1
2 1 1
2 1 1
2 2 1
85. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
That's a lot of data to model. Where will we get it from?
p(Color | ClothingItem ) ~
p(Color | Material )
p(Person | ClothingItem ) >
[ [red leather] jacket ]
[ red [leather jacket] ]
[ men's [leather jacket] ]
[ [men's leather] jacket ]
2 2 1
2 1 1
2 1 1
2 2 1 p(Person | Material )
86. So, let's be real...
hey i need to buy a men's red leather
jacket with straps for my friend
If our taxonomy is well built, semantic associations
acquired for a significant number of members of each
class will propagate to new, unseen members of that class.
With a sufficiently rich NER detection pipeline and a semantic
relation database large enough, we can rely on simple inference
for a vast majority of the disambiguations.
More importantly, we can weight candidate hypotheses
dynamically in order to make semantic parsing tractable
over many possible syntactic trees.
87. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
that's a nice one but, I want to see
different jackets .... how can i do that
lexical variants:
synonyms, paraphrases formal variants
(typos, ASR errors)
multiword
chunking/parsing
morphological
variants
entity recognition
entity linking
hierarchical structure
88. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
that's a nice one but, i want to see
different jackets .... how can i do that
89. So, let's be real...
Welcome to Macy's blah-blah... !
hi im a guy looking for leda jackets
We've got what you're looking for!
<image product_id=1 item_id=1>
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
90. So, let's be real...
that
one
do that
LastResult
jacket
LastCommand
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
91. So, let's be real...
that LastResult
IF:
pPoS-tagging(that_DT | * be_VBZ, START *) > pPoS-tagging(that_IN | * be_VBZ, START *)
that_DT 's_VBZ great_JJ !_PUNCT
that_DT 's_VBZ the_DT last_JJ...
he said that_IN i_PRP...
he said that_IN trump_NNP...
THEN IF:
pisCoreferent(LastValueReturned | that_DT) > u # may also need to condition on the
# dependency: (x, nice)
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
92. So, let's be real...
one jacket
Find the LastResult in the context that maximizes:
pisCoreferent(
LastResult.type=x |
one_CD, # trigger for the function, trivial condition
QualitativeEvaluation(User, x),
ContextDistance(LastResult.node)
)
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
93. So, let's be real...
one jacket
Particularly important
for decision-tree re-entry
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
Find the LastResult in the context that maximizes:
pisCoreferent(
LastResult.type=x |
one_CD, # trigger for the function, trivial condition
QualitativeEvaluation(User, x),
ContextDistance(LastResult.node)
)
94. So, let's be real...
no! forget the last thing i said
go back!
third option in the menu
i want to modify the address I gave you
95. So, let's be real...
QualitativeEvaluation(Useru, ClothingItemi)
BrowseCatalogue(Useru, filter=[ClothingItemn, j
|J|])
HowTo(Useru, Commandc
|Context|)
}A single utterance
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
96. So, let's be real...
}A single utterance
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
QualitativeEvaluation(Useru, ClothingItemi)
BrowseCatalogue(Useru, filter=[ClothingItemn, j
|J|])
HowTo(Useru, Commandc
|Context|)
97. So, let's be real...
}random.choice(I)
x if x ∈ I ^
length(x) = argmax(L)
I = [ i1, i2, i3 ]
L = [ | i1 |, | i2 |, | i3 | ]
P = [ p( i1 ), p( i2 ), p( i3 )]
x if x ∈ I ^
p(x) = argmax(P)
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
QualitativeEvaluation(Useru, ClothingItemi)
BrowseCatalogue(Useru, filter=[ClothingItemn, j
|J|])
HowTo(Useru, Commandc
|Context|)
98. So, let's be real...
}
random.choice(I)
x if x ∈ I ^
length(x) = argmax(L)
I = [ i1, i2, i3 ]
L = [ | i1 |, | i2 |, | i3 | ]
P = [ p( i1 | q ), p( i2 | q ), p( i3 | q ) ]
x if x ∈ I ^
p(x | q) = argmax(P)
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
However, this approach will still suffer from
segmentation issues due to passing many
more arguments than expected for resolving
any specific intent.
99. So, let's be real...
GetInfo(Useru, argmax(Commandc
|Context| ) )
SetPreference(Useru, SetRating(ClothingItemi, k, 1) )
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
∀(ClothingItemn, m
|N| |M| | n = i, k = * )
}
A standard API
will normally
provide these
definitions as the
declaration of its
methods.
100. So, let's be real...
GetInfo(Useru, argmax(Commandc
|Context| ) )
SetPreference(Useru, SetRating(ClothingItemi, k, 1) )
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
∀(ClothingItemn, m
|N| |M| | n = i, k = * )
}
We can map the
argument signatures
to the nodes of the
taxonomy we are
using for linguistic
inference.
101. So, let's be real...
GetInfo(Useru, argmax(Commandc
|Context| ) )
SetPreference(Useru, SetRating(ClothingItemi, k, 1) )
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
∀(ClothingItemn, m
|N| |M| | n = i, k = * )
}The system will
then be able to link
API calls to entity
parses dynamically.
102. So, let's be real...
GetInfo(Useru, argmax(Commandc
|Context| ) )
SetPreference(Useru, SetRating(ClothingItemi, k, 1) )
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
∀(ClothingItemn, m
|N| |M| | n = i, k = * )
}
Over time, new
methods added to
the API will be
automatically
supported by the
pre-existing
linguistic engine.
103. So, let's be real...
GetInfo(Useru, argmax(Commandc
|Context| ) )
SetPreference(Useru, SetRating(ClothingItemi, k, 1) )
that's a nice one but, i want to see
different jackets .... how can i do that
LastResult is a nice jacket but, i want
to see different jackets .... how can i
LastCommand
∀(ClothingItemn, m
|N| |M| | n = i, k = * )
}
Over time, new
methods added to
the API will be
automatically
supported by the
pre-existing
linguistic engine.
😁
104. So, let's be real...
If the system can reasonably solve nested semantic
dependencies hierarchically,
it can also be extended to handle multi-intent utterances in a
natural way.
105. So, let's be real...
We need:
• semantic association measures between the leaves of the tree in
order to attach them to the right nodes and be able to stop growing
a tree at the right level,
• a syntactic tree to build the dependencies, either using a parser or a
PCFG/LFG grammar (both may take input from the previous step,
which will provide the notion of verbal valence: Person Browse
Product),
• for tied analyses, an interface with the user to request clarification,
106. So, let's be real...
Convert to abstract entity types
Activate applicable
dependencies/grammar rules
(fewer and less ambiguous
thanks to the previous step)
i need to pay my bill and i cannot log into my account
i need to PAYMENT my INVOICE and i ISSUE ACCESS ACCOUNT
Resolve attachments and rank
candidates by semantic association,
keep top-n hypotheses
i need to [ PAYMENT my INVOICE ] and i [ ISSUE [ AccessAccount() ] ]
i need to [ Payment(x) ] and i [ Issue(y) ]
107. So, let's be real...
AND usually joins
ontologically coordinate elements,
e.g. books and magazines versus
*humans and Donald Trump
i need to [ Payment(x) ] and i [ Issue(y) ]
108. So, let's be real...
Even if
p0 = p( Payment | Issue ) > u,
we will still split the intents because here we have
py = p( Payment | Issue, y ),
and normally py < p0 for any irrelevant case (i.e., the
posterior probability of the candidate attachment will not beat
the baseline of its own prior probability)
i need to [ Payment(x) ] and i [ Issue(y) ]
109. So, let's be real...
I.e., Payment may have been attached to Issue if no other element
had been attached to it before.
Resolution order, as given by attachment probabilities, matters.
i need to [ Payment(x) ] and i [ Issue(y) ]
110. So, let's be real...
i need to [ Payment(x) ] i [ Issue(y) ]
i need to [ Payment(x) ] and i [ Issue(y) ]
Multi-intents solved 😎
111. Summary of desiderata
1. Normalization
1. Ability to handle input indeterminacy by activating several
candidate hypotheses
2. Ability to handle synonyms
3. Ability to handle paraphrases
2. NER
1. Transversal (cross-intent)
2. No nesting under classification (feedback loop instead)
3. Contextually unbounded
4. Entity-linked
3. Semantic parsing
1. Taxonomy-driven supersense inference
2. Semantically-informed syntactic trees
3. Ability to handle decision-tree re-entries
112. Summary of desiderata
1. Normalization
1. Ability to handle input indeterminacy by activating several
candidate hypotheses
2. Ability to handle synonyms
3. Ability to handle paraphrases
2. NER
1. Transversal (cross-intent)
2. No nesting under classification (feedback loop instead)
3. Contextually unbounded
4. Entity-linked
3. Semantic parsing
1. Taxonomy-driven supersense inference
2. Semantically-informed syntactic trees
3. Ability to handle decision-tree re-entries
Word embeddings, domain
thesaurus, fuzzy matching
Inverted indexes, character-
based neural networks for NER
113. Summary of desiderata
1. Normalization
1. Ability to handle input indeterminacy by activating several
candidate hypotheses
2. Ability to handle synonyms
3. Ability to handle paraphrases
2. NER
1. Transversal (cross-intent)
2. No nesting under classification (feedback loop instead)
3. Contextually unbounded
4. Entity-linked
3. Semantic parsing
1. Taxonomy-driven supersense inference
2. Semantically-informed syntactic trees
3. Ability to handle decision-tree re-entries
Word embeddings, domain
thesaurus, fuzzy matching
Inverted indexes, character-
based neural networks for NER
implicitly solved by Supersenses
+ Neural enquirer
implicitly solved by Neural
enquirer (layered/modular)
115. 3.1 Supersense inference
GFT (Google Fine-Grained) taxonomy
Yogatama, Dani, Daniel Gillick, and Nevena Lazic. "Embedding Methods
for Fine Grained Entity Type Classification." ACL (2). 2015.
116. 3.1 Supersense inference
FIGER taxonomy
Yogatama, Dani, Daniel Gillick, and Nevena Lazic. "Embedding Methods
for Fine Grained Entity Type Classification." ACL (2). 2015.
117. 3.1 Supersense inference
for every position, term, tag tuple in TermExtractor3(s):
for every document d in D2:
FB = Freebase
owem = OurWordEmbeddingsModel
Yogatama, Dani, Daniel Gillick, and Nevena Lazic. "Embedding Methods
for Fine Grained Entity Type Classification." ACL (2). 2015.
for every sent s in d:
sentence_vector = None
if not sentence_vector:
sentence_vector = vectorizeSentence(s, position)
owem[ T[tag] ].update( sentence_vector )
T = OurTaxonomy1
3 TermExtractor = parser + chunker + entity resolver that assigns Freebase types to entities.
2 D = 133,000 news documents.
1 FB labels are manually mapped to fine-grained labels.
118. 3.1 Supersense inference
for every position, term, tag tuple in TermExtractor3(s):
for every document d in D2:
FB = Freebase
owem = OurWordEmbeddingsModel
Yogatama, Dani, Daniel Gillick, and Nevena Lazic. "Embedding Methods
for Fine Grained Entity Type Classification." ACL (2). 2015.
for every sent s in d:
sentence_vector = None
if not sentence_vector:
sentence_vector = vectorizeSentence(s, position)
owem[ T[tag] ].update( sentence_vector )
T = OurTaxonomy1
3 TermExtractor = parser + chunker + entity resolver that assigns Freebase types to entities.
2 D = 133,000 news documents.
1 FB labels are manually mapped to fine-grained labels.
119. 3.1 Supersense inference
Yogatama, Dani, Daniel Gillick, and Nevena Lazic. "Embedding Methods
for Fine Grained Entity Type Classification." ACL (2). 2015.
120. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
q = "Which city hosted the longest game before the game in Beijing?"
KB = knowledge base consisting of tables that store facts
encoder = bi-directional Recurrent Neural Network
v = encoder(q) # vector of the encoded query
kb_encoder = KB table encoder
121. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
q = "Which city hosted the longest game before the game in Beijing?"
KB = knowledge base consisting of tables that store facts
encoder = bi-directional Recurrent Neural Network
v = encoder(q) # vector of the encoded query
kb_encoder = KB table encoder
fn = embedding of the name of the n-th column
wmn = embedding of the value in the n-th column of the m-th row
[x ; y] = vector concatenation
W = weights (what we will e.g. SGD out of the data)
b = bias
122. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
q = "Which city hosted the longest game before the game in Beijing?"
KB = knowledge base consisting of tables that store facts
encoder = bi-directional Recurrent Neural Network
v = encoder(q) # vector of the encoded query
kb_encoder = KB table encoder
fn = embedding of the name of the n-th column
wmn = embedding of the value in the n-th column of the m-th row
[x ; y] = vector concatenation
W = weights (what we will e.g. SGD out of the data)
b = bias
normalizeBetween0and1(
penaltiesOrRewards * # the weights that minimize the loss for this column
[ {sydney... australia... kangaroo...} + {city... town... place...} ] +
somePriorExpectations
)
123. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
"olympic games near kangaroos" 👍
q = "Which city hosted the longest game before the game in Beijing?"
KB = knowledge base consisting of tables that store facts
encoder = bi-directional Recurrent Neural Network
v = encoder(q) # vector of the encoded query
kb_encoder = KB table encoder
fn = embedding of the name of the n-th column
wmn = embedding of the value in the n-th column of the m-th row
[x ; y] = vector concatenation
W = weights (what we will e.g. SGD out of the data)
b = bias
normalizeBetween0and1(
penaltiesOrRewards *
[ {sydney... australia... kangaroo...} + {city... town... place...} ] +
somePriorExpectations
)
124. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
"olympic games near kangaroos"
Risk of false positives?
Already as probabilities!!
👍
q = "Which city hosted the longest game before the game in Beijing?"
KB = knowledge base consisting of tables that store facts
encoder = bi-directional Recurrent Neural Network
v = encoder(q) # vector of the encoded query
kb_encoder = KB table encoder
fn = embedding of the name of the n-th column
wmn = embedding of the value in the n-th column of the m-th row
[x ; y] = vector concatenation
W = weights (what we will e.g. SGD out of the data)
b = bias
normalizeBetween0and1(
penaltiesOrRewards *
[ {sydney... australia... kangaroo...} + {city... town... place...} ] +
somePriorExpectations
)
125. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Olympic Games
City Year Duration
Barcelona 1992 15
Atlanta 1996 18
Sydney 2000 17
Athens 2004 14
Beijing 2008 16
London 2012 16
Rio de Janeiro 2016 18
126. 3.2 Semantic trees
Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?
Olympic Games
City Year Duration
Barcelona 1992 15
Atlanta 1996 18
Sydney 2000 17
Athens 2004 14
Beijing 2008 16
London 2012 16
Rio de Janeiro 2016 18
127. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?
Olympic Games
City Year Duration
Barcelona 1992 15
Atlanta 1996 18
Sydney 2000 17
Athens 2004 14
Beijing 2008 16
London 2012 16
Rio de Janeiro 2016 18
Q(i) =
T(i) =
y(i) =
RNN-encoded = p(vector("column:city") | [ vector("which"), vector("city")... ] ) >
p(vector("column:city") | [ vector("city"), vector("which")... ] )
3.2 Semantic trees
128. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
As many as SQL
operations
3.2 Semantic trees
129. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
Each executor
applies the same
set of weights to all
rows (since it
models a single
operation)
3.2 Semantic trees
130. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
The logic of max, min
functions is built-in as
column-wise
operations. The system
learns over which
ranges they apply
3.2 Semantic trees
131. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
The logic of max, min
functions is built-in as
column-wise
operations. The system
learns over which
ranges they apply
(before, after are just
temporal versions of
argmax and argmin)
3.2 Semantic trees
132. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
Candidate vector at layer L for
row m of our database
Memory layer with the last output
vector generated by executor L - 1
Evaluate a
candidate:
3.2 Semantic trees
133. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
Row m, R = our table
Set of columns
As an attention mechanism,
cumulatively weight the vector by the output
weights from executor L - 1
Evaluate a
candidate:
3.2 Semantic trees
134. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
Row m, R = our table
Set of columns
This allows the system to restrict
the reference at each step.
Evaluate a
candidate:
3.2 Semantic trees
135. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
Which city hosted the 2012 Olympic Games?Q(i) =
RNN-encoded = p(vector("column:city") | [ vec("which"), vec("city")... ] ) >
p(vector("column:city") | [ vec("city"), vec("which")... ] )Executor layer L
p(vector("column:year") | [ ... vec("2012"), vec("olympic") ... ] ) >
p(vector("column:year") | [ ... vec("olympic"), vec("2012") ... ] )Executor layer L - 1
Executor layer L - 2
p(vector("FROM") | [ ... vec("olympic"), vec("games") ... ] ) >
p(vector("FROM") | [ ... vec("games"), vec("olympic") ... ] )
Row m, R = our table
Set of columns
As we weight the vector in the direction of
each successive operation, it gets closer to
any applicable answer in our DB.
Evaluate a
candidate:
3.2 Semantic trees
136. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
SbS is N2N with bias.
SEMPRE is a toolkit for training semantic parsers, which map
natural language utterances to denotations (answers) via
intermediate logical forms. Here's an example for querying
databases. https://nlp.stanford.edu/software/sempre/
3.2 Semantic trees
137. Yin, Pengcheng, et al. "Neural enquirer: Learning to query tables with
natural language." arXiv preprint arXiv:1512.00965 (2015).
SbS is N2N with bias.
SEMPRE is a toolkit for training semantic parsers, which map
natural language utterances to denotations (answers) via
intermediate logical forms. Here's an example for querying
databases. https://nlp.stanford.edu/software/sempre/
😳
3.2 Semantic trees
138. 3.3 Re-entry
Task: given a set of facts or a story, answer questions on those
facts in a logically consistent way.
Problem: Theoretically, it could be achieved by a language
modeler such as a recurrent neural network (RNN).
However, their memory (encoded by hidden states and weights)
is typically too small, and is not compartmentalized enough to
accurately remember facts from the past (knowledge is
compressed into dense vectors).
RNNs are known to have difficulty in memorization tasks, i.e.,
outputting the same input sequence they have just read.
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
139. Task: given a set of facts or a story, answer questions on those
facts in a logically consistent way.
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
Jason went to the kitchen.
Jason picked up the milk.
Jason travelled to the office.
Jason left the milk there.
Jason went to the bathroom.
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
Where is the milk? milk?
3.3 Re-entry
140. F = set of facts (e.g. recent context of conversation; sentences)
M = memory
for fact f in F: # any linguistic pre-processing may apply
store f in the next available memory slot of M
Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the reading stage...
3.3 Re-entry
141. F = set of facts (e.g. recent context of conversation; sentences)
M = memory
O = inference module (oracle)
x = input
q = query about our known facts
k = number of candidate facts to be retrieved
m = a supporting fact (typically, the one that maximizes support)
Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the inference stage...
3.3 Re-entry
142. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the inference stage...
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
F = set of facts (e.g. recent context of conversation; sentences)
M = memory
O = inference module (oracle)
x = input
q = query about our known facts
k = number of candidate facts to be retrieved
m = a supporting fact (typically, the one that maximizes support)
3.3 Re-entry
143. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the inference stage, t1...
milk
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
Candidate space for k = 2
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
argmax over that space
milk, [ ]
3.3 Re-entry
144. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the inference stage, t2...
milk
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
Candidate space for k = 2
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
argmax over that space
milk, jason, drop
3.3 Re-entry
145. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
At the inference stage, t2...
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
jason go kitchen
jason get milk
jason go office
jason drop milk
jason go bathroom
Why not?
jason, get, milk = does not contribute as much
new information as jason go office
3.3 Re-entry
146. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
Training is performed with a margin ranking loss and stochastic
gradient descent (SGD).
Simple model.
Easy to integrate with any pipeline already working with
structured facts (e.g. Neural Inquirer). Adds a temporal dimension
to it.
Answering the question about the location of the milk requires
comprehension of the actions picked up and left.
3.3 Re-entry
147. Memory networks
Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory
networks." arXiv preprint arXiv:1410.3916 (2014).
QACNN CBT-NE CBT-CN Date
69.4 66.6 63.0
Kadlec, Rudolf, et al. "Text understanding with the
attention sum reader network." arXiv preprint arXiv:
1603.01547 (2016).
75.4 71.0 68.9 4 Mar 16
Sordoni, Alessandro, et al. "Iterative alternating neural
attention for machine reading." arXiv preprint arXiv:
1606.02245 (2016).
76.1 72.0 71.0 7 Jun 16
Trischler, Adam, et al. "Natural language comprehension
with the epireader." arXiv preprint arXiv:1606.02270
(2016).
74.0 71.8 70.6 7 Jun 16
Dhingra, Bhuwan, et al. "Gated-attention readers for text
comprehension." arXiv preprint arXiv:1606.01549 (2016).
77.4 71.9 69.0 5 Jun 16
3.3 Re-entry
148. 0
25
50
75
100
KB IE Wikipedia
76.2
68.3
93.9
69.9
63.4
78.5
No Knowledge
(embeddings)
Standard QA
System on KB
54.4%
93.5%
17%
Memory Networks Key-Value Memory Networks
Responseaccuracy
(%)
Source: Antoine Bordes, Facebook AI Research, LXMLS Lisbon July 28, 2016
Results on MovieQA
150. Summary of desiderata
3. Semantic parsing
1. Taxonomy-driven supersense inference
2. Semantically-informed syntactic trees
3. Ability to handle decision-tree re-entries
Extend the neuralized architecture's domain by integrating support
for supersenses. Enable inference at training and test time.
Enable semantic parsing using a neuralized architecture with respect
to some knowledge base serving as ground truth.
Use memory networks to effectively keep track of the entities
throughout the conversation and ensure logical consistency.
Extend memory networks by integrating support for supersenses.
Enable inference at training and test time.
151. Summary of desiderata
Conversational technology should really be about
• dynamically finding the best possible way to browse a large
repository of information/actions.
• find the shortest path to any relevant action or piece of
information (to avoid the plane dashboard effect).
• surfacing implicit data in unstructured content ("bocadillo de
calamares in Madrid"). Rather than going open-domain, taking
the closed-domain and going deep into it.
Remember?