Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017

402 views

Published on

Byron Galbraith is the Chief Data Scientist and co-founder of Talla, where he works to translate the latest advancements in machine learning and natural language processing to build AI-powered conversational agents. Byron has a PhD in Cognitive and Neural Systems from Boston University and an MS in Bioinformatics from Marquette University. His research expertise includes brain-computer interfaces, neuromorphic robotics, spiking neural networks, high-performance computing, and natural language processing. Byron has also held several software engineering roles including back-end system engineer, full stack web developer, office automation consultant, and game engine developer at companies ranging in size from a two-person startup to a multi-national enterprise.

Abstract summary

Neural Information Retrieval and Conversational Question Answering:
One the main affordances of conversational UIs is the ability to use natural language to succinctly convey to a bot what you want. An area where this interface excels is in question answering (Q&A). Research into Q&A systems often falls at the intersection of natural language processing (NLP) and information retrieval (IR), and while NLP has been getting a lot of attention from deep learning for several years now, it’s only largely within the last year or so that the field of IR has seen an equivalent explosion of interest in employing these techniques. In this presentation, I will touch on challenges facing conversational bots, provide a high level overview into the emerging field of Neural Information Retrieval, discuss how these methods can be used in a Q&A context, and then highlight some lessons learned attempting to design and deploy a conversational Q&A agent-based product.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017

  1. 1. Byron Galbraith, PhD Co-founder / Chief Data Scientist, Talla MLConf Seattle 2017.05.19 Neural Information Retrieval
 &
 Conversational Question Answering
  2. 2. / 29 Intelligent Conversational Service Desk Human in the loop Conversational Knowledge Base Conversational Ticketing System Intelligent Workflows Talla gets smarter, faster. Conversational Ticketing System Intelligent Workflows Conversational Knowledge Base Human in the Loop Talla gets smarter, faster.
 Stay in control. 2
  3. 3. / 29 Whither Chatbots? 3
  4. 4. / 29 Context and ambiguity are significant challenges Time flies like an arrow; fruit flies like a banana. 4
  5. 5. / 29 AI to the Rescue? https://xkcd.com/1831/ 5
  6. 6. / 29 With apologies to George Box All chatbots are dumb, but some are useful. 6
  7. 7. / 29 Question Answering is the most compelling use case for chatbots NLP IR Q&A 7
  8. 8. / 29 Neural Information Retrieval 2014 2015 2016 2017 1 % 4 % 8 % 21 % 051015202530 Year %ofSIGIRpapers relatedtoneuralIR Figure 1: The percentage of neural IR papers at the ACM SIGIR conference—as manual inspection of the paper titles—shows a clear trend in the growing popularity important IR task. A search query may typically contain a few terms, while the d depending on the scenario, may range from a few terms to hundreds of sentences models for IR use vector representations of text, and usually contain a large numb that needs to be tuned. ML models with large set of parameters typically require Mitra and Craswell (2017) 8
  9. 9. / 29 (Neural) Information Retrieval System Query Docs Generate Representation q D Generate Representation Estimate Relevance 9
  10. 10. / 29 Word Embeddings Query Docs Generate Representation q D Generate Representation Estimate Relevance 10
  11. 11. / 29 Word Embeddings Mitra et al. (2016) Figure 2: The architecture of a word2vec (CBOW) model con- 2 g e co o w p m u (i a m a fu in 11
  12. 12. / 29 Learning to Rank Query Docs Generate Representation q D Generate Representation Estimate Relevance 12
  13. 13. / 29 Learning to Rank Huang et al. (2013) 13
  14. 14. / 29 End-to-End Models Query Docs Generate Representation q D Generate Representation Estimate Relevance 14
  15. 15. / 29 End-to-End Models Severyn and Moschitti (2015) 15
  16. 16. / 29 Neural IR Resources Mitra and Craswell (2017) Neural Models for Information Retrieval https://arxiv.org/abs/1705.01509 Mitra and Craswell (2017) Neural Text Embeddings for IR WSDM 2017 Tutorial https://www.slideshare.net/BhaskarMitra3/neural-text-embeddings-for-information-retrieval-wsdm-2017 Zhang et al. (2016) Neural Information Retrieval: A Literature Review https://arxiv.org/abs/1611.06792 Neu-IR Workshop at SIGIR http://neu-ir.weebly.com/ 16
  17. 17. / 29 Neural IR for Q&A Conversational Knowledge Base 17
  18. 18. / 29 Conversational Knowledge Base Goal: Automatically Efficiently answer employees’ requests Method: 1. Respond with high confidence answer from KB 2. Suggest up to four similar questions from KB 3. Provide easy path to service desk representatives Enable rep to train 18
  19. 19. / 29 Word embeddings are susceptible to out of vocabulary terms Problem: Out of Vocabulary Terms Unseen at Training Skipped for being too rare Can be highly discriminative What does cromulent mean? What does bigly mean? 19
  20. 20. / 29 Word embeddings are susceptible to out of vocabulary terms Problem: Out of Vocabulary Terms Unseen at Training Skipped for being too rare Can be highly discriminative What does UNK mean? What does UNK mean? 20
  21. 21. / 29 OOV can be overcome through ensembling Solution: Out of Vocabulary Terms Infer embedding from local context Ensemble with term frequency methods Mitra et al. (2016) Table 4: Results of NDCG evaluations under the non-telescoping settings. Both the DESM and the LSA models perform poorly in the presence of random irrelevant documents in the candidate set. The mixture of DESMIN OUT with BM25 achieves the best NDCG. The best NDCG values are highlighted per column in bold and all the statistically significant (p < 0.05) differences with the BM25 baseline are indicated by the asterisk (*) . Explicitly Judged Test Set Implicit Feedback based Test Set NDCG@1 NDCG@3 NDCG@10 NDCG@1 NDCG@3 NDCG@10 BM25 21.44 26.09 37.53 11.68 22.14 33.19 LSA 04.61* 04.63* 04.83* 01.97* 03.24* 04.54* DESM (IN-IN, trained on body text) 06.69* 06.80* 07.39* 03.39* 05.09* 07.13* DESM (IN-IN, trained on queries) 05.56* 05.59* 06.03* 02.62* 04.06* 05.92* DESM (IN-OUT, trained on body text) 01.01* 01.16* 01.58* 00.78* 01.12* 02.07* DESM (IN-OUT, trained on queries) 00.62* 00.58* 00.81* 00.29* 00.39* 01.36* BM25 + DESM (IN-IN, trained on body text) 21.53 26.16 37.48 11.96 22.58* 33.70* BM25 + DESM (IN-IN, trained on queries) 21.58 26.20 37.62 11.91 22.47* 33.72* BM25 + DESM (IN-OUT, trained on body text) 21.47 26.18 37.55 11.83 22.42* 33.60* BM25 + DESM (IN-OUT, trained on queries) 21.54 26.42* 37.86* 12.22* 22.96* 34.11* We do not report the results of evaluating the mixture models under the telescoping setup because tuning the ↵ parameter under those settings on the training set results in the best performance from the standalone DESM models. Overall, we conclude that the DESM The probabilistic model of information retrieval leads to the de- velopment of the BM25 ranking feature [35]. The increase in BM25 as term frequency increases is justified according to the 2-Poisson model [15, 36], which makes a distinction between documents about 21
  22. 22. / 29 Deep Learning methods have both training and operational challenges Problem: Operationalizing Deep Learning A lot of labeled data required UX requires online, one-shot learning Poor interpretability, hard to debug Performance gain vs model complexity Model persistence with auto-scaling infrastructure 22
  23. 23. / 29 In this case, Deep Learning is better suited for offline scenarios Solution: Operationalizing Deep Learning Use linear models instead for online / nearline Deep learning for offline and global tasks e.g. generating new word embeddings https://xkcd.com/1838/ 23
  24. 24. / 29 The user controls the question- answer pairs in the knowledge base Problem: User-Trained Agent End users can ad hoc update the knowledge base New Q&A pairs should be accessible immediately Real-time, one-shot learning expected 24
  25. 25. / 29 The user controls the question- answer pairs in the knowledge base Problem: User-Trained Agent End users can ad hoc update the knowledge base New Q&A pairs should be accessible immediately Real-time, one-shot learning expected 25
  26. 26. / 29 IR-based methods give us the interpretability and speed needed for a reliable UX Solution: User-Trained Agent Fully inspectable, editable KB via web interface Cascade of fast online and nearline models Linear models and term-frequency features easier to debug and modify 26
  27. 27. / 29 Conversational interfaces have their own user behavioral quirks Problem: Users Don’t Read Skim, Assume, Respond 27
  28. 28. / 29 Give the user every opportunity to succeed Solution: Users Don’t Read Constrain interaction expectation Hybrid interfaces 28
  29. 29. / 29 Summary Productizing conversational Q&A is not just about algorithms Neural IR is an exciting and fast growing field Chatbots can actually be useful 29
  30. 30. www.talla.com

×