Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Summit 2017)

962 views

Published on

More information about the event: https://www.microsoft.com/en-us/research/event/faculty-summit-2017/

Published in: Education
  • Be the first to comment

Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Summit 2017)

  1. 1. Isabelle Augenstein, UCL / UCPH Machine Reading Using Neural Machines
  2. 2. Goal: Fact Checking query “Unemployment in the US is 42%” Machine Reader unemployment(US, 42%) What is the stance of HRC on immigra3on? stance(HRC, immigra3on, X) SemEval 2016 Stance Detection, EMNLP 2016, SemEval 2017 RumourEval, Fake News Challenge 2017
  3. 3. Goal: Understanding Scientific Publications query Machine Reader Method A outperforms method B for task C outperforms(A, B, C) What models exists for ques3on answering? method_for_task(QA) SemEval 2017 ScienceIE (organiser), ACL 2017, ConLL 2017
  4. 4. What models exists for ques3on answering? method_for_task(QA) Machine Reader query query Ques%ons What is the stance of HRC on immigra3on? stance(HRC, immigra3on, X)
  5. 5. What models exists for ques3on answering? method_for_task(QA) Machine Reader query query retrieval RNN, method-for, QA “We introduce a RNN-based method for QA” “Immigrants welcome!” Evidence Ques%ons What is the stance of HRC on immigra3on? stance(HRC, immigra3on, X)
  6. 6. What models exists for ques3on answering? method_for_task(QA) Machine Reader query query retrieval RNN, method-for, QA “We introduce a RNN-based method for QA” “Immigrants welcome!” method_for_task(QA) Representa%ons updates answers Evidence Ques%ons What is the stance of HRC on immigra3on? stance(HRC, immigra3on, X)
  7. 7. What models exists for ques3on answering? method_for_task(QA) Machine Reader query query Methods based on RNNs are widely used RNNs answers HRC is in favour of immigra3on favour answers retrieval RNN, method-for, QA “We introduce a RNN-based method for QA” “Immigrants welcome!” method_for_task(QA) Representa%ons updates answers Evidence Ques%ons Answers What is the stance of HRC on immigra3on? stance(HRC, immigra3on, X)
  8. 8. Task: Stance Detection Document: “… Led Zeppelin’s Robert Plant turned down £500 MILLION to reform supergroup. …” Headline/Target: “Robert Plant Ripped up $800M Led Zeppelin Reunion Contract -> Confirming Task: How do sequences relate to one another? (e.g. for, against, neutral) Problems: •  Interpretation depends on headline/target •  Target not always seen in training set •  However, overlap between target + document Fake News Challenge 2017
  9. 9. Task: Stance Detection Tweet: Target: Legalization of Abortion, Atheism, Pro Life, … Task: How do sequences relate to one another? (e.g. for, against, neutral) Problems: •  Interpretation depends on target •  Target not always mentioned in tweet •  No training data for test target -> target-independent / unseen headline approach needed SemEval 2016 Stance Detection, EMNLP 2016 A foetus has rights too!
  10. 10. Task: Scientific Paper Summarisation Select the sentences from within a paper which best summarise that paper. Treated as a binary classification task - each sentence classified as either summary or not. The Task Challenges •  Extractive summarisation Ø  Binary classification task: for each sentence, is it summary statement or not? •  Fine neural encoding of current sentence •  Simple, coarse features for paragraph and global (e.g. location) features ConLL 2017
  11. 11. Science Paper Summarisation Dataset Paper title Statistical estimation of the names of HTTPS servers Author-Written Highlights (= Summary Statements!) -  We present the domain name graph (DNG), which is a formal expression that can keep track of cname chains and (…) -  … Summary statements highlighted in main text: In this work, we present a novel methodology that aims to (…) The key contributions of this work are as follows. We present the domain name graph (DNG), which is a formal expression that can keep track of cname chains (challenge 1) and (…) ConLL 2017
  12. 12. Research Challenges -  Small Datasets -  Weak supervision, multi-task learning, semi-supervision -  Computational cost of neural machine reading -  Makes small benchmark datasets more attractive for research -  Few datasets with large or multiple documents -  Proliferation of datasets, some toy tasks or not well designed -  E.g. small vocabulary, easy to “game”
  13. 13. Thank you! Questions? Papers at ACL 2017: ACL 2017, ConLL 2017, SemEval 2017 ScienceIE (organiser), SemEval 2017 RumourEval augenstein.github.io augenstein@di.ku.dk @iaugenstein github.com/isabelleaugenstein

×