Successfully reported this slideshow.
Your SlideShare is downloading. ×

[223]기계독해 QA: 검색인가, NLP인가?

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
From NLP to text mining
From NLP to text mining
Loading in …3
×

Check these out next

1 of 122 Ad

More Related Content

Similar to [223]기계독해 QA: 검색인가, NLP인가? (20)

More from NAVER D2 (20)

Advertisement

Recently uploaded (20)

[223]기계독해 QA: 검색인가, NLP인가?

  1. 1. 기계독해 QA: 검색인가, NLP인가? 이름 : 서민준 소속 : NAVER / Clova ML
  2. 2. QA = Question Answering
  3. 3. 너 큰일난듯. 탑항공 폐업했대! *실제로 일어난 일 허럴? 진짜? 왜 폐업했대? 몰라 내 표 환불가능할까? 도와줘 네이버! 도움이 안되는 친굴세. 전화도 안받어…
  4. 4. CONTENTS 1. 검색으로 “찾는” QA – 10분 2. NLP로 “읽는” QA – 10분 3. 검색과 NLP의 접점 – 20분 4. Q&A – 5분
  5. 5. 1. 검색으로 “찾는” QA
  6. 6. 탑항공 폐업
  7. 7. 탑항공 폐업 • 내용 및 제목의 관련성 • 비슷한 검색을 한 유저가 읽은 문서 • 웹사이트의 신뢰도 • 문서의 인기도 • 검색자의 정보 • … 종합적으로 고려해요!
  8. 8. 탑항공 폐업 • 내용 및 제목의 관련성 • 비슷한 검색을 한 유저가 읽은 문서 • 웹사이트의 신뢰도 • 문서의 인기도 • 검색자의 정보 • …
  9. 9. Word Matching 검색한 단어가 존재하는 문서를 가져옴 • Ctrl-F • 제목에만 적용할 경우 꽤 효과적임
  10. 10. “탑항공이 폐업한게 진짜임?” “탑항공 폐업 레알?”
  11. 11. TF-IDF Term Frequency – Inverse Document Frequency • 중요 키워드 (흔하지 않은 단어)에 더 높은 가중치를 줌. • 질문이 길어지고 문서 내용 검색을 한다면 필수
  12. 12. Okapi BM25 “Best Matching” (Robertson et al., 1970s) • TF-IDF 의 “업그레이드 버젼” • TF 부분을 변경 왜 더했다 뺐다 그러는 거야
  13. 13. LSA Latent Semantic Analysis (Deerwester et al., 1988) • Bag of words (sparse) à dense vector via SVD • 각 단어에 추상적인 “태그”를 달아줌 • 추상적인 ”태그”를 통해 다른 단어끼리도 비교할 수 있게 됨. • “폐업” ~ “망하다” ~ “몰락”
  14. 14. 내가 궁금한 걸 꼭 집어서 알려줄 순 없어? 문서는 찾아드릴 수 있는데요…
  15. 15. 검색 != 문장 독해
  16. 16. 검색의 한계 문장을 “읽는” 것이 아니다 • 단어 (lexical) 수준의 정보습득은 가능하나… • 문법적 (syntactic) 또는 의미적 (semantic) 맥락을 파악 못함. • 문서나 문단 수준 이상으로 “꼭 집어서” 답을 가져오기 힘듬.
  17. 17. 2. NLP로 “읽는” QA
  18. 18. 게으른 우리가 원하는 것
  19. 19. 제가 읽어봤는데요, 대내외적인 경영환경 악화로 폐업했대요. 왜 폐업했대? 똑똑하구만!
  20. 20. 기계학습의 첫 단계: 인풋 ,아웃풋 정의하기
  21. 21. 탑항공이 왜 폐업했대? 대내외적인 경영환경 악화 Inputs Output 일단 인풋과 아웃풋을 정의해보잣
  22. 22. 생성 or 추출?
  23. 23. 생성모델은 매력적으로 보이지만…
  24. 24. Generative Model 의 문제점 서비스 퀄리티가 안나온다. • 엉뚱한 답을 내는 경우가 너무 많음. • 데이터 퀄리티 컨트롤이 어려움. (예: MS MARCO1) 1 Nguyen et al. MS MARCO: A human generated machine reading comprehension dataset. 2016. 평가 (Evaluation) 도 어렵다. • BLEU 가 있기는 하지만…
  25. 25. 결국 Extractive
  26. 26. 5분만에 보는 Neural Extractive QA Trend
  27. 27. 7 Milestones in Extractive QA 1. Sentence-level QA (May 2015) 2. Phrase-level QA (May 2016) 3. Cross-attention (Nov 2016) 4. Self-attention (Mar 2017) 5. Transfer learning (Nov 2017) 6. Super-human level (Jan 2018) 7. What’s next? (Nov 2018) Task definition Models
  28. 28. 7 Milestones in Extractive QA 1. Sentence-level QA (May 2015) 2. Phrase-level QA (May 2016) 3. Cross-attention (Nov 2016) 4. Self-attention (Mar 2017) 5. Transfer learning (Nov 2017) 6. Super-human level (Jan 2018) 7. What’s next? (Nov 2018)
  29. 29. 1. Sentence-level QA Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Yang et al. WikiQA: A Challenge Dataset for Open-domain Question Answering. EMNLP 2015.
  30. 30. 1. Sentence-level QA Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Yang et al. WikiQA: A Challenge Dataset for Open-domain Question Answering. EMNLP 2015.
  31. 31. 하지만 답이 너무 길다…
  32. 32. 답만 딱 보여줄 수 없을까?
  33. 33. 2. Phrase-level QA Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Rajpurkar et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text. EMNLP 2016
  34. 34. 2. Phrase-level QA Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Rajpurkar et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text. EMNLP 2016
  35. 35. SQuAD: 100,000+
  36. 36. 2년동안 100+ models!
  37. 37. 7 Milestones in Extractive QA 1. Sentence-level QA (May 2015) 2. Phrase-level QA (May 2016) 3. Cross-attention (Nov 2016) 4. Self-attention (Mar 2017) 5. Transfer learning (Nov 2017) 6. Super-human level (Jan 2018) 7. What’s next? (Nov 2018)
  38. 38. 2. Cross-attention Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians?
  39. 39. 문서를 읽으면서 질문을 참고 질문을 읽으면서 문서를 참고
  40. 40. 2. Cross-attention Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Seo et al. Bi-directional attention flow for machine comprehension. ICLR 2017.
  41. 41. 2. Self-attention Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Clark & Gardner. Simple and effective multi-paragraph reading comprehension. 2017
  42. 42. 문서를 읽으면서 문서의 다른 부분을 참고
  43. 43. 2. Self-attention Second Epistle to the Corinthians The Second Epistle to the Corinthians, often referred to as Second Corinthians (and written as 2 Corinthians), is the eighth book of the New Testament of the Bible. Paul the Apostle and “Timothy our brother” wrote this epistle to “the church of God which is at Corinth, with all the saints which are in all Achaia”. Who wrote second Corinthians? Clark & Gardner. Simple and effective multi-paragraph reading comprehension. 2017
  44. 44. Unlabeled Corpus를 활용할 수 없을까?
  45. 45. 4. Transfer learning 3 billion words, unlabeled 2 million words, labeled Language model Peters et al. Deep contextualized word representations. NAACL 2018.
  46. 46. 컴퓨터가 사람을 이길 수 있을까?
  47. 47. 5. Super-human level • Ensemble • NLP tools (POS, parser, etc.) • Data Augmentation • A lot of layers Hi, Nice to meet you! MT 안녕, 반가워! MT Hello, great to see you! • 이…. • 것… • 저… • 것… Yu et al. QANet: Combining local convolution with global self- attention for reading comprehension. ICLR 2018.
  48. 48. 오늘 4시 Track 4 사람보다 5% 높음.
  49. 49. 7 Milestones in Extractive QA 1. Sentence-level QA (May 2015) 2. Phrase-level QA (May 2016) 3. Cross-attention (Nov 2016) 4. Self-attention (Mar 2017) 5. Transfer learning (Nov 2017) 6. Super-human level (Jan 2018) 7. What’s next? (Nov 2018)
  50. 50. QuAC (Conversational) Choi et al., EMNLP 2018 HotpotQA (Reasoning) Yang et al., EMNLP 2018
  51. 51. 정확한 건 좋은데, 얼마나 걸려? 음… GPU를 사용하면 한 문서 읽는데 0.1초정도?
  52. 52. 0.1초
  53. 53. 하지만 Linear-time 의 굴레 에서 벗어날 수가 없다. Microsoft Research Asia. R-Net: machine reading comprehension with self matching networks. 2017.
  54. 54. 문서 560만개 단어 30억개
  55. 55. 정확한 건 좋은데, 얼마나 걸려? 음… GPU를 사용하면 한 문서 읽는데 0.1초정도? 그러니까… 6일정도요.
  56. 56. 3. 검색과 NLP의 접점
  57. 57. 질문 하나에 1주일? !#$@*%(@*@ 아 그러면 검색을 이용해서 문서를 찾고, 그거만 읽을게요!
  58. 58. Solution 1: 찾고나서 읽자!
  59. 59. 1961 Chen et al. Reading Wikipedia to Answer Open-Domain Questions. ACL 2017.
  60. 60. 잠깐, 그런데 검색엔진이 잘못된 답을 내면 어떡하지? “탑항공이 폐업한게 진짜임?”
  61. 61. Error Propagation 뿐만 아니라
  62. 62. 엘레강스가 부족하다 *End-to-End 중독
  63. 63. 1961 하나의 깔끔한 모델 ?
  64. 64. 5.8일 문서 560만개 4,000x 느리다5,000,000x 짧고 0.1초 CPU Titan Xp
  65. 65. 200억배 빠르게?!
  66. 66. Solution 2: 찾기와 읽기를 동시에?
  67. 67. 검색은 어떻게 문서를 빨리 찾을까?
  68. 68. 그린팩토리 도서관이군.
  69. 69. 역사 미술 IT . . . 소설 한국전쟁은 언제 터졌어? A B K . . .
  70. 70. 벡터를 정리하자
  71. 71. [0.3, 0.5, …] [0.7, 0.1, …] [0.6, 0.2, …] . . . [0.4, 0.4, …] 한국전쟁은 언제 터졌어? […] […] […] . . . [0.5, 0.1, …] [0.3, 0.4, …] [0.4, 0.5, …] [0.8, 0.1, …] [0.4, 0.4, …] [0.4, 0.3, …] Locality-Sensitive Hashing 비슷한 아이템의 충돌을 최대화 MIPS
  72. 72. Locality-Sensitive Hashing (LSH) • Symmetric: distance functions (Nearest Neighbor Search) • L2 • L1 • Cosine • Asymmetric: inner product (MIPS) • Dot product
  73. 73. !(#$% log $ ) !(#$) *= 근사 factor (<1) Shrivastava and Li. Asymmetric lsh (alsh) for sublinear time maximum inner product search (mips). NIPS 2014.
  74. 74. Sublinear-time 근사값. 아주 빠르다.
  75. 75. 문서 à 구문 (Phrase)?
  76. 76. Super Bowl 50 !" American football game !# National Football League !$ Denver Broncos !% … Which NFL team represented the AFC at Super Bowl 50? & MIPS
  77. 77. 수식으로 보는 기존과 비교 • 문서 d 와 쿼리 q 가 주어졌을 때: !" = argmax ) *+("|.; 0) *+("|.; 0) ∝ exp(5+ ", ., 0 ) *+("|.; 0) ∝ exp(7+(.) 8 9+(", 0)) where 기존: 매 새로운 질문마다 F 를 재계산 해야 함. 제안: H 는 미리 계산 될 수 있고 index (hash) 될 수도 있음 Decomposition
  78. 78. 그러나 Decomposition이 매우 어렵다
  79. 79. 새로운 연구문제: Phrase-Indexed QA (PIQA)
  80. 80. 5분간 듣는 PIQA 1년 삽질기 1. Baseline 은 그리 어렵지 않았다 2. Duality의 활용 3. Multimodality… 4. Sparsity: 단번에 9% 업! 5. Scalability: 가능은 하지만 만만치 않은… 작년 6월부터
  81. 81. Baseline 1: LSTM … water transforms into steam within a boiler … What does water turn into when heated? Document Question Bi-LSTM Bi-LSTM !" !# !$ !% !& !' !( Weighted Sum ) Nearest Neighbor
  82. 82. Baseline 2: Self-Attention … water transforms into steam within a boiler … What does water turn into when heated? Document Question !" !# !$ !% !& !' !( “steam” “water” + “transform” + “boiler” “What” “water” + “turn” + “heated” type clue type clue )% * dot
  83. 83. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 Seo et al. Phrase-indexed question answering: a new challenge toward scalable document comprehension. EMNLP 2018.
  84. 84. Duality & Multimodality
  85. 85. Barack Obama was 44th president from 2009 to 2017. 일대다 관계 !, # $ $(!, #) Q1: Who was president in 2009? Q2: Who was the 44th president?
  86. 86. Duality를 활용할 수 없을까?
  87. 87. Duality: Question Reconstruction What does water turn into when heated? Question Bi-LSTM !" !# !$ !% !& !' !( Weighted Sum ) Nearest Neighbor Generation seq2seq decoder (without attention)
  88. 88. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0
  89. 89. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 LSTM+SA+Dual 63.2 52.0
  90. 90. 어떻게 Multimodality를 해결할 수 있을까?
  91. 91. !"($|&; () Barack Obama was 44th president from 2009 to 2017. Who was president in 2009? Who was the 44th president? *"($|&; () Multimodality
  92. 92. 이론적인 해결책
  93. 93. Barack Obama was 44th president from 2009 to 2017. Q1: Who was president in 2009? Q2: Who was the 44th president? Latent Variable 을 사용하면 된다? !, # $ $(!, &1, #) $(!, z2, #)
  94. 94. 그래서 (1년동안!) 시도해 본 것들 1. Multiple identical models (ensemble) 2. Orthogonality regularization 3. Sequential decoding 4. Latent variable from Gaussian distribution 5. Latent variable from surrounding words
  95. 95. 그래서 (1년동안!) 시도해 본 것들 1. Multiple identical models (ensemble) 2. Orthogonality regularization 3. Sequential decoding 4. Latent variable from Gaussian distribution 5. Latent variable from surrounding words 정확성을 좀 올려주지만, 30배 이상의 storage가 필요. 안됨…
  96. 96. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 LSTM+SA+Dual 63.2 52.0
  97. 97. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 LSTM+SA+Dual 63.2 52.0 LSTM+SA+Multi-mode 66.5 55.1
  98. 98. Dense + Sparse?
  99. 99. Sparse vector “steam” “water” + “transform” + “boiler” type clue !" steamboiler water transform Dense vector
  100. 100. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 LSTM+SA+Dual 63.2 52.0 LSTM+SA+Multi-mode 66.5 55.1
  101. 101. SQuAD F1 (%) EM (%) First Baseline 40.0 51.0 SOTA 91.2 85.4 PI-SQuAD F1 EM LSTM 57.2 46.8 LSTM+SA 59.8 49.0 LSTM+SA+Dual 63.2 52.0 LSTM+SA+Multi-mode 66.5 55.1 LSTM+SA+Sparse+ELMo 69.3 58.7 To be on arXiv soon
  102. 102. Scalability 고려사항 1 • SQuAD 는 문서 하나만 보는 것. à 벤치마크의 성격이 강함 • 실제 QA 시나리오가 아님. • End-to-end 가 Pipeline보다 더 나을거라는 보장? 추가 실험들이 필요!
  103. 103. Scalability 고려사항 2 • 영어 위키피디아 단어수: 30억개 • 단어당 구문수: 평균 7개 • 구문당 vector dimension: 1024 • Float32: 4 Byte 약 90 TB (210억개의 구문)
  104. 104. Scalability 고려사항 2 • 영어 위키피디아 단어수: 30억개 • 단어당 구문수: 평균 7개 • 구문당 vector dimension: 1024 • Float32: 4 Byte 최적화 가능 약 90 TB (210억개의 구문)
  105. 105. phrase embedding에 내포된 의미?
  106. 106. Super Bowl 50 !" American football game !# National Football League !$ Denver Broncos !% … Which NFL team represented the AFC at Super Bowl 50? & MIPS
  107. 107. According to the American Library Association, this makes … … tasked with drafting a European Charter of Human Rights, … 비슷한 타입의 고유명사 (lexical)
  108. 108. The LM engines were successfully test- fired and restarted, … Steam turbines were extensively applied … 비슷한 semantic (의미) 및 syntactic (문법) 구조
  109. 109. … primarily accomplished through the ductile stretching and thinning. … directly derived from the homogeneity or symmetry of space … 비슷한 syntactic (문법) 구조
  110. 110. 그러니까 결론이 뭐야? 검색과 NLP의 아름다운 조화 아직 갈길은 멀지만, 같이 연구하고 고민해 보자구요! 나는 당장 잘되는게 필요하다구 둘 다 할게요 ㅜㅜ
  111. 111. tl;dr: Representing the world knowledge in an elegant way
  112. 112. Thank you
  113. 113. Q & A
  114. 114. 질문은 Slido에 남겨주세요. sli.do #deview TRACK 2
  115. 115. We are Hiring! Domains • Speech Recognition • Speech Synthesis • Computer Vision • Natural Language • NSML / AutoML • Finance AI • App/Web Services Positions • Research Scientist • Research Engineer • SW Engineer • Android / iOS Engineer • Backend Engineer • Data Engineer • UI/UX Engineer • Internship Member • Global Residency clova-jobs@navercorp.com

×