The document discusses the results of a study on the impact of COVID-19 lockdowns on air pollution. Researchers analyzed satellite data from NASA and the European Space Agency and found that nitrogen dioxide levels decreased significantly during lockdown periods in major cities across the world as traffic and industrial activities reduced. Overall, the temporary improvements in air quality during widespread lockdowns highlight the human-caused nature of poor air quality but also show how collective changes in behavior can positively impact the environment.
The document discusses the results of a study on the impact of COVID-19 lockdowns on air pollution. Researchers analyzed satellite data from NASA and the European Space Agency and found that nitrogen dioxide levels decreased significantly during lockdown periods in major cities across the world as traffic and industrial activities reduced. Overall, the temporary improvements in air quality during widespread lockdowns highlight the human-caused nature of poor air quality but also show how collective changes in behavior can positively impact the environment.
ICLR/ICML2019読み会で紹介した、ICLR2019でのNLPに関するOral4件の論文紹介です。
紹介論文:
Shen, Yikang, et al. “Ordered neurons: Integrating tree structures into recurrent neural networks.” in Proc. of ICLR, 2019.
Li, Xiang, et al. "Smoothing the Geometry of Probabilistic Box Embeddings." in Proc. of ICLR, 2019.
Wu, Felix, et al. "Pay less attention with lightweight and dynamic convolutions." in Proc. of ICLR, 2019.
Mao, Jiayuan, et al. "The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision." in Proc. of ICLR, 2019.
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
The document discusses meta-learning and prototypical networks for few-shot learning. It introduces prototypical networks, which learn a metric space such that classification can be performed by finding the nearest class prototype to a query example in embedding space. The document summarizes results on few-shot image classification benchmarks like Omniglot and miniImageNet, finding that prototypical networks achieve state-of-the-art performance.
Predicting organic reaction outcomes with weisfeiler lehman networkKazuki Fujikawa
This document discusses neural message passing networks for modeling quantum chemistry. It defines message passing networks as having message functions that update node states based on neighboring node states, vertex update functions that update node states based to accumulated messages, and a readout function that produces an output for the full graph. It provides examples of specific message, update, and readout functions used in existing message passing models like interaction networks and molecular graph convolutions.
SchNet: A continuous-filter convolutional neural network for modeling quantum...Kazuki Fujikawa
The document summarizes a paper about modeling quantum interactions using a continuous-filter convolutional neural network called SchNet. Some key points:
1) SchNet performs convolution using distances between nodes in 3D space rather than graph connectivity, allowing it to model interactions between arbitrarily positioned nodes.
2) This is useful for cases where graphs have different configurations that impact properties, or where graph and physical distances differ.
3) The paper proposes a continuous-filter convolutional layer and interaction block to incorporate distance information into graph convolutions performed by the SchNet model.
The document summarizes the paper "Matching Networks for One Shot Learning". It discusses one-shot learning, where a classifier can learn new concepts from only one or a few examples. It introduces matching networks, a new approach that trains an end-to-end nearest neighbor classifier for one-shot learning tasks. The matching networks architecture uses an attention mechanism to compare a test example to a small support set and achieve state-of-the-art one-shot accuracy on Omniglot and other datasets. The document provides background on one-shot learning challenges and related work on siamese networks, memory augmented neural networks, and attention mechanisms.
2. ▪ ACL2020 概要
▪ ACL2020 Best Paper群の紹介
▪ Best Paper
▪ Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [Ribeiro+]
▪ Best Paper (Honorable mention)
▪ Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation
Metrics [Mathur+]
▪ Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan+]
AGENDA
3. ▪ ACL2020 概要
▪ ACL2020 Best Paper群の紹介
▪ Best Paper
▪ Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [Ribeiro+]
▪ Best Paper (Honorable mention)
▪ Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation
Metrics [Mathur+]
▪ Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan+]
AGENDA
5. ACL2020 概要: 分野毎の論文数(submission)
Information Extraction, Text Mining
Machine Learning
Machine Translation
Dialogue and Interactive Systems
Generation
Question Answering
Sentiment Analysis, Argument Mining
Word-level Semantics
Applications
Resources and Evaluation
Multidisciplinary, AC COI
Sentience-level Semantics
Tagging, Chunking, Syntax, Parsing
Social Media
Summarization
Document Analysis
Multilinguality
Textual Inference, Other Areas of Semantics
Discourse and Pragmatics
Phonology, Morphology, Word Segmentation
2019 2020
Machine Learning for NLP
Dialogue and Interactive Systems
Machine Translation
Information Extraction
NLP Application
Generation
Question Answering
Resources and Evaluation
Summarization
Computational Social Science and Social Media
Semantics: Sentence Level
Interpretability and Analysis of Models for NLP
Semantics: Lexical
Information Retrieval and Text Mining
Language Grounding to Vision, Robotics and Beyond
Theme
Cognitive Modeling and Psycholinguistics
Speech and Multimodality
Syntax: Tagging, Chunking and Parsing
Multidisciplinary and Area Chair COI
Discourse and Pragmatics
Phonology, Morphology and Word Segmentation
Ethics and NLP
Sentiment Analysis, Stylistic Analysis,
and Argument Mining
Semantics: Textual Inference and Other Areas
of Semantics
Theory and Formalism in NLP (Linguistic and
Mathematical)
Vision, Robotics,Multimodal Grounding,
Speech
Linguistic Theories, Cognitive,
Psycholinguistics
: New
: 200+ submissions
6. ▪ ACL2020 概要
▪ ACL2020 Best Paper群の紹介
▪ Best Paper
▪ Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [Ribeiro+]
▪ Best Paper (Honorable mention)
▪ Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation
Metrics [Mathur+]
▪ Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan+]
AGENDA
18. ▪ NLP専門家に対するユーザ調査(実験結果)
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
[Ribeiro+]
CheckList利用者 vs 非利用者:
● Capabilityの観点で、多様な観点で試験を実施
● 結果として約3倍のBug(template利用者)
を発見
→ CheckList利用の有用性を示唆
19. ▪ NLP専門家に対するユーザ調査(実験結果)
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
[Ribeiro+]
template利用者 vs 非利用者:
● Capabilityの観点では同等の試験を実施
● テスト数はtemplate利用者が大きく上回り、
結果としてより多くのバグを発見
→ CheckList利用の有用性を示唆
20. ▪ ACL2020 概要
▪ ACL2020 Best Paper群の紹介
▪ Best Paper
▪ Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [Ribeiro+]
▪ Best Paper (Honorable mention)
▪ Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation
Evaluation Metrics [Mathur+]
▪ Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan+]
AGENDA
21. ▪ 機械翻訳(MT)における既存の評価指標の問題点を指摘
▪ 現在のメトリクスの評価手法が、外れ値に弱いことを指摘
▪ 事実上の標準メトリックであるBLEUが必ずしも人間の評価と相関して
いるわけではないことを指摘
▪ MTのパフォーマンスが向上したと言える、メトリクス向上の閾値
の決め方にも言及
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine
Translation Evaluation Metrics [Mathur+]
22. ▪ 従来のMT品質の評価方法自体の評価方法
▪ DA (Direct Assessment) と評価法(e.g. BLEU)とのPearson Rで評価
▪ DA: WMT2019で構築されたMTシステムの出力に対して、アノテータが
100段階の評価を付けた上で、アノテータ毎に標準化して平均を取る
▪ どの言語の翻訳タスクに対しても、BLEUは高い相関を持つという結果
▪ 現在も事実上標準の評価指標として用いられている
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine
Translation Evaluation Metrics [Mathur+]
I have a pen. ペンを持つ。
ペンを持っています。
原文 MT
Human annotation
DA (Direct Assessment)
50
0 100
BLEU: 28
23. ▪ Pearson R
▪ 基本的には誤差が少ない場合に高い値を示す
▪ 外れ値が存在する場合に高い値を示してしまうことも知られている
▪ 外れ値(少数の品質の悪いMTシステムに対するデータ)を除外すると、
相関係数が大きく変わるものも存在
▪ → メトリック評価時は外れ値を除外しましょう
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine
Translation Evaluation Metrics [Mathur+]
外れ値除外無し
外れ値除外有り
24. ▪ 2つのMTシステムペアに対するスコア差を比較
▪ BLEU: 3~5 point 差がある事例の1/4は、DAでは優位な差が無い
▪ 近年のMT研究のBLEU差の多くは 1-2 point 程度の差であり、真の品質
を評価できていない可能性がある
▪ 下図のような各メトリクスとDAの関係性を考慮した上でMTパフォーマンス
向上有無を評価する必要がある
▪ BLEU, TER と比較して、CHRF, YISI-1, ESIM のエラーは少ない傾向
▪ 現時点ではBLEU, TERではなくCHRF, YISI-1, ESIMなどのメトリックを
使うことをオススメする
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine
Translation Evaluation Metrics [Mathur+]
25. ▪ ACL2020 概要
▪ ACL2020 Best Paper群の紹介
▪ Best Paper
▪ Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [Ribeiro+]
▪ Best Paper (Honorable mention)
▪ Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation
Metrics [Mathur+]
▪ Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan+]
AGENDA