SlideShare a Scribd company logo
Svitlana Vakulenko
Memory Networks for QA
on Tabular Data
Institute for Information Business
WU Vienna
@vendiSV
http://vendi12.github.io
Outline
1. Task: Question Answering (QA)
2. Method: Memory Networks
3. Application: Open Data Tables
4. Memory Networks for QA on Tabular data
QA Task
Daniel Jurafsky & James H. Martin. Speech and Language Processing (Chapter 28). 2016
QA Task
bAbI Benchmark
❖ 20 QA tasks: train/test 1K samples
	 

Factoid QA

	 	 Yes/no questions

	 	 Counting

	 	 Coreference

	 	 Time manipulation

	 	 Basic deduction/induction

	 	 Positional reasoning

	 	 Reasoning about size

	 	 Path finding …

❖ 6 dialog tasks
Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin and Tomas
Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:1502.05698.
2. Factoid QA with two supporting facts
1 Mary got the milk.
2 John moved to the bedroom.
3 Sandra went back to the kitchen.
4 Mary travelled to the hallway.
5 Where is the milk? hallway 1 4
https://github.com/facebook/bAbI-tasks
15. Basic deduction
1 Wolves are afraid of mice.
2 Sheep are afraid of mice.
3 Winona is a sheep.
4 Mice are afraid of cats.
5 Cats are afraid of wolves.
6 Jessica is a mouse.
10 What is winona afraid of? mouse 3 2
12 What is jessica afraid of? cat 6 4
https://github.com/facebook/bAbI-tasks
19. Path finding
1 The garden is west of the bathroom.
2 The bedroom is north of the hallway.
3 The office is south of the hallway.
4 The bathroom is north of the bedroom.
5 The kitchen is east of the bedroom.
6 How do you go from the bathroom to the hallway? s,s 4 2
https://github.com/facebook/bAbI-tasks
Memory Network (MemNN)
❖ Deep neural network architecture proposed by Facebook AI Research group
Memory: indexed array of objects (e.g. vectors)
❖ Components:
I: (input) convert incoming data to the internal representation.
G: (generalisation) update memories given input.
O: (output) produce output given the memories.
R: (response) convert output representation into a response.
J. Weston, S. Chopra, A. Bordes. Memory Networks. ICLR 2015

https://blog.acolyer.org/2016/03/10/memory-networks/
Memory Network (MemNN)
https://blog.acolyer.org/2016/03/10/memory-networks/
End-to-end Memory Network (MemN2N)
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in
neural information processing systems (NIPS). 2015.
Variations
❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James
Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus,
Richard Socher: Ask Me Anything: Dynamic Memory
Networks for Natural Language Processing. ICML 2016.
❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein
Karimi, Antoine Bordes, Jason Weston: Key-Value Memory
Networks for Directly Reading Documents. EMNLP 2016.
❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks.
EACL 2017.
http://yerevann.com/dmn-ui
Application: Open Data
❖ Open Data -> Open Government
❖ Increasing transparency
❖ Empowering citizens and local communities
Global Open Data Index
https://index.okfn.org/place/
Why Tabular Data?
https://www.europeandataportal.eu/mqa-service/en
Memory Networks for QA on Tabular data
https://svakulenko.ai.wu.ac.at/tableqa
Architecture
System architecture: T - input table; Q - question; A - answer
Table Representation
1 Row1 NUTS2 AT31
2 Row1 LAU2_CODE 40404
3 Row1 LAU2_NAME Braunau_am_Inn
4 Row1YEAR 2015
5 Row1 INTERNAL_MIG_IMMIGRATION 808
6 Row1 INTERNATIONAL_MIG_IMMIGRATION 357
7 Row1 IMMIGRATION_TOTAL 1165
8 Row1 INTERNAL_MIG_EMIGRATION 607
9 Row1 INTERNATIONAL_MIG_EMIGRATION 186
10 Row1 EMIGRATION_TOTAL 793
11 Row2 NUTS2 AT31
12 Row2 LAU2_CODE 40405
13 Row2 LAU2_NAME Burgkirchen
14 Row2YEAR 2015
15 Row2 INTERNAL_MIG_IMMIGRATION 138
16 Row2 INTERNATIONAL_MIG_IMMIGRATION 91
17 Row2 IMMIGRATION_TOTAL 229
18 Row2 INTERNAL_MIG_EMIGRATION 195
19 Row2 INTERNATIONAL_MIG_EMIGRATION 12
20 Row2 EMIGRATION_TOTAL 207
21 What is the INTERNATIONAL_MIG_EMIGRATION for Burgkirchen? 12 13 19
Query Disambiguation
❖ fastText model trained on Wikipedia
❖ handles OOV words
immigration recognised as immigration_total 0.96
code recognized as lau2_code 0.86
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors
with subword information. arXiv:1607.04606. 2016.
Evaluation
The template-based questions are modified by
‣ omitting words: one or more words are removed from
the original user query;
‣ changing the position of words in the query;
‣ querying a different column that did not appear in the
questions from the training data set;
‣ inadequate questions, for which data required to
answer this question are not present in the input table.
Results
The template-based questions are modified by
✓ omitting words: one or more words are removed from
the original user query;
✓ changing the position of words in the query;
- querying a different column that did not appear in the
questions from the training data set;
- inadequate questions, for which data required to
answer this question are not present in the input table.
Results
❖ Test set:
8 samples x 4 corruption types = 32 samples
Conclusions
❖ Fancy, but tricky!
❖ Know on what you train?
data sampling & variance to ensure generalisability
❖ Know what you trained?
interaction & visualisation of learned patterns
Future Work
❖ Scaling up experiments to real world tables (variance &
OOV words)
❖ New dataset for QA from open data tables
❖ Answering questions across tables
❖ Semantic integration of open data tables
❖ Joint training with other bAbI tasks for text
understanding
Future Work
https://m.me/OpenDataAssistant
Make data your friend!
Open Data Assistant: chatbot - dialogue interface
Sebastian Neumaier, Vadim Savenkov, and Svitlana Vakulenko. "Talking Open Data." ESWC (demo). 2017
References I
❖ Svitlana Vakulenko and Vadim Savenkov. TableQA: Question Answering on Tabular Data.
2017. https://arxiv.org/abs/1705.06504
❖ Jason Weston, Sumit Chopra, Antoine Bordes. Memory Networks. ICLR 2015 .
❖ Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks."
Advances in neural information processing systems (NIPS). 2015.
❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani,
Victor Zhong, Romain Paulus, Richard Socher: Ask Me Anything: Dynamic Memory
Networks for Natural Language Processing. ICML 2016.
❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason
Weston: Key-Value Memory Networks for Directly Reading Documents. EMNLP 2016.
❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks. EACL 2017.
❖ Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston: Large-scale Simple Question
Answering with Memory Networks. CoRR abs/1506.02075. 2015.
References II
Datasets
❖ Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin
and Tomas Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:
1502.05698.
❖ SimpleQuestions
❖ WebQuestions
❖ CNN QA
Videos
❖ CS224D Guest Lecture - Jason Weston - 2015 https://www.youtube.com/watch?v=6NHeIEaSie8&t=1435s
❖ Jason Weston. Memory Networks for Language Understanding, ICML Tutorial 2016
http://www.thespermwhale.com/jaseweston/icml2016/
http://techtalks.tv/talks/memory-networks-for-language-understanding/62356/
Blogs
❖ https://blog.init.ai/icml-2016-memory-networks-for-language-understanding-f2ed4c8819c4
❖ https://blog.acolyer.org/2016/03/10/memory-networks/
❖ https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/
AI Summit Vienna 2017
mostly.ai/summit 

More Related Content

Similar to Memory Networks for Question Answering on Tabular Data

Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017
Fabien Gandon
 
Yuri A. Ivanov
Yuri A. IvanovYuri A. Ivanov
Yuri A. Ivanovbutest
 
CILIP School Libraries Group Nov 2015
CILIP School Libraries Group Nov 2015CILIP School Libraries Group Nov 2015
CILIP School Libraries Group Nov 2015
EISLibrarian
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
Jens Lehmann
 
Analisa
AnalisaAnalisa
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
Olga Scrivner
 
The Future Of Education
The Future Of EducationThe Future Of Education
The Future Of Education
James
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
Tao Xie
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Amélie Gyrard
 
2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».
eMadrid network
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
Sotiris Beis
 
Who are you and makes you special?
Who are you and makes you special?Who are you and makes you special?
Who are you and makes you special?
Simon Buckingham Shum
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
Symeon Papadopoulos
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
hala Skaf
 
Nas integrated education 20170406 v5
Nas integrated education 20170406 v5Nas integrated education 20170406 v5
Nas integrated education 20170406 v5
ISSIP
 
Real World NLP, ML, and Big Data
Real World NLP, ML, and Big DataReal World NLP, ML, and Big Data
Real World NLP, ML, and Big Data
Devin Bost
 
Collaborative writing and common core standards in the classroom slideshare
Collaborative writing and common core standards in the classroom   slideshareCollaborative writing and common core standards in the classroom   slideshare
Collaborative writing and common core standards in the classroom slideshare
Vicki Davis
 
SCAI invited talk @EMNLP2020
SCAI invited talk @EMNLP2020SCAI invited talk @EMNLP2020
SCAI invited talk @EMNLP2020
Verena Rieser
 
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdfMaemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
WARCnet
 
Wrangling Big Data in a Small Tech Ecosystem
Wrangling Big Data in a Small Tech EcosystemWrangling Big Data in a Small Tech Ecosystem
Wrangling Big Data in a Small Tech Ecosystem
Shalin Hai-Jew
 

Similar to Memory Networks for Question Answering on Tabular Data (20)

Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017
 
Yuri A. Ivanov
Yuri A. IvanovYuri A. Ivanov
Yuri A. Ivanov
 
CILIP School Libraries Group Nov 2015
CILIP School Libraries Group Nov 2015CILIP School Libraries Group Nov 2015
CILIP School Libraries Group Nov 2015
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Analisa
AnalisaAnalisa
Analisa
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
The Future Of Education
The Future Of EducationThe Future Of Education
The Future Of Education
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
 
2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Who are you and makes you special?
Who are you and makes you special?Who are you and makes you special?
Who are you and makes you special?
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
Nas integrated education 20170406 v5
Nas integrated education 20170406 v5Nas integrated education 20170406 v5
Nas integrated education 20170406 v5
 
Real World NLP, ML, and Big Data
Real World NLP, ML, and Big DataReal World NLP, ML, and Big Data
Real World NLP, ML, and Big Data
 
Collaborative writing and common core standards in the classroom slideshare
Collaborative writing and common core standards in the classroom   slideshareCollaborative writing and common core standards in the classroom   slideshare
Collaborative writing and common core standards in the classroom slideshare
 
SCAI invited talk @EMNLP2020
SCAI invited talk @EMNLP2020SCAI invited talk @EMNLP2020
SCAI invited talk @EMNLP2020
 
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdfMaemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
Maemura_WARCnet_Developing Datasheets for Archived Web Datasets.pdf
 
Wrangling Big Data in a Small Tech Ecosystem
Wrangling Big Data in a Small Tech EcosystemWrangling Big Data in a Small Tech Ecosystem
Wrangling Big Data in a Small Tech Ecosystem
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
QADay
 
UiPath New York Community Day in-person event
UiPath New York Community Day in-person eventUiPath New York Community Day in-person event
UiPath New York Community Day in-person event
DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Ransomware Mallox [EN].pdf
Ransomware         Mallox       [EN].pdfRansomware         Mallox       [EN].pdf
Ransomware Mallox [EN].pdf
Overkill Security
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
UiPath New York Community Day in-person event
UiPath New York Community Day in-person eventUiPath New York Community Day in-person event
UiPath New York Community Day in-person event
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Ransomware Mallox [EN].pdf
Ransomware         Mallox       [EN].pdfRansomware         Mallox       [EN].pdf
Ransomware Mallox [EN].pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 

Memory Networks for Question Answering on Tabular Data

  • 1. Svitlana Vakulenko Memory Networks for QA on Tabular Data Institute for Information Business WU Vienna @vendiSV http://vendi12.github.io
  • 2. Outline 1. Task: Question Answering (QA) 2. Method: Memory Networks 3. Application: Open Data Tables 4. Memory Networks for QA on Tabular data
  • 3. QA Task Daniel Jurafsky & James H. Martin. Speech and Language Processing (Chapter 28). 2016
  • 5. bAbI Benchmark ❖ 20 QA tasks: train/test 1K samples Factoid QA Yes/no questions Counting Coreference Time manipulation Basic deduction/induction Positional reasoning Reasoning about size Path finding … ❖ 6 dialog tasks Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin and Tomas Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:1502.05698.
  • 6. 2. Factoid QA with two supporting facts 1 Mary got the milk. 2 John moved to the bedroom. 3 Sandra went back to the kitchen. 4 Mary travelled to the hallway. 5 Where is the milk? hallway 1 4 https://github.com/facebook/bAbI-tasks
  • 7. 15. Basic deduction 1 Wolves are afraid of mice. 2 Sheep are afraid of mice. 3 Winona is a sheep. 4 Mice are afraid of cats. 5 Cats are afraid of wolves. 6 Jessica is a mouse. 10 What is winona afraid of? mouse 3 2 12 What is jessica afraid of? cat 6 4 https://github.com/facebook/bAbI-tasks
  • 8. 19. Path finding 1 The garden is west of the bathroom. 2 The bedroom is north of the hallway. 3 The office is south of the hallway. 4 The bathroom is north of the bedroom. 5 The kitchen is east of the bedroom. 6 How do you go from the bathroom to the hallway? s,s 4 2 https://github.com/facebook/bAbI-tasks
  • 9. Memory Network (MemNN) ❖ Deep neural network architecture proposed by Facebook AI Research group Memory: indexed array of objects (e.g. vectors) ❖ Components: I: (input) convert incoming data to the internal representation. G: (generalisation) update memories given input. O: (output) produce output given the memories. R: (response) convert output representation into a response. J. Weston, S. Chopra, A. Bordes. Memory Networks. ICLR 2015 https://blog.acolyer.org/2016/03/10/memory-networks/
  • 11. End-to-end Memory Network (MemN2N) Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems (NIPS). 2015.
  • 12. Variations ❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher: Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. ICML 2016. ❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston: Key-Value Memory Networks for Directly Reading Documents. EMNLP 2016. ❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks. EACL 2017. http://yerevann.com/dmn-ui
  • 13. Application: Open Data ❖ Open Data -> Open Government ❖ Increasing transparency ❖ Empowering citizens and local communities
  • 14. Global Open Data Index https://index.okfn.org/place/
  • 16. Memory Networks for QA on Tabular data https://svakulenko.ai.wu.ac.at/tableqa
  • 17. Architecture System architecture: T - input table; Q - question; A - answer
  • 18. Table Representation 1 Row1 NUTS2 AT31 2 Row1 LAU2_CODE 40404 3 Row1 LAU2_NAME Braunau_am_Inn 4 Row1YEAR 2015 5 Row1 INTERNAL_MIG_IMMIGRATION 808 6 Row1 INTERNATIONAL_MIG_IMMIGRATION 357 7 Row1 IMMIGRATION_TOTAL 1165 8 Row1 INTERNAL_MIG_EMIGRATION 607 9 Row1 INTERNATIONAL_MIG_EMIGRATION 186 10 Row1 EMIGRATION_TOTAL 793 11 Row2 NUTS2 AT31 12 Row2 LAU2_CODE 40405 13 Row2 LAU2_NAME Burgkirchen 14 Row2YEAR 2015 15 Row2 INTERNAL_MIG_IMMIGRATION 138 16 Row2 INTERNATIONAL_MIG_IMMIGRATION 91 17 Row2 IMMIGRATION_TOTAL 229 18 Row2 INTERNAL_MIG_EMIGRATION 195 19 Row2 INTERNATIONAL_MIG_EMIGRATION 12 20 Row2 EMIGRATION_TOTAL 207 21 What is the INTERNATIONAL_MIG_EMIGRATION for Burgkirchen? 12 13 19
  • 19. Query Disambiguation ❖ fastText model trained on Wikipedia ❖ handles OOV words immigration recognised as immigration_total 0.96 code recognized as lau2_code 0.86 Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. arXiv:1607.04606. 2016.
  • 20. Evaluation The template-based questions are modified by ‣ omitting words: one or more words are removed from the original user query; ‣ changing the position of words in the query; ‣ querying a different column that did not appear in the questions from the training data set; ‣ inadequate questions, for which data required to answer this question are not present in the input table.
  • 21. Results The template-based questions are modified by ✓ omitting words: one or more words are removed from the original user query; ✓ changing the position of words in the query; - querying a different column that did not appear in the questions from the training data set; - inadequate questions, for which data required to answer this question are not present in the input table.
  • 22. Results ❖ Test set: 8 samples x 4 corruption types = 32 samples
  • 23. Conclusions ❖ Fancy, but tricky! ❖ Know on what you train? data sampling & variance to ensure generalisability ❖ Know what you trained? interaction & visualisation of learned patterns
  • 24. Future Work ❖ Scaling up experiments to real world tables (variance & OOV words) ❖ New dataset for QA from open data tables ❖ Answering questions across tables ❖ Semantic integration of open data tables ❖ Joint training with other bAbI tasks for text understanding
  • 25. Future Work https://m.me/OpenDataAssistant Make data your friend! Open Data Assistant: chatbot - dialogue interface Sebastian Neumaier, Vadim Savenkov, and Svitlana Vakulenko. "Talking Open Data." ESWC (demo). 2017
  • 26. References I ❖ Svitlana Vakulenko and Vadim Savenkov. TableQA: Question Answering on Tabular Data. 2017. https://arxiv.org/abs/1705.06504 ❖ Jason Weston, Sumit Chopra, Antoine Bordes. Memory Networks. ICLR 2015 . ❖ Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems (NIPS). 2015. ❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher: Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. ICML 2016. ❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston: Key-Value Memory Networks for Directly Reading Documents. EMNLP 2016. ❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks. EACL 2017. ❖ Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston: Large-scale Simple Question Answering with Memory Networks. CoRR abs/1506.02075. 2015.
  • 27. References II Datasets ❖ Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin and Tomas Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv: 1502.05698. ❖ SimpleQuestions ❖ WebQuestions ❖ CNN QA Videos ❖ CS224D Guest Lecture - Jason Weston - 2015 https://www.youtube.com/watch?v=6NHeIEaSie8&t=1435s ❖ Jason Weston. Memory Networks for Language Understanding, ICML Tutorial 2016 http://www.thespermwhale.com/jaseweston/icml2016/ http://techtalks.tv/talks/memory-networks-for-language-understanding/62356/ Blogs ❖ https://blog.init.ai/icml-2016-memory-networks-for-language-understanding-f2ed4c8819c4 ❖ https://blog.acolyer.org/2016/03/10/memory-networks/ ❖ https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/
  • 28. AI Summit Vienna 2017 mostly.ai/summit