SlideShare a Scribd company logo
1 of 89
Download to read offline
A Framework for Automatic Question
Answering in Indian Languages
Ritwik Mishra (PhD19013)
27th July 2022
Comprehensive Panel Members
Dr Rajiv Ratn Shah
PhD advisor
IIIT-Delhi
Prof Ponnurangam Kumaraguru
PhD advisor
IIIT-Hyderabad
Prof Radhika Mamidi
External expert
IIIT-Hyderabad
Dr Arun Balaji Buduru
Internal expert
IIIT-Delhi
Dr Raghava Mutharaju
Internal expert
IIIT-Delhi
2
What is Automatic Question Answering (QnA)?
Human/user asks a question, and a computer produces an answer.
3
Outline
● Motivation
● Thesis Statement
● Contributions
○ FAQ retrieval
○ Open Information Extraction
○ Better Contextual Embeddings
● Future Plans
○ Pronoun Resolution
○ Long-context Question Answering
4
Question Answering
Open
Information
Extraction
(OIE)
Better Contextual
Embeddings
Pronoun Resolution
Our Work
5
Thesis Overview
Why languages?
● Human intelligence is very much
attributed to our ability to express
complex ideas through language.
● With the advent of text modality
(i.e. writing) our ability to store
and spread information increased
dramatically.
6
Why Indian Languages?
● Despite being spoken by billions of people, Indic languages
are considered low-resource due to the lack of annotated
resources.
“Hindi is... an underdog. It has enough resources and everything looks
promising for Hindi.”
- Dr. Monojit Choudhury, MSR India, ML India Podcast, Aug 2020
7
Cases where Indic-QnA works well
8
Cases where it fails
9
Thesis Statement
● Explore the possibility to develop a framework for Automated
Question-Answering (QnA) in Indian languages with the help of multiple
supporting tasks like Open Information Extraction (OIE) and Pronoun
Resolution, and improved contextual embeddings.
10
Thesis Statement (visualized)
11
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Question-Answering
12
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Types of QnA
QnA
Domain
Context
Answer Type
Discourse
Open-Domain Closed-Domain
Short
Long
No
Span-based
(MRC)
Sentence-based
(AS2)
Conversational
Memory-less
13
Methodologies in QnA
● Rule based approach
● Generative approach
● Retrieval based approach
14
Methodologies in QnA
● Rule based approach
● Generative approach
● Retrieval based approach
15
Methodologies in QnA
16
● Rule based approach
● Generative approach
● Retrieval based approach
Retrieval-based
17
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
FAQ retrieval based QnA
User query
(q)
Questioni
( Qi
)
Answeri
( Ai
)
Top-k Question-Answer pairs (QA)
Where Qi is most similar to the user query (q)
k
18
FAQ retrieval based QnA (internal view)
User query (q)
FAQ database Sentence
Similarity
Calculator
(SSC)
# Q A
# Q A score
FAQ database with each
row having its similarity
score with q 19
Example
User query (q) : बच्चे की ना भ फ
ू ली हो है ठीक करने क
े लए क्या क्या करे?
Chatbot suggested:
Q1) बच्चे की अचानक ना भ फ
ू ल जाए तो क्या करें?
A1) बच्चे की ना भ अचानक नहीं फ
ू लती आमतौर पर यह देखा गया है क जब बच्चे का पेट थोड़ा फ
ू लता है …
Q2) अगर 5 म हने क बच्चे की ना भ फ
ू ली हुई हो तो क्या करे?
A2) 5 महीने क
े बच्चे की ना भ आम तोर पर मसल्स की कमजोरी क
े कारण होती है इस में कोई चंता ….
Q3) नवजात बच्चे की फ
ू ली हुए ना भ होती है वह जब ठीक हो जाए तो क्या उसक
े बढ़ उसे गैस की प्रॉब्लम हो
जाती है?
A3) यह एक बह्रम है बच्चे की ना भ जब फ
ु ल्ली होती है तो वह खतरे क
े लक्षण नहीं है फ
ु ल्ली हुई ना भ …
20
How to evaluate?
Question ( Q )
Answer ( A )
User deciding the relevancy of the
top-k Question-Answer pairs (QA)
suggested by chatbot for 4 different queries
k
21
Question ( Q )
Answer ( A )
k
Question ( Q )
Answer ( A )
k
Question ( Q )
Answer ( A )
k
Top-k suggestions for q1 Top-k suggestions for q2
Top-k suggestions for q3 Top-k suggestions for q4
Evaluation Metrics
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted Cumulative Gain (nDCG)
22
Success Rate is the % of user queries for whom the chatbot suggested at least one relevant QA pair
among its top-k suggestions.
Evaluation techniques
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted Cumulative Gain (nDCG)
Precision at k is the proportion of recommended items in the top-k set that
are relevant. 23
Evaluation techniques
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted Cumulative Gain (nDCG)
For each user (or q)
For each relevant item
Compute precision till that item
Average them
Average them 24
Evaluation techniques
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted Cumulative Gain (nDCG)
25
Evaluation techniques
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted
Cumulative Gain (nDCG)
26
Evaluation techniques
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted
Cumulative Gain (nDCG)
27
Selected Literature
● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○ Non-public healthcare dataset in African languages
○ 230K QA pairs → 150K Qs with 126 As
○ Hence 126 clusters of Qs
○ Predict a cluster for q
● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○ Transcription based
○ 516 Qs with 90 As
○ User had to specify broad category of q
● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○ Q-q similarity works better than QA-q or A-q similarity.
28
Selected Literature
● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○ Non-public healthcare dataset in African languages
○ 230K QA pairs → 150K Qs with 126 As
○ Hence 126 clusters of Qs
○ Predict a cluster for q
● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○ Transcription based
○ 516 Qs with 90 As
○ User had to specify broad category of q
● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○ Q-q similarity works better than QA-q or A-q similarity.
Research Gap: There is a need to propose automated solutions to provide healthcare related information
to an end-user in remote places of India. 29
The proposed FAQ-chatbot
Chatbot
FAQ
database
if count < k
User query (q)
Finding top-k most similar
QA pair for the given q
Asking user if Q is similar to q?
No
No
Yes
A
q cannot be
answered
Yes Response
Response
User User User
30
Different SSC techniques
Sentence
Similarity
Calculator
(SSC)
Dependency Tree
Pruning (DTP) method
Cosine Similarity (COS)
method
Sentence-pair Classifier
(SPC) method
Training NOT needed Training needed
31
32
Ablation on SPC alone
33
Evaluation data
● On field testing was not feasible due to COVID.
● We collected 336 queries from the healthcare workers.
● Manually annotated each of them, and found complete/partial matching Qs.
34
q
336
Healthcare-worker
q Qs
Hold-out
test-set
For each query (q) we manually
found similar questions (Qs) from
the FAQ database
270
Results 1/2
DTP SPC COS
ℇ
DTP DTPq-e
SPC SPC+A
SPCq-e
COS COSq-e
mAP 30.5 35.1 39.4 21.1 39.1 26.5 27.9 45.3
MRR 42.6 48.5 54.6 42.2 54.2 38.7 41.0 61.6
SR 27.1 59.6 66.2 49.6 64.4 47.7 51.1 70.3
nDCG 55.5 51.2 57.1 43.9 56.5 40.8 43.3 62.5
P@3 27.1 30.0 34.6 34.6 34.6 22.7 23.9 34.6
Table 1. Comparison of all three primary approaches on hold-out test set for top-3 suggestions. Ensemble
(ℇ) is obtained by taking the best performing models, highlighted with underlined text, from each of the
primary approach.
35
Results 2/2
ℇ-COS
ℇ-DTP
ℇ-SPC
ℇ
mAP 40.9 40.8 30.3 45.3
MRR 56.2 56.5 43.8 61.6
SR 66.2 66.2 51.1 70.3
nDCG 58.4 58.2 45.5 62.5
P@3 34.6 34.6 23.9 34.6
Table 2. Results of ablation study on the Ensemble method (ℇ). We observed that each
approach plays an important role in the performance of the developed chatbot.
36
Ensemble technique ablation
37
ℇsus
ℇns
ℇqfm
ℇ
mAP 45.2 45.7 44.8 45.3
MRR 61.4 62.0 60.6 61.6
SR 71.8 70.0 67.7 70.3
nDCG 62.4 62.7 60.9 62.5
P@3 34.6 34.6 34.6 34.6
Memory footprint
38
Performance depends on diversity of database as well
Performance of ensemble models on test-sets with different level of pruning. The label ℇp2
#152
represents the ensemble results on a hold-out test-set that is pruned by removing all user
queries that have 2 or less relevant questions in the FAQ database. The resulting pruned
test-set (p2 set) has 152 user queries. 39
Limitations
● The retrieval-based approaches are limited to extracting information from an
indexed database which has to be curated manually.
● In order to retrieve information from a text dump of raw sentences, a
knowledge-graph has to constructed from the unstructured text.
● OIE tools could be used to extract facts from raw sentences in different
domains.
40
Open Information Extraction
41
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Open Information Extraction (OIE)
● Extract facts from raw sentences.
○ Use triples to represent the facts.
○ Format of triples is <head, relation, tail>.
● Example
○ PM Modi to visit UAE in Jan marking 50 yrs of diplomatic ties.
■ one of the possible meaningful triple would be:
<PM Modi, to visit, UAE>
42
Why OIE is important?
● Its ability to extract triples from large amounts of texts in an unsupervised
manner.
● It serves as an initial step in building or augmenting a knowledge graph out of
an unstructured text.
● OIE tools have been used to solve downstream applications like
Question-Answering, Text Summarization, and Entity Linking.
43
OIE in Indian language needs a special attention
● Take an English sentence
○ Ram ate an apple
○ A sensible triple would be
■ <Ram, ate, an apple>
○ A nonsensical triple would be
■ <an apple, ate, Ram>
● Now look at a Hindi sentence (same meaning)
○ राम ने सेब खाया
○ Both these triples would be sensible
■ <राम ने, खाया, सेब>
■ <सेब, खाया, राम ने> 44
How generated triples are evaluated?
● Manual Annotations
Valid
Well-formed
Reasonable
Concrete
True
Tabular representation
Image representation 45
How generated triples are evaluated?
● Manual Annotations
● Automatic Evaluations
46
sent_id:1 वह ऑस्ट्रे लया क
े पहले प्रधान मंत्री क
े रूप में कायर्यरत थे और ऑस्ट्रे लया क
े उच्च न्यायालय क
े संस्थापक न्यायाधीश बने ।
(He served as the first Prime Minister of Australia and became a founding justice of the High Court of Australia .)
------ Cluster 1 ------
वह --> कायर्यरत थे --> [ऑस्ट्रे लया क
े ]{a} [पहले] प्रधान मंत्री क
े रूप में
(He --> served --> as [the] [first] Prime Minister [of Australia]{a})
वह --> बने --> [ऑस्ट्रे लया क
े उच्च न्यायालय क
े ]{b} [संस्थापक] न्यायाधीश
(He --> became --> [founding] justice [of the High Court of Australia]{b})
{a} ऑस्ट्रे लया क
े --> property --> [पहले] प्रधान मंत्री क
े रूप में |OR| वह --> [पहले] प्रधान मंत्री क
े रूप में कायर्यरत थे --> ऑस्ट्रे लया क
े
({a} of Australia --> property --> as [first] Prime Minister |OR| He --> served as [the] [first] Prime Minister --> of Australia)
{b} [ऑस्ट्रे लया क
े ]{c} उच्च न्यायालय क
े --> property --> [संस्थापक] न्यायाधीश |OR| वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क
े ]{c} उच्च न्यायालय क
े
({b} of High Court [of Australia]{c} --> property --> [founding] justice |OR| He --> became [founding] justice --> of High Court [of
Australia]{c})
{c} ऑस्ट्रे लया क
े --> property --> उच्च न्यायालय क
े
({c} of Australia --> property --> of High Court)
------ Cluster 2 ------
वह --> [पहले] प्रधान मंत्री क
े रूप में कायर्यरत थे --> ऑस्ट्रे लया क
े
(He --> served as [the] [first] Prime Minister --> of Australia)
वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क
े ]{a} उच्च न्यायालय क
े
(He --> became [founding] justice --> of High Court [of Australia]{a})
{a} ऑस्ट्रे लया क
े --> property --> उच्च न्यायालय क
े
({a} of Australia --> property --> of High Court)
Hindi-BenchIE
47
Selected Literature
● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○ Supported Hindi but it was translation based
○ Didn’t release code but released triples
● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○ Event specific and domain specific IE
● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○ Identify relations, and then their head-tail
48
Selected Literature
● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○ Supported Hindi but it was translation based
○ Didn’t release code but released triples
● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○ Event specific and domain specific IE
● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○ Identify relations, and then their head-tail
Research Gap:
(1) The field of Open Information Extraction (OIE) from unstructured text in Indic languages has not been
explored much. Moreover, the effectiveness of existing multilingual OIE techniques has to be evaluated on
Indic languages.
(2) There is a scarcity of annotated resources for automatic evaluation of automatically generated triples for
Indic languages.
(3) Construction of knowledge-graph using extracted triples from unstructured text in Indian languages needs
to be explored as well.
49
IndIE: our Indic OIE tool
Raw Text
Output
Sentence segmented text,
Triples for each sentence,
Execution Time
Off-the-shelf sentence segmentation and
dependency parser by Stanford
(a) XLM-roberta fine-tuned on chunk
annotated data
(b) Creating Merged-phrases
Dependency Tree (MDT)
3: Chunk tags
3: Dependency
tree
Greedy algorithm
based on
hand-crafted rules
(c) Triple extraction
2: Sentences
1: Shallow parsing
4: Passing MDT
5: Result
accumulation
50
Chunker
51
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Chunker Implementation 1/2
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs Embeddings
52
Chunker Implementation 2/2
Transformer
Tokenizer
Pretrained
Transformer
Feed-forward
Layer
Tokens Token IDs
Token-level
predictions
Ground-Truth
Calculating
Loss
(a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
53
Results: Chunker
Classification Layers
First sub-word token
embedding
Last sub-word token
embedding
Average embedding of all
sub-word tokens
1 82±10 (50±20) 89±0.5 (62±1.0) 91±0.0 (65±0.5)
2 86±1.8 (51±6.2) 89±0.5 (54±7.4) 90±0.5 (54±4.5)
3 79±14 (43±13) 82±11 (41±12) 90±0.5 (48±2.2)
Table 3. A comparison of three approaches for solving the sub-word token embeddings for chunking task. Four different random seeds
were used to calculate the mean and standard deviation for the given samples. All the experiments were run on the combined data of
TDIL and UD. The numbers written outside round brackets represent the accuracy, whereas numbers inside round brackets represent
the macro average.
Model Hindi English Urdu Nepali Gujarati Bengali
XLM 78% 60% 84% 65% 56% 66%
CRF 67% 56% 71% 58% 53% 53%
Table 4. A comparison of (fine-tuned) XLM chunker and CRF chunker on the languages which are removed from training-set.
The numbers represent the accuracy obtained by each model when sentences from the given language are used only in the
test-set.
54
Results: Triple Evaluation (manual)
Image options ArgOE M&K Multi2OIE PredPatt IndIE
No information 17% 28% 71% 5% 4%
Most Information 22% 44% 24% 29% 17%
All information 61% 28% 5% 66% 79%
Table 5. Percentage of sentences having no/most/all information in the image representation of their generated triples. The method which
generates maximum triples with ‘All information’ is considered the best method.
#Triple ArgOE M&K Multi2OIE PredPatt IndIE
Total 45 180 50 40 272
Valid 38 142 10 39 252
Well-formed 32 75 9 36 240
Reasonable 26 69 6 32 227
Concrete 21 53 4 25 158
True 19 51 4 25 152
Table 6. Number of triples extracted by each OIE method for 106 Hindi sentences.
55
Results: Triple Evaluation (automatic)
ArgOE M&K Multi2OIE PredPatt IndIE
Precision 0.26 0.14 0.12 0.37 0.62
Recall 0.06 0.16 0.03 0.08 0.62
F1-score 0.10 0.15 0.05 0.14 0.62
Table 7. Performance of different OIE methods on Hindi-BenchIE golden set consisting of 75
Hindi sentences. It is observed that IndIE outperforms other methods on all three metrics.
56
Limitations
● Small size of evaluation datasets.
● The contextual embeddings (from transformers) are known to not take the
syntactic structure of sentence into account.
● As a result, it has been observed that syntactic information is either absent in
the transformer embeddings or it is not utilized while making the predictions.
● Therefore, there is a need to explore the possibility of incorporating syntactic
(dependency) information in transformer embeddings.
57
Improve Contextual Embeddings
58
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Dependency-aware Transformer (DaT) Embedding
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs (a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
59
DaT Embedding: Graph
1 2 3 4 5
1 0 1 1 0 0
2 0 0 0 1 1
3 1 0 0 0 0
4 0 1 0 0 0
5 0 1 0 0 0
Adjacency Matrix A
1 2 3 4 5
1 2 0 0 0 0
2 0 3 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
Degree Matrix D
1
2 3
4 5
60
DaT Embedding: Graph Convolution Network (GCN)
1 2 3 4 5
1 0 1 1 0 0
2 0 0 0 1 1
3 1 0 0 0 0
4 0 1 0 0 0
5 0 1 0 0 0
Adjacency Matrix A
1 2 3 4 5
1 2 0 0 0 0
2 0 3 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
Degree Matrix D
1
2 3
4 5
61
DaT Embedding: GCN message passing
62
63
Selected Literature
● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
64
Selected Literature
● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Research Gap: Incorporating dependency structure of a sentence in generating dependency-aware
contextual embeddings is an under-explored field in Indic-NLP.
65
GCN variants
0) No GCN
1)
2.1)
2.2)
3.1)
66
1/3
2/3
3/3
67
Evaluation Dataset
● Small Dataset
○ Fearspeech | #4782 | Labels: 'Fear speech':1 , 'Normal': 0
○ Codalab | #5728 | 'hate’:1, offensive':1,'defamation:1, 'fake':1, 'non-hostile':0
○ HASOC | #4665 | ‘non-hostile’: 0, ‘fake’:0, … , ‘offensive’:1 , ‘hate’: 1
● Big Dataset
○ IIITD abusive language data |
7 lacs (multilingual) | 30k Hindi |
not abusive: 0, abusive:1
68
DaT Embedding Results
69
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.9
(67.7-74.6)
75.9 +/- 1.2
(74.1-77.5)
77.4 +/- 0.6
(76.8-78.1)
79.0 +/- 0.7
(78.2-79.8)
v1
72.4 +/- 1.9
(69.4-74.0)
76.6 +/- 1.2
(74.9-78.0)
78.8 +/- 1.0
(77.1-79.6)
80.3 +/- 1.1
(79.3-81.8)
v2.2
74.2 +/- 1.1
(72.5-75.3)
77.2 +/- 0.9
(75.9-78.3)
79.3 +/- 0.8
(78.4-80.2)
80.5 +/- 0.7
(79.5-81.2)
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.5
(67.3-73.9)
76.2 +/- 1.2
(74.6-77.4)
77.9 +/- 0.6
(77.2-78.8)
79.8 +/- 0.6
(78.8-80.3)
v1
72.6 +/- 1.6
(70.2-74.0)
76.5 +/- 1.0
(75.5-77.8)
78.5 +/- 1.1
(76.9-79.7)
79.9 +/- 1.1
(79.0-81.8)
v2.2
74.2 +/- 1.3
(72.1-75.2)
76.6 +/- 1.0
(75.5-77.8)
79.0 +/- 1.0
(77.9-80.3)
80.0 +/- 1.2
(78.9-82.1)
Table 8. Validation Accuracy on the Small dataset
Table 9. Testing Accuracy on the Small dataset
70
Limitation
लौटे
राम अयोध्या
बने
वह राजा
।
।
nsubj
punct
obl
nsubj
xcomp
punct
root
root
राम अयोध्या लौटे । वह राजा बने । [PAD] [PAD]
राम 1 1
अयोध्या 1 1
लौटे 1 1 1 1
। 1 1
वह 1 1
राजा 1 1
बने 1 1 1 1
। 1 1
[PAD] 1
[PAD] 1 71
Future Plans
● End-to-End Pronoun Resolution in Hindi
● Indic-SpanBERT
● Long-context Question Answering
72
Pronoun Resolution
73
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings
Pronoun Resolution
Every speaker had to
present his paper .
Barack Obama visited
India. Modiji came to
receive Obama.
He was brave. He was
Ashoka the great.
If you want to hire them,
then all candidates must
be treated nicely.
Ram went to forest
with his wife
Krishna was an avatar of
Vishnu.
CEO of Reliance, Mukesh
Ambani, inaugurated the
SOM building.
74
Dataset
Mujadia, Vandan, Palash Gupta, and Dipti Misra Sharma. "Coreference Annotation Scheme and Relation Types for Hindi."
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 2016.
75
Table 10. Corpus Details
Table 11. Distribution of co-referential entities
Coref chain
76
Unique
mention ID
0: Intermediate token
1: End token
Unique chain ID
They modify the mention
Source
Selected Literature
● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
77
Selected Literature
● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
Research Gap:
(1) A publicly available tool for coreference resolution in Indic languages is required to be built using
the state-of-art coreference resolution techniques.
(2) Pretraining of transformer models using different subword tokenization methods and pretraining
tasks are to be investigated extensively. 78
exp( )
exp( ) + exp( ) +exp( )
BERT
Loss
L1
Loss
L2
Loss
L3
= - Log ( )
Proposed Architecture: TRAINING PHASE
Mention
detector
Pronoun
detector
Pronoun
resolver
P
L
A
I
N
T
E
X
T
embeddings
79
BERT
Softmax
Proposed Architecture: INFERENCE PHASE
P
L
A
I
N
T
E
X
T
embeddings
80
Future Plans
● End-to-End Pronoun Resolution in Hindi
● Indic-SpanBERT
● Long-context Question Answering
Ritwik: In terms of architectural choices, what
are some of the promising directions for
low-resource languages?
Sebastian Ruder: (summarizing)
There are three
1. Language specific tokenization
2. Model efficiency
3. New pretraining objectives
81
QnA
Rb OIE
PR
Ch
Improve CE
Future Plans
● End-to-End Pronoun Resolution in Hindi
● Indic-SpanBERT
● Long-context Question Answering
○ One multilingual dataset called XQA by Liu et al. 2019.
○ We need to explore the applicability of our aforementioned methodologies.
82
QnA
Rb OIE
PR
Ch
Improve CE
Timeline
1. Aug 2022 - Oct 2022
▣ Incorporate the internal reviews for maternal chatbot, and submit the
manuscript to a journal like ACM Health (average citation per article is 2) or
IEEE Transactions on CSS (Impact score is 5).
▣ Finish the experimentation pipeline for DaT embeddings. Prepare the first
draft.
▣ Data preprocessing for pronoun resolution.
Today Dec 2023
Defence
2 3 4
1
83
Timeline
2. Nov 2022 - Jan 2023
▣ Incorporate internal reviews for DaT embeddings paper, and submit the draft
to ACL Rolling Reviews (ARR). Target AACL.
▣ Build the multi-task learning pipeline for pronoun resolution.
▣ Prepare data and dataloader for Indic-SpanBERT pretraining. Explore the
usability of bert and distilbert.
Today Dec 2023
Defence
2 3 4
1
84
Timeline
3. Feb 2023 - Mar 2023
▣ Implement the baselines for pronoun resolution and design evaluation
metrics.
▣ Explore the usability of OIE in long-context question answering.
▣ Use a combination of Answer Sentence Selection (AS2) and Machine
Comprehension (MC) to solve long-context question answering.
▣ Evaluate Indic-SpanBERT on various downstream tasks in Indic languages.
Document the work as a short paper.
Today Dec 2023
Defence
3
2 4
1
85
Timeline
4. May 2023 - Aug 2023
▣ Prepare the draft for pronoun resolution, and long-context question
answering. Collect internal reviews. Modify and submit in EMNLP.
▣ Begin a survey on work done in Indic NLP.
Today Dec 2023
Defence
2 4
3
1
86
Publications
1. Mishra, Ritwik, Simranjeet Singh, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and
Pushpak Bhattacharya. "IndIE: A Multilingual Open Information Extraction Tool For Indic
Languages". 2022. [Submitted to a special issue of TALLIP]
2. Mishra, Ritwik, Simranjeet Singh, Jasmeet Kaur, Rajiv Ratn Shah, and Pushpendra Singh.
"Exploring the Use of Chatbots for Supporting Maternal and Child Health in
Resource-constrained Environments". 2022. [Draft ready. Under internal review]
Miscellaneous
● Mishra, Ritwik, Ponnurangam Kumaraguru, Rajiv Ratn Shah, Aanshul Sadaria, Shashank
Srikanth, Kanay Gupta, Himanshu Bhatia, and Pratik Jain. "Analyzing traffic violations
through e-challan system in metropolitan cities (workshop paper)." In 2020 IEEE Sixth
International Conference on Multimedia Big Data (BigMM), pp. 485-493. IEEE, 2020.
87
Acknowledgements
● I would like to express my gratitude to pillars group members for their valuable
guidance.
● I would like to thank Simranjeet, Samarth, Ajeet, and Jasmeet for being diligent
co-authors.
● I would also like to thank Prof Pushpak Bhattacharyya, and CFILT lab members for
providing me great insights and hosting me at IIT Bombay under Anveshan Setu
program.
● I would also like to University Grant Commission (UGC) Junior Research Fellowship
(JRF) / Senior Research Fellowship (SRF) for funding my PhD program
88
Thank You
89

More Related Content

Similar to A Framework For Automatic Question Answering in Indian Languages

Critical appraisal of a journal article
Critical appraisal of a journal articleCritical appraisal of a journal article
Critical appraisal of a journal articlePreethi Selvaraj
 
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...Asma Ben Abacha
 
114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCoBenitoSumpter862
 
114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCoSantosConleyha
 
CASP Randomised Controlled Trial Standard Checklist 11 que.docx
CASP Randomised Controlled Trial Standard Checklist  11 que.docxCASP Randomised Controlled Trial Standard Checklist  11 que.docx
CASP Randomised Controlled Trial Standard Checklist 11 que.docxbartholomeocoombs
 
Simulation: From theory to implementation
Simulation: From theory to implementationSimulation: From theory to implementation
Simulation: From theory to implementationAdam Dubrowski
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
SPC_Jeopardy_Download.ppt
SPC_Jeopardy_Download.pptSPC_Jeopardy_Download.ppt
SPC_Jeopardy_Download.pptDeeptiBhoknal
 
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...ijfls
 
AI in AGRICULTURE.ppt
AI in AGRICULTURE.pptAI in AGRICULTURE.ppt
AI in AGRICULTURE.pptHoang Tran
 
AI in AGRICULTURE_Group8.ppt
AI in AGRICULTURE_Group8.pptAI in AGRICULTURE_Group8.ppt
AI in AGRICULTURE_Group8.pptPrithviKollara
 
Causal discovery
Causal discoveryCausal discovery
Causal discoverydagunisa
 
Purpose The purpose of this assignment is to document sourc.docx
Purpose The purpose of this assignment is to document sourc.docxPurpose The purpose of this assignment is to document sourc.docx
Purpose The purpose of this assignment is to document sourc.docxwoodruffeloisa
 
Santipur Webinar on COVID-19 pandemic
 Santipur Webinar on COVID-19 pandemic Santipur Webinar on COVID-19 pandemic
Santipur Webinar on COVID-19 pandemicDeeptak Biswas
 
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...What Programmers Say About Refactoring Tools? An Empirical Investigation of ...
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...UFPA
 

Similar to A Framework For Automatic Question Answering in Indian Languages (20)

Critical appraisal of a journal article
Critical appraisal of a journal articleCritical appraisal of a journal article
Critical appraisal of a journal article
 
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...
 
114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo
 
114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo114Sun Coast RemediationMichell MuldrowCo
114Sun Coast RemediationMichell MuldrowCo
 
Basic quantitative research
Basic quantitative researchBasic quantitative research
Basic quantitative research
 
Active learning
Active learningActive learning
Active learning
 
CASP Randomised Controlled Trial Standard Checklist 11 que.docx
CASP Randomised Controlled Trial Standard Checklist  11 que.docxCASP Randomised Controlled Trial Standard Checklist  11 que.docx
CASP Randomised Controlled Trial Standard Checklist 11 que.docx
 
Simulation: From theory to implementation
Simulation: From theory to implementationSimulation: From theory to implementation
Simulation: From theory to implementation
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
SPC_Jeopardy_Download.ppt
SPC_Jeopardy_Download.pptSPC_Jeopardy_Download.ppt
SPC_Jeopardy_Download.ppt
 
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...
OWA BASED MAGDM TECHNIQUE IN EVALUATING DIAGNOSTIC LABORATORY UNDER FUZZY ENV...
 
AI in AGRICULTURE.ppt
AI in AGRICULTURE.pptAI in AGRICULTURE.ppt
AI in AGRICULTURE.ppt
 
AI in AGRICULTURE_Group8.ppt
AI in AGRICULTURE_Group8.pptAI in AGRICULTURE_Group8.ppt
AI in AGRICULTURE_Group8.ppt
 
Causal discovery
Causal discoveryCausal discovery
Causal discovery
 
Habib, Researcher Awareness + Perception: A Year in Review
Habib, Researcher Awareness + Perception: A Year in ReviewHabib, Researcher Awareness + Perception: A Year in Review
Habib, Researcher Awareness + Perception: A Year in Review
 
Purpose The purpose of this assignment is to document sourc.docx
Purpose The purpose of this assignment is to document sourc.docxPurpose The purpose of this assignment is to document sourc.docx
Purpose The purpose of this assignment is to document sourc.docx
 
Santipur Webinar on COVID-19 pandemic
 Santipur Webinar on COVID-19 pandemic Santipur Webinar on COVID-19 pandemic
Santipur Webinar on COVID-19 pandemic
 
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...What Programmers Say About Refactoring Tools? An Empirical Investigation of ...
What Programmers Say About Refactoring Tools? An Empirical Investigation of ...
 
KGCTutorial_AIISC_2022.pptx
KGCTutorial_AIISC_2022.pptxKGCTutorial_AIISC_2022.pptx
KGCTutorial_AIISC_2022.pptx
 

More from IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsIIIT Hyderabad
 

More from IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake News
 

Recently uploaded

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfShreyas Pandit
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organizationchnrketan
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptNoman khan
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 

Recently uploaded (20)

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdf
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organization
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).ppt
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 

A Framework For Automatic Question Answering in Indian Languages

  • 1. A Framework for Automatic Question Answering in Indian Languages Ritwik Mishra (PhD19013) 27th July 2022
  • 2. Comprehensive Panel Members Dr Rajiv Ratn Shah PhD advisor IIIT-Delhi Prof Ponnurangam Kumaraguru PhD advisor IIIT-Hyderabad Prof Radhika Mamidi External expert IIIT-Hyderabad Dr Arun Balaji Buduru Internal expert IIIT-Delhi Dr Raghava Mutharaju Internal expert IIIT-Delhi 2
  • 3. What is Automatic Question Answering (QnA)? Human/user asks a question, and a computer produces an answer. 3
  • 4. Outline ● Motivation ● Thesis Statement ● Contributions ○ FAQ retrieval ○ Open Information Extraction ○ Better Contextual Embeddings ● Future Plans ○ Pronoun Resolution ○ Long-context Question Answering 4
  • 6. Why languages? ● Human intelligence is very much attributed to our ability to express complex ideas through language. ● With the advent of text modality (i.e. writing) our ability to store and spread information increased dramatically. 6
  • 7. Why Indian Languages? ● Despite being spoken by billions of people, Indic languages are considered low-resource due to the lack of annotated resources. “Hindi is... an underdog. It has enough resources and everything looks promising for Hindi.” - Dr. Monojit Choudhury, MSR India, ML India Podcast, Aug 2020 7
  • 8. Cases where Indic-QnA works well 8
  • 9. Cases where it fails 9
  • 10. Thesis Statement ● Explore the possibility to develop a framework for Automated Question-Answering (QnA) in Indian languages with the help of multiple supporting tasks like Open Information Extraction (OIE) and Pronoun Resolution, and improved contextual embeddings. 10
  • 13. Types of QnA QnA Domain Context Answer Type Discourse Open-Domain Closed-Domain Short Long No Span-based (MRC) Sentence-based (AS2) Conversational Memory-less 13
  • 14. Methodologies in QnA ● Rule based approach ● Generative approach ● Retrieval based approach 14
  • 15. Methodologies in QnA ● Rule based approach ● Generative approach ● Retrieval based approach 15
  • 16. Methodologies in QnA 16 ● Rule based approach ● Generative approach ● Retrieval based approach
  • 18. FAQ retrieval based QnA User query (q) Questioni ( Qi ) Answeri ( Ai ) Top-k Question-Answer pairs (QA) Where Qi is most similar to the user query (q) k 18
  • 19. FAQ retrieval based QnA (internal view) User query (q) FAQ database Sentence Similarity Calculator (SSC) # Q A # Q A score FAQ database with each row having its similarity score with q 19
  • 20. Example User query (q) : बच्चे की ना भ फ ू ली हो है ठीक करने क े लए क्या क्या करे? Chatbot suggested: Q1) बच्चे की अचानक ना भ फ ू ल जाए तो क्या करें? A1) बच्चे की ना भ अचानक नहीं फ ू लती आमतौर पर यह देखा गया है क जब बच्चे का पेट थोड़ा फ ू लता है … Q2) अगर 5 म हने क बच्चे की ना भ फ ू ली हुई हो तो क्या करे? A2) 5 महीने क े बच्चे की ना भ आम तोर पर मसल्स की कमजोरी क े कारण होती है इस में कोई चंता …. Q3) नवजात बच्चे की फ ू ली हुए ना भ होती है वह जब ठीक हो जाए तो क्या उसक े बढ़ उसे गैस की प्रॉब्लम हो जाती है? A3) यह एक बह्रम है बच्चे की ना भ जब फ ु ल्ली होती है तो वह खतरे क े लक्षण नहीं है फ ु ल्ली हुई ना भ … 20
  • 21. How to evaluate? Question ( Q ) Answer ( A ) User deciding the relevancy of the top-k Question-Answer pairs (QA) suggested by chatbot for 4 different queries k 21 Question ( Q ) Answer ( A ) k Question ( Q ) Answer ( A ) k Question ( Q ) Answer ( A ) k Top-k suggestions for q1 Top-k suggestions for q2 Top-k suggestions for q3 Top-k suggestions for q4
  • 22. Evaluation Metrics 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) 22 Success Rate is the % of user queries for whom the chatbot suggested at least one relevant QA pair among its top-k suggestions.
  • 23. Evaluation techniques 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) Precision at k is the proportion of recommended items in the top-k set that are relevant. 23
  • 24. Evaluation techniques 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) For each user (or q) For each relevant item Compute precision till that item Average them Average them 24
  • 25. Evaluation techniques 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) 25
  • 26. Evaluation techniques 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) 26
  • 27. Evaluation techniques 1. Success Rate (SR) 2. Precision at k (P@k) 3. Mean Average Precision (mAP) 4. Mean Reciprocal Rank (MRR) 5. normalized Discounted Cumulative Gain (nDCG) 27
  • 28. Selected Literature ● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. ○ Non-public healthcare dataset in African languages ○ 230K QA pairs → 150K Qs with 126 As ○ Hence 126 clusters of Qs ○ Predict a cluster for q ● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication Technologies and Development. 2020. ○ Transcription based ○ 516 Qs with 90 As ○ User had to specify broad category of q ● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019. ○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q). ○ Q-q similarity works better than QA-q or A-q similarity. 28
  • 29. Selected Literature ● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. ○ Non-public healthcare dataset in African languages ○ 230K QA pairs → 150K Qs with 126 As ○ Hence 126 clusters of Qs ○ Predict a cluster for q ● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication Technologies and Development. 2020. ○ Transcription based ○ 516 Qs with 90 As ○ User had to specify broad category of q ● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019. ○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q). ○ Q-q similarity works better than QA-q or A-q similarity. Research Gap: There is a need to propose automated solutions to provide healthcare related information to an end-user in remote places of India. 29
  • 30. The proposed FAQ-chatbot Chatbot FAQ database if count < k User query (q) Finding top-k most similar QA pair for the given q Asking user if Q is similar to q? No No Yes A q cannot be answered Yes Response Response User User User 30
  • 31. Different SSC techniques Sentence Similarity Calculator (SSC) Dependency Tree Pruning (DTP) method Cosine Similarity (COS) method Sentence-pair Classifier (SPC) method Training NOT needed Training needed 31
  • 32. 32
  • 33. Ablation on SPC alone 33
  • 34. Evaluation data ● On field testing was not feasible due to COVID. ● We collected 336 queries from the healthcare workers. ● Manually annotated each of them, and found complete/partial matching Qs. 34 q 336 Healthcare-worker q Qs Hold-out test-set For each query (q) we manually found similar questions (Qs) from the FAQ database 270
  • 35. Results 1/2 DTP SPC COS ℇ DTP DTPq-e SPC SPC+A SPCq-e COS COSq-e mAP 30.5 35.1 39.4 21.1 39.1 26.5 27.9 45.3 MRR 42.6 48.5 54.6 42.2 54.2 38.7 41.0 61.6 SR 27.1 59.6 66.2 49.6 64.4 47.7 51.1 70.3 nDCG 55.5 51.2 57.1 43.9 56.5 40.8 43.3 62.5 P@3 27.1 30.0 34.6 34.6 34.6 22.7 23.9 34.6 Table 1. Comparison of all three primary approaches on hold-out test set for top-3 suggestions. Ensemble (ℇ) is obtained by taking the best performing models, highlighted with underlined text, from each of the primary approach. 35
  • 36. Results 2/2 ℇ-COS ℇ-DTP ℇ-SPC ℇ mAP 40.9 40.8 30.3 45.3 MRR 56.2 56.5 43.8 61.6 SR 66.2 66.2 51.1 70.3 nDCG 58.4 58.2 45.5 62.5 P@3 34.6 34.6 23.9 34.6 Table 2. Results of ablation study on the Ensemble method (ℇ). We observed that each approach plays an important role in the performance of the developed chatbot. 36
  • 37. Ensemble technique ablation 37 ℇsus ℇns ℇqfm ℇ mAP 45.2 45.7 44.8 45.3 MRR 61.4 62.0 60.6 61.6 SR 71.8 70.0 67.7 70.3 nDCG 62.4 62.7 60.9 62.5 P@3 34.6 34.6 34.6 34.6
  • 39. Performance depends on diversity of database as well Performance of ensemble models on test-sets with different level of pruning. The label ℇp2 #152 represents the ensemble results on a hold-out test-set that is pruned by removing all user queries that have 2 or less relevant questions in the FAQ database. The resulting pruned test-set (p2 set) has 152 user queries. 39
  • 40. Limitations ● The retrieval-based approaches are limited to extracting information from an indexed database which has to be curated manually. ● In order to retrieve information from a text dump of raw sentences, a knowledge-graph has to constructed from the unstructured text. ● OIE tools could be used to extract facts from raw sentences in different domains. 40
  • 42. Open Information Extraction (OIE) ● Extract facts from raw sentences. ○ Use triples to represent the facts. ○ Format of triples is <head, relation, tail>. ● Example ○ PM Modi to visit UAE in Jan marking 50 yrs of diplomatic ties. ■ one of the possible meaningful triple would be: <PM Modi, to visit, UAE> 42
  • 43. Why OIE is important? ● Its ability to extract triples from large amounts of texts in an unsupervised manner. ● It serves as an initial step in building or augmenting a knowledge graph out of an unstructured text. ● OIE tools have been used to solve downstream applications like Question-Answering, Text Summarization, and Entity Linking. 43
  • 44. OIE in Indian language needs a special attention ● Take an English sentence ○ Ram ate an apple ○ A sensible triple would be ■ <Ram, ate, an apple> ○ A nonsensical triple would be ■ <an apple, ate, Ram> ● Now look at a Hindi sentence (same meaning) ○ राम ने सेब खाया ○ Both these triples would be sensible ■ <राम ने, खाया, सेब> ■ <सेब, खाया, राम ने> 44
  • 45. How generated triples are evaluated? ● Manual Annotations Valid Well-formed Reasonable Concrete True Tabular representation Image representation 45
  • 46. How generated triples are evaluated? ● Manual Annotations ● Automatic Evaluations 46
  • 47. sent_id:1 वह ऑस्ट्रे लया क े पहले प्रधान मंत्री क े रूप में कायर्यरत थे और ऑस्ट्रे लया क े उच्च न्यायालय क े संस्थापक न्यायाधीश बने । (He served as the first Prime Minister of Australia and became a founding justice of the High Court of Australia .) ------ Cluster 1 ------ वह --> कायर्यरत थे --> [ऑस्ट्रे लया क े ]{a} [पहले] प्रधान मंत्री क े रूप में (He --> served --> as [the] [first] Prime Minister [of Australia]{a}) वह --> बने --> [ऑस्ट्रे लया क े उच्च न्यायालय क े ]{b} [संस्थापक] न्यायाधीश (He --> became --> [founding] justice [of the High Court of Australia]{b}) {a} ऑस्ट्रे लया क े --> property --> [पहले] प्रधान मंत्री क े रूप में |OR| वह --> [पहले] प्रधान मंत्री क े रूप में कायर्यरत थे --> ऑस्ट्रे लया क े ({a} of Australia --> property --> as [first] Prime Minister |OR| He --> served as [the] [first] Prime Minister --> of Australia) {b} [ऑस्ट्रे लया क े ]{c} उच्च न्यायालय क े --> property --> [संस्थापक] न्यायाधीश |OR| वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क े ]{c} उच्च न्यायालय क े ({b} of High Court [of Australia]{c} --> property --> [founding] justice |OR| He --> became [founding] justice --> of High Court [of Australia]{c}) {c} ऑस्ट्रे लया क े --> property --> उच्च न्यायालय क े ({c} of Australia --> property --> of High Court) ------ Cluster 2 ------ वह --> [पहले] प्रधान मंत्री क े रूप में कायर्यरत थे --> ऑस्ट्रे लया क े (He --> served as [the] [first] Prime Minister --> of Australia) वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क े ]{a} उच्च न्यायालय क े (He --> became [founding] justice --> of High Court [of Australia]{a}) {a} ऑस्ट्रे लया क े --> property --> उच्च न्यायालय क े ({a} of Australia --> property --> of High Court) Hindi-BenchIE 47
  • 48. Selected Literature ● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015. ○ Supported Hindi but it was translation based ○ Didn’t release code but released triples ● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290. ○ Event specific and domain specific IE ● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020. ○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end) ○ Identify relations, and then their head-tail 48
  • 49. Selected Literature ● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015. ○ Supported Hindi but it was translation based ○ Didn’t release code but released triples ● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290. ○ Event specific and domain specific IE ● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020. ○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end) ○ Identify relations, and then their head-tail Research Gap: (1) The field of Open Information Extraction (OIE) from unstructured text in Indic languages has not been explored much. Moreover, the effectiveness of existing multilingual OIE techniques has to be evaluated on Indic languages. (2) There is a scarcity of annotated resources for automatic evaluation of automatically generated triples for Indic languages. (3) Construction of knowledge-graph using extracted triples from unstructured text in Indian languages needs to be explored as well. 49
  • 50. IndIE: our Indic OIE tool Raw Text Output Sentence segmented text, Triples for each sentence, Execution Time Off-the-shelf sentence segmentation and dependency parser by Stanford (a) XLM-roberta fine-tuned on chunk annotated data (b) Creating Merged-phrases Dependency Tree (MDT) 3: Chunk tags 3: Dependency tree Greedy algorithm based on hand-crafted rules (c) Triple extraction 2: Sentences 1: Shallow parsing 4: Passing MDT 5: Result accumulation 50
  • 53. Chunker Implementation 2/2 Transformer Tokenizer Pretrained Transformer Feed-forward Layer Tokens Token IDs Token-level predictions Ground-Truth Calculating Loss (a) Initial Embeddings (b) Taking average (c) Final Embeddings 53
  • 54. Results: Chunker Classification Layers First sub-word token embedding Last sub-word token embedding Average embedding of all sub-word tokens 1 82±10 (50±20) 89±0.5 (62±1.0) 91±0.0 (65±0.5) 2 86±1.8 (51±6.2) 89±0.5 (54±7.4) 90±0.5 (54±4.5) 3 79±14 (43±13) 82±11 (41±12) 90±0.5 (48±2.2) Table 3. A comparison of three approaches for solving the sub-word token embeddings for chunking task. Four different random seeds were used to calculate the mean and standard deviation for the given samples. All the experiments were run on the combined data of TDIL and UD. The numbers written outside round brackets represent the accuracy, whereas numbers inside round brackets represent the macro average. Model Hindi English Urdu Nepali Gujarati Bengali XLM 78% 60% 84% 65% 56% 66% CRF 67% 56% 71% 58% 53% 53% Table 4. A comparison of (fine-tuned) XLM chunker and CRF chunker on the languages which are removed from training-set. The numbers represent the accuracy obtained by each model when sentences from the given language are used only in the test-set. 54
  • 55. Results: Triple Evaluation (manual) Image options ArgOE M&K Multi2OIE PredPatt IndIE No information 17% 28% 71% 5% 4% Most Information 22% 44% 24% 29% 17% All information 61% 28% 5% 66% 79% Table 5. Percentage of sentences having no/most/all information in the image representation of their generated triples. The method which generates maximum triples with ‘All information’ is considered the best method. #Triple ArgOE M&K Multi2OIE PredPatt IndIE Total 45 180 50 40 272 Valid 38 142 10 39 252 Well-formed 32 75 9 36 240 Reasonable 26 69 6 32 227 Concrete 21 53 4 25 158 True 19 51 4 25 152 Table 6. Number of triples extracted by each OIE method for 106 Hindi sentences. 55
  • 56. Results: Triple Evaluation (automatic) ArgOE M&K Multi2OIE PredPatt IndIE Precision 0.26 0.14 0.12 0.37 0.62 Recall 0.06 0.16 0.03 0.08 0.62 F1-score 0.10 0.15 0.05 0.14 0.62 Table 7. Performance of different OIE methods on Hindi-BenchIE golden set consisting of 75 Hindi sentences. It is observed that IndIE outperforms other methods on all three metrics. 56
  • 57. Limitations ● Small size of evaluation datasets. ● The contextual embeddings (from transformers) are known to not take the syntactic structure of sentence into account. ● As a result, it has been observed that syntactic information is either absent in the transformer embeddings or it is not utilized while making the predictions. ● Therefore, there is a need to explore the possibility of incorporating syntactic (dependency) information in transformer embeddings. 57
  • 59. Dependency-aware Transformer (DaT) Embedding Transformer Tokenizer Pretrained Transformer Tokens Token IDs (a) Initial Embeddings (b) Taking average (c) Final Embeddings 59
  • 60. DaT Embedding: Graph 1 2 3 4 5 1 0 1 1 0 0 2 0 0 0 1 1 3 1 0 0 0 0 4 0 1 0 0 0 5 0 1 0 0 0 Adjacency Matrix A 1 2 3 4 5 1 2 0 0 0 0 2 0 3 0 0 0 3 0 0 1 0 0 4 0 0 0 1 0 5 0 0 0 0 1 Degree Matrix D 1 2 3 4 5 60
  • 61. DaT Embedding: Graph Convolution Network (GCN) 1 2 3 4 5 1 0 1 1 0 0 2 0 0 0 1 1 3 1 0 0 0 0 4 0 1 0 0 0 5 0 1 0 0 0 Adjacency Matrix A 1 2 3 4 5 1 2 0 0 0 0 2 0 3 0 0 0 3 0 0 1 0 0 4 0 0 0 1 0 5 0 0 0 0 1 Degree Matrix D 1 2 3 4 5 61
  • 62. DaT Embedding: GCN message passing 62
  • 63. 63
  • 64. Selected Literature ● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI Conference on Artificial Intelligence. 2017. ● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. ● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. 64
  • 65. Selected Literature ● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI Conference on Artificial Intelligence. 2017. ● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. ● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. Research Gap: Incorporating dependency structure of a sentence in generating dependency-aware contextual embeddings is an under-explored field in Indic-NLP. 65
  • 66. GCN variants 0) No GCN 1) 2.1) 2.2) 3.1) 66
  • 68. Evaluation Dataset ● Small Dataset ○ Fearspeech | #4782 | Labels: 'Fear speech':1 , 'Normal': 0 ○ Codalab | #5728 | 'hate’:1, offensive':1,'defamation:1, 'fake':1, 'non-hostile':0 ○ HASOC | #4665 | ‘non-hostile’: 0, ‘fake’:0, … , ‘offensive’:1 , ‘hate’: 1 ● Big Dataset ○ IIITD abusive language data | 7 lacs (multilingual) | 30k Hindi | not abusive: 0, abusive:1 68
  • 69. DaT Embedding Results 69 Epoch 1 Epoch 2 Epoch 3 Epoch 4 v0 71.2 +/- 2.9 (67.7-74.6) 75.9 +/- 1.2 (74.1-77.5) 77.4 +/- 0.6 (76.8-78.1) 79.0 +/- 0.7 (78.2-79.8) v1 72.4 +/- 1.9 (69.4-74.0) 76.6 +/- 1.2 (74.9-78.0) 78.8 +/- 1.0 (77.1-79.6) 80.3 +/- 1.1 (79.3-81.8) v2.2 74.2 +/- 1.1 (72.5-75.3) 77.2 +/- 0.9 (75.9-78.3) 79.3 +/- 0.8 (78.4-80.2) 80.5 +/- 0.7 (79.5-81.2) Epoch 1 Epoch 2 Epoch 3 Epoch 4 v0 71.2 +/- 2.5 (67.3-73.9) 76.2 +/- 1.2 (74.6-77.4) 77.9 +/- 0.6 (77.2-78.8) 79.8 +/- 0.6 (78.8-80.3) v1 72.6 +/- 1.6 (70.2-74.0) 76.5 +/- 1.0 (75.5-77.8) 78.5 +/- 1.1 (76.9-79.7) 79.9 +/- 1.1 (79.0-81.8) v2.2 74.2 +/- 1.3 (72.1-75.2) 76.6 +/- 1.0 (75.5-77.8) 79.0 +/- 1.0 (77.9-80.3) 80.0 +/- 1.2 (78.9-82.1) Table 8. Validation Accuracy on the Small dataset Table 9. Testing Accuracy on the Small dataset
  • 70. 70
  • 71. Limitation लौटे राम अयोध्या बने वह राजा । । nsubj punct obl nsubj xcomp punct root root राम अयोध्या लौटे । वह राजा बने । [PAD] [PAD] राम 1 1 अयोध्या 1 1 लौटे 1 1 1 1 । 1 1 वह 1 1 राजा 1 1 बने 1 1 1 1 । 1 1 [PAD] 1 [PAD] 1 71
  • 72. Future Plans ● End-to-End Pronoun Resolution in Hindi ● Indic-SpanBERT ● Long-context Question Answering 72
  • 74. Pronoun Resolution Every speaker had to present his paper . Barack Obama visited India. Modiji came to receive Obama. He was brave. He was Ashoka the great. If you want to hire them, then all candidates must be treated nicely. Ram went to forest with his wife Krishna was an avatar of Vishnu. CEO of Reliance, Mukesh Ambani, inaugurated the SOM building. 74
  • 75. Dataset Mujadia, Vandan, Palash Gupta, and Dipti Misra Sharma. "Coreference Annotation Scheme and Relation Types for Hindi." Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 2016. 75 Table 10. Corpus Details Table 11. Distribution of co-referential entities
  • 76. Coref chain 76 Unique mention ID 0: Intermediate token 1: End token Unique chain ID They modify the mention Source
  • 77. Selected Literature ● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. ● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association for Computational Linguistics 8 (2020): 64-77. ● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings of the Sixth International Joint Conference on Natural Language Processing. 2013. ● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages." Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014. ● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages." Knowledge-Based Systems 109 (2016): 147-159. 77
  • 78. Selected Literature ● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. ● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association for Computational Linguistics 8 (2020): 64-77. ● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings of the Sixth International Joint Conference on Natural Language Processing. 2013. ● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages." Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014. ● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages." Knowledge-Based Systems 109 (2016): 147-159. Research Gap: (1) A publicly available tool for coreference resolution in Indic languages is required to be built using the state-of-art coreference resolution techniques. (2) Pretraining of transformer models using different subword tokenization methods and pretraining tasks are to be investigated extensively. 78
  • 79. exp( ) exp( ) + exp( ) +exp( ) BERT Loss L1 Loss L2 Loss L3 = - Log ( ) Proposed Architecture: TRAINING PHASE Mention detector Pronoun detector Pronoun resolver P L A I N T E X T embeddings 79
  • 80. BERT Softmax Proposed Architecture: INFERENCE PHASE P L A I N T E X T embeddings 80
  • 81. Future Plans ● End-to-End Pronoun Resolution in Hindi ● Indic-SpanBERT ● Long-context Question Answering Ritwik: In terms of architectural choices, what are some of the promising directions for low-resource languages? Sebastian Ruder: (summarizing) There are three 1. Language specific tokenization 2. Model efficiency 3. New pretraining objectives 81 QnA Rb OIE PR Ch Improve CE
  • 82. Future Plans ● End-to-End Pronoun Resolution in Hindi ● Indic-SpanBERT ● Long-context Question Answering ○ One multilingual dataset called XQA by Liu et al. 2019. ○ We need to explore the applicability of our aforementioned methodologies. 82 QnA Rb OIE PR Ch Improve CE
  • 83. Timeline 1. Aug 2022 - Oct 2022 ▣ Incorporate the internal reviews for maternal chatbot, and submit the manuscript to a journal like ACM Health (average citation per article is 2) or IEEE Transactions on CSS (Impact score is 5). ▣ Finish the experimentation pipeline for DaT embeddings. Prepare the first draft. ▣ Data preprocessing for pronoun resolution. Today Dec 2023 Defence 2 3 4 1 83
  • 84. Timeline 2. Nov 2022 - Jan 2023 ▣ Incorporate internal reviews for DaT embeddings paper, and submit the draft to ACL Rolling Reviews (ARR). Target AACL. ▣ Build the multi-task learning pipeline for pronoun resolution. ▣ Prepare data and dataloader for Indic-SpanBERT pretraining. Explore the usability of bert and distilbert. Today Dec 2023 Defence 2 3 4 1 84
  • 85. Timeline 3. Feb 2023 - Mar 2023 ▣ Implement the baselines for pronoun resolution and design evaluation metrics. ▣ Explore the usability of OIE in long-context question answering. ▣ Use a combination of Answer Sentence Selection (AS2) and Machine Comprehension (MC) to solve long-context question answering. ▣ Evaluate Indic-SpanBERT on various downstream tasks in Indic languages. Document the work as a short paper. Today Dec 2023 Defence 3 2 4 1 85
  • 86. Timeline 4. May 2023 - Aug 2023 ▣ Prepare the draft for pronoun resolution, and long-context question answering. Collect internal reviews. Modify and submit in EMNLP. ▣ Begin a survey on work done in Indic NLP. Today Dec 2023 Defence 2 4 3 1 86
  • 87. Publications 1. Mishra, Ritwik, Simranjeet Singh, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and Pushpak Bhattacharya. "IndIE: A Multilingual Open Information Extraction Tool For Indic Languages". 2022. [Submitted to a special issue of TALLIP] 2. Mishra, Ritwik, Simranjeet Singh, Jasmeet Kaur, Rajiv Ratn Shah, and Pushpendra Singh. "Exploring the Use of Chatbots for Supporting Maternal and Child Health in Resource-constrained Environments". 2022. [Draft ready. Under internal review] Miscellaneous ● Mishra, Ritwik, Ponnurangam Kumaraguru, Rajiv Ratn Shah, Aanshul Sadaria, Shashank Srikanth, Kanay Gupta, Himanshu Bhatia, and Pratik Jain. "Analyzing traffic violations through e-challan system in metropolitan cities (workshop paper)." In 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM), pp. 485-493. IEEE, 2020. 87
  • 88. Acknowledgements ● I would like to express my gratitude to pillars group members for their valuable guidance. ● I would like to thank Simranjeet, Samarth, Ajeet, and Jasmeet for being diligent co-authors. ● I would also like to thank Prof Pushpak Bhattacharyya, and CFILT lab members for providing me great insights and hosting me at IIT Bombay under Anveshan Setu program. ● I would also like to University Grant Commission (UGC) Junior Research Fellowship (JRF) / Senior Research Fellowship (SRF) for funding my PhD program 88