SlideShare a Scribd company logo
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Automatic Question Answering
through BERT fine-tuning
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Question Answering task
Q: _______________ ?
A: _______
Or
A: impossible
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
SQuAD v1.1: Single good answer
SQuAD v2: One ore more good answers and questions that are impossible to answer
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
Partial answer
query
answer
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
https://rajpurkar.github.io/SQuAD-explorer/
Bi-Directional Attention Flow
for machine comprehension (BiDAF), Seo et al. 2018
(refresher) BERT (Devlin et al. 18): Pre-training
INPUT
WordPieces
Embeddings
Sentence
Embeddings
Position
Embeddings
Learned during the
(pre)training process
MASK
EMASK
In pre-training 15% of the input tokens are
masked for the masked LM task
There is also a sentence similarity task
http://jalammar.github.io/illustrated-bert/
(refresher) BERT (Devlin et al. 18): Pre-training
http://jalammar.github.io/illustrated-bert/
(refresher) BERT (Devlin et al. 18): Pre-training
Training for Question Answering task: (Modified from: original paper)
Fine-tuning
Add small weight matrix W:
𝒉 𝐿
𝑖
𝑾 𝑆𝑃𝐴𝑁_𝑆𝑇𝐴𝑅𝑇_𝐸𝑁𝐷
BERT (Devlin et al. 18): Fine-Tuning for QA
impossible = start[CLS]+end[CLS]
Modified network in Gluon (Bert + Dense Layer)
max(𝒔𝒕𝒂𝒓𝒕) + max(𝒆𝒏𝒅) − (𝑠𝑡𝑎𝑟𝑡 𝐶𝐿𝑆 + 𝑒𝑛𝑑 𝐶𝐿𝑆 ) < 𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
Question is impossible to answer if:
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Lab
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Go build!
• http://gluon-nlp.mxnet.io/
Get help:
• https://discuss.mxnet.io/

More Related Content

What's hot

Bert
BertBert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Minh Pham
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
Bhaskar Mitra
 
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
Hady Elsahar
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
Ted Xiao
 
Word embedding
Word embedding Word embedding
Word embedding
ShivaniChoudhary74
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Young Seok Kim
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
RahulKumar854607
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
caa28steve
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
Jens Lehmann
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
Suman Debnath
 
Regulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with TransparencyRegulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with Transparency
Debmalya Biswas
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 

What's hot (20)

Bert
BertBert
Bert
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Word embedding
Word embedding Word embedding
Word embedding
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
Regulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with TransparencyRegulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with Transparency
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 

Similar to Fine-tuning BERT for Question Answering

Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial Intelligence
Daniel Zivkovic
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
Rob De Feo
 
Building Event-driven Architectures with Amazon EventBridge
Building Event-driven Architectures with Amazon EventBridge Building Event-driven Architectures with Amazon EventBridge
Building Event-driven Architectures with Amazon EventBridge
James Beswick
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Alex Barbosa Coqueiro
 
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdfre:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
Heitor Lessa
 
AWS Startup Garage - Building your MVP on AWS
AWS Startup Garage - Building your MVP on AWSAWS Startup Garage - Building your MVP on AWS
AWS Startup Garage - Building your MVP on AWS
Cobus Bernard
 
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMakerAutomatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
Amazon Web Services
 
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit SydneyAutomatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
Amazon Web Services
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
Amazon Web Services
 
Drive digital transformation with AI
Drive digital transformation with AIDrive digital transformation with AI
Drive digital transformation with AI
Amazon Web Services
 
Revving up with Reinforcement Learning by Ricardo Sueiras
Revving up with Reinforcement Learning by Ricardo SueirasRevving up with Reinforcement Learning by Ricardo Sueiras
Revving up with Reinforcement Learning by Ricardo Sueiras
Alex Cachia
 
Startup Day Buenos Aires
Startup Day Buenos AiresStartup Day Buenos Aires
Startup Day Buenos Aires
Amazon Web Services LATAM
 
AWS Startup Day Santiago - Pitch essentials
AWS Startup Day Santiago - Pitch essentialsAWS Startup Day Santiago - Pitch essentials
AWS Startup Day Santiago - Pitch essentials
Amazon Web Services LATAM
 
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019 딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
AWSKRUG - AWS한국사용자모임
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Amazon Web Services
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Amazon Web Services
 
De un monolito a microservicios
De un monolito a microserviciosDe un monolito a microservicios
De un monolito a microservicios
Amazon Web Services
 
Rapid Prototyping with AWS - AWS Summit Sydney
Rapid Prototyping with AWS - AWS Summit SydneyRapid Prototyping with AWS - AWS Summit Sydney
Rapid Prototyping with AWS - AWS Summit Sydney
Amazon Web Services
 
AWS Startup Day Bogotá - The Pitch
AWS Startup Day Bogotá - The PitchAWS Startup Day Bogotá - The Pitch
AWS Startup Day Bogotá - The Pitch
Amazon Web Services LATAM
 
AWS Startup Day Guadalajara - Pitch essentials
AWS Startup Day Guadalajara - Pitch essentialsAWS Startup Day Guadalajara - Pitch essentials
AWS Startup Day Guadalajara - Pitch essentials
Amazon Web Services LATAM
 

Similar to Fine-tuning BERT for Question Answering (20)

Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial Intelligence
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
 
Building Event-driven Architectures with Amazon EventBridge
Building Event-driven Architectures with Amazon EventBridge Building Event-driven Architectures with Amazon EventBridge
Building Event-driven Architectures with Amazon EventBridge
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
 
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdfre:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
re:Invent OPN306 AWS Lambda Powertools Lessons 10M downloads.pdf
 
AWS Startup Garage - Building your MVP on AWS
AWS Startup Garage - Building your MVP on AWSAWS Startup Garage - Building your MVP on AWS
AWS Startup Garage - Building your MVP on AWS
 
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMakerAutomatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
Automatic-Labelling-and-Model-Tuning-with-Amazon-SageMaker
 
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit SydneyAutomatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit Sydney
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
 
Drive digital transformation with AI
Drive digital transformation with AIDrive digital transformation with AI
Drive digital transformation with AI
 
Revving up with Reinforcement Learning by Ricardo Sueiras
Revving up with Reinforcement Learning by Ricardo SueirasRevving up with Reinforcement Learning by Ricardo Sueiras
Revving up with Reinforcement Learning by Ricardo Sueiras
 
Startup Day Buenos Aires
Startup Day Buenos AiresStartup Day Buenos Aires
Startup Day Buenos Aires
 
AWS Startup Day Santiago - Pitch essentials
AWS Startup Day Santiago - Pitch essentialsAWS Startup Day Santiago - Pitch essentials
AWS Startup Day Santiago - Pitch essentials
 
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019 딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
 
De un monolito a microservicios
De un monolito a microserviciosDe un monolito a microservicios
De un monolito a microservicios
 
Rapid Prototyping with AWS - AWS Summit Sydney
Rapid Prototyping with AWS - AWS Summit SydneyRapid Prototyping with AWS - AWS Summit Sydney
Rapid Prototyping with AWS - AWS Summit Sydney
 
AWS Startup Day Bogotá - The Pitch
AWS Startup Day Bogotá - The PitchAWS Startup Day Bogotá - The Pitch
AWS Startup Day Bogotá - The Pitch
 
AWS Startup Day Guadalajara - Pitch essentials
AWS Startup Day Guadalajara - Pitch essentialsAWS Startup Day Guadalajara - Pitch essentials
AWS Startup Day Guadalajara - Pitch essentials
 

More from Apache MXNet

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
Apache MXNet
 
Introduction to GluonNLP
Introduction to GluonNLPIntroduction to GluonNLP
Introduction to GluonNLP
Apache MXNet
 
Introduction to object tracking with Deep Learning
Introduction to object tracking with Deep LearningIntroduction to object tracking with Deep Learning
Introduction to object tracking with Deep Learning
Apache MXNet
 
Introduction to GluonCV
Introduction to GluonCVIntroduction to GluonCV
Introduction to GluonCV
Apache MXNet
 
Introduction to Computer Vision
Introduction to Computer VisionIntroduction to Computer Vision
Introduction to Computer Vision
Apache MXNet
 
Image Segmentation: Approaches and Challenges
Image Segmentation: Approaches and ChallengesImage Segmentation: Approaches and Challenges
Image Segmentation: Approaches and Challenges
Apache MXNet
 
Introduction to Deep face detection and recognition
Introduction to Deep face detection and recognitionIntroduction to Deep face detection and recognition
Introduction to Deep face detection and recognition
Apache MXNet
 
Generative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNetGenerative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNet
Apache MXNet
 
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.aiDeep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Apache MXNet
 
Using Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetUsing Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNet
Apache MXNet
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Apache MXNet
 
MXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetMXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNet
Apache MXNet
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
Apache MXNet
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
Apache MXNet
 
Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018
Apache MXNet
 
ONNX and Edge Deployments
ONNX and Edge DeploymentsONNX and Edge Deployments
ONNX and Edge Deployments
Apache MXNet
 
Distributed Inference with MXNet and Spark
Distributed Inference with MXNet and SparkDistributed Inference with MXNet and Spark
Distributed Inference with MXNet and Spark
Apache MXNet
 
Multivariate Time Series
Multivariate Time SeriesMultivariate Time Series
Multivariate Time Series
Apache MXNet
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
Apache MXNet
 
Building Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet GluonBuilding Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet Gluon
Apache MXNet
 

More from Apache MXNet (20)

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
Introduction to GluonNLP
Introduction to GluonNLPIntroduction to GluonNLP
Introduction to GluonNLP
 
Introduction to object tracking with Deep Learning
Introduction to object tracking with Deep LearningIntroduction to object tracking with Deep Learning
Introduction to object tracking with Deep Learning
 
Introduction to GluonCV
Introduction to GluonCVIntroduction to GluonCV
Introduction to GluonCV
 
Introduction to Computer Vision
Introduction to Computer VisionIntroduction to Computer Vision
Introduction to Computer Vision
 
Image Segmentation: Approaches and Challenges
Image Segmentation: Approaches and ChallengesImage Segmentation: Approaches and Challenges
Image Segmentation: Approaches and Challenges
 
Introduction to Deep face detection and recognition
Introduction to Deep face detection and recognitionIntroduction to Deep face detection and recognition
Introduction to Deep face detection and recognition
 
Generative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNetGenerative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNet
 
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.aiDeep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
 
Using Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetUsing Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNet
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
MXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetMXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNet
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018
 
ONNX and Edge Deployments
ONNX and Edge DeploymentsONNX and Edge Deployments
ONNX and Edge Deployments
 
Distributed Inference with MXNet and Spark
Distributed Inference with MXNet and SparkDistributed Inference with MXNet and Spark
Distributed Inference with MXNet and Spark
 
Multivariate Time Series
Multivariate Time SeriesMultivariate Time Series
Multivariate Time Series
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 
Building Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet GluonBuilding Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet Gluon
 

Recently uploaded

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 

Recently uploaded (20)

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 

Fine-tuning BERT for Question Answering

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Automatic Question Answering through BERT fine-tuning
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Question Answering task Q: _______________ ? A: _______ Or A: impossible
  • 3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark SQuAD: The Stanford Question Answering Dataset SQuAD v1.1: Single good answer SQuAD v2: One ore more good answers and questions that are impossible to answer
  • 4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark SQuAD: The Stanford Question Answering Dataset Partial answer query answer
  • 5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark SQuAD: The Stanford Question Answering Dataset https://rajpurkar.github.io/SQuAD-explorer/
  • 6. Bi-Directional Attention Flow for machine comprehension (BiDAF), Seo et al. 2018
  • 7. (refresher) BERT (Devlin et al. 18): Pre-training INPUT WordPieces Embeddings Sentence Embeddings Position Embeddings Learned during the (pre)training process MASK EMASK In pre-training 15% of the input tokens are masked for the masked LM task There is also a sentence similarity task
  • 10. Training for Question Answering task: (Modified from: original paper) Fine-tuning Add small weight matrix W: 𝒉 𝐿 𝑖 𝑾 𝑆𝑃𝐴𝑁_𝑆𝑇𝐴𝑅𝑇_𝐸𝑁𝐷 BERT (Devlin et al. 18): Fine-Tuning for QA impossible = start[CLS]+end[CLS] Modified network in Gluon (Bert + Dense Layer) max(𝒔𝒕𝒂𝒓𝒕) + max(𝒆𝒏𝒅) − (𝑠𝑡𝑎𝑟𝑡 𝐶𝐿𝑆 + 𝑒𝑛𝑑 𝐶𝐿𝑆 ) < 𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 Question is impossible to answer if:
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Lab
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Go build! • http://gluon-nlp.mxnet.io/ Get help: • https://discuss.mxnet.io/

Editor's Notes

  1. First call deck for a high level introduction to Apache MXNet.
  2. Task 1: Mask language model (MLM) Task 2: Next sentence prediction Note that the first token is always forced to be [CLS] — a placeholder that will be used later for prediction in downstream tasks.