2. Introduction
Confidential Property of App Orchid Inc Copyright : 2017 - 2018 2
Background
● Sophomore at Rutgers University
● Pursuing a Bachelor’s of Science in Computer
Science and Mathematics
● Very interested in Machine Learning.
My Task
● Implemented an ensemble of BERTs that
were trained on the SQuAD 2.0 Dataset.
● Went from knowing very little about
Machine Learning to learning Pytorch
and experimenting with a recent NLP
architecture.
3. SQuAD 2.0 Dataset
3
Question Answering is one of
the hot NLP downstream tasks
being researched today!
The SQuAD 2.0 Dataset is a
commonly used benchmark
dataset for this task.
Example From the SQuAD 2.0 Dataset
Rajpurkar et. al, 2018
6. My Task
Confidential Property of App Orchid Inc Copyright : 2017 - 2018 6
Task: Train an ensemble of BERTs on the SQuAD
2.0 dataset.
The BERT and BiDAF code were provided by
HuggingFace and this github repo respectively.
Zhou et. al, 2019
8. How the Ensemble Works
Ensemble algorithm takes the prediction
files made by each of the BERTs and
BiDAF.
8
Example Question - Answers pair from model0’s n best predictions
11. Issues I encountered
Confidential Property of App Orchid Inc Copyright : 2017 - 2018 11
● I had about 1 and a half weeks to set up the right environment, train the models,
and create the ensemble.
○ I spent about 1.5 months to understand the code.
○ Could have done more experiments using different hyperparameters and
augmented data.
● GPU constraint - Since these models are massive, I encountered a couple of
problems.
○ Encountered out of RAM issues when training the BERT large models
12. Plans for the Future
Confidential Property of App Orchid Inc Copyright : 2017 - 2018 12
● I plan to pursue a career in Machine
Learning even more now.
● I plan to learn more about NLP and fill in
the gaps of what I learned these past 2
months.
● I plan to gain experience by pursuing a
research position at my university.
● I plan to create a couple of projects
using ML and get at a level where I could
implement research papers.
● I plan to learn Computer Vision.
Google Deep Dream