Transfer learning in NLP has become important for business. Contextual word embeddings like ELMo and BERT provide context-dependent representations of words that address issues with earlier static embeddings. While shallow embeddings are easy to train and small in size, they cannot represent different meanings of words or handle out-of-vocabulary words. Contextual embeddings provide context-aware representations but require more complex neural architectures, while transformer-based models like BERT can be used for embeddings or fine-tuning but are very large in size and require significant computing resources to train.
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
DSS 2019 Transfer Learning in Nlp
1. Transfer learning in NLP
What has changed and why is it important for business?
Jakub Nowacki, PhD
Lead Machine Learning Engineer @ Sotrender
Trainer @ Sages
17. The pros and cons
Shallow embeddings
(Word2Vec, FastText etc.)
Pros:
• Easy to train
• Small
• A lot of existing models
Cons:
• Same embedding for
different meaning
• May have issues with
inflection
• May have issues with
out-of-vocabulary (OOV)
words
Contextualized Embeddings
(ELMo, Flair etc.)
Pros:
• Embedding based on the
context
• Moderate size and
training speed
• Existing models
• No OOV problem
Cons:
• Require extra network
architecture
• LSTMs are rather slow
• Should be used along
with shallow embeddings
Transformer-based models
(e.g. BERT etc.)
Pros:
• Task-agnostic model
• Can be used as
embeddings or tuned
• Existing models
• Faster than LSTMs
• No OOV problem
Cons:
• Can be really large
• Hard to tune and even
harder to train (TPUs
almost a must)
• Multilingual versions are
very large
https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html