DSS 2019 Transfer Learning in Nlp

•

2 likes•240 views

Transfer learning in NLP has become important for business. Contextual word embeddings like ELMo and BERT provide context-dependent representations of words that address issues with earlier static embeddings. While shallow embeddings are easy to train and small in size, they cannot represent different meanings of words or handle out-of-vocabulary words. Contextual embeddings provide context-aware representations but require more complex neural architectures, while transformer-based models like BERT can be used for embeddings or fine-tuning but are very large in size and require significant computing resources to train.

Data & Analytics

Transfer learning in NLP
What has changed and why is it important for business?
Jakub Nowacki, PhD
Lead Machine Learning Engineer @ Sotrender
Trainer @ Sages

Transfer Learning
https://medium.com/@pierre_guillou/understand-how-works-resnet-without-talking-about-residual-64698f157e0
c
😭😎

Embeddings (Word2vec, FastText etc.)
https://towardsdatascience.com/word-embedding-with-word2vec-and-fasttext-a209c1d3e12c

So what is wrong with that?
[0.0, 0.0, …, 0.0]

Contextualized word-embeddings
http://jalammar.github.io/illustrated-bert/

Language model
https://medium.com/@plusepsilon/the-bidirectional-language-model-1f3961d1fb27
A statistical language model is a probability distribution over
sequences of words. Given such a sequence, say of length m, it
assigns a probability P(w_1,..., w_m) to the whole sequence.
Wikipedia: https://en.wikipedia.org/wiki/Language_model

ELMo
http://jalammar.github.io/illustrated-bert/

LSTM vs Transformer
https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04
https://colah.github.io/posts/2015-08-Understanding-LSTMs/

BERT
http://jalammar.github.io/illustrated-bert/

Transfer learning in NLP
http://jalammar.github.io/illustrated-bert/

NLP’s ImageNet moment
http://ruder.io/nlp-imagenet/

Cyberbullying
https://kidshelpline.com.au/teens/issues/cyberbullying

PolEval 2019 Cyberbullying
http://poleval.pl/tasks/task6
Precision = 0.5
Recall = 0.5522
F1-score = 0.5248 (balanced)
Accuracy = 0.866
🏆
4th place
(theoretically, since we
didn’t take part)Word
embeddings
(FastText)
Flair
embeddings
(forward)
Flair
embeddings
(backward)
Stacked
embeddings
BiLSTM
(with dropouts)
Linear
Document
Harmful?

PolEval 2019 Cyberbullying
http://poleval.pl/tasks/task6
@anonymized_account Czyżby Madryt brał przykład z
Warszawy?
@anonymized_account @anonymized_account No to
Skończmy k**** z tym wersalem w j****** szczujni

The pros and cons
Shallow embeddings
(Word2Vec, FastText etc.)
Pros:
• Easy to train
• Small
• A lot of existing models
Cons:
• Same embedding for
different meaning
• May have issues with
inﬂection
• May have issues with
out-of-vocabulary (OOV)
words
Contextualized Embeddings
(ELMo, Flair etc.)
Pros:
• Embedding based on the
context
• Moderate size and
training speed
• Existing models
• No OOV problem
Cons:
• Require extra network
architecture
• LSTMs are rather slow
• Should be used along
with shallow embeddings
Transformer-based models
(e.g. BERT etc.)
Pros:
• Task-agnostic model
• Can be used as
embeddings or tuned
• Existing models
• Faster than LSTMs
• No OOV problem
Cons:
• Can be really large
• Hard to tune and even
harder to train (TPUs
almost a must)
• Multilingual versions are
very large
https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html

Similar to DSS 2019 Transfer Learning in Nlp

Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...IRJET Journal

Text Pre-Processing Techniques in Natural Language Processing: A ReviewIRJET Journal

Ie essay j yongchan_leeYONGCHANLEE4

Data science and artificial intelligencessuser774037

Update Resume (1)Deepak Singh

Enhancing Video Understanding: NLP-Based Automatic Question GenerationIRJET Journal

IRJET - Response Analysis of Educational VideosIRJET Journal

TAUS Moses Roundtable, Prague, 11 September 2013TAUS - The Language Data Network

Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docxwellesleyterresa

IRJET- Voice based Billing SystemIRJET Journal

IRJET - Mobile Chatbot for Information SearchIRJET Journal

Automatic Grading of Handwritten AnswersIRJET Journal

Project Team StructurePatrick Ogbuitepu

ijeter35852020.pdfSatishBhalshankar

IRJET - Storytelling App for Children with Hearing Impairment using Natur...IRJET Journal

A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOTIRJET Journal

Open Source Software to Enhance the STEM Learning EnvironmentMaurice Dawson

CV1masnad hossain nehith

Presentation Summarizer: A Full-Fledged NLP ServiceIRJET Journal

IRJET - Optical Character Recognition and TranslationIRJET Journal

Similar to DSS 2019 Transfer Learning in Nlp (20)

Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...

Text Pre-Processing Techniques in Natural Language Processing: A Review

Ie essay j yongchan_lee

Data science and artificial intelligence

Update Resume (1)

Enhancing Video Understanding: NLP-Based Automatic Question Generation

IRJET - Response Analysis of Educational Videos

TAUS Moses Roundtable, Prague, 11 September 2013

Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx

IRJET- Voice based Billing System

IRJET - Mobile Chatbot for Information Search

Automatic Grading of Handwritten Answers

Project Team Structure

ijeter35852020.pdf

IRJET - Storytelling App for Children with Hearing Impairment using Natur...

A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT

Open Source Software to Enhance the STEM Learning Environment

CV1

Presentation Summarizer: A Full-Fledged NLP Service

IRJET - Optical Character Recognition and Translation

Recently uploaded

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083

April 2024 - Crypto Market Report's Analysismanisha194592

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY

Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823

Midocean dropshipping via API with DroFxolyaivanovalion

Zuja dropshipping via API with DroFx.pptxolyaivanovalion

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Recently uploaded (20)

CebaBaby dropshipping via API with DroFX.pptx

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...

April 2024 - Crypto Market Report's Analysis

Generative AI on Enterprise Cloud with NiFi and Milvus

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

Schema on read is obsolete. Welcome metaprogramming..pdf

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

Determinants of health, dimensions of health, positive health and spectrum of...

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Log Analysis using OSSEC sasoasasasas.pptx

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...

Midocean dropshipping via API with DroFx

Zuja dropshipping via API with DroFx.pptx

100-Concepts-of-AI by Anupama Kate .pptx

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

BigBuy dropshipping via API with DroFx.pptx

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

DSS 2019 Transfer Learning in Nlp

1. Transfer learning in NLP What has changed and why is it important for business? Jakub Nowacki, PhD Lead Machine Learning Engineer @ Sotrender Trainer @ Sages

2. Transfer Learning https://medium.com/@pierre_guillou/understand-how-works-resnet-without-talking-about-residual-64698f157e0 c 😭😎

3. Embeddings (Word2vec, FastText etc.) https://towardsdatascience.com/word-embedding-with-word2vec-and-fasttext-a209c1d3e12c

4. So what is wrong with that? [0.0, 0.0, …, 0.0]

5. Contextualized word-embeddings http://jalammar.github.io/illustrated-bert/

6. Language model https://medium.com/@plusepsilon/the-bidirectional-language-model-1f3961d1fb27 A statistical language model is a probability distribution over sequences of words. Given such a sequence, say of length m, it assigns a probability P(w_1,..., w_m) to the whole sequence. Wikipedia: https://en.wikipedia.org/wiki/Language_model

7. ELMo http://jalammar.github.io/illustrated-bert/

8. ELMo http://jalammar.github.io/illustrated-bert/

9. LSTM vs Transformer https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04 https://colah.github.io/posts/2015-08-Understanding-LSTMs/

10. BERT http://jalammar.github.io/illustrated-bert/

11. BERT http://jalammar.github.io/illustrated-bert/

12. Transfer learning in NLP http://jalammar.github.io/illustrated-bert/

13. NLP’s ImageNet moment http://ruder.io/nlp-imagenet/

14. Cyberbullying https://kidshelpline.com.au/teens/issues/cyberbullying

15. PolEval 2019 Cyberbullying http://poleval.pl/tasks/task6 Precision = 0.5 Recall = 0.5522 F1-score = 0.5248 (balanced) Accuracy = 0.866 🏆 4th place (theoretically, since we didn’t take part)Word embeddings (FastText) Flair embeddings (forward) Flair embeddings (backward) Stacked embeddings BiLSTM (with dropouts) Linear Document Harmful?

16. PolEval 2019 Cyberbullying http://poleval.pl/tasks/task6 @anonymized_account Czyżby Madryt brał przykład z Warszawy? @anonymized_account @anonymized_account No to Skończmy k**** z tym wersalem w j****** szczujni

17. The pros and cons Shallow embeddings (Word2Vec, FastText etc.) Pros: • Easy to train • Small • A lot of existing models Cons: • Same embedding for different meaning • May have issues with inﬂection • May have issues with out-of-vocabulary (OOV) words Contextualized Embeddings (ELMo, Flair etc.) Pros: • Embedding based on the context • Moderate size and training speed • Existing models • No OOV problem Cons: • Require extra network architecture • LSTMs are rather slow • Should be used along with shallow embeddings Transformer-based models (e.g. BERT etc.) Pros: • Task-agnostic model • Can be used as embeddings or tuned • Existing models • Faster than LSTMs • No OOV problem Cons: • Can be really large • Hard to tune and even harder to train (TPUs almost a must) • Multilingual versions are very large https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html

18. Thank you! Questions?

DSS 2019 Transfer Learning in Nlp

Recommended

Recommended

More Related Content

Similar to DSS 2019 Transfer Learning in Nlp

Similar to DSS 2019 Transfer Learning in Nlp (20)

Recently uploaded

Recently uploaded (20)

DSS 2019 Transfer Learning in Nlp