Introduction to Transformer Model

•Download as PPTX, PDF•

6 likes•4,512 views

Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach. This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.

Engineering

Transformers for NLP
An Introduction to the Attention on Steroids

What is NLP?
Natural Language Processing
(NLP) is a subfield of artificial
intelligence which is primarily aimed
to program computers to process
and analyze large amounts of
natural language data

NLP techniques
In timeline ● Symbolic NLP (1950s to 1990s)
● Statistical NLP (1990s to 2010s)
● Neural NLP (2010s to present)

Deep Learning Models
Neural Networks
Word to representation
(word2vec)
● Layered structure
● True targets vs
output predictions
● Weights and loss
functions
● Optimizers
RNN
Language Generation
Encoder-Decoder
Translation
● Joint RNNs
● Creating context
and using the final
context of first RNN
● Training using
parallel corpora
Image source: https://www.slideshare.net/darvind/nlp-using-transformers

Deep Learning Models
Attention Models
Translation and Image
Recognition
● Pay selective attention
● Utilize intermediate
states while decoding
● Do alignment while
translating
Transformer
Translation
Image source: https://www.slideshare.net/darvind/nlp-using-transformers

Transformers in
Translation
This architecture aims to solve
sequence-to-sequence tasks while
handling long-range dependencies with
ease. It relies entirely on self-attention
to compute representations of its input
and output WITHOUT using sequence-
aligned RNNs or convolution

Transformers in
Translation
Three types of attention
● Within encoder
● Encoder - Decoder
● Within decoder

Transformers in
Translation
Self-attention
● A rich representation which
captures the interdependencies
between the words compared
to single word embeddings
● Parallel computation
● Adhere to long-term
dependencies

Transformers in
Translation
Image source: https://arxiv.org/pdf/1706.03762.pdf

Transformers in
Translation
Challenges
● Attention can only deal with
fixed-length text strings. The
text has to be split into a
certain number of segments or
chunks before being fed into
the system as input
● This chunking of text causes
context fragmentation.

Transformers in
Translation
Future directions
● Transformer XL for language
modelling
● Google’s BERT

What's hot

ViT (Vision Transformer) Review [CDM]Dongmin Choi

Introduction to Recurrent Neural NetworkKnoldus Inc.

Attention Is All You NeedIllia Polosukhin

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn

Transformers in 2021Grigory Sapunov

BERT introductionHanwha System / ICT

Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn

Transformers AI PPT.pptxRahulKumar854607

Attention is All You Need (Transformer)Jeong-Gwan Lee

LstmMehrnaz Faraz

Image captioningMuhammad Zbeedat

INTRODUCTION TO NLP, RNN, LSTM, GRUSri Geetha

Recurrent neural networks rnnKuppusamy P

GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim

Fine tune and deploy Hugging Face NLP modelsOVHcloud

Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Deep Learning Italia

Word embeddings, RNN, GRU and LSTMDivya Gera

LSTM BasicsAkshay Sehgal

Deep Generative Models Chia-Wen Cheng

What's hot (20)

ViT (Vision Transformer) Review [CDM]

Introduction to Recurrent Neural Network

Attention Is All You Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...

Transformers in 2021

BERT introduction

Introduction For seq2seq(sequence to sequence) and RNN

Transformers AI PPT.pptx

Attention is All You Need (Transformer)

Lstm

Image captioning

INTRODUCTION TO NLP, RNN, LSTM, GRU

Recurrent neural networks rnn

GPT-2: Language Models are Unsupervised Multitask Learners

Fine tune and deploy Hugging Face NLP models

Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)

Word embeddings, RNN, GRU and LSTM

LSTM Basics

Deep Generative Models

Similar to Introduction to Transformer Model

Short story presentationStutiAgarwal36

Natural Language Processing - Research and Application TrendsShreyas Suresh Rao

TensorflowKnoldus Inc.

Deep Learning for Natural Language ProcessingParrotAI

Intro.to RNN (Recurrent Neural Network).pdfomardesoky789

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...Vimukthi Wickramasinghe

Understanding deep learningDr. Stylianos Kampakis

Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik

Convolutional Neural Networks for Natural Language Processing / Stanford cs22...changedaeoh

Talk from NVidia Developer ConnectAnuj Gupta

Deep Learning for Machine TranslationMatīss ‎‎‎‎‎‎‎

NLP using Deep learningBabu Priyavrat

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...Universitat Politècnica de Catalunya

From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang

Building a Neural Machine Translation System From ScratchNatasha Latysheva

Presentation.pptxtanmaydixit173

BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...Kyuri Kim

Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari

End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFJayavardhan Reddy Peddamail

[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...IJET - International Journal of Engineering and Techniques

Similar to Introduction to Transformer Model (20)

Short story presentation

Natural Language Processing - Research and Application Trends

Tensorflow

Deep Learning for Natural Language Processing

Intro.to RNN (Recurrent Neural Network).pdf

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...

Understanding deep learning

Engineering Intelligent NLP Applications Using Deep Learning – Part 2

Convolutional Neural Networks for Natural Language Processing / Stanford cs22...

Talk from NVidia Developer Connect

Deep Learning for Machine Translation

NLP using Deep learning

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...

From Conventional Machine Learning to Deep Learning and Beyond.pptx

Building a Neural Machine Translation System From Scratch

Presentation.pptx

BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...

Natural language processing for requirements engineering: ICSE 2021 Technical...

End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF

[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...

Recently uploaded

IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Introduction-To-Agricultural-Surveillance-Rover.pptxk795866

Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066

Heart Disease Prediction using machine learning.pptxPoojaBan

Work Experience-Dalton Park.pptxfvvvvvvvLewisJB

8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191

young call girls in Green Park🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Application of Residue Theorem to evaluate real integrations.pptx959SahilShah

An introduction to Semiconductor and its types.pptxPurva Nikam

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani

POWER SYSTEMS-1 Complete notes examplesDr. Gudipudi Nageswara Rao

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12

Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774

main PPT.pptx of girls hostel security using rfidNikhilNagaraju

Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721

Recently uploaded (20)

IVE Industry Focused Event - Defence Sector 2024

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR

Introduction-To-Agricultural-Surveillance-Rover.pptx

Introduction to Machine Learning Unit-3 for II MECH

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)

Heart Disease Prediction using machine learning.pptx

Work Experience-Dalton Park.pptxfvvvvvvv

8251 universal synchronous asynchronous receiver transmitter

young call girls in Green Park🔝 9953056974 🔝 escort Service

Application of Residue Theorem to evaluate real integrations.pptx

An introduction to Semiconductor and its types.pptx

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf

POWER SYSTEMS-1 Complete notes examples

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE

Arduino_CSE ece ppt for working and principal of arduino.ppt

main PPT.pptx of girls hostel security using rfid

Call Girls Narol 7397865700 Independent Call Girls

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync

Introduction to Transformer Model

1. Transformers for NLP An Introduction to the Attention on Steroids

2. What is NLP? Natural Language Processing (NLP) is a subfield of artificial intelligence which is primarily aimed to program computers to process and analyze large amounts of natural language data

3. NLP techniques In timeline ● Symbolic NLP (1950s to 1990s) ● Statistical NLP (1990s to 2010s) ● Neural NLP (2010s to present)

4. Deep Learning for NLP

5. Deep Learning Models Neural Networks Word to representation (word2vec) ● Layered structure ● True targets vs output predictions ● Weights and loss functions ● Optimizers RNN Language Generation Encoder-Decoder Translation ● Joint RNNs ● Creating context and using the final context of first RNN ● Training using parallel corpora Image source: https://www.slideshare.net/darvind/nlp-using-transformers

6. Deep Learning Models Attention Models Translation and Image Recognition ● Pay selective attention ● Utilize intermediate states while decoding ● Do alignment while translating Transformer Translation Image source: https://www.slideshare.net/darvind/nlp-using-transformers

7. Transformers in Translation This architecture aims to solve sequence-to-sequence tasks while handling long-range dependencies with ease. It relies entirely on self-attention to compute representations of its input and output WITHOUT using sequence- aligned RNNs or convolution

8. Transformers in Translation Three types of attention ● Within encoder ● Encoder - Decoder ● Within decoder

9. Transformers in Translation Self-attention ● A rich representation which captures the interdependencies between the words compared to single word embeddings ● Parallel computation ● Adhere to long-term dependencies

10. Transformers in Translation Image source: https://arxiv.org/pdf/1706.03762.pdf

11. Transformers in Translation Challenges ● Attention can only deal with fixed-length text strings. The text has to be split into a certain number of segments or chunks before being fed into the system as input ● This chunking of text causes context fragmentation.

12. Transformers in Translation Future directions ● Transformer XL for language modelling ● Google’s BERT

13. Thank you!!!

Introduction to Transformer Model

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Transformer Model

Similar to Introduction to Transformer Model (20)

More from Nuwan Sriyantha Bandara

More from Nuwan Sriyantha Bandara (10)

Recently uploaded

Recently uploaded (20)

Introduction to Transformer Model