The Evolution: From RNNs to Transformers
2017
Unlike RNNs, transformers use an attention mechanism, allowing
them to consider the entire sentence or paragraph simultaneously,
rather than one word at a time
Transformer
The Evolution: From RNNs to Transformers
As she said this, she looked down at her hands...
['As', 'she', 'said', 'this', ',', 'she', 'looked', 'down', 'at', 'her', 'hands',
'...']
Encoding
Tokenization
Embeddings
Context Vector
Autoregressive
Generation
Example:
Why Transformers Can Predict Text
Redundancy in Language
Entropy and Predictability
Why Predicting Text is Possible:
What is a Large Language Model?
A Large Language Model is a type of artificial intelligence that uses
deep learning and vast datasets to understand, summarize, generate,
and predict new content. LLMs are a subset of generative AI,
specifically designed to create text-based content
GPT-3
Megatron-Turing NLG 530B
Examples:
Capabilities of LLMs
Generation
Summarization
Translation
Classification
Chatbots
LLM
Real-World Applications
Healthcare
Retail
Software
Development
Finance
Marketing
LLM
How Do Large Language Models Work?
LLMs are trained using unsupervised learning, which means they
learn from large datasets without labeled data. They find patterns and
relationships in the data, enabling them to perform various tasks like
text generation, summarization, and translation
Learning
One-Shot
Learning
Zero-Shot
Learning
1 2
Few-Shot
Learning
3
Customization Techniques
Fine-Tuning
Prompt
Tuning
1 2
Adapters
3
Types of Large Language Models
Decoder-Only
Models
Encoder-Only
Models
1 2
Encoder-
Decoder
Models
3
Advantages and Challenges of Large Language
Models
Flexibility
Extensibility
Performance
Accuracy
Ease of Training
Advantages
Efficiency
Operational Costs
Development Costs
Bias
Ethical Concerns
Explainability
Challenges
Hallucination

Large Language Models | How Large Language Models Work? | Introduction to LLM | Simplilearn