LLaMA 2.pptx

Large Language
Model Mata AI
-LLaMA 2
© Rk Rahul

LLaMA - Overview
● LLaMA is a family of large language models (LLMs)
● LLaMA has four model sizes were trained: 7, 13, 33 and 65
billion parameters
● LLaMA developed by Meta
● First released in February 2023.

LLaMA 2 - Overview
● LLaMA 2 is a family of large language models (LLMs)
● LLaMA 2 is an auto-regressive language model
● First Release, July 18, 2023, in partnership with Microsoft, Meta
and Open-source Large Language models.
● LLaMA 2 pretrained models are trained on 2 trillion tokens, and
have double the context length than LLaMA 1
● Three model sizes were trained: 7, 13,70 billion parameters
● LLaMA 2 is available for free for research and commercial use.

LLaMA 2 – Can Do
● Generate different creative text formats of text content, like poems,
code, scripts, musical pieces, email, letters, etc.
● Translate languages.
● Write different kinds of creative content.
● Answer your questions in an informative way, even if they are open
ended, challenging, or strange.
● Help you with coding tasks.
● Generate dialogue for chatbots and other conversational AI systems.

LLaMA 2 - Improvements
● Increased Training on Tokens: Llama 2 is trained on 40% more
tokens.
● Longer Context Length: With a longer context length of 4k tokens.
● Fine-Tuning for Dialogues: The versions of Llama 2 that are fine-
tuned (Labelled Llama 2-Chat) are aimed at being optimized for
dialogue applications using Reinforcement Learning from Human
Feedback (RLHF).

Fine-Tuning Process and LLaMA-2-Chat

Supervised Fine-
Tuning
LLaMA 2 Building Process
1
Pre-Training
2
3
Reinforcement
Learning from
Human Feedback
(RLHF)
4
Reward Model

LLaMA 2 Pre-Training
● The pretraining approach using an optimized auto-regressive transformer
(several changes to improve performance)
● Also used grouped-query attention (GQA) (improve inference scalability)
● Trained on 2 trillion tokens of data for good performance.
● Model architecture uses standard transformer architecture.
● Pre-normalization using RMSNorm (Root Mean Square Layer Normalization)

LLaMA 2 Pre-Training Normalization

LLaMA 2 - Pretraining Functionality
● Trained using the AdamW optimizer (β1 = 0.9, β2 = 0.95, eps = 10−5
)
● The SwiGLU activation function
● To use a cosine learning rate schedule (warmup of 2000 steps) and decay
final learning rate.
● Weight decay of 0.1 and gradient clipping of 1.0

LLaMA 2 - Training Hardware
● LLaMA 2 was pre-trained on Meta's Research Super Cluster
(RSC) as well as internal production clusters.
● Both clusters use NVIDIA A100 GPUs.
● RSC use NVIDIA Quantum InfiniBand while production
cluster is using a RoCE (RDMA over converged Ethernet)

LLaMA 2 - Supervised Fine-Tuning (SFT)
● SFT is the technique of next-token prediction objective that is nearly
identical to pre-training.
● To encode text for LLaMA 2 and using the method of the tokenizer.
● Supervised fine-tuning to use a cosine learning rate schedule with an
initial learning rate of 2 × 𝟏𝟎−𝟓, a weight decay of 0.1, a batch size
of 64, and a sequence length of 4096 tokens.

LLaMA 2 - Tokenizer
● To encode text from SFT (LLaMA 2), the tokenizer first splits all
numbers into individual digits. LLaMA 2 is a sub word language
model, and it can learn to represent numbers using a small
number of sub words.
● LLaMA 2 is a byte pair encoding (BPE) tokenizer based on the
SentencePiece implementation.
● The total vocabulary size is 32k tokens.

LLaMA 2 - RLHF
● Reinforcement learning from human feedback (RLHF) is a model training
procedure that is applied to a fine-tuned language model to further align model
behavior with human preferences and instruction following.
● RLHF collects data that represents sampled human preferences, whereby
human annotators select which directly from human feedback on the
model’s output.
● Safety-based data collection during RLHF
● This human feedback is subsequently used to train a reward model, which
learns patterns in the preferences of the human annotators and can then
automate preference decisions.

LLaMA 2 - Reward Model
● The reward model is responsible
for telling the language model
what constitutes a good response.
Its response based on how helpful
and safe it is.
● The reward model takes a model
response and its corresponding as
inputs and outputs a scalar score to
indicate the quality of the model
generation.

Reference
● Deep (Learning) Focus -
https://cameronrwolfe.substack.com/p/llama-2-from-the-ground-up
● Meta AI - https://ai.meta.com/
● Research Article - Llama 2: Open Foundation and Fine-Tuned
Chat Models

LLaMA 2.pptx

More Related Content

What's hot

Similar to LLaMA 2.pptx

Recently uploaded

LLaMA 2.pptx