Difference between a base LLM model and their variants. Learn the difference between a Base model and it's instruct and chat variants. Learn when to use which model and a final summary 🙂
2. Base model
- Trained on a diverse range of
texts, making minimal
assumptions about the structure
of the text it completes.
- Lacks specific context or
task-related biases.
- When using a base model, you
can input any text prompt, and it
will generate a continuation
based on its general language
understanding.
- Versatile but don’t specialize in
any particular task.
3. Instruct Variant
- Fine-tuned on
instruction-response pairs
during training.
- Designed to follow specific
instructions and generate
responses that adhere to those
instructions.
- For example, if you give an
instruct model an instruction
like “Write a recipe for chocolate
cake,” it will generate a response
that aligns with the given
instruction.
- Useful for tasks where precise
adherence to instructions
matters.
4. - Derived from base models by
training them on transcripts of
dialogues.
- Assume that the input text is part
of a conversation.
- Can use chat models for
interactive back-and-forth
conversations.
- For instance, you can provide
one side of a dialogue, and the
chat model will complete the
other side.
Chat Variant
5. - While these labels (base, chat,
instruct) help describe the
model’s intended use, they are
not strict boundaries.
- You can instruct chat models and
chat with instruct models.
- In practice, you can often switch
between them based on your
specific needs.
- Actual capabilities of a model
depend on how it was fine-tuned
and the data it was exposed to!
Notes