SlideShare a Scribd company logo
1 of 26
Download to read offline
NoCode, Data & AI
LLM Inside Bootcamp
Fundamentals of LLM
What is a large language model, how is it trained, how are
different from traditional machine learning models.
Rahul Xavier Singh Anant Corporation
Nocode Data & AI
To most , LLMs seem like
magic. In computing &
technology, LLMs show
great promise in bridging
the gap between human
computer interaction.
Our Customers
NoCode, Data & AI
LLM Inside Bootcamp
with Cassandra
Full day bootcamp to familiarize product managers, software
professionals, and data engineers to creating next generation
experts, assistants, and platforms powered by Generative AI
with Large Language Models (LLM, OpenAI, GPT)
Rahul Xavier Singh Anant Corporation
Nocode Data & AI
kono.io/bootcamp
Agenda
● I: Strategy & Theory
● II: LLM Design Patterns
● III: NoCode/Code LLM Stacks
● IV: Build a Custom ChatBot
with LLM your Data
Today’s Agenda
1. Fundamentals of ML
2. Transformers Architecture
3. How LLMs Work
4. LLMs other than ChatGPT/GPT
Fundamentals of
ML/Transformers
● History of LLMs (Large Language Models)
● What is Machine Learning / AI?
● Transformer Architecture
History of Large Language Models
1. Everything
before GPT-3
(2020) was trash.
2. ChatGPT made
GPT-3 popular.
3. Now everyone
wants in on the
party.
https://voicebot.ai/large-language-models
-history-timeline/
Most of the hype, growth
relating to LLMs have
happened in the last 6 months
( November 2022 till now , May
2023
Machine Learning in a Nutshell
https://www.avenga.com/magazine/machi
ne-learning-programming/
1. In machine learning, the
computer trains on your
data, and gives you the
most likely answer. The
better the data, the
better the algorithm.
2. Neural networks process
input data through layers
to predict outcomes
based on patterns and
relationships learned
during training.
What can Neural Neworks do?
https://thedatascientist.com/wp-content/uploads/2018/03/Deep-Neural-
Network-What-is-Deep-Learning-Edureka.png
1. Artificial neural networks (ANN)
can recognize patterns and
relationships in data.
2. They can classify and categorize
data accurately.
3. They can make predictions based
on input data.
4. Neural networks can be used for
image and speech recognition.
5. Deep neural network is an ANN
that has many layers and can do
more complex predictions.
6. They can be trained to improve
their accuracy over time.
https://www.analyticsvidhya.com/blog/202
1/05/convolutional-neural-networks-cnn/
What is the big deal about Transformers?
1. Because ANNs are implementations in matrix math -
and that relates to the Matrix of Leadership …
2. Transformers improve natural language processing,
enabling better chatbots and language translation
tools.
3. Transformers are a neural network architecture that
outperforms previous models on various NLP tasks.
4. Attention mechanisms in Transformers better model
long-term dependencies in sequential data.
5. Transformers are a hardware accelerator that
speeds up AI computations by several orders of
magnitude.
6. Transformers were invented by Elon Musk
The encoder-decoder structure of the Transformer
architecture
Taken from “Attention Is All You Need“
How LLMs Work &
What LLMs Do
● Transformers Decoder/Encoder
● What LLMs Do: Predict Words
● What LLMs Do: Narrow Possibilities
● What LLMs Do: Verse Jumping
● What LLMs Do: Document Construction
How does a Large Language Model Work?
1. The transformer architecture consists of two
components: the encoder and decoder.
2. The encoder processes the input sequence and
generates embeddings through self-attention
mechanisms.
3. The decoder takes the encoder's embeddings as
input and generates an output sequence, while
also using self-attention mechanisms to attend to
relevant parts of the input sequence.
4. Together, they enable the transformer to learn
complex patterns and relationships within
sequences, making it a powerful tool for natural
language processing and other sequence modeling
tasks.
The encoder-decoder structure of the Transformer
architecture
Taken from “Attention Is All You Need“
What LLMs Do: Predict Words
1. A language model uses deep learning
algorithms to learn patterns and
relationships in large sets of text data.
2. It is trained on a large corpus of text, such
as books, articles, and websites, to
recognize and understand the underlying
structure and meaning of language.
3. Once trained, the model can generate
new text based on the input it receives,
by predicting the most likely sequence of
words to follow.
4. The model uses a probabilistic approach
to generate text, allowing it to produce
diverse and creative responses to different
inputs.
5. LLMs have a wide range of applications,
including language translation, chatbots,
content creation, and more.
https://vectara.com/avoiding-hallucinations-in-llm-powered-applications/
What LLMs Do: Narrow Possibilities
1. A LLM is like a really
smart guesser that's
been trained on a lot
of text.
2. When you give it a
prompt, it starts
guessing what the
next word might be.
3. Instead of guessing
randomly, it predicts
the best possible
word.
4. As you add words to
your prompt, you are
narrowing down the
overall “document”
you get back.
What LLMs Do: Verse Jumping
1. It’s a simulator of the real world, but it isn’t a real
world. Each prompt is a portal to a a possible
realistic universe.
2. It contains probabilities of words or tokens from the
tokenverse strung together which we can call a
“Document”
3. As you give it more words, the universe of possible
“Documents” reduces.
https://now.tufts.edu/2022/05/31/exploring-shape-
our-universe-and-multiverse
What LLMs Do: Document Construction
1. Each model has a
“tokenverse” which it
picks words from.
GPT4 has 100k tokens.
2. Document A &
Document B are
possible path through
all of the tokens in the
tokenverse for a
particular model.
3. If you start with certain
words, a Prompt A’,
the possibility of
getting Document A
increases
4.
A’
B’
LLMs other than
ChatGPT/GPT
● Popular LLMs Available
● Popular Open Source LLMs Available
● Cloud Providers LLM Offerings
Popular Public LLMs Available Today
1. OpenAI: ChatGPT,
GPT3.5-Turbo,
Text-Davinci-003,
GPT4 (Waitlist)
2. Anthropic: Claude,
Claude-Instant
3. Cohere: Baseline,
allows training
https://vectara.com/top-large-language-models-llms-gpt-4-llama-g
ato-bloom-and-when-to-choose-one-over-the-other/
If you are starting out, just use GPT-3.5 Turbo.
It’s easy to get access to, and there are lots of
code examples on Github
Leaked @Google: “We Have No Moat…”
“We Have No Moat, And
Neither Does OpenAI"
https://lmsys.org/blog/2023-03-30-vicuna/
https://www.semianalysis.com/p/google-we-have-
no-moat-and-neither ● Meta LLaMa Open Sourced
● GPT Answers used to Train
● LoRA - Low rank adaptation
● Retraining models is hard
● Small models iterating
better
● Data quality scales better
● Battling open source means
failure
● Companies need users /
researchers
● Individuals can use
different licenses
● Be your customer
● Let open source do the
work
● OpenAI no different than
Google
Example Open LLM: Stanford Alpaca
https://crfm.stanford.edu/2023/03/13/alpaca.html
https://lmsys.org/blog/2023-03-30-vicuna/
Popular Open LLMs Available Today
Leaderboard
1. Vicuna-13b
2. Koala-13b
3. Oast-pythia-12b
Others to Look into
4. StableLM
5. Dolly
6. ChatGLM
https://chat.lmsys.org/
If you don’t want to send your data to a public
LLM, you can host your own open model, or use
Azure OpenAI, Amazon Bedrock
Cost of Fine Tuning: Alpaca/Vicuna
https://lmsys.org/blog/2023-03-30-vicuna/
Public Cloud Offerings of LLM
1. Azure OpenAI
2. Amazon Bedrock
3. NVidia NeMo
4. Google Vertex (batteries
not Included)
https://venturebeat.com/ai/amazon-launches-bedrock-for-generative
-ai-escalating-ai-cloud-wars/
Azure OpenAI is the most mature, and probably the
best. Amazon’s Bedrock offers managed hosting of
Claude, StableLM, etc. Google’s offering requires
work to get it to work.
25
Key Takeaways: History Foundations of LLM
Neural Networks : 1940s/50s
Transformers/Attention: 2017
GPT3: 2020, GPT3.5: 2022
Tensorflow : 2015/ Pytorch 2016
- People have been hacking away at
ML/AI since the 1940s. Until GPUs, TPUs,
Cloud Infrastructure, very few
companies could do “Deep Learning”
- Deep Learning enabled great stuff in
vision, speech, and starts to generative
AI. It wasn’t until the Transformers paper,
that things took off.
- LLMs are good at predicting the “next
word” or token from a tokenverse given
an input.
- The quality / characteristics of the
prompt given, narrows down a
Document from a multiverse of
documents.
TPUs / GPT: 2018, GPT2: 2019
Everything Else: 2023 Q1/Q2
26
Thank you and Dream Big.
Hire us
- Design Workshops
- Innovation Sprints
- Service Catalog
Anant.us
- Read our Playbook
- Join our Mailing List
- Read up on Data Platforms
- Watch our Videos
- Download Examples

More Related Content

What's hot

How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfDavid Rostcheck
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxColleen Farrelly
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models BootcampData Science Dojo
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGISynaptonIncorporated
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPXiachongFeng
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN FrameworkKeymate.AI
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3Ishan Jain
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AICori Faklaris
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxJesus Rodriguez
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AICMassociates
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 

What's hot (20)

How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
 
The-CxO-Guide-to.pdf
The-CxO-Guide-to.pdfThe-CxO-Guide-to.pdf
The-CxO-Guide-to.pdf
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
Introduction to ChatGPT
Introduction to ChatGPTIntroduction to ChatGPT
Introduction to ChatGPT
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
Carol Scott - Fast Track Your AI Journey.pdf
Carol Scott - Fast Track  Your AI Journey.pdfCarol Scott - Fast Track  Your AI Journey.pdf
Carol Scott - Fast Track Your AI Journey.pdf
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AI
 
ChatGPT ChatBot
ChatGPT ChatBotChatGPT ChatBot
ChatGPT ChatBot
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 

Similar to LLMs Inside Bootcamp Fundamentals

Train foundation model for domain-specific language model
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language modelBenjaminlapid1
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AIOzgurOscarOzkan
 
Technologies for startup
Technologies for startupTechnologies for startup
Technologies for startupDzung Nguyen
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...Bluebash LLC
 
The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018Amit Ashwini
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMsJim Steele
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1sarahkh12
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesPaige Morgan
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGDSCNiT
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
 
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLMCrafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLMChristopherTHyatt
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackJustina Petraitytė
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfStephenAmell4
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfAnastasiaSteele10
 

Similar to LLMs Inside Bootcamp Fundamentals (20)

Train foundation model for domain-specific language model
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language model
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
 
Technologies for startup
Technologies for startupTechnologies for startup
Technologies for startup
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...
Introducing Langsmith_ Your All-in-One Solution for Debugging, Testing, Evalu...
 
The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLMCrafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stack
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 

More from Anant Corporation

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137Anant Corporation
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfAnant Corporation
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotAnant Corporation
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...Anant Corporation
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowAnant Corporation
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & FutureAnant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraAnant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionAnant Corporation
 

More from Anant Corporation (20)

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
 

Recently uploaded

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

LLMs Inside Bootcamp Fundamentals

  • 1. NoCode, Data & AI LLM Inside Bootcamp Fundamentals of LLM What is a large language model, how is it trained, how are different from traditional machine learning models. Rahul Xavier Singh Anant Corporation Nocode Data & AI
  • 2. To most , LLMs seem like magic. In computing & technology, LLMs show great promise in bridging the gap between human computer interaction.
  • 4. NoCode, Data & AI LLM Inside Bootcamp with Cassandra Full day bootcamp to familiarize product managers, software professionals, and data engineers to creating next generation experts, assistants, and platforms powered by Generative AI with Large Language Models (LLM, OpenAI, GPT) Rahul Xavier Singh Anant Corporation Nocode Data & AI kono.io/bootcamp
  • 5. Agenda ● I: Strategy & Theory ● II: LLM Design Patterns ● III: NoCode/Code LLM Stacks ● IV: Build a Custom ChatBot with LLM your Data
  • 6. Today’s Agenda 1. Fundamentals of ML 2. Transformers Architecture 3. How LLMs Work 4. LLMs other than ChatGPT/GPT
  • 7. Fundamentals of ML/Transformers ● History of LLMs (Large Language Models) ● What is Machine Learning / AI? ● Transformer Architecture
  • 8. History of Large Language Models 1. Everything before GPT-3 (2020) was trash. 2. ChatGPT made GPT-3 popular. 3. Now everyone wants in on the party. https://voicebot.ai/large-language-models -history-timeline/ Most of the hype, growth relating to LLMs have happened in the last 6 months ( November 2022 till now , May 2023
  • 9. Machine Learning in a Nutshell https://www.avenga.com/magazine/machi ne-learning-programming/ 1. In machine learning, the computer trains on your data, and gives you the most likely answer. The better the data, the better the algorithm. 2. Neural networks process input data through layers to predict outcomes based on patterns and relationships learned during training.
  • 10. What can Neural Neworks do? https://thedatascientist.com/wp-content/uploads/2018/03/Deep-Neural- Network-What-is-Deep-Learning-Edureka.png 1. Artificial neural networks (ANN) can recognize patterns and relationships in data. 2. They can classify and categorize data accurately. 3. They can make predictions based on input data. 4. Neural networks can be used for image and speech recognition. 5. Deep neural network is an ANN that has many layers and can do more complex predictions. 6. They can be trained to improve their accuracy over time. https://www.analyticsvidhya.com/blog/202 1/05/convolutional-neural-networks-cnn/
  • 11. What is the big deal about Transformers? 1. Because ANNs are implementations in matrix math - and that relates to the Matrix of Leadership … 2. Transformers improve natural language processing, enabling better chatbots and language translation tools. 3. Transformers are a neural network architecture that outperforms previous models on various NLP tasks. 4. Attention mechanisms in Transformers better model long-term dependencies in sequential data. 5. Transformers are a hardware accelerator that speeds up AI computations by several orders of magnitude. 6. Transformers were invented by Elon Musk The encoder-decoder structure of the Transformer architecture Taken from “Attention Is All You Need“
  • 12. How LLMs Work & What LLMs Do ● Transformers Decoder/Encoder ● What LLMs Do: Predict Words ● What LLMs Do: Narrow Possibilities ● What LLMs Do: Verse Jumping ● What LLMs Do: Document Construction
  • 13. How does a Large Language Model Work? 1. The transformer architecture consists of two components: the encoder and decoder. 2. The encoder processes the input sequence and generates embeddings through self-attention mechanisms. 3. The decoder takes the encoder's embeddings as input and generates an output sequence, while also using self-attention mechanisms to attend to relevant parts of the input sequence. 4. Together, they enable the transformer to learn complex patterns and relationships within sequences, making it a powerful tool for natural language processing and other sequence modeling tasks. The encoder-decoder structure of the Transformer architecture Taken from “Attention Is All You Need“
  • 14. What LLMs Do: Predict Words 1. A language model uses deep learning algorithms to learn patterns and relationships in large sets of text data. 2. It is trained on a large corpus of text, such as books, articles, and websites, to recognize and understand the underlying structure and meaning of language. 3. Once trained, the model can generate new text based on the input it receives, by predicting the most likely sequence of words to follow. 4. The model uses a probabilistic approach to generate text, allowing it to produce diverse and creative responses to different inputs. 5. LLMs have a wide range of applications, including language translation, chatbots, content creation, and more. https://vectara.com/avoiding-hallucinations-in-llm-powered-applications/
  • 15. What LLMs Do: Narrow Possibilities 1. A LLM is like a really smart guesser that's been trained on a lot of text. 2. When you give it a prompt, it starts guessing what the next word might be. 3. Instead of guessing randomly, it predicts the best possible word. 4. As you add words to your prompt, you are narrowing down the overall “document” you get back.
  • 16. What LLMs Do: Verse Jumping 1. It’s a simulator of the real world, but it isn’t a real world. Each prompt is a portal to a a possible realistic universe. 2. It contains probabilities of words or tokens from the tokenverse strung together which we can call a “Document” 3. As you give it more words, the universe of possible “Documents” reduces. https://now.tufts.edu/2022/05/31/exploring-shape- our-universe-and-multiverse
  • 17. What LLMs Do: Document Construction 1. Each model has a “tokenverse” which it picks words from. GPT4 has 100k tokens. 2. Document A & Document B are possible path through all of the tokens in the tokenverse for a particular model. 3. If you start with certain words, a Prompt A’, the possibility of getting Document A increases 4. A’ B’
  • 18. LLMs other than ChatGPT/GPT ● Popular LLMs Available ● Popular Open Source LLMs Available ● Cloud Providers LLM Offerings
  • 19. Popular Public LLMs Available Today 1. OpenAI: ChatGPT, GPT3.5-Turbo, Text-Davinci-003, GPT4 (Waitlist) 2. Anthropic: Claude, Claude-Instant 3. Cohere: Baseline, allows training https://vectara.com/top-large-language-models-llms-gpt-4-llama-g ato-bloom-and-when-to-choose-one-over-the-other/ If you are starting out, just use GPT-3.5 Turbo. It’s easy to get access to, and there are lots of code examples on Github
  • 20. Leaked @Google: “We Have No Moat…” “We Have No Moat, And Neither Does OpenAI" https://lmsys.org/blog/2023-03-30-vicuna/ https://www.semianalysis.com/p/google-we-have- no-moat-and-neither ● Meta LLaMa Open Sourced ● GPT Answers used to Train ● LoRA - Low rank adaptation ● Retraining models is hard ● Small models iterating better ● Data quality scales better ● Battling open source means failure ● Companies need users / researchers ● Individuals can use different licenses ● Be your customer ● Let open source do the work ● OpenAI no different than Google
  • 21. Example Open LLM: Stanford Alpaca https://crfm.stanford.edu/2023/03/13/alpaca.html https://lmsys.org/blog/2023-03-30-vicuna/
  • 22. Popular Open LLMs Available Today Leaderboard 1. Vicuna-13b 2. Koala-13b 3. Oast-pythia-12b Others to Look into 4. StableLM 5. Dolly 6. ChatGLM https://chat.lmsys.org/ If you don’t want to send your data to a public LLM, you can host your own open model, or use Azure OpenAI, Amazon Bedrock
  • 23. Cost of Fine Tuning: Alpaca/Vicuna https://lmsys.org/blog/2023-03-30-vicuna/
  • 24. Public Cloud Offerings of LLM 1. Azure OpenAI 2. Amazon Bedrock 3. NVidia NeMo 4. Google Vertex (batteries not Included) https://venturebeat.com/ai/amazon-launches-bedrock-for-generative -ai-escalating-ai-cloud-wars/ Azure OpenAI is the most mature, and probably the best. Amazon’s Bedrock offers managed hosting of Claude, StableLM, etc. Google’s offering requires work to get it to work.
  • 25. 25 Key Takeaways: History Foundations of LLM Neural Networks : 1940s/50s Transformers/Attention: 2017 GPT3: 2020, GPT3.5: 2022 Tensorflow : 2015/ Pytorch 2016 - People have been hacking away at ML/AI since the 1940s. Until GPUs, TPUs, Cloud Infrastructure, very few companies could do “Deep Learning” - Deep Learning enabled great stuff in vision, speech, and starts to generative AI. It wasn’t until the Transformers paper, that things took off. - LLMs are good at predicting the “next word” or token from a tokenverse given an input. - The quality / characteristics of the prompt given, narrows down a Document from a multiverse of documents. TPUs / GPT: 2018, GPT2: 2019 Everything Else: 2023 Q1/Q2
  • 26. 26 Thank you and Dream Big. Hire us - Design Workshops - Innovation Sprints - Service Catalog Anant.us - Read our Playbook - Join our Mailing List - Read up on Data Platforms - Watch our Videos - Download Examples