SlideShare a Scribd company logo
Practical GenAI: Understanding Large Language Models (LLMs)
10 Limitations of LLMs and mitigation options
Mihai Criveti, Principal Architect, CKA, RHCA III
September 10, 2023
1
1. Hallucinations
2. Performance
3. Inference Cost
4. Stale training data
5. Use with private data
6. Token limits / context window size
7. LLMs only support plain text
8. Lack of transparency / explainability
9. Ethical Concerns
10. Training and fine tuning costs
2
Introduction
Mihai Criveti, Principal Architect, Platform Engineering
• Responsible for large scale Cloud Native and AI Solutions
• Red Hat Certified Architect III, CKA/CKS/CKAD
• Drives the development of Inner Source Retrieval Augmentation Generation platforms, and solutions for
Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI
models.
Abstract
10 Limitations of Large Language Models and ways to overcome them. Dealing with hallucinations, performance,
costs, stale training data, injecting private data, token limits and contextual memory, text conversion, lack of
transparency, ethical concerns and training costs.
3
1. Hallucinations
1. Hallucinations
Because models are designed to produce coherent and fluent text, LLMs can ‘hallucinate’ and generate text that is
incorrect, but often seems plausible.
Lack of context or contextual understanding of the input prompt are key reasons why LLMs hallucinate.
4
Why hallucinations occur
Lack of context or contextual understanding
• The input prompt is contradictory, or unclear
• The prompt does not provide sufficient examples of the desired output
• The model lacks context to respond to the input, either in it’s dataset or the prompt
Data Quality and Training Method
• The model itself has been trained on biased, noisy, old, low quality or incorrect data
• For example, models trained on ‘twitter data’ or various forums can often contain large sections of incorrect
data
Generation Method
• Models and their weights might be biased towards specific languages, words or data
5
Hallucination Workarounds
Workarounds include advanced prompt engineering
• Adding a prompt such as: If a question does not make any sense, or is not factually
coherent, explain why instead of answering something not correct.
• Provide examples using one-shot prompting or few-shot prompting
And forms of Retrieval Augmented Generation
• Context injection and grounding to use-case-specific sources
• More advanced methods such as Retrieval Augmented Generation using a Vector Database
• Internet or API retrieval connectors and ‘plugins’
Other workarounds
• Using a more performant model that performs better at a given task, or fine tuning the models
• Testing the quality of responses, and providing an alternative model / answer
• Reinforcement learning from human feedback (RLHF).
6
Hallucination Workarounds: Prompting
LLAMA2 Prompt
“You are a helpful, respectful and honest assistant. Always answer as helpfully as
possible, while being safe. Your answers should not include any harmful, unethical,
racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses
are socially unbiased and positive in nature. If a question does not make any sense, or
is not factually coherent, explain why instead of answering something not correct. If
you don't know the answer to a question, please don't share false information."
7
Hallucination Workarounds: Retrieval Augmented Generation
Figure 1: RAG
8
2. Performance
2. Performance
Concerns
• Even the faster models are slower than a dial-up modem, or a fast typist!
• They also suffer from latency or time to first token.
• For most queries, expect 10–20 second response times from most models, and even with streaming, you’ll end
up waiting a few seconds for the first token to be generated!
Workarounds
• Throw money & hardware at the problem: more GPUs
• Use smaller models
• Generate fewer tokens
9
3. Inference Cost
3. Inference Cost
Concerns
• LLMs are expensive to run!
• Some of the top 180B parameter models may need as many as 5xA100 GPUs to run, while even quantized
versions of 70B LLAMA would take up a whole GPU! That’s one query at a time.
• The costs add up. For example, a dedicated A100 might cost as much as $20K a month with a cloud provider! A
brute force approach is going to be expensive.
Workarounds
• Use a quantized model - it trades off output quality for performance. 8-bit, 6-bit or even 4-bit quantization will
help you fit models into smaller, cheaper GPU vRAM, or use fewer GPUs.
• Use a smaller model: a quality,fine-tuned 13B may perform well enough for tasks such as summarization.
10
4. Stale training data
4. Stale training data
Concern
• Even top models haven’t been trained on ‘recent’ data, and have a cut-off date. Remember, a model doesn’t
‘have access to the internet’.
• While certain ‘plugins’ do offer ‘internet search’, it’s just a form of RAG, where ‘top 10 internet search query
results’ are fed into the prompt as context, for example.
Workarounds
• Using a more recent model
• Retraining the model
• Fine tuning
• Retrieval Augmented Generation
11
5. Use with private data
5. Use with private data
LLMs haven’t been trained on your private data, and as such, cannot answer questions based on our dataset, unless
that data is inject through fine tuning, or some form prompt engineering including RAG.
12
6. Token limits / context window size
6. Token limits / context window size
Concern
• Models are limited by the TOKEN_LIMIT, and most models can process, at best, a few pages of total input/output.
• This means you can’t just feed a model and entire document, and ask for a summary or extract facts from the
document.
Workaround
• You need to chunk documents into pages first, and perform multiple queries.
• Use a model with a larger token limit.
13
7. LLMs only support plain text
7. LLMs only support plain text
Concern
• While this sounds obvious (from the name), it also means you can’t just feed a PDF file or WORD document to a
LLM. You first need to convert that data to text, and chunk it to fit in the token limit, alongside your prompt and
some room for output.
• Conversion to text isn’t perfect. What happens to your images, or tables, or metadata? It also means models
can only output text. Formatting the text to output HTML or DOCX or other rich text formats requires a lot of
heavy lifting in our pipeline.
Mitigation
• Having a good data processing pipeline
• Multi-model approaches
14
8. Lack of transparency / explainability
8. Lack of transparency / explainability
Concern
• Why did the model generate a particular answer? While the LLM answer may not necessarily be correct, you can
display the source content that helped generate that answer.
Mitigation
• Content grounding
• Techniques such as RAG can help, as you are able to point at the ‘context’ that generated a particular answer,
and even display the context.
15
9. Ethical Concerns
9. Ethical Concerns
Concerns
Potential bias, hate, abuse, harm, ethical concerns, etc: sometimes, answers generated by an LLM can be outright
harmful. Using the RAG pattern, in addition to HARM filters can help mitigate some of these issues.
Mitigation
• Using open source models with know data lineage
• HARM filters
• Governance frameworks
• Content grounding
• Reinforcement learning from human feedback (RLHF)
16
10. Training and fine tuning costs
10. Training and fine tuning costs
Concern
The: “Training Hardware & Carbon Footprint” section from the LLAMA2 paper suggests a total of 3311616 GPU hours
was used to train LLAMA2 (7/13/34 and 70B)!
To put it in perspective, a 70B model like LLAMA2 might need ~2048 A100 GPUs for a month to train, adding up to
$20–40M training cost, not to mention what it takes to download and store the data.
Workaround
• Don’t train your own model: using a pre-trained model
• Open Source and Open Innovation: share learnings and training data, rather than having proprietary models.
17
Contact
This talk can be found on GitHub
• https://github.com/crivetimihai/overcome-llm-limitations
Social media
• https://twitter.com/CrivetiMihai - follow for more LLM content
• https://youtube.com/CrivetiMihai - more LLM videos to follow
• https://www.linkedin.com/in/crivetimihai/
18

More Related Content

What's hot

AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
Aggregage
 
Using Generative AI
Using Generative AIUsing Generative AI
Using Generative AI
Mark DeLoura
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
OzgurOscarOzkan
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
University of North Carolina at Charlotte
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Qualcomm Research
 
Cavalry Ventures | Deep Dive: Generative AI
Cavalry Ventures | Deep Dive: Generative AICavalry Ventures | Deep Dive: Generative AI
Cavalry Ventures | Deep Dive: Generative AI
Cavalry Ventures
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
Pekka Abrahamsson / Tampere University
 
The Future is in Responsible Generative AI
The Future is in Responsible Generative AIThe Future is in Responsible Generative AI
The Future is in Responsible Generative AI
Saeed Al Dhaheri
 
Generative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdfGenerative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdf
Manjunatha Sai
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
Maxim Salnikov
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
Keymate.AI
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfExploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdf
Dung Hoang
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
Steve Omohundro
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
Leon Dohmen
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
caa28steve
 

What's hot (20)

AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
 
Using Generative AI
Using Generative AIUsing Generative AI
Using Generative AI
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
Cavalry Ventures | Deep Dive: Generative AI
Cavalry Ventures | Deep Dive: Generative AICavalry Ventures | Deep Dive: Generative AI
Cavalry Ventures | Deep Dive: Generative AI
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
The Future is in Responsible Generative AI
The Future is in Responsible Generative AIThe Future is in Responsible Generative AI
The Future is in Responsible Generative AI
 
Generative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdfGenerative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdf
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfExploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdf
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
 

Similar to 10 Limitations of Large Language Models and Mitigation Options

Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Alok Singh
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
Edge AI and Vision Alliance
 
odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
Sanghamitra Deb
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
ankit_ppt
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
TriNimbus
 
Debugging machine-learning
Debugging machine-learningDebugging machine-learning
Debugging machine-learning
Michał Łopuszyński
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
Allen Chan
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practiceswebuploader
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
João Pedro Martins
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Ron Bodkin
 
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.aiPractical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Sri Ambati
 
Architecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons LearnedArchitecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons Learned
João Pedro Martins
 
LangChain + Docugami Webinar
LangChain + Docugami WebinarLangChain + Docugami Webinar
LangChain + Docugami Webinar
Taqi Jaffri
 
Promoting Cloud Inside Your Company
Promoting Cloud Inside Your CompanyPromoting Cloud Inside Your Company
Promoting Cloud Inside Your Company
RightScale
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth
 
Overcoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in businessOvercoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in business
Ahmed Fattah
 

Similar to 10 Limitations of Large Language Models and Mitigation Options (20)

Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
 
odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
 
Debugging machine-learning
Debugging machine-learningDebugging machine-learning
Debugging machine-learning
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practices
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017
 
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.aiPractical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
 
Architecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons LearnedArchitecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons Learned
 
LangChain + Docugami Webinar
LangChain + Docugami WebinarLangChain + Docugami Webinar
LangChain + Docugami Webinar
 
Promoting Cloud Inside Your Company
Promoting Cloud Inside Your CompanyPromoting Cloud Inside Your Company
Promoting Cloud Inside Your Company
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
Overcoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in businessOvercoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in business
 

More from Mihai Criveti

Ansible Workshop for Pythonistas
Ansible Workshop for PythonistasAnsible Workshop for Pythonistas
Ansible Workshop for Pythonistas
Mihai Criveti
 
Mihai Criveti - PyCon Ireland - Automate Everything
Mihai Criveti - PyCon Ireland - Automate EverythingMihai Criveti - PyCon Ireland - Automate Everything
Mihai Criveti - PyCon Ireland - Automate Everything
Mihai Criveti
 
Data Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps ApproachData Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps Approach
Mihai Criveti
 
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and KubernetesShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
Mihai Criveti
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Mihai Criveti
 
OpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image LifecycleOpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image Lifecycle
Mihai Criveti
 
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShiftKubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Mihai Criveti
 
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Mihai Criveti
 
Kubernetes Story - Day 1: Build and Manage Containers with Podman
Kubernetes Story - Day 1: Build and Manage Containers with PodmanKubernetes Story - Day 1: Build and Manage Containers with Podman
Kubernetes Story - Day 1: Build and Manage Containers with Podman
Mihai Criveti
 
Container Technologies and Transformational value
Container Technologies and Transformational valueContainer Technologies and Transformational value
Container Technologies and Transformational value
Mihai Criveti
 
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
Mihai Criveti
 
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
Mihai Criveti
 

More from Mihai Criveti (12)

Ansible Workshop for Pythonistas
Ansible Workshop for PythonistasAnsible Workshop for Pythonistas
Ansible Workshop for Pythonistas
 
Mihai Criveti - PyCon Ireland - Automate Everything
Mihai Criveti - PyCon Ireland - Automate EverythingMihai Criveti - PyCon Ireland - Automate Everything
Mihai Criveti - PyCon Ireland - Automate Everything
 
Data Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps ApproachData Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps Approach
 
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and KubernetesShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
OpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image LifecycleOpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image Lifecycle
 
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShiftKubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
 
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
 
Kubernetes Story - Day 1: Build and Manage Containers with Podman
Kubernetes Story - Day 1: Build and Manage Containers with PodmanKubernetes Story - Day 1: Build and Manage Containers with Podman
Kubernetes Story - Day 1: Build and Manage Containers with Podman
 
Container Technologies and Transformational value
Container Technologies and Transformational valueContainer Technologies and Transformational value
Container Technologies and Transformational value
 
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
OpenShift Commons - Adopting Podman, Skopeo and Buildah for Building and Mana...
 
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
 

Recently uploaded

一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

10 Limitations of Large Language Models and Mitigation Options

  • 1. Practical GenAI: Understanding Large Language Models (LLMs) 10 Limitations of LLMs and mitigation options Mihai Criveti, Principal Architect, CKA, RHCA III September 10, 2023 1
  • 2. 1. Hallucinations 2. Performance 3. Inference Cost 4. Stale training data 5. Use with private data 6. Token limits / context window size 7. LLMs only support plain text 8. Lack of transparency / explainability 9. Ethical Concerns 10. Training and fine tuning costs 2
  • 3. Introduction Mihai Criveti, Principal Architect, Platform Engineering • Responsible for large scale Cloud Native and AI Solutions • Red Hat Certified Architect III, CKA/CKS/CKAD • Drives the development of Inner Source Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models. Abstract 10 Limitations of Large Language Models and ways to overcome them. Dealing with hallucinations, performance, costs, stale training data, injecting private data, token limits and contextual memory, text conversion, lack of transparency, ethical concerns and training costs. 3
  • 5. 1. Hallucinations Because models are designed to produce coherent and fluent text, LLMs can ‘hallucinate’ and generate text that is incorrect, but often seems plausible. Lack of context or contextual understanding of the input prompt are key reasons why LLMs hallucinate. 4
  • 6. Why hallucinations occur Lack of context or contextual understanding • The input prompt is contradictory, or unclear • The prompt does not provide sufficient examples of the desired output • The model lacks context to respond to the input, either in it’s dataset or the prompt Data Quality and Training Method • The model itself has been trained on biased, noisy, old, low quality or incorrect data • For example, models trained on ‘twitter data’ or various forums can often contain large sections of incorrect data Generation Method • Models and their weights might be biased towards specific languages, words or data 5
  • 7. Hallucination Workarounds Workarounds include advanced prompt engineering • Adding a prompt such as: If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. • Provide examples using one-shot prompting or few-shot prompting And forms of Retrieval Augmented Generation • Context injection and grounding to use-case-specific sources • More advanced methods such as Retrieval Augmented Generation using a Vector Database • Internet or API retrieval connectors and ‘plugins’ Other workarounds • Using a more performant model that performs better at a given task, or fine tuning the models • Testing the quality of responses, and providing an alternative model / answer • Reinforcement learning from human feedback (RLHF). 6
  • 8. Hallucination Workarounds: Prompting LLAMA2 Prompt “You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information." 7
  • 9. Hallucination Workarounds: Retrieval Augmented Generation Figure 1: RAG 8
  • 11. 2. Performance Concerns • Even the faster models are slower than a dial-up modem, or a fast typist! • They also suffer from latency or time to first token. • For most queries, expect 10–20 second response times from most models, and even with streaming, you’ll end up waiting a few seconds for the first token to be generated! Workarounds • Throw money & hardware at the problem: more GPUs • Use smaller models • Generate fewer tokens 9
  • 13. 3. Inference Cost Concerns • LLMs are expensive to run! • Some of the top 180B parameter models may need as many as 5xA100 GPUs to run, while even quantized versions of 70B LLAMA would take up a whole GPU! That’s one query at a time. • The costs add up. For example, a dedicated A100 might cost as much as $20K a month with a cloud provider! A brute force approach is going to be expensive. Workarounds • Use a quantized model - it trades off output quality for performance. 8-bit, 6-bit or even 4-bit quantization will help you fit models into smaller, cheaper GPU vRAM, or use fewer GPUs. • Use a smaller model: a quality,fine-tuned 13B may perform well enough for tasks such as summarization. 10
  • 15. 4. Stale training data Concern • Even top models haven’t been trained on ‘recent’ data, and have a cut-off date. Remember, a model doesn’t ‘have access to the internet’. • While certain ‘plugins’ do offer ‘internet search’, it’s just a form of RAG, where ‘top 10 internet search query results’ are fed into the prompt as context, for example. Workarounds • Using a more recent model • Retraining the model • Fine tuning • Retrieval Augmented Generation 11
  • 16. 5. Use with private data
  • 17. 5. Use with private data LLMs haven’t been trained on your private data, and as such, cannot answer questions based on our dataset, unless that data is inject through fine tuning, or some form prompt engineering including RAG. 12
  • 18. 6. Token limits / context window size
  • 19. 6. Token limits / context window size Concern • Models are limited by the TOKEN_LIMIT, and most models can process, at best, a few pages of total input/output. • This means you can’t just feed a model and entire document, and ask for a summary or extract facts from the document. Workaround • You need to chunk documents into pages first, and perform multiple queries. • Use a model with a larger token limit. 13
  • 20. 7. LLMs only support plain text
  • 21. 7. LLMs only support plain text Concern • While this sounds obvious (from the name), it also means you can’t just feed a PDF file or WORD document to a LLM. You first need to convert that data to text, and chunk it to fit in the token limit, alongside your prompt and some room for output. • Conversion to text isn’t perfect. What happens to your images, or tables, or metadata? It also means models can only output text. Formatting the text to output HTML or DOCX or other rich text formats requires a lot of heavy lifting in our pipeline. Mitigation • Having a good data processing pipeline • Multi-model approaches 14
  • 22. 8. Lack of transparency / explainability
  • 23. 8. Lack of transparency / explainability Concern • Why did the model generate a particular answer? While the LLM answer may not necessarily be correct, you can display the source content that helped generate that answer. Mitigation • Content grounding • Techniques such as RAG can help, as you are able to point at the ‘context’ that generated a particular answer, and even display the context. 15
  • 25. 9. Ethical Concerns Concerns Potential bias, hate, abuse, harm, ethical concerns, etc: sometimes, answers generated by an LLM can be outright harmful. Using the RAG pattern, in addition to HARM filters can help mitigate some of these issues. Mitigation • Using open source models with know data lineage • HARM filters • Governance frameworks • Content grounding • Reinforcement learning from human feedback (RLHF) 16
  • 26. 10. Training and fine tuning costs
  • 27. 10. Training and fine tuning costs Concern The: “Training Hardware & Carbon Footprint” section from the LLAMA2 paper suggests a total of 3311616 GPU hours was used to train LLAMA2 (7/13/34 and 70B)! To put it in perspective, a 70B model like LLAMA2 might need ~2048 A100 GPUs for a month to train, adding up to $20–40M training cost, not to mention what it takes to download and store the data. Workaround • Don’t train your own model: using a pre-trained model • Open Source and Open Innovation: share learnings and training data, rather than having proprietary models. 17
  • 28. Contact This talk can be found on GitHub • https://github.com/crivetimihai/overcome-llm-limitations Social media • https://twitter.com/CrivetiMihai - follow for more LLM content • https://youtube.com/CrivetiMihai - more LLM videos to follow • https://www.linkedin.com/in/crivetimihai/ 18