SlideShare a Scribd company logo
1 of 22
H2O.ai Confidential
PASCAL PFEIFFER
Principal Data Scientist, H2O.ai
#4 KGM
H2O.ai Confidential
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse
Beyond Current Limitations
H2O.ai Confidential
H2O.ai Confidential
How Can You Use LLMs?
“Everyone needs to use GenAI”
…, but where do I start?
H2O.ai Confidential
Retrieval Augmented Generation
● Answers are generated with additional
context from external knowledge
database.
● Grounding LLM answers to the
retrieved knowledge/facts.
● LLMs do not need to be retrained when
new knowledge is added or modified.
● H2OGPT / LangChain
Some Good Use-Cases
If you haven’t already started exploring
Summarization
● Business related summaries can be a
disruptive use-case. Every meeting of a
company can be recorded, transcribed,
and then summarized using LLM.
● Needs long-context models or iterative
reduction techniques.
● Intrinsically grounded LLM output.
Residual risk ok misinterpretations.
Creative Writing
● Promising area of interest for short
texts.
● Limited in creativity and novelty. Risk of
hallucination as it is challenging to
ground the generated text.
● LLM as a coworker/copilot.
v
H2O.ai Confidential
check out genai.h2o.ai for more inspiration
Function calling
Classification
Regression
Coding
Writing Helper
From simple spell and
grammar checking up to
professional rephrasing.
Broad applications when
LLMs are trained to call
APIs. Usage of external
tools (simple calculator
or browsing internet).
Translate natural
language to SQL queries
Use LLMs for classical
NLP use-cases such as
classification and
regression tasks. Can also
be retrieval augmented.
Writing code will be a
valuable task for LLMs
especially for small fine
grained tasks. This will
enable a lot more
software engineers to be
10x or the already 10x to
be 100x.
Autonomous
Agents
Fully automate repetitive
workflows. Make use of
function calling and
natural language
interaction. Requires LLM
memory management.
… Much More
There is more
Complexity
Function calling
Broad applications when
LLMs are trained to call
APIs. Usage of external
tools (simple calculator
or browsing internet).
Translate natural
language to SQL queries
Classification
Regression
Use LLMs for classical
NLP use-cases such as
classification and
regression tasks. Can also
be retrieval augmented.
H2O.ai Confidential
“How do I …?”
“Summarize!”
“What are …?”
Technical Ways to use
LLMs
Best practice Consider
Prompt engineering
Be very detailed in your instructions.
Use prompts with all relevant
information.
Ask for quality answer.
Refine your prompt.
Bias from prompt and from model.
Limited to train knowledge.
Hallucinations
Raw prompt engineering may
struggle in certain use-cases
Add context
Retrieve relevant context
Context placement.
Explore few shot prompting
Usually has a sweet spot for amount
of context (depends on use case,
start with 5 paragraphs / 512
characters each)
Fine-tuning
Build a proper evaluation first.
Start with small models.
Data is important.
Use H2O LLM Studio to quickly iterate
Needs training data.
Beware of fine-tuning for factuality.
Can be great for specific use-cases.
No prompt engineering needed.
RLHF
Model must be fine-tuned with in-
distribution data first.
PPO: requires a lot of high quality data
to train a good reward model.
DPO: loss is calculated directly on
accept/reject pairs.
Next token loss is replaced by a
scalar.
Very research oriented. Proven to be
useful for chat alignment.
PPO (with reward model) or DPO
(without reward model).
Increasing
Complexity
Always max out
the options above
(or already have
training data)
v
H2O.ai Confidential
Meetings
Expand Knowledge
Opportunities
Logs & Documents
● Identify work
duplication
● Find new synergies
● Improve staff
motivation by
providing up-to-date
information
● Personalized updates
on any related tasks
● Discover trends with
summaries on
summaries
LLM summarization
Business Intelligence
Automate Information Flow in Your Company
Demos & Presentations
H2O.ai Confidential
Transcript Summary
Let’s fine-tune our own summarization model.
H2O.ai Confidential
H2O.ai Confidential
Solve with Prompt Engineering
In this talk, Andrew Ng discusses how AI can empower businesses of all sizes. He compares the rise of AI to the rise of literacy, explaining that just as literacy was once
limited to a few, AI is currently concentrated in the hands of highly skilled engineers at big tech companies. Ng believes that society can benefit from enabling everyone
to contribute to AI development. He explains that AI projects have been expensive to build, requiring skilled engineers and significant investment. Large tech companies
have been successful in making these investments pay off because they have millions or billions of users to generate revenue from their AI systems. However, this model
does not work for smaller businesses that do not have the same scale or resources. Ng gives examples of how AI can be beneficial to small businesses, such as a pizza
store owner using AI to improve sales by analyzing data and spotting patterns. He emphasizes that AI can work with modest amounts of data and that the real challenge
is the cost of hiring an AI team for small businesses. Ng proposes that the solution lies in new AI development platforms that focus on providing data rather than writing
code. These platforms allow individuals to upload data and train AI systems without extensive coding knowledge. He gives an example of an inspector using a platform to
teach an AI system how to detect defects in fabric. Ng acknowledges that these platforms are still evolving but believes they will become more accessible in the future.
He concludes by stating that democratizing access to AI is crucial for spreading wealth across society. Ng envisions a future where every accountant, store manager,
buyer, and quality inspector can build their own AI systems. He compares the impact of democratizing AI to the impact of widespread literacy, emphasizing the exciting
possibilities it holds for the future.
H2O.ai Confidential
Solve with Prompt Engineering
Sure! Here is a summary of Andrew Ng's talk "How AI could empower any business" in a single paragraph:
Andrew Ng discusses the current state of AI, where it is primarily controlled by a select few "high priests
and priestesses" in big tech companies, and how this limits the potential benefits of AI for society as a
whole. He argues that AI has the potential to create tremendous wealth, but only if it is democratized and
made accessible to everyone. To achieve this, he suggests that we need to shift the focus from asking
people to write lots of code to providing data, and highlights the emergence of new AI development
platforms that make this possible. He believes that empowering everyone to build AI systems will lead to
an incredibly exciting future, where small businesses and individuals can use AI to improve their operations
and create new opportunities.
H2O.ai Confidential
Solve with Prompt Engineering
Sure! Here's the summary of Andrew Ng's talk "How AI could empower any business" in a single paragraph:
Andrew Ng discusses the potential of AI to empower any business, not just large tech companies. He
highlights the long-tail of AI projects that are valuable but not being addressed, and the need for a
democratic approach to AI development. Ng introduces a new way to build AI systems that focuses on
providing data rather than writing code, making it easier for small businesses and individuals to participate.
He illustrates the potential of these platforms with examples of T-shirt demand forecasting and quality
control, and emphasizes the importance of empowering everyone to build AI systems for themselves to
create a more equitable distribution of wealth.
H2O.ai Confidential
v
H2O.ai Confidential
Maximize baseline score
with prompt engineering
first. Choose maximum
tokens, that most or all
samples fit.
Prompt
Evaluate on sub-epochs
to see if your model
saturates quickly and
overfits. Manually inspect
the output and test some
hand-crafted samples.
Evaluate Often
Even with (Q)LoRA,
learning rate still has a
significant impact on the
training. Start with 0.0001
and AdamW.
Learning Rate
Spend quality time on
what the model is seeing.
Add context? Name, Title,
Text. Experiment with
order and delimiters.
Modify Prompt
Start with LoRA rank 4,
alpha 16, no-dropout.
Apply to all linear layers.
Decrease/increase LoRA
rank & alpha (2-4 * rank),
add a bit of dropout.
LoRA
Only very late, try to scale
up to larger models or
more (diverse) training
data. Also try to scale
down!
Scale Up
What do I Tune First?
So many hyperparameters
H2O.ai Confidential
H2O.ai Confidential
In standard SFT training, Models are
rewarded for predicting the next
token. Factuality makes only a
miniscule fraction of the train loss.
Hallucinations Lack of Reasoning
02
01
They can easily fool us with eloquent
output.
They are only trained on text, and
don’t have deep understanding of
nature, math.
They don’t have any deeper goals or
desires.
…
Models inherit the bias from the
training corpus.
Models may struggle in classes
of applications, e.g. order
inversion, advanced math.
Models only have the
knowledge that they were
trained on (knowledge cutoff).
Models are susceptible to
prompt injection, “jailbreak”
attacks, data poisoning
attacks,…
Beyond Current Limits
A glimpse
v
H2O.ai Confidential
Hallucinations
LLMs are not a database search engine. Hallucinations (factual
incorrectness or made up context that “sounds good”) can be
very hard to spot.
By the architecture of LLMs, hallucinations can never be 100%
avoided but there are a few options to mitigate them:
RAG
Knowledge Graphs
Ground the answers by prompting with
high level abstractions and concepts first.
Provide Context
If very low rate of hallucinations is
important for your task, consider using the
best and largest models out there.
Larger/Better Models
Lower temperature or even greedy
decoding will produce more condensed
answers but they will be likely more
factual correct.
Low Temperature
Recommendations:
● Use in low-stakes applications,
combine with human oversight
● Source of inspiration suggestions
● Copilots over autonomous agents
Andrej Karpathy, State of GPT
v
H2O.ai Confidential
Reasoning
Fast, automatic, frequent, emotional,
stereotypic, unconscious.
- determine that an object is at a
greater distance than another
- localize the source of a specific
sound
- complete the phrase "war and ..."
- display disgust when seeing a
gruesome image
- solve 2+2=?
- read text on a billboard
- drive a car on an empty road
- understand simple sentences
System 1 (98 % of time)
Slow, effortful, infrequent, logical,
calculating, conscious.
- direct your attention towards the
clowns at the circus
- sustain a faster-than-normal
walking rate
- count the number of A's in a
certain text
- park into a tight parking space
- determine the price/quality ratio
of two washing machines
- determine the validity of a
complex logical reasoning
- solve 17 × 24
System 2 (2 % of time)
H2O.ai Confidential
LLMs are stochastic parrots
LLMs do not think, nor have consciousness.
But models can learn internal representations of
states as has been shown with Othello and Chess
(GPT-4 can play at around ELO 1800)
Do LLMs Have an Internal State?
v
H2O.ai Confidential
If the task is not solved intuitively …
… consider giving the model time (tokens) to
think about the answer.
Trick Models Into System 2
Chain-of-thought & Reflection
Instruct the model to go through it’s thoughts
step by step. Let the model reflect about its
own answer. While this has been proposed
quite a few times, I personally haven’t seen it
working well and the model is very biased
towards rewoking the original answer.
Decomposition
Decomposition of one task into multiple
smaller ones.
Step-back prompting
Retrieve background information and abstract
the question first. This grounds the answer to
fundamental concepts or principles.
H2O.ai Confidential

More Related Content

What's hot

Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
OzgurOscarOzkan
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Po-Chuan Chen
 

What's hot (20)

Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
Generative AI for the rest of us
Generative AI for the rest of usGenerative AI for the rest of us
Generative AI for the rest of us
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
Generative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second SessionGenerative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second Session
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfExploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdf
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 

Similar to Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Current Limitations

Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
NadinaLisbon1
 
The Lean Startup EA edition
The Lean Startup EA editionThe Lean Startup EA edition
The Lean Startup EA edition
Eric Ries
 

Similar to Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Current Limitations (20)

BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
 
DataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdfDataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdf
 
Using the Emerging Technology Analysis Canvas
Using the Emerging Technology Analysis CanvasUsing the Emerging Technology Analysis Canvas
Using the Emerging Technology Analysis Canvas
 
Augment the Human
Augment the HumanAugment the Human
Augment the Human
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
 
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Ai open powermeetupmarch25th_latest
Ai open powermeetupmarch25th_latestAi open powermeetupmarch25th_latest
Ai open powermeetupmarch25th_latest
 
How-to-Build-a-Career-in-AI.pdf
How-to-Build-a-Career-in-AI.pdfHow-to-Build-a-Career-in-AI.pdf
How-to-Build-a-Career-in-AI.pdf
 
Build a Career in AI
Build a Career in AIBuild a Career in AI
Build a Career in AI
 
Machine learning for Marketers
Machine learning for MarketersMachine learning for Marketers
Machine learning for Marketers
 
AI 2023.pdf
AI 2023.pdfAI 2023.pdf
AI 2023.pdf
 
My programming and machine learning linked in notes 2021 part 1
My programming and machine learning linked in notes   2021 part 1My programming and machine learning linked in notes   2021 part 1
My programming and machine learning linked in notes 2021 part 1
 
Bailing Out Your Business with Open Source
Bailing Out Your Business with Open SourceBailing Out Your Business with Open Source
Bailing Out Your Business with Open Source
 
Cinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsCinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patterns
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
The Lean Startup EA edition
The Lean Startup EA editionThe Lean Startup EA edition
The Lean Startup EA edition
 

More from Sri Ambati

Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
Sri Ambati
 

More from Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
 
Scaling & Managing Production Deployments with H2O ModelOps
Scaling & Managing Production Deployments with H2O ModelOpsScaling & Managing Production Deployments with H2O ModelOps
Scaling & Managing Production Deployments with H2O ModelOps
 
Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Current Limitations

  • 1. H2O.ai Confidential PASCAL PFEIFFER Principal Data Scientist, H2O.ai #4 KGM
  • 2. H2O.ai Confidential Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Current Limitations
  • 4. H2O.ai Confidential How Can You Use LLMs? “Everyone needs to use GenAI” …, but where do I start?
  • 5. H2O.ai Confidential Retrieval Augmented Generation ● Answers are generated with additional context from external knowledge database. ● Grounding LLM answers to the retrieved knowledge/facts. ● LLMs do not need to be retrained when new knowledge is added or modified. ● H2OGPT / LangChain Some Good Use-Cases If you haven’t already started exploring Summarization ● Business related summaries can be a disruptive use-case. Every meeting of a company can be recorded, transcribed, and then summarized using LLM. ● Needs long-context models or iterative reduction techniques. ● Intrinsically grounded LLM output. Residual risk ok misinterpretations. Creative Writing ● Promising area of interest for short texts. ● Limited in creativity and novelty. Risk of hallucination as it is challenging to ground the generated text. ● LLM as a coworker/copilot.
  • 6. v H2O.ai Confidential check out genai.h2o.ai for more inspiration Function calling Classification Regression Coding Writing Helper From simple spell and grammar checking up to professional rephrasing. Broad applications when LLMs are trained to call APIs. Usage of external tools (simple calculator or browsing internet). Translate natural language to SQL queries Use LLMs for classical NLP use-cases such as classification and regression tasks. Can also be retrieval augmented. Writing code will be a valuable task for LLMs especially for small fine grained tasks. This will enable a lot more software engineers to be 10x or the already 10x to be 100x. Autonomous Agents Fully automate repetitive workflows. Make use of function calling and natural language interaction. Requires LLM memory management. … Much More There is more Complexity Function calling Broad applications when LLMs are trained to call APIs. Usage of external tools (simple calculator or browsing internet). Translate natural language to SQL queries Classification Regression Use LLMs for classical NLP use-cases such as classification and regression tasks. Can also be retrieval augmented.
  • 7. H2O.ai Confidential “How do I …?” “Summarize!” “What are …?” Technical Ways to use LLMs Best practice Consider Prompt engineering Be very detailed in your instructions. Use prompts with all relevant information. Ask for quality answer. Refine your prompt. Bias from prompt and from model. Limited to train knowledge. Hallucinations Raw prompt engineering may struggle in certain use-cases Add context Retrieve relevant context Context placement. Explore few shot prompting Usually has a sweet spot for amount of context (depends on use case, start with 5 paragraphs / 512 characters each) Fine-tuning Build a proper evaluation first. Start with small models. Data is important. Use H2O LLM Studio to quickly iterate Needs training data. Beware of fine-tuning for factuality. Can be great for specific use-cases. No prompt engineering needed. RLHF Model must be fine-tuned with in- distribution data first. PPO: requires a lot of high quality data to train a good reward model. DPO: loss is calculated directly on accept/reject pairs. Next token loss is replaced by a scalar. Very research oriented. Proven to be useful for chat alignment. PPO (with reward model) or DPO (without reward model). Increasing Complexity Always max out the options above (or already have training data)
  • 8. v H2O.ai Confidential Meetings Expand Knowledge Opportunities Logs & Documents ● Identify work duplication ● Find new synergies ● Improve staff motivation by providing up-to-date information ● Personalized updates on any related tasks ● Discover trends with summaries on summaries LLM summarization Business Intelligence Automate Information Flow in Your Company Demos & Presentations
  • 9. H2O.ai Confidential Transcript Summary Let’s fine-tune our own summarization model.
  • 11. H2O.ai Confidential Solve with Prompt Engineering In this talk, Andrew Ng discusses how AI can empower businesses of all sizes. He compares the rise of AI to the rise of literacy, explaining that just as literacy was once limited to a few, AI is currently concentrated in the hands of highly skilled engineers at big tech companies. Ng believes that society can benefit from enabling everyone to contribute to AI development. He explains that AI projects have been expensive to build, requiring skilled engineers and significant investment. Large tech companies have been successful in making these investments pay off because they have millions or billions of users to generate revenue from their AI systems. However, this model does not work for smaller businesses that do not have the same scale or resources. Ng gives examples of how AI can be beneficial to small businesses, such as a pizza store owner using AI to improve sales by analyzing data and spotting patterns. He emphasizes that AI can work with modest amounts of data and that the real challenge is the cost of hiring an AI team for small businesses. Ng proposes that the solution lies in new AI development platforms that focus on providing data rather than writing code. These platforms allow individuals to upload data and train AI systems without extensive coding knowledge. He gives an example of an inspector using a platform to teach an AI system how to detect defects in fabric. Ng acknowledges that these platforms are still evolving but believes they will become more accessible in the future. He concludes by stating that democratizing access to AI is crucial for spreading wealth across society. Ng envisions a future where every accountant, store manager, buyer, and quality inspector can build their own AI systems. He compares the impact of democratizing AI to the impact of widespread literacy, emphasizing the exciting possibilities it holds for the future.
  • 12. H2O.ai Confidential Solve with Prompt Engineering Sure! Here is a summary of Andrew Ng's talk "How AI could empower any business" in a single paragraph: Andrew Ng discusses the current state of AI, where it is primarily controlled by a select few "high priests and priestesses" in big tech companies, and how this limits the potential benefits of AI for society as a whole. He argues that AI has the potential to create tremendous wealth, but only if it is democratized and made accessible to everyone. To achieve this, he suggests that we need to shift the focus from asking people to write lots of code to providing data, and highlights the emergence of new AI development platforms that make this possible. He believes that empowering everyone to build AI systems will lead to an incredibly exciting future, where small businesses and individuals can use AI to improve their operations and create new opportunities.
  • 13. H2O.ai Confidential Solve with Prompt Engineering Sure! Here's the summary of Andrew Ng's talk "How AI could empower any business" in a single paragraph: Andrew Ng discusses the potential of AI to empower any business, not just large tech companies. He highlights the long-tail of AI projects that are valuable but not being addressed, and the need for a democratic approach to AI development. Ng introduces a new way to build AI systems that focuses on providing data rather than writing code, making it easier for small businesses and individuals to participate. He illustrates the potential of these platforms with examples of T-shirt demand forecasting and quality control, and emphasizes the importance of empowering everyone to build AI systems for themselves to create a more equitable distribution of wealth.
  • 15. v H2O.ai Confidential Maximize baseline score with prompt engineering first. Choose maximum tokens, that most or all samples fit. Prompt Evaluate on sub-epochs to see if your model saturates quickly and overfits. Manually inspect the output and test some hand-crafted samples. Evaluate Often Even with (Q)LoRA, learning rate still has a significant impact on the training. Start with 0.0001 and AdamW. Learning Rate Spend quality time on what the model is seeing. Add context? Name, Title, Text. Experiment with order and delimiters. Modify Prompt Start with LoRA rank 4, alpha 16, no-dropout. Apply to all linear layers. Decrease/increase LoRA rank & alpha (2-4 * rank), add a bit of dropout. LoRA Only very late, try to scale up to larger models or more (diverse) training data. Also try to scale down! Scale Up What do I Tune First? So many hyperparameters
  • 17. H2O.ai Confidential In standard SFT training, Models are rewarded for predicting the next token. Factuality makes only a miniscule fraction of the train loss. Hallucinations Lack of Reasoning 02 01 They can easily fool us with eloquent output. They are only trained on text, and don’t have deep understanding of nature, math. They don’t have any deeper goals or desires. … Models inherit the bias from the training corpus. Models may struggle in classes of applications, e.g. order inversion, advanced math. Models only have the knowledge that they were trained on (knowledge cutoff). Models are susceptible to prompt injection, “jailbreak” attacks, data poisoning attacks,… Beyond Current Limits A glimpse
  • 18. v H2O.ai Confidential Hallucinations LLMs are not a database search engine. Hallucinations (factual incorrectness or made up context that “sounds good”) can be very hard to spot. By the architecture of LLMs, hallucinations can never be 100% avoided but there are a few options to mitigate them: RAG Knowledge Graphs Ground the answers by prompting with high level abstractions and concepts first. Provide Context If very low rate of hallucinations is important for your task, consider using the best and largest models out there. Larger/Better Models Lower temperature or even greedy decoding will produce more condensed answers but they will be likely more factual correct. Low Temperature Recommendations: ● Use in low-stakes applications, combine with human oversight ● Source of inspiration suggestions ● Copilots over autonomous agents Andrej Karpathy, State of GPT
  • 19. v H2O.ai Confidential Reasoning Fast, automatic, frequent, emotional, stereotypic, unconscious. - determine that an object is at a greater distance than another - localize the source of a specific sound - complete the phrase "war and ..." - display disgust when seeing a gruesome image - solve 2+2=? - read text on a billboard - drive a car on an empty road - understand simple sentences System 1 (98 % of time) Slow, effortful, infrequent, logical, calculating, conscious. - direct your attention towards the clowns at the circus - sustain a faster-than-normal walking rate - count the number of A's in a certain text - park into a tight parking space - determine the price/quality ratio of two washing machines - determine the validity of a complex logical reasoning - solve 17 × 24 System 2 (2 % of time)
  • 20. H2O.ai Confidential LLMs are stochastic parrots LLMs do not think, nor have consciousness. But models can learn internal representations of states as has been shown with Othello and Chess (GPT-4 can play at around ELO 1800) Do LLMs Have an Internal State?
  • 21. v H2O.ai Confidential If the task is not solved intuitively … … consider giving the model time (tokens) to think about the answer. Trick Models Into System 2 Chain-of-thought & Reflection Instruct the model to go through it’s thoughts step by step. Let the model reflect about its own answer. While this has been proposed quite a few times, I personally haven’t seen it working well and the model is very biased towards rewoking the original answer. Decomposition Decomposition of one task into multiple smaller ones. Step-back prompting Retrieve background information and abstract the question first. This grounds the answer to fundamental concepts or principles.