SlideShare a Scribd company logo
1 of 24
Download to read offline
Fact vs. Fiction:
Auto-detecting Hallucinations in
LLMs
Morena Bastiaansen, 16.04.2024
Let’s start with a
little game…
In the 1980s, Saddam Hussein
was given the key to the city of
Detroit after donating $250,000
to a local church. The church’s
pastor, Jacob Yasso, calls the
former Iraqi president “a very
generous, warm man who just
let too much power go to his
head”.
Justin Bieber's DNA was sent
into space aboard a SpaceX
Falcon 9 rocket in 2021 as part
of a promotional collaboration
between Bieber and a
technology company.
BS
True!
Hi, my name is
Morena 👋
● Data Scientist at GetYourGuide
● Co-organiser of MLOps Community
meetups in Berlin
01
02
03
04
Agenda
What?
What are hallucinations in LLMs?
Why?
Why do LLMs hallucinate?
Methods
Different hallucination detection
methods
Summary
What are
hallucinations
in LLMs?
01
3
7
|
What are hallucinations in
LLMs?
● Generation of content that is
○ factually incorrect
○ nonsensical or unfaithful to the input context
● Intrinsic vs. extrinsic hallucination
○ Example: LLM summarizing Wikipedia page
about Paris
■ Intrinsic: “Paris has a population of 1
million residents”
■ Extrinsic: “Paris is home to the most
successful soccer team in France”
3
8
|
What are hallucinations in
LLMs?
● Hallucinations can be harmful in many ways –
especially if it’s hard to verify the information
○ Major challenge for deploying LLMs in prod
○ Potential harm for society
● Nature of models make them output false
content in a very convincing way
→ It’s becoming increasingly important to be
able to detect hallucinations in a structured and
quantitative way
Why do LLMs
hallucinate?
02
● Contradicting or false information in training data
● Complexity/novelty of task to perform
● Fundamental nature of the model:
○ LLMs are trained to predict tokens probabilistically
○ Text is broken down in tokens
○ Next token is predicted based on token and position embeddings
Why do LLMs hallucinate?
Hallucination
detection
methods
03
Hallucination detection methods
Uncertainty-based 🤔
● Leverage existing intrinsic
uncertainty metrics
● E.g. G-Eval with probability
normalization
Reference-based📚
● Measure generation
consistency against
provided reference
● E.g. RAGAS
Self-evaluation 🔍
● Prompt LLM to evaluate
its previous prediction
● E.g. Arize's Phoenix Evals
Consistency-based
🎯
● Stochastic sampling of
responses
● E.g. SelfCheckGPT
13 Presentation Template
Self-evaluation methods
Self-evaluation methods
● Prompt LLM to evaluate its previous prediction
● Evaluating response quality is an easier task than producing the
response
● E.g. text summarization task:
Prompt LLM
"Summarize the
following text: {}"
Prompt LLM
Context: {} Sentence: {}
Is the sentence
supported by the context
above? Answer Yes or No:
Break
response
down into
sentences
● Can be used in combination with other methods to improve reliability
○ Combine with reference-based methods
○ Combine with consistency-based methods, using a sampling
approach
Reference-based methods
Reference-based methods
● Measure generation consistency against provided reference
● Availability of reference depends on use case
○ Open QA
■ lack of references
○ RAG
■ references available through retrieval
○ Text summarization
■ references readily available
Break down
LLM response
into sentences
For each sentence,
calculate similarity
score with reference
e.g.
BERTScore
or ROUGE
Uncertainty-based methods
Uncertainty-based methods
● Leverage existing intrinsic uncertainty metrics to determine parts of
output sequence that the system is least certain of
● Require access to token-level probability distributions
Morena Bastiaansen is a Belgian writer and poet
Consistency-based methods
Consistency-based methods:
“If an LLM has knowledge of a given concept, sampled responses are
likely to be similar and contain consistent facts”
● Example (SelfCheckGPT):
○ Let R refer to an LLM response drawn from a given prompt
○ Draw a further N stochastic LLM response samples
{S1
,S2
,...,Sn
,...,SN
} using the same prompt
○ For each sentence rᵢ in R, for each sampled answer Sⁿ, measure
the consistency using some kind of similarity/inconsistency
score
■ BERTScore
■ NLI contradiction score
■ Self-evaluation: Prompt LLM (“Is the sentence supported by
the context above? Answer Yes or No”)
■ …
○ Aggregate these scores to compute the hallucination score of
sentence rᵢ, H(i) such that H(i) ∈ [0.0,1.0], where H(i) → 0.0 if the
i-th sentence is grounded in valid information and H(i) → 1.0 if
the i-th sentence is hallucinated
3
21
|
SelfCheckGPT:
Demo
● Demo
Hallucination detection methods
Uncertainty-based 🤔
Reference-based📚
Self-evaluation 🔍 Consistency-based
🎯
✅Simple and
straightforward to use
✅Useful for a variety of
tasks
❌Not very suitable for
extrinsic hallucinations
✅Useful for text
summarization, RAG
❌Requires references to
be readily available
❌Challenging for open
QA tasks/free text
generation
✅Useful for a variety of
tasks
❌ Requires access to
internal model states
✅Reference-free
✅Works for black box
LLMs
❌ Can be challenging in
real-time use cases
❌ Not useful for all tasks
(e.g. free text generation)
Thank you!
Sources
● Manakul, P., Liusie, A., Gales, M. J. F. (2023). SELFCHECKGPT:Zero-Resource
Black-Box Hallucination Detection for Generative Large Language Models
● Amatriain, X. (2024). Measuring and Mitigating Hallucinations in Large Language
Models: A Multifaceted Approach
● McKenna, N., Li, T., Cheng, L., Hosseini, M. J., Johnson, M., & Steedman, M.
(2023). Sources of Hallucination by Large Language Models on Inference Tasks.
● Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W.,
Feng, X., Qin, B., & Liu, T. (2023). A Survey on Hallucination in Large Language
Models: Principles, Taxonomy, Challenges, and Open Questions.
● Liu, T., Zhang, Y., Brockett, C., Mao, Y., Sui, Z., Chen, W., & Dolan, B. (2022). A
Token-level Reference-free Hallucination Detection Benchmark for Free-form Text
Generation.
● Yuan, W., Neubig, G., & Liu, P. (2021). BARTSCORE: Evaluating Generated Text
as Text Generation.
● Xu, Z., Jain, S., & Kankanhalli, M. (2024). Hallucination is Inevitable: An Innate
Limitation of Large Language Models.
● Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Chen, D., Chan,
H., Dai, W., & Madotto, A., Fung, P. (2024). Survey of Hallucination in Natural
Language Generation.
● Liu, Y., Iter, D., Xu, Y., Wang, S., Xu, R., & Zhu, C. (2023). G-EVAL: NLG
Evaluation using GPT-4 with Better Human Alignment

More Related Content

Similar to Fact vs. Fiction: Autodetecting Hallucinations in LLMs

Narrative Epistemology for Mathematics
Narrative Epistemology for MathematicsNarrative Epistemology for Mathematics
Narrative Epistemology for Mathematics
Yishay Mor
 
Foundation_Logic_1.pptx discrete mathematics
Foundation_Logic_1.pptx discrete mathematicsFoundation_Logic_1.pptx discrete mathematics
Foundation_Logic_1.pptx discrete mathematics
SherwinSangalang3
 
Connectionist and Dynamical Systems approach to Cognition
Connectionist and Dynamical Systems approach to CognitionConnectionist and Dynamical Systems approach to Cognition
Connectionist and Dynamical Systems approach to Cognition
cruzin008
 

Similar to Fact vs. Fiction: Autodetecting Hallucinations in LLMs (20)

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Yunchao he icot2015
Yunchao he icot2015Yunchao he icot2015
Yunchao he icot2015
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
 
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
XAI (IIT-Patna).pdf
XAI (IIT-Patna).pdfXAI (IIT-Patna).pdf
XAI (IIT-Patna).pdf
 
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
 
Towads Unsupervised Commonsense Reasoning in AI
Towads Unsupervised Commonsense Reasoning in AITowads Unsupervised Commonsense Reasoning in AI
Towads Unsupervised Commonsense Reasoning in AI
 
Slides.ltdca
Slides.ltdcaSlides.ltdca
Slides.ltdca
 
Narrative Epistemology for Mathematics
Narrative Epistemology for MathematicsNarrative Epistemology for Mathematics
Narrative Epistemology for Mathematics
 
Foundation_Logic_1.pptx discrete mathematics
Foundation_Logic_1.pptx discrete mathematicsFoundation_Logic_1.pptx discrete mathematics
Foundation_Logic_1.pptx discrete mathematics
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural Logic
 
Bayesian Reasoning and Learning
Bayesian Reasoning and LearningBayesian Reasoning and Learning
Bayesian Reasoning and Learning
 
Connectionist and Dynamical Systems approach to Cognition
Connectionist and Dynamical Systems approach to CognitionConnectionist and Dynamical Systems approach to Cognition
Connectionist and Dynamical Systems approach to Cognition
 
Objective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteObjective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynote
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
 
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging DataLearning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
 

More from Zilliz

More from Zilliz (19)

Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Zilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in ML
 
Integrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomIntegrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with Floom
 
Build streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizBuild streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and Zilliz
 
Beyond Retrieval Augmented Generation (RAG): Vector Databases
Beyond Retrieval Augmented Generation (RAG): Vector DatabasesBeyond Retrieval Augmented Generation (RAG): Vector Databases
Beyond Retrieval Augmented Generation (RAG): Vector Databases
 
Chunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector DatabasesChunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector Databases
 
Introduction to Large Language Model Customization.pdf
Introduction to Large Language Model Customization.pdfIntroduction to Large Language Model Customization.pdf
Introduction to Large Language Model Customization.pdf
 
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGVoyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
 
Voyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationVoyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented Generation
 
Chat with your data, privately and locally
Chat with your data, privately and locallyChat with your data, privately and locally
Chat with your data, privately and locally
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 release
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 

Fact vs. Fiction: Autodetecting Hallucinations in LLMs

  • 1. Fact vs. Fiction: Auto-detecting Hallucinations in LLMs Morena Bastiaansen, 16.04.2024
  • 2. Let’s start with a little game…
  • 3. In the 1980s, Saddam Hussein was given the key to the city of Detroit after donating $250,000 to a local church. The church’s pastor, Jacob Yasso, calls the former Iraqi president “a very generous, warm man who just let too much power go to his head”. Justin Bieber's DNA was sent into space aboard a SpaceX Falcon 9 rocket in 2021 as part of a promotional collaboration between Bieber and a technology company. BS True!
  • 4. Hi, my name is Morena 👋 ● Data Scientist at GetYourGuide ● Co-organiser of MLOps Community meetups in Berlin
  • 5. 01 02 03 04 Agenda What? What are hallucinations in LLMs? Why? Why do LLMs hallucinate? Methods Different hallucination detection methods Summary
  • 7. 3 7 | What are hallucinations in LLMs? ● Generation of content that is ○ factually incorrect ○ nonsensical or unfaithful to the input context ● Intrinsic vs. extrinsic hallucination ○ Example: LLM summarizing Wikipedia page about Paris ■ Intrinsic: “Paris has a population of 1 million residents” ■ Extrinsic: “Paris is home to the most successful soccer team in France”
  • 8. 3 8 | What are hallucinations in LLMs? ● Hallucinations can be harmful in many ways – especially if it’s hard to verify the information ○ Major challenge for deploying LLMs in prod ○ Potential harm for society ● Nature of models make them output false content in a very convincing way → It’s becoming increasingly important to be able to detect hallucinations in a structured and quantitative way
  • 10. ● Contradicting or false information in training data ● Complexity/novelty of task to perform ● Fundamental nature of the model: ○ LLMs are trained to predict tokens probabilistically ○ Text is broken down in tokens ○ Next token is predicted based on token and position embeddings Why do LLMs hallucinate?
  • 12. Hallucination detection methods Uncertainty-based 🤔 ● Leverage existing intrinsic uncertainty metrics ● E.g. G-Eval with probability normalization Reference-based📚 ● Measure generation consistency against provided reference ● E.g. RAGAS Self-evaluation 🔍 ● Prompt LLM to evaluate its previous prediction ● E.g. Arize's Phoenix Evals Consistency-based 🎯 ● Stochastic sampling of responses ● E.g. SelfCheckGPT
  • 14. Self-evaluation methods ● Prompt LLM to evaluate its previous prediction ● Evaluating response quality is an easier task than producing the response ● E.g. text summarization task: Prompt LLM "Summarize the following text: {}" Prompt LLM Context: {} Sentence: {} Is the sentence supported by the context above? Answer Yes or No: Break response down into sentences ● Can be used in combination with other methods to improve reliability ○ Combine with reference-based methods ○ Combine with consistency-based methods, using a sampling approach
  • 16. Reference-based methods ● Measure generation consistency against provided reference ● Availability of reference depends on use case ○ Open QA ■ lack of references ○ RAG ■ references available through retrieval ○ Text summarization ■ references readily available Break down LLM response into sentences For each sentence, calculate similarity score with reference e.g. BERTScore or ROUGE
  • 18. Uncertainty-based methods ● Leverage existing intrinsic uncertainty metrics to determine parts of output sequence that the system is least certain of ● Require access to token-level probability distributions Morena Bastiaansen is a Belgian writer and poet
  • 20. Consistency-based methods: “If an LLM has knowledge of a given concept, sampled responses are likely to be similar and contain consistent facts” ● Example (SelfCheckGPT): ○ Let R refer to an LLM response drawn from a given prompt ○ Draw a further N stochastic LLM response samples {S1 ,S2 ,...,Sn ,...,SN } using the same prompt ○ For each sentence rᵢ in R, for each sampled answer Sⁿ, measure the consistency using some kind of similarity/inconsistency score ■ BERTScore ■ NLI contradiction score ■ Self-evaluation: Prompt LLM (“Is the sentence supported by the context above? Answer Yes or No”) ■ … ○ Aggregate these scores to compute the hallucination score of sentence rᵢ, H(i) such that H(i) ∈ [0.0,1.0], where H(i) → 0.0 if the i-th sentence is grounded in valid information and H(i) → 1.0 if the i-th sentence is hallucinated
  • 22. Hallucination detection methods Uncertainty-based 🤔 Reference-based📚 Self-evaluation 🔍 Consistency-based 🎯 ✅Simple and straightforward to use ✅Useful for a variety of tasks ❌Not very suitable for extrinsic hallucinations ✅Useful for text summarization, RAG ❌Requires references to be readily available ❌Challenging for open QA tasks/free text generation ✅Useful for a variety of tasks ❌ Requires access to internal model states ✅Reference-free ✅Works for black box LLMs ❌ Can be challenging in real-time use cases ❌ Not useful for all tasks (e.g. free text generation)
  • 24. Sources ● Manakul, P., Liusie, A., Gales, M. J. F. (2023). SELFCHECKGPT:Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models ● Amatriain, X. (2024). Measuring and Mitigating Hallucinations in Large Language Models: A Multifaceted Approach ● McKenna, N., Li, T., Cheng, L., Hosseini, M. J., Johnson, M., & Steedman, M. (2023). Sources of Hallucination by Large Language Models on Inference Tasks. ● Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2023). A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. ● Liu, T., Zhang, Y., Brockett, C., Mao, Y., Sui, Z., Chen, W., & Dolan, B. (2022). A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation. ● Yuan, W., Neubig, G., & Liu, P. (2021). BARTSCORE: Evaluating Generated Text as Text Generation. ● Xu, Z., Jain, S., & Kankanhalli, M. (2024). Hallucination is Inevitable: An Innate Limitation of Large Language Models. ● Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Chen, D., Chan, H., Dai, W., & Madotto, A., Fung, P. (2024). Survey of Hallucination in Natural Language Generation. ● Liu, Y., Iter, D., Xu, Y., Wang, S., Xu, R., & Zhu, C. (2023). G-EVAL: NLG Evaluation using GPT-4 with Better Human Alignment