SlideShare a Scribd company logo
1 of 19
H2O.ai Confidential
KIM MONTGOMERY
Principal Data Scientist,
H2O.ai
What is Generative AI?
GenAI enables the creation of novel content
Input
GenAI Model
Learns patterns in
unstructured data
Unstructured data
Output Novel Content
Data
Traditional AI Model
Learns relationship
between data and label
Output Label
Labels
VS
H2O.ai Confidential
For complex models (neural networks, gradient boosters, etc. )
● Lack of transparency
○ It’s not obvious what the model is calculating.
○ It’s not obvious why the model made a decision.
● And may not be obvious when the model breaks.
● Model robustness issues. May get strange results for out of
distribution input.
● Model probing can leak private information.
● May contain bias to certain groups
Responsible AI for Traditional ML
H2O.ai Confidential
For complex models (LLM )
● Lack of transparency
○ It’s not obvious what the model is calculating.
○ It’s not obvious why the model made a decision.
● And may not be obvious when the model breaks.
● Model robustness issues. May get strange results for out of
distribution input.
● Model probing can leak private information.
● May contain bias to certain groups
Responsible AI for Gen AI
v
H2O.ai Confidential
Interpretability Supervised AI
Global
● What is the average quality of the model in general?
○ Accuracy
○ Feature importance
○ Fairness
Local
● What are the properties of a single response?
○ Correct / Incorrect
○ Local feature importance
○ Robustness to perturbations
v
H2O.ai Confidential
Interpretability: Traditional ML
v
H2O.ai Confidential
Interpretability: Traditional ML
v
H2O.ai Confidential
Interpretability: Global / Local LLM
Global measures
● How accurate is the model in general?
● How frequently does it hallucinate?
● How frequently does the answer contain undesirable qualities like toxicity,
privacy violations, or unfairness?
Local measures
● Is the current response accurate?
● Does the current response contain undesirable qualities like toxicity,
privacy violations, or unfairness?
v
H2O.ai Confidential
Accuracy: Traditional ML
Traditional machine
learning
● Comparing a prediction
to an outcome
● Generally the correct
labels are in a simple
format
v
H2O.ai Confidential
Accuracy: LLMs
● Frequently sound reasonable
● Can hallucinate
● The training data may be from a huge training set that is difficult
to check.
v
H2O.ai Confidential
“Open the pod bay doors, Hal.”
v
H2O.ai Confidential
v
H2O.ai Confidential
Accuracy: Retrieval Augmented Generation (RAG) Provides a
Simple Solution for Some Applications
v
H2O.ai Confidential
Accuracy: LLMs
Confirm results against a given source:
● Checking results against a given source (RAG)
● Checking results against the tuning data
● Checking results against an external source (eg wikipedia)
● Checking results against the training data (cumbersome).
● Checking for self-consistency (Self-check GPT)
Scoring methods
● Natural language inference
● Comparing embeddings
● Influence functions
v
H2O.ai Confidential
Counterfactual analysis: Traditional ML
● How does changing a feature change the model outcome?
● What is the smallest change that can change the outcome?
v
H2O.ai Confidential
Counterfactual analysis: LLM
How consistent are results under different:
● Prompts / instructions.
● Proper names or pronouns (fairness)
● Provided context
● Word replacement with synonyms
● Other rewording
v
H2O.ai Confidential
Guardrails (Controlling LLM Output)
Provide tools for:
● Avoiding certain topics.
● Avoiding privacy violations
● Avoiding toxicity
● Fact checking
● Avoiding hallucinations
● Avoiding bias
v
H2O.ai Confidential
Guardrails
Achieving more flexible control of LLM output
● Adding instructions resistant to undesirable outcomes
● Screening output for bad behavior
v
H2O.ai Confidential
Conclusions
● Gen AI models have many of the complexities as other models.
● Some methods from unsupervised learning are still useful.
● Unstructured output will also benefit from new methods.

More Related Content

Similar to LLM Interpretability

Real-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning SystemsReal-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning Systems
Databricks
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
Krishnaram Kenthapadi
 
What could possibly go wrong? - An incomplete guide on how to prevent, detect...
What could possibly go wrong? - An incomplete guide on how to prevent, detect...What could possibly go wrong? - An incomplete guide on how to prevent, detect...
What could possibly go wrong? - An incomplete guide on how to prevent, detect...
LeaPetters1
 
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
Chris Hammerschmidt
 

Similar to LLM Interpretability (20)

L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Pstc 2018
Pstc 2018Pstc 2018
Pstc 2018
 
Real-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning SystemsReal-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning Systems
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
Day 1 wazz up ai
Day 1  wazz up aiDay 1  wazz up ai
Day 1 wazz up ai
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applications
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 
Kaggle and data science
Kaggle and data scienceKaggle and data science
Kaggle and data science
 
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
What could possibly go wrong? - An incomplete guide on how to prevent, detect...
What could possibly go wrong? - An incomplete guide on how to prevent, detect...What could possibly go wrong? - An incomplete guide on how to prevent, detect...
What could possibly go wrong? - An incomplete guide on how to prevent, detect...
 
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...
 
C3 w4
C3 w4C3 w4
C3 w4
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 

More from Sri Ambati

Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
Sri Ambati
 

More from Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
 
Scaling & Managing Production Deployments with H2O ModelOps
Scaling & Managing Production Deployments with H2O ModelOpsScaling & Managing Production Deployments with H2O ModelOps
Scaling & Managing Production Deployments with H2O ModelOps
 
Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 

LLM Interpretability

  • 2. What is Generative AI? GenAI enables the creation of novel content Input GenAI Model Learns patterns in unstructured data Unstructured data Output Novel Content Data Traditional AI Model Learns relationship between data and label Output Label Labels VS
  • 3. H2O.ai Confidential For complex models (neural networks, gradient boosters, etc. ) ● Lack of transparency ○ It’s not obvious what the model is calculating. ○ It’s not obvious why the model made a decision. ● And may not be obvious when the model breaks. ● Model robustness issues. May get strange results for out of distribution input. ● Model probing can leak private information. ● May contain bias to certain groups Responsible AI for Traditional ML
  • 4. H2O.ai Confidential For complex models (LLM ) ● Lack of transparency ○ It’s not obvious what the model is calculating. ○ It’s not obvious why the model made a decision. ● And may not be obvious when the model breaks. ● Model robustness issues. May get strange results for out of distribution input. ● Model probing can leak private information. ● May contain bias to certain groups Responsible AI for Gen AI
  • 5. v H2O.ai Confidential Interpretability Supervised AI Global ● What is the average quality of the model in general? ○ Accuracy ○ Feature importance ○ Fairness Local ● What are the properties of a single response? ○ Correct / Incorrect ○ Local feature importance ○ Robustness to perturbations
  • 8. v H2O.ai Confidential Interpretability: Global / Local LLM Global measures ● How accurate is the model in general? ● How frequently does it hallucinate? ● How frequently does the answer contain undesirable qualities like toxicity, privacy violations, or unfairness? Local measures ● Is the current response accurate? ● Does the current response contain undesirable qualities like toxicity, privacy violations, or unfairness?
  • 9. v H2O.ai Confidential Accuracy: Traditional ML Traditional machine learning ● Comparing a prediction to an outcome ● Generally the correct labels are in a simple format
  • 10. v H2O.ai Confidential Accuracy: LLMs ● Frequently sound reasonable ● Can hallucinate ● The training data may be from a huge training set that is difficult to check.
  • 11. v H2O.ai Confidential “Open the pod bay doors, Hal.”
  • 13. v H2O.ai Confidential Accuracy: Retrieval Augmented Generation (RAG) Provides a Simple Solution for Some Applications
  • 14. v H2O.ai Confidential Accuracy: LLMs Confirm results against a given source: ● Checking results against a given source (RAG) ● Checking results against the tuning data ● Checking results against an external source (eg wikipedia) ● Checking results against the training data (cumbersome). ● Checking for self-consistency (Self-check GPT) Scoring methods ● Natural language inference ● Comparing embeddings ● Influence functions
  • 15. v H2O.ai Confidential Counterfactual analysis: Traditional ML ● How does changing a feature change the model outcome? ● What is the smallest change that can change the outcome?
  • 16. v H2O.ai Confidential Counterfactual analysis: LLM How consistent are results under different: ● Prompts / instructions. ● Proper names or pronouns (fairness) ● Provided context ● Word replacement with synonyms ● Other rewording
  • 17. v H2O.ai Confidential Guardrails (Controlling LLM Output) Provide tools for: ● Avoiding certain topics. ● Avoiding privacy violations ● Avoiding toxicity ● Fact checking ● Avoiding hallucinations ● Avoiding bias
  • 18. v H2O.ai Confidential Guardrails Achieving more flexible control of LLM output ● Adding instructions resistant to undesirable outcomes ● Screening output for bad behavior
  • 19. v H2O.ai Confidential Conclusions ● Gen AI models have many of the complexities as other models. ● Some methods from unsupervised learning are still useful. ● Unstructured output will also benefit from new methods.