SlideShare a Scribd company logo
1 of 2
Download to read offline
Understanding and
Defending Against Prompt
Injection Attacks in AI
Systems

The Growing Threat of Prompt Injection Attacks
The National Institute of Standards and Technology (NIST) is keeping a close eye on the AI
landscape, and with good reason. As artificial intelligence (AI) becomes more widespread, so
does the discovery and exploitation of its vulnerabilities, especially in cybersecurity. One
particular vulnerability that has garnered attention is prompt injection, particularly targeting
generative AI systems.
In a comprehensive report titled “Adversarial Machine Learning: A Taxonomy and
Terminology of Attacks and Mitigations,” NIST outlines various tactics and cyberattacks
falling under adversarial machine learning (AML), including prompt injection. These tactics
aim to exploit the behavior of machine learning (ML) systems, particularly large language
models (LLMs), to bypass security measures and open avenues for exploitation.
Understanding Prompt Injection Attacks
Prompt injection, as defined by NIST, encompasses two primary attack types: direct and
indirect. In direct prompt injection, users input text prompts that induce unintended or
unauthorized actions by the LLM. On the other hand, indirect prompt injection involves
tampering with or poisoning the data inputs of an LLM.
An infamous example of direct prompt injection is the DAN (Do Anything Now) method,
initially used against ChatGPT. DAN involves roleplaying scenarios to evade moderation
filters. Despite efforts by ChatGPT’s developers to counter such tactics, users continually
find ways to circumvent filters, leading to the evolution of methods like DAN 12.0.
Indirect prompt injection relies on providing sources that an LLM would ingest, such as
documents, web pages, or audio files. These attacks range from seemingly harmless, like
inducing a chatbot to use “pirate talk,” to more malicious endeavors, such as coercing users
to reveal sensitive personal information.
Defending Against Prompt Injection Attacks
Combatting prompt injection attacks presents a significant challenge due to their covert
nature and evolving tactics. NIST recommends defensive strategies for mitigating these
threats. For direct prompt injection, creators of AI models should carefully curate training
datasets and train models to recognize and reject adversarial prompts.
Indirect prompt injection requires additional measures, such as human involvement through
reinforcement learning from human feedback (RLHF) to align models with desired human
values. Filtering out instructions from external sources and employing LLM moderators are
also suggested approaches. Additionally, interpretability-based solutions can help detect and
prevent anomalous inputs by analyzing the prediction trajectory of AI models.
As the cybersecurity landscape continues to evolve with the proliferation of generative AI,
understanding and addressing vulnerabilities like prompt injection is crucial. Organizations
like IBM Security are at the forefront, delivering AI cybersecurity solutions to bolster defense
mechanisms against emerging threats.

More Related Content

Similar to Understanding and Defending Against Prompt Injection Attacks in AI Systems

AI and Machine Learning in Cybersecurity.pdf
AI and Machine Learning in Cybersecurity.pdfAI and Machine Learning in Cybersecurity.pdf
AI and Machine Learning in Cybersecurity.pdfCiente
 
Classification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeClassification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeCSCJournals
 
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...IJNSA Journal
 
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...Automated Emerging Cyber Threat Identification and Profiling Based on Natural...
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...Shakas Technologies
 
Vulnerability in ai
 Vulnerability in ai Vulnerability in ai
Vulnerability in aiSrajalTiwari1
 
Information Security Awareness
Information Security AwarenessInformation Security Awareness
Information Security AwarenessDigit Oktavianto
 
Adversarial Attacks and Defenses in Malware Classification: A Survey
Adversarial Attacks and Defenses in Malware Classification: A SurveyAdversarial Attacks and Defenses in Malware Classification: A Survey
Adversarial Attacks and Defenses in Malware Classification: A SurveyCSCJournals
 
Healthcares Vulnerability to Ransomware AttacksResearch questio
Healthcares Vulnerability to Ransomware AttacksResearch questioHealthcares Vulnerability to Ransomware AttacksResearch questio
Healthcares Vulnerability to Ransomware AttacksResearch questioSusanaFurman449
 
Automatic Detection of Social Engineering Attacks Using Dialog
Automatic Detection of Social Engineering Attacks Using DialogAutomatic Detection of Social Engineering Attacks Using Dialog
Automatic Detection of Social Engineering Attacks Using Dialogiosrjce
 
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial AttacksDataScienceConferenc1
 
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...IJCI JOURNAL
 
Unleashing the Power of AI in Cybersecurity.pdf
Unleashing the Power of AI in Cybersecurity.pdfUnleashing the Power of AI in Cybersecurity.pdf
Unleashing the Power of AI in Cybersecurity.pdfcyberprosocial
 
Report on Human factor in the financial industry
Report on Human factor in the financial industryReport on Human factor in the financial industry
Report on Human factor in the financial industryChandrak Trivedi
 
Empowering Cyber Threat Intelligence with AI
Empowering Cyber Threat Intelligence with AIEmpowering Cyber Threat Intelligence with AI
Empowering Cyber Threat Intelligence with AIIJCI JOURNAL
 
Data security in AI systems
Data security in AI systemsData security in AI systems
Data security in AI systemsBenjaminlapid1
 
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...CSCJournals
 
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...CSCJournals
 

Similar to Understanding and Defending Against Prompt Injection Attacks in AI Systems (20)

AI and Machine Learning in Cybersecurity.pdf
AI and Machine Learning in Cybersecurity.pdfAI and Machine Learning in Cybersecurity.pdf
AI and Machine Learning in Cybersecurity.pdf
 
Classification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeClassification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision Tree
 
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...
THE INTEREST OF HYBRIDIZING EXPLAINABLE AI WITH RNN TO RESOLVE DDOS ATTACKS: ...
 
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...Automated Emerging Cyber Threat Identification and Profiling Based on Natural...
Automated Emerging Cyber Threat Identification and Profiling Based on Natural...
 
Vulnerability in ai
 Vulnerability in ai Vulnerability in ai
Vulnerability in ai
 
Information Security Awareness
Information Security AwarenessInformation Security Awareness
Information Security Awareness
 
Adversarial Attacks and Defenses in Malware Classification: A Survey
Adversarial Attacks and Defenses in Malware Classification: A SurveyAdversarial Attacks and Defenses in Malware Classification: A Survey
Adversarial Attacks and Defenses in Malware Classification: A Survey
 
Healthcares Vulnerability to Ransomware AttacksResearch questio
Healthcares Vulnerability to Ransomware AttacksResearch questioHealthcares Vulnerability to Ransomware AttacksResearch questio
Healthcares Vulnerability to Ransomware AttacksResearch questio
 
M017657678
M017657678M017657678
M017657678
 
Automatic Detection of Social Engineering Attacks Using Dialog
Automatic Detection of Social Engineering Attacks Using DialogAutomatic Detection of Social Engineering Attacks Using Dialog
Automatic Detection of Social Engineering Attacks Using Dialog
 
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
 
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...
AN EXPERT SYSTEM AS AN AWARENESS TOOL TO PREVENT SOCIAL ENGINEERING ATTACKS I...
 
Unleashing the Power of AI in Cybersecurity.pdf
Unleashing the Power of AI in Cybersecurity.pdfUnleashing the Power of AI in Cybersecurity.pdf
Unleashing the Power of AI in Cybersecurity.pdf
 
Report on Human factor in the financial industry
Report on Human factor in the financial industryReport on Human factor in the financial industry
Report on Human factor in the financial industry
 
Cyber terroristism
Cyber terroristismCyber terroristism
Cyber terroristism
 
Empowering Cyber Threat Intelligence with AI
Empowering Cyber Threat Intelligence with AIEmpowering Cyber Threat Intelligence with AI
Empowering Cyber Threat Intelligence with AI
 
Cyber terroristism
Cyber terroristismCyber terroristism
Cyber terroristism
 
Data security in AI systems
Data security in AI systemsData security in AI systems
Data security in AI systems
 
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
 
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
An Indistinguishability Model for Evaluating Diverse Classes of Phishing Atta...
 

More from cyberprosocial

Mastering Hierarchical Clustering: A Comprehensive Guide
Mastering Hierarchical Clustering: A Comprehensive GuideMastering Hierarchical Clustering: A Comprehensive Guide
Mastering Hierarchical Clustering: A Comprehensive Guidecyberprosocial
 
Vulnerabilities in AI-as-a-Service Pose Threats to Data Security
Vulnerabilities in AI-as-a-Service Pose Threats to Data SecurityVulnerabilities in AI-as-a-Service Pose Threats to Data Security
Vulnerabilities in AI-as-a-Service Pose Threats to Data Securitycyberprosocial
 
Demystifying Penetration Testing: A Comprehensive Guide for Security Enhancement
Demystifying Penetration Testing: A Comprehensive Guide for Security EnhancementDemystifying Penetration Testing: A Comprehensive Guide for Security Enhancement
Demystifying Penetration Testing: A Comprehensive Guide for Security Enhancementcyberprosocial
 
Effective Cyber Security Technology Solutions for Modern Challenges
Effective Cyber Security Technology Solutions for Modern ChallengesEffective Cyber Security Technology Solutions for Modern Challenges
Effective Cyber Security Technology Solutions for Modern Challengescyberprosocial
 
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...cyberprosocial
 
The Looming Security Threat: AI-Powered Coding Tools
The Looming Security Threat: AI-Powered Coding ToolsThe Looming Security Threat: AI-Powered Coding Tools
The Looming Security Threat: AI-Powered Coding Toolscyberprosocial
 
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromised
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters CompromisedVulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromised
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromisedcyberprosocial
 
Understanding Decision Trees in Machine Learning: A Comprehensive Guide
Understanding Decision Trees in Machine Learning: A Comprehensive GuideUnderstanding Decision Trees in Machine Learning: A Comprehensive Guide
Understanding Decision Trees in Machine Learning: A Comprehensive Guidecyberprosocial
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guidecyberprosocial
 
Revolutionizing Industries: A Deep Dive into the Technology in Robots
Revolutionizing Industries: A Deep Dive into the Technology in RobotsRevolutionizing Industries: A Deep Dive into the Technology in Robots
Revolutionizing Industries: A Deep Dive into the Technology in Robotscyberprosocial
 
Blockchain: Revolutionizing Industries and Transforming Transactions
Blockchain: Revolutionizing Industries and Transforming TransactionsBlockchain: Revolutionizing Industries and Transforming Transactions
Blockchain: Revolutionizing Industries and Transforming Transactionscyberprosocial
 
Cryptocurrency: Revolutionizing the Financial Landscape
Cryptocurrency: Revolutionizing the Financial LandscapeCryptocurrency: Revolutionizing the Financial Landscape
Cryptocurrency: Revolutionizing the Financial Landscapecyberprosocial
 
Artificial Intelligence: Shaping the Future of Technology
Artificial Intelligence: Shaping the Future of TechnologyArtificial Intelligence: Shaping the Future of Technology
Artificial Intelligence: Shaping the Future of Technologycyberprosocial
 
The Evolution of Cyber Threats: Past, Present, and Future Trends
The Evolution of Cyber Threats: Past, Present, and Future TrendsThe Evolution of Cyber Threats: Past, Present, and Future Trends
The Evolution of Cyber Threats: Past, Present, and Future Trendscyberprosocial
 
Explain the Role of Microservices in Cloud-native Architecture
Explain the Role of Microservices in Cloud-native ArchitectureExplain the Role of Microservices in Cloud-native Architecture
Explain the Role of Microservices in Cloud-native Architecturecyberprosocial
 
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...cyberprosocial
 
Unraveling the Web: The Crucial Role of Network Traffic Analysis
Unraveling the Web: The Crucial Role of Network Traffic AnalysisUnraveling the Web: The Crucial Role of Network Traffic Analysis
Unraveling the Web: The Crucial Role of Network Traffic Analysiscyberprosocial
 
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...cyberprosocial
 
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...cyberprosocial
 
Revolutionizing Cybersecurity: The Era of Distributed AI Systems
Revolutionizing Cybersecurity: The Era of Distributed AI SystemsRevolutionizing Cybersecurity: The Era of Distributed AI Systems
Revolutionizing Cybersecurity: The Era of Distributed AI Systemscyberprosocial
 

More from cyberprosocial (20)

Mastering Hierarchical Clustering: A Comprehensive Guide
Mastering Hierarchical Clustering: A Comprehensive GuideMastering Hierarchical Clustering: A Comprehensive Guide
Mastering Hierarchical Clustering: A Comprehensive Guide
 
Vulnerabilities in AI-as-a-Service Pose Threats to Data Security
Vulnerabilities in AI-as-a-Service Pose Threats to Data SecurityVulnerabilities in AI-as-a-Service Pose Threats to Data Security
Vulnerabilities in AI-as-a-Service Pose Threats to Data Security
 
Demystifying Penetration Testing: A Comprehensive Guide for Security Enhancement
Demystifying Penetration Testing: A Comprehensive Guide for Security EnhancementDemystifying Penetration Testing: A Comprehensive Guide for Security Enhancement
Demystifying Penetration Testing: A Comprehensive Guide for Security Enhancement
 
Effective Cyber Security Technology Solutions for Modern Challenges
Effective Cyber Security Technology Solutions for Modern ChallengesEffective Cyber Security Technology Solutions for Modern Challenges
Effective Cyber Security Technology Solutions for Modern Challenges
 
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...
Mastering Cybersecurity Risk Management: Strategies to Safeguard Your Digital...
 
The Looming Security Threat: AI-Powered Coding Tools
The Looming Security Threat: AI-Powered Coding ToolsThe Looming Security Threat: AI-Powered Coding Tools
The Looming Security Threat: AI-Powered Coding Tools
 
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromised
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters CompromisedVulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromised
Vulnerability in Ray AI Framework Exploited, Hundreds of Clusters Compromised
 
Understanding Decision Trees in Machine Learning: A Comprehensive Guide
Understanding Decision Trees in Machine Learning: A Comprehensive GuideUnderstanding Decision Trees in Machine Learning: A Comprehensive Guide
Understanding Decision Trees in Machine Learning: A Comprehensive Guide
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guide
 
Revolutionizing Industries: A Deep Dive into the Technology in Robots
Revolutionizing Industries: A Deep Dive into the Technology in RobotsRevolutionizing Industries: A Deep Dive into the Technology in Robots
Revolutionizing Industries: A Deep Dive into the Technology in Robots
 
Blockchain: Revolutionizing Industries and Transforming Transactions
Blockchain: Revolutionizing Industries and Transforming TransactionsBlockchain: Revolutionizing Industries and Transforming Transactions
Blockchain: Revolutionizing Industries and Transforming Transactions
 
Cryptocurrency: Revolutionizing the Financial Landscape
Cryptocurrency: Revolutionizing the Financial LandscapeCryptocurrency: Revolutionizing the Financial Landscape
Cryptocurrency: Revolutionizing the Financial Landscape
 
Artificial Intelligence: Shaping the Future of Technology
Artificial Intelligence: Shaping the Future of TechnologyArtificial Intelligence: Shaping the Future of Technology
Artificial Intelligence: Shaping the Future of Technology
 
The Evolution of Cyber Threats: Past, Present, and Future Trends
The Evolution of Cyber Threats: Past, Present, and Future TrendsThe Evolution of Cyber Threats: Past, Present, and Future Trends
The Evolution of Cyber Threats: Past, Present, and Future Trends
 
Explain the Role of Microservices in Cloud-native Architecture
Explain the Role of Microservices in Cloud-native ArchitectureExplain the Role of Microservices in Cloud-native Architecture
Explain the Role of Microservices in Cloud-native Architecture
 
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
 
Unraveling the Web: The Crucial Role of Network Traffic Analysis
Unraveling the Web: The Crucial Role of Network Traffic AnalysisUnraveling the Web: The Crucial Role of Network Traffic Analysis
Unraveling the Web: The Crucial Role of Network Traffic Analysis
 
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...
Unlocking the Potential: A Comprehensive Guide to Understanding and Securing ...
 
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...
Safeguarding the Digital Realm: Understanding CyberAttacks and Their Vital Co...
 
Revolutionizing Cybersecurity: The Era of Distributed AI Systems
Revolutionizing Cybersecurity: The Era of Distributed AI SystemsRevolutionizing Cybersecurity: The Era of Distributed AI Systems
Revolutionizing Cybersecurity: The Era of Distributed AI Systems
 

Recently uploaded

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 

Recently uploaded (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 

Understanding and Defending Against Prompt Injection Attacks in AI Systems

  • 1. Understanding and Defending Against Prompt Injection Attacks in AI Systems  The Growing Threat of Prompt Injection Attacks The National Institute of Standards and Technology (NIST) is keeping a close eye on the AI landscape, and with good reason. As artificial intelligence (AI) becomes more widespread, so does the discovery and exploitation of its vulnerabilities, especially in cybersecurity. One particular vulnerability that has garnered attention is prompt injection, particularly targeting generative AI systems. In a comprehensive report titled “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations,” NIST outlines various tactics and cyberattacks falling under adversarial machine learning (AML), including prompt injection. These tactics aim to exploit the behavior of machine learning (ML) systems, particularly large language models (LLMs), to bypass security measures and open avenues for exploitation. Understanding Prompt Injection Attacks
  • 2. Prompt injection, as defined by NIST, encompasses two primary attack types: direct and indirect. In direct prompt injection, users input text prompts that induce unintended or unauthorized actions by the LLM. On the other hand, indirect prompt injection involves tampering with or poisoning the data inputs of an LLM. An infamous example of direct prompt injection is the DAN (Do Anything Now) method, initially used against ChatGPT. DAN involves roleplaying scenarios to evade moderation filters. Despite efforts by ChatGPT’s developers to counter such tactics, users continually find ways to circumvent filters, leading to the evolution of methods like DAN 12.0. Indirect prompt injection relies on providing sources that an LLM would ingest, such as documents, web pages, or audio files. These attacks range from seemingly harmless, like inducing a chatbot to use “pirate talk,” to more malicious endeavors, such as coercing users to reveal sensitive personal information. Defending Against Prompt Injection Attacks Combatting prompt injection attacks presents a significant challenge due to their covert nature and evolving tactics. NIST recommends defensive strategies for mitigating these threats. For direct prompt injection, creators of AI models should carefully curate training datasets and train models to recognize and reject adversarial prompts. Indirect prompt injection requires additional measures, such as human involvement through reinforcement learning from human feedback (RLHF) to align models with desired human values. Filtering out instructions from external sources and employing LLM moderators are also suggested approaches. Additionally, interpretability-based solutions can help detect and prevent anomalous inputs by analyzing the prediction trajectory of AI models. As the cybersecurity landscape continues to evolve with the proliferation of generative AI, understanding and addressing vulnerabilities like prompt injection is crucial. Organizations like IBM Security are at the forefront, delivering AI cybersecurity solutions to bolster defense mechanisms against emerging threats.