Artificial Intelligence Large Language Models (LLM) and Machine Learning (ML) Application Security Threats and Defenses. OWASP Top Tens for LLM and ML along with software development attack preventative best practices.
27. rGrupe
:|:
application
security
• For Machine Learning • For Large Language Model
LLM01: Prompt
Injection
LLM02: Insecure
Output Handling
LLM03: Training
Data Poisoning
LLM04: Model
Denial of Service
LLM05: Supply
Chain
Vulnerabilities
LLM06: Sensitive
Information
Disclosure
LLM07: Insecure
Plugin Design
LLM08: Excessive
Agency
LLM09:
Overreliance
LLM10: Model
Theft
OWASP Top 10 AI 2023 Attack Risks
ML01:
Adversarial
Attack
ML02: Data
Poisoning Attack
ML03: Model
Inversion Attack
ML04:
Membership
Inference Attack
ML05: Model
Stealing
ML06: Corrupted
Packages
ML07: Transfer
Learning Attack
ML08: Model
Skewing
ML09: Output
Integrity Attack
ML10: Neural
Net
Reprogramming
27
46. rGrupe
:|:
application
security
OWASP 2023: LLM09 Overreliance
Overreliance on LLMs can lead to serious consequences such as
misinformation, legal issues, and security vulnerabilities.
It occurs when an LLM is trusted to make critical decisions or generate
content without adequate oversight or validation.
ATTACK SCENARIOS
+ AI fed misleading info leading to disinformation
+ AI's code suggestions introduce security vulnerabilities
+ Developer unknowingly integrates malicious package suggested by AI
EXAMPLES
+ LLM provides incorrect information
+ LLM generates nonsensical text
+ LLM suggests insecure code
+ Inadequate risk communication from LLM providers
PREVENTION
+ Regular monitoring and review of LLM outputs
+ Cross-check LLM output with trusted sources
+ Enhance model with fine-tuning or embeddings
+ Implement automatic validation mechanisms
+ Break tasks into manageable subtasks
+ Clearly communicate LLM risks and limitations
+ Secure coding practices in development environments
DEFENSES AppSec Coding Standards
+ SSDF AppSec Coding Standards
+ SDLC continuous functional verification testing
+ D7 Monitoring & Alerting
+ Developer and User AI Risks Training
46
54. rGrupe
:|:
application
security
Defenses
Baseline Cybersecurity: NIST 800-53 (includes NIST CSF and ISO 27002)
Secure Software Development Framework (SSDF) NIST 800-812
With Host Platform Security Hardening
With Secure Software Development Life Cycle (SSDLC) practice
With AI Attack Hardening
54
AI – software programs with the objective to learn and appear to reason as humans
ML – adaptable algorithms without discrete programming
Deep Learning – Large Language/Data using “neural networks”
With each of these definitions much more details
ML
Supervised Learning – human operator facilitated classifications and predictions/extrapolations
Reinforced Learning – learning by trial and error
Unsupervised Learning – summary generalizations & feedback adjustments
AI applications
Thinking Analytical analysis
Human natural language I/O – text, images, audio
Health care AI focus solutions
Classification - Diagnosis
Imaging analysis (microscopy/xrays/etc.)
Records processing
https://pub.towardsai.net/large-language-models-and-gpt-4-architecture-and-openai-api-d8f1c070e0fc
LLMs
Human communication languages
And machine/signals
Neural networks for parallel multifunctional processing
Parameters – data elements/digital-metatags
Real-time processing analysis – predicting next works and topics via vector representations
https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/
AI Solution
Transformer/Foundation Model
Neural Network that
Adapts with ML
Based on
Reference data
Training
Adaptations
Processing & Rendering User/Consumer I/Os
For AI,
All standard application security and data loss prevention
consideration and protection strategies apply
But due to human interactions and impact also
Special Legal considerations
for data analysis & sensitive information processing.
When considering AI Technology enabled Solutions,
There are some key considerations that need to be kept in mind.
Contextual understandingThey don't always get it right and are often unable to understand the context,leading to inappropriate or just plain wrong answers.
Humans struggle with multicultural context in communications (idioms, sarcasm).
Translating between languages is much more complicated that pure grammar and diction.
Language/Slang changes over time
e.g.
Calculator – is that a person or an electronic device? Depends on era context
Punctuation
– are message capitalizations abbreviations or aggressive “shouting”?
- Is a full-stop/period an emotion?
Common SenseCommon sense is difficult to quantify, but humans learn this from an early age. LLMs only understand what has been supplied through their training data, and this does not give them a true comprehension of the world they exist in.
An LLM is only as good as its training data
Accuracy can never be guaranteed: "Garbage In, Garbage Out"
BiasAny biases present in the training data can often be present in responses. This includes biases towards gender, race, geography, and culture.
Contextual understanding
Humans struggle with multicultural context in communications (idioms, sarcasm).
Translating between languages is much more complicated that pure grammar and diction.
Language/Slang changes over time
e.g.
Calculator – is that a person or an electronic device? Depends on era context
Punctuation
– are message capitalizations abbreviations or aggressive “shouting”?
- Is a full-stop/period an emotion?
Common Sense
Only as good as its training data
Bias
https://www.makeuseof.com/what-are-large-langauge-models-how-do-they-work/
https://en.wikipedia.org/wiki/List_of_cognitive_biases
“Even with the best of intentions, what could possibly go wrong?”
No matter how altruistic and logical programmers think they are, all humans are still susceptible to human weaknesses.
These partialities and prejudices, get embedded into the model design
and are affected by training/feedback users.
Anchoring bias
The tendency to rely too heavily—to "anchor"—on one trait or piece of information when making decisions (usually the first piece of information acquired on that subject)
Apophenia
The tendency to perceive meaningful connections between unrelated things.
Availability bias
The tendency to overestimate the likelihood of events with greater "availability" in memory, which can be influenced by how recent the memories are or how unusual or emotionally charged they may be.
Cognitive dissonance
Perception of contradictory information and the mental toll of it.
Confirmation bias
The tendency to search for, interpret, focus on and remember information in a way that confirms one's preconceptions.
Egocentric bias
The tendency to rely too heavily on one's own perspective and/or have a different perception of oneself relative to others.
Extension neglect
When the sample size is ignored when its determination is relevant.
False priors
Initial beliefs and knowledge which interfere with the unbiased evaluation of factual evidence and lead to incorrect conclusions.
The framing effect
The tendency to draw different conclusions from the same information, depending on how that information is presented.
Logical fallacy
The use of invalid or otherwise faulty reasoning in the construction of an argument that may appear to be well-reasoned if unnoticed.
Prospect theory
How individuals assess their loss and gain perspectives in an asymmetric manner.
Self-assessment
the tendency for unskilled individuals to overestimate their own ability and the tendency for experts to underestimate their own ability.
Truth judgment
Also called Belief bias, an effect where someone's evaluation of the logical strength of an argument is biased by the believability of the conclusion.
Illusory truth effect, the tendency to believe that a statement is true if it is easier to process, or if it has been stated multiple times, regardless of its actual veracity.
“Say something often enough and people will start to believe it.”
Attack Chain: https://atlas.mitre.org/
Real-World Epic Fail/Attack Use Cases
Reconnaissance – learn where and how AI use by target
Blogs, case studies, conference presentations, etc.
Web sites
Code repositories and contributions
Network/API probes
Resource Development
Acquire Public target artifacts (datasets, models)
Develop Poisoned training datasets
Establish Accounts on target systems for victim impersonation
Initial Access
Supply Chain access
Compromise Platforms
Compromise models (defense evasion)
ML Model Access
API Access and functional mapping
Full model access
Execution
User execution
Unsafe ML Artifacts user execution
Malicious command injection/execution
Persistence
Backdoor model, data, command injection I/Os
Defense Evasion
Evade detection by ML security software (e.g. anti-malware)
Discovery
ML eco-system doxing
Collection
Exfiltration of ML artifacts
ML Attack Staging
Training proxy models
Creating adversarial data
Exfiltration
Stealing data (IP value) via ML I/O or IT vulnerabilities/compromises
Impact
Manipulate, corrupt, or destroy ML systems and/or data
Then identify your anticipated Baddies:
Types
Motivations (what they want)
Objectives that they would want to try exploiting through your application
“Threat Actors/Malicious Personas Library”
For each Baddie,Review your designs for each of the STRIDE threats
https://en.wikipedia.org/wiki/STRIDE_(security)
THREAT: LLM01 Prompt Injection
Direct – attacker entries
Indirect – victim user/process hidden prompts in
Web page
LLM summarize document file with prompt injections
DEFENSE
D6 Input / Output (validation & sanitization)
D03 Access Control
No just UI, but backend
External content
Trust boundary segmented design
THREAT: ML01 Adversarial Attack
Biometric marker twins
Trojans
Manipulate the features of the network traffic, such as the source IP address, destination IP address, or payload, in such a way that they are not detected by the intrusion detection system (IDS).
For example, the attacker may hide their source IP address behind a proxy server or encrypt the payload of their network traffic.
DEFENSE
D7 Input/Output
SSDLC Security Functional/UAT Test
Adversarial training (“MisUse Cases”)
Train with adversarial examples to reduce being misled.
Robust models:
Use models that are designed to be robust against adversarial attacks, such as adversarial training or models that incorporate defense mechanisms.
Input validation
Checking the input data for anomalies, such as unexpected values or patterns, and rejecting inputs that are likely to be malicious.
THREAT: ML04 Membership Inference Attack
Sensitive Data Exposure by inferring information from Training Data used
Attacker has some valid reference data
Queries training data for matches and tease out sensitive information
Uses Results analysis to infer sensitive information about real persons
DEFENSE
SDLC AI Design
SSDLC Sensitive Data Analysis
SSDLC Design Threat Assessment
SSDLC D7 Monitoring & Alerting
SDLC Maintenance AI Functional Testing Monitoring
THREAT: LLM04 Model Denial of Service
Prompt Stuffing/Brute Force attack
DEFENSE
D6 I/O input validation & rate limiting
THREAT: ML02 Data Poisoning Attack
Same as LLM03 Training Data Poisoning
Injecting malicious data into training set
Or compromising insider (similar to mobile phone SIM cloners)
DEFENSE
D3 Access Management
D5 Data – sensitive data encryption
D7 Monitoring & Alerting – suspicious activities, data category drifts
SSDLC Design Reviews – secure data stores design
THREAT: LLM03 Training Data PoisoningSame as ML02 Data Poisoning Attack
Poisons training data
DEFENSE
D3 Access Management
D5 Data
D7 Monitoring & Alerting
SSDLC Design Reviews
Training Data poisoning (cloned and multiplied defects - DNA inbreeding defects amplification, propaganda repeated lie becomes majority common knowledge, groupthink bandwagoning self-reinforcing feedback loop
THREAT: ML08 Model Skewing
Manipulate training data
Attacker wants high risk loan application approved
Modifies model that high risk loans have been previously approved
Feedback uses this information to modify it’s approval criteria
DEFENSE
D3 Access Management – only allow trusted admins
D5 Data –
Sensitive information protection, signing
D6 Input / Output
Data validation from outputs that are fed into inputs
D7 Monitoring & Alerting
Design AC deviations
Periodic auditing
Movies:
Minority Report (prediction modeling – ignoring instrumentation so all appears well),
Black Mirror Nodedive (harmful effects of social ranking manipulations)
THREAT: LLM06 Sensitive Information Disclosure
Like a Access Management vulnerability
Legitimate users/consuming apps are provided with “illegal” information
Caused by poor design
DEFENSE
D3 Access Management
D6 Input / Output
SSDLC Security Design Review
SSDLC Security Functional Testing
THREAT: LLM05 Supply Chain VulnerabilitiesSame as ML06 Corrupted Packages
Favorite of Ransomware attackers - injecting Trojan malware into trusted
Open Source (too many examples to enumerate) or
Compromised 3rd party tools (SolarWinds Orion Software)
DEFENSE
D2 Frameworks & Components – Enterprise Formulary and Binary Package Manager (Artifactory)
DevSecOps SBOM – know your dependencies
SSDLC SCA vulnerability testing (continuous) – JFrog Xray
SSDLC Security Code Reviews – minimized usage, safe use
With SAST for internally developed packages
THREAT: ML06 Corrupted Packages Same as LLM05 Supply Chain Vulnerabilities
3rd Party Supply Chain Poisoning – inserting malware into trusted downloads
DEFENSE
D2 Frameworks & Components – Enterprise Formulary and Binary Package Manager (Artifactory)
DevSecOps SBOM – know your dependencies
SSDLC SCA vulnerability testing (continuous) – JFrog Xray
SSDLC Security Code Reviews – minimized usage, safe use
With SAST for internally developed packages
AppSec Defenders training – these risks and defensive strategies
THREAT: LLM07 Insecure Plugin Design
Allowing plugins that aren’t verified to be security hardened
E.g. promiscuous input
DEFENSE
D3 Access Management
D6 Input / Output (sanitization and validation)
SSDLC Design Threat Assessment
SSDLC AppSec Testing
THREAT: LLM08 Excessive Agency
Leveraging
intended functions (sending emails)
for malicious intentions (sending spam)
Trusting user access roles to be conscientious and benevolent
Especially “dev/admins”
DEFENSE
D3 Access Management
D6 Input / Output
SSDLC Design Threat Assessment (roles & permissions)
D7 Monitoring & Alerting
THREAT: LLM09 Overreliance
Blindly trusted and unsupervised AI that can cause damage
Naive Developers/Integrators/Users - carelessness
DEFENSE
SSDF AppSec Coding Standards
SDLC continuous functional verification testing
D7 Monitoring & Alerting
Developer and User AI Risks Training
THREAT: ML03 Model Inversion Attack
Reverse Engineering
The advertiser executed this attack by training their own bot detection model and then using it to reverse the predictions of the bot detection model used by the online advertising platform.
Via design vulnerabilities/promiscuous API
DEFENSE
SSDLC Access Control (hindering model access)
Design
Provide User transparency/insights
inputs and outputs,
explanation of prediction,
reveal internal representation
Logging
THREAT: ML07 Transfer Learning Attack
Maliciously change training to leverage for an attack
Attacker uses manipulated image of themselves and the system would identify them as a legitimate user.
DEFENSE
D3 Access Management
D7 Monitoring & Alerting
THREAT: ML10 Neural Net Reprogramming
Bank processing handwritten cheques – beyond OCR, improving deterministic categorization
This can result in the model being reprogrammed to identify characters differently.
For example, the attacker could change the parameters so that the model identifies the character “5” as the character “2”, leading to incorrect amounts being processed.
The attacker can exploit this vulnerability by introducing forged cheques into the clearing process, which the model will process as valid due to the manipulated parameters.
This can result in significant financial loss to the bank.
DEFENSE
SSDLC
Solution Design with AI COE
Data Sensitivity of process data and Model categorization parameters
Security Threat Assessment
D3 Access Control
D5 Data (encryption)
THREAT: LLM05 Supply Chain VulnerabilitiesSame as ML06 Corrupted Packages
Favorite of Ransomware attackers - injecting Trojan malware into trusted
Open Source (too many examples to enumerate) or
Compromised 3rd party tools (SolarWinds Orion Software)
DEFENSE
D2 Frameworks & Components – Enterprise Formulary and Binary Package Manager (Artifactory)
DevSecOps SBOM – know your dependencies
SSDLC SCA vulnerability testing (continuous) – JFrog Xray
SSDLC Security Code Reviews – minimized usage, safe use
With SAST for internally developed packages
https://neptune.ai/blog/how-to-monitor-your-models-in-production-guide
* https://github.com/Trusted-AI/adversarial-robustness-toolbox
https://www.qed42.com/insights/perspectives/biztech/complete-guide-testing-ai-and-ml-applications
Traditional deterministic applications can be hand off to DevOps for monitoring and wait for feature updates