SlideShare a Scribd company logo
1 of 18
Download to read offline
BAE: BERT-based Adversarial
Examples for Text Classification
신동진, 박희수, 황소현
Adversarial Example
Explaining and Harnessing Adversarial Examples, ICLR ‘15
Adversarial Attack in NLP
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment, AAAI ‘20
Prior works
- Character level error, adding / deleting words
- 부자연스러움
- 문법적 오류
- Rule-based synonym replacement
- TextFooler (AAAI ‘20)
- Synonym replacement
- Token-level similarity에만 의존
- e.g. The restaurant service was poor
→ broke : inappropriate
→ terrible : context-aware
- need more context-awareness
⇒ use BERT-MLM (Masked Language Model)
Problem Definition
- Given…
- S: 주어진 문장
- C: target model
- y: S의 ground-truth class
- 생성 목표: Sadv
- Semantic Similarity : Sadv
~ S
- Misclassification: C(Sadv
) != y
- This film offers many delights and surprises ⇒ C(S) = Positive
→ This movie offers enough pleasures and surprises ⇒ C(Sadv
) = Negative
Threat Model
- Black-box attack
- Classification 결과만 접근 가능
- Model weights, gradients, training data 접근 불가
- Soft-label
- score of classification
↔ hard-label : 0/1 class inclusion
1. Token Importance
- token ti
가 classification에 영향을 주는 정도
- 이 token을 직접 바꾸거나, 주변에 다른 token을 넣으면 classification이 바뀔
가능성이 높음
- P(C(S)=y) - P(C(S - {ti
})=y)로 계산
This film offers many delights and surprises
0.1 0.3 0.2 0.4 0.9 0.1 0.6Ii
2. Mask & predict top-K tokens
- Importance가 높은 token부터 masking
- BAE-I (insert)의 경우 token의 인접 위치에 mask를 삽입
- BERT-MLM을 이용해서 mask에 들어갈 token을 K개 예측
This film offers many ? and surprises
pleasures
twists
challenges
3. Filter semantic similarity
- 원래 문장과 실제로 의미가 같은지 판단
- Universal Sentence Encoder (USE) 기반의 sentence similarity encoder 사용
- 문법 체크
- BAE-R의 경우 원래 token과 part of speech (POS)가 같은지 확인
This film offers many ? and surprises
pleasures
twists
challenges
4. Apply perturbation
- 생성된 문장을 target model로 분류
- Prediction이 변하는 token이 여럿일 때 : USE score가 가장 높은 token
- Prediction이 변하는 token이 없을 때 : P(C(Sadv
) = y)를 가장 많이 낮추는 token
- 이 과정을 반복
This movie offers enough pleasures and surprises
Experiments
- Dataset
- Amazon, Yelp, IMDB (sentiment classification)
- MR (sentiment polarity)
- MPQA (opinion polarity)
- Subj (subjective / objective question)
- Model
- word-LSTM / word-CNN / BERT
- Attacks
- BAE-R (replace)
- BAE-I (insert)
- BAE-R/I (either replace or insert)
- BAE-R+I (replace, and then insert)
1. Automatic Evaluation
- Classification accuracy
- 31.0% → 4.0% : strong attack
- Semantic similarity (from USE)
- TextFooler: 0.747
- BAE-R+I: 0.848
⇒ semantically more similar
1. Automatic Evaluation
2. Effectiveness
- Replace/Insert operation의 횟수를 제한했을 때 attack 성공률 비교
- 같은 변형 비율에서 TextFooler보다 강력
- 40-50% 변형에서 포화 (최대 accuracy drop 도달)
3. Qualitative Examples
- Replacement / Insertion
- TextFooler: complex synonym
- BAE
- more natural
- fewer perturbations
4. Human Evaluation
원본 / Attack 문장들에 대한 사람의 평가
- Sentiment accuracy & Naturalness
- R > R+I > TextFooler
✱ Automatic evaluation 에서
USE score는 R+I > R
⇒ USE의 한계점
5. Replace vs. Insert
- A: (R/I) - R
- B: (R/I) - I
- A > B ⇒ Insert 필수 > Replace 필수
- C: (R/I) - R - I
- Subj에서 C의 비율이 가장 큼 & Subj가 가장 robust
⇒ Replace와 Insert를 모두 사용할 수 있다는 것이 중요
The End

More Related Content

Similar to Bae bert based adversarial examples for text classification

AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series
AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series
AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series Amazon Web Services Korea
 
웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴shannonsi
 
웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴shannonsi
 
Lcos显示产业在中国的机遇
Lcos显示产业在中国的机遇Lcos显示产业在中国的机遇
Lcos显示产业在中国的机遇巍 陆
 
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...Amazon Web Services Korea
 
Testing at the core of digital optimization
Testing at the core of digital optimizationTesting at the core of digital optimization
Testing at the core of digital optimizationFlorian Pihs
 
Revisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorRevisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorSungchul Kim
 
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Sungchul Kim
 
12.2008 Trendbird Monthly Trend Report Sample
12.2008 Trendbird  Monthly Trend Report Sample12.2008 Trendbird  Monthly Trend Report Sample
12.2008 Trendbird Monthly Trend Report Samplewebtel125
 
Hr 045 職場經驗分享2
Hr 045 職場經驗分享2Hr 045 職場經驗分享2
Hr 045 職場經驗分享2handbook
 

Similar to Bae bert based adversarial examples for text classification (10)

AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series
AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series
AWS 서비스로 웹 애플리케이션 만들기 – 김주영, AWS 솔루션즈 아키텍트:: AWS Builders Online Series
 
웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴
 
웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴웹사이트 벤치마킹의 9가지 패턴
웹사이트 벤치마킹의 9가지 패턴
 
Lcos显示产业在中国的机遇
Lcos显示产业在中国的机遇Lcos显示产业在中国的机遇
Lcos显示产业在中国的机遇
 
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...
AWS 비용, 어떻게 사용하고 계신가요? - 비용 최적화를 위한 AWS의 다양한 툴 알아보기 – 허경원, AWS 클라우드 파이낸셜 매니저:...
 
Testing at the core of digital optimization
Testing at the core of digital optimizationTesting at the core of digital optimization
Testing at the core of digital optimization
 
Revisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorRevisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object Detector
 
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
 
12.2008 Trendbird Monthly Trend Report Sample
12.2008 Trendbird  Monthly Trend Report Sample12.2008 Trendbird  Monthly Trend Report Sample
12.2008 Trendbird Monthly Trend Report Sample
 
Hr 045 職場經驗分享2
Hr 045 職場經驗分享2Hr 045 職場經驗分享2
Hr 045 職場經驗分享2
 

More from taeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptxtaeseon ryu
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Modelstaeseon ryu
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuningtaeseon ryu
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithmtaeseon ryu
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu
 

More from taeseon ryu (20)

VoxelNet
VoxelNetVoxelNet
VoxelNet
 
OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
JetsonTX2 Python
 JetsonTX2 Python  JetsonTX2 Python
JetsonTX2 Python
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptx
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
RL_UpsideDown
RL_UpsideDownRL_UpsideDown
RL_UpsideDown
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learning
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
mPLUG
mPLUGmPLUG
mPLUG
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithm
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networks
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarization
 

Recently uploaded

WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and AnnovaMansi Rastogi
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...Chayanika Das
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
Measures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGMeasures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGSoniaBajaj10
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasChayanika Das
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxpriyankatabhane
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsDanielBaumann11
 

Recently uploaded (20)

WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
Measures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGMeasures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UG
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
 
Ultrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptxUltrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptx
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptx
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
 

Bae bert based adversarial examples for text classification

  • 1. BAE: BERT-based Adversarial Examples for Text Classification 신동진, 박희수, 황소현
  • 2. Adversarial Example Explaining and Harnessing Adversarial Examples, ICLR ‘15
  • 3. Adversarial Attack in NLP Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment, AAAI ‘20
  • 4. Prior works - Character level error, adding / deleting words - 부자연스러움 - 문법적 오류 - Rule-based synonym replacement - TextFooler (AAAI ‘20) - Synonym replacement - Token-level similarity에만 의존 - e.g. The restaurant service was poor → broke : inappropriate → terrible : context-aware - need more context-awareness ⇒ use BERT-MLM (Masked Language Model)
  • 5. Problem Definition - Given… - S: 주어진 문장 - C: target model - y: S의 ground-truth class - 생성 목표: Sadv - Semantic Similarity : Sadv ~ S - Misclassification: C(Sadv ) != y - This film offers many delights and surprises ⇒ C(S) = Positive → This movie offers enough pleasures and surprises ⇒ C(Sadv ) = Negative
  • 6. Threat Model - Black-box attack - Classification 결과만 접근 가능 - Model weights, gradients, training data 접근 불가 - Soft-label - score of classification ↔ hard-label : 0/1 class inclusion
  • 7. 1. Token Importance - token ti 가 classification에 영향을 주는 정도 - 이 token을 직접 바꾸거나, 주변에 다른 token을 넣으면 classification이 바뀔 가능성이 높음 - P(C(S)=y) - P(C(S - {ti })=y)로 계산 This film offers many delights and surprises 0.1 0.3 0.2 0.4 0.9 0.1 0.6Ii
  • 8. 2. Mask & predict top-K tokens - Importance가 높은 token부터 masking - BAE-I (insert)의 경우 token의 인접 위치에 mask를 삽입 - BERT-MLM을 이용해서 mask에 들어갈 token을 K개 예측 This film offers many ? and surprises pleasures twists challenges
  • 9. 3. Filter semantic similarity - 원래 문장과 실제로 의미가 같은지 판단 - Universal Sentence Encoder (USE) 기반의 sentence similarity encoder 사용 - 문법 체크 - BAE-R의 경우 원래 token과 part of speech (POS)가 같은지 확인 This film offers many ? and surprises pleasures twists challenges
  • 10. 4. Apply perturbation - 생성된 문장을 target model로 분류 - Prediction이 변하는 token이 여럿일 때 : USE score가 가장 높은 token - Prediction이 변하는 token이 없을 때 : P(C(Sadv ) = y)를 가장 많이 낮추는 token - 이 과정을 반복 This movie offers enough pleasures and surprises
  • 11. Experiments - Dataset - Amazon, Yelp, IMDB (sentiment classification) - MR (sentiment polarity) - MPQA (opinion polarity) - Subj (subjective / objective question) - Model - word-LSTM / word-CNN / BERT - Attacks - BAE-R (replace) - BAE-I (insert) - BAE-R/I (either replace or insert) - BAE-R+I (replace, and then insert)
  • 12. 1. Automatic Evaluation - Classification accuracy - 31.0% → 4.0% : strong attack - Semantic similarity (from USE) - TextFooler: 0.747 - BAE-R+I: 0.848 ⇒ semantically more similar
  • 14. 2. Effectiveness - Replace/Insert operation의 횟수를 제한했을 때 attack 성공률 비교 - 같은 변형 비율에서 TextFooler보다 강력 - 40-50% 변형에서 포화 (최대 accuracy drop 도달)
  • 15. 3. Qualitative Examples - Replacement / Insertion - TextFooler: complex synonym - BAE - more natural - fewer perturbations
  • 16. 4. Human Evaluation 원본 / Attack 문장들에 대한 사람의 평가 - Sentiment accuracy & Naturalness - R > R+I > TextFooler ✱ Automatic evaluation 에서 USE score는 R+I > R ⇒ USE의 한계점
  • 17. 5. Replace vs. Insert - A: (R/I) - R - B: (R/I) - I - A > B ⇒ Insert 필수 > Replace 필수 - C: (R/I) - R - I - Subj에서 C의 비율이 가장 큼 & Subj가 가장 robust ⇒ Replace와 Insert를 모두 사용할 수 있다는 것이 중요