introduce "Stealing Machine Learning Models via Prediction APIs"

2016.12.06
AISECjp #7
Presented by Isao Takaesu
論文紹介
Stealing Machine Learning Models
via Prediction APIs
Part. 1

About the speaker
• 職業 : Webセキュリティエンジニア
• 所属 : 三井物産セキュアディレクション
• 趣味 : 脆弱性スキャナ作り、機械学習
• ブログ: http://www.mbsd.jp/blog/
• Black Hat Asia Arsenal, CODE BLUE / 2016
• AISECjpを主催
高江洲勲
Paper
タカエスイサオ
AISECjp

紹介する論文
Paper
Stealing Machine Learning Models via Prediction APIs
AISECjp
Author : Florian Tramèr (EPFL)
Fan Zhang (Cornell University)
Ari Juels (Cornell Tech, Jacobs Institute )
Michael K Reiter (UNC Chapel Hill)
Thomas Ristenpart (Cornell Tech )
Post Date: 9 Sep 2016
Proceedings of USENIX Security 2016
Source : https://arxiv.org/abs/1609.02943

論文の概要
Paper
機械学習(ML)モデルを複製する”model extraction attacks”の提案
AISECjp
D B
ML service
Data owner
Train model
Extraction
adversaryf 𝒙 𝟏
𝒙 𝟏
・・・
f 𝒙 𝒒
𝒙 𝒒
𝒇
LR
MLP
Decision tree
ブラックボックスアクセスのみでMLモデルを複製

モデル複製によるリスク
Paper
 課金回避
MLモデルへのクエリ単位で課金するビジネスモデルの場合、
収益の悪化(課金 < 訓練コスト)を招く。
 訓練データからの情報漏えい
モデルに組み込まれた訓練データ(機密情報を含む)から、
機密情報が漏えい。
 振る舞い検知の回避
MLモデルがスパム検知、マルウエア検知、N/W異常検知に使用される場合、
攻撃者は上記の検知機能を回避可能。
AISECjp

モデル複製の手法一覧
Paper
 Extraction with Confidence Values
MLモデルがClassとConfidence Valuesを応答する場合。
・Equation-Solving Attacks
・Decision Tree Path-Finding Attacks
・Online Model Extraction Attacks (against BigML, Amazon ML)
 Extraction Given Class Labels Only
MLモデルがClassのみ応答する場合。
・The Lowd-Meek attack
・The retraining approach
AISECjp

今回紹介する手法
Paper
 Extraction with Confidence Values
MLモデルがClassとConfidence Valuesを応答する場合。
・Equation-Solving Attacks ⇐ ココ
・Decision Tree Path-Finding Attacks
・Online Model Extraction Attacks (against BigML, Amazon ML)
 Extraction Given Class Labels Only
MLモデルがClassのみ応答する場合。
・The Lowd-Meek attack
・The retraining approach
AISECjp

Paper
Equation-Solving Attacks
AISECjp

“Equation-Solving Attacks”とは ?
Paper AISECjp
MLモデルへの入力「」と、出力「」を基に、
(攻撃者にとって)未知の方程式「」を復元(複製)。
例）”Binary logistic regression”の場合
MLモデル：
攻撃者：
攻撃者が知り得る「」と「」を基に方程式を解き、
未知のパラメータ「」を特定(方程式の復元)。
f 𝒙, 𝒚𝒙, 𝒚
f 𝒙, 𝒚 = “?????”
f 𝒙, 𝒚 = 1.4150971 + 3.3421481 ∗ 𝒙 + 3.0892439∗ 𝒚
f 𝒙, 𝒚 = 𝒘 𝟎 + 𝒘 𝟏 𝒙 + 𝒘 𝟐 𝒚
f 𝒙, 𝒚𝒙, 𝒚
𝒘 𝟎 , 𝒘 𝟏, 𝒘 𝟐

“Equation-Solving Attacks”の検証
Paper AISECjp
 MLモデルの複製
・Binary logistic regression
・Multiclass LR and Multilayer Perceptron
・Training Data Leakage for Kernel LR
・Model Inversion Attacks on Extracted Models

今回検証した“Equation-Solving Attacks”
Paper AISECjp
・Binary logistic regression ⇐ ココ

Paper
Binary logistic regression
AISECjp

データのクラス分類(c=2)と(クラスに属する)確率を求める
decision boundary :
Paper AISECjp
f 𝒙 𝟏, 𝒙 𝟐 = 𝒘 𝟎 + 𝒘 𝟏 𝒙 𝟏 + 𝒘 𝟐 𝒙 𝟐
“Binary logistic regression”とは ?（おさらい）
f(x1,x2)=0
f(x1,x2)>0
f(x1,x2)<0
positive
negative

Paper AISECjp
“positive”の確率：
“negative”の確率：
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
𝒂
𝜎 𝑎 =
1
1 + 𝑒−𝑎
ロジスティック関数
“Binary logistic regression”とは ?
P 𝒙 𝟏, 𝒙 𝟐 = 𝝈(𝒘 𝟎 + 𝒘 𝟏 𝒙 𝟏 + 𝒘 𝟐 𝒙 𝟐)
1-P 𝒙 𝟏, 𝒙 𝟐
positive
negative

Paper AISECjp
検証モデルの構築
訓練データ(ex2data1) : 赤 = positive, 青 = negative
⇒ decision boundaryを求める

Paper AISECjp
検証モデルの構築
訓練結果
decision boundary : f 𝒙 𝟏, 𝒙 𝟐 = 1.415 + 3.342 ∗ 𝒙 𝟏 + 3.089∗ 𝒙 𝟐
f(x1,x2)=0
f(x1,x2)>0
f(x1,x2)<0
positive
negative

Paper AISECjp
検証モデルの利用イメージ
D B
LR model
UserP=0.055, neg
𝒙 𝟏, 𝒙 𝟐
・・・
𝒙 𝒒𝟏, 𝒙 𝒒𝟐
P=0.996, pos
分類させたいデータ(x1, x2)を入力し、
分類結果(c=pos or neg)と(クラスに所属する)確率(P)を得る。

Paper AISECjp
検証モデルの悪用イメージ
D B
LR model
adversary
P=0.055, neg
𝒙 𝟏, 𝒙 𝟐
・・・
𝒙 𝒒𝟏, 𝒙 𝒒𝟐
P=0.996, pos
入力データ(x1, x2)と出力される確率(P)を利用し、
decision boundaryを特定する。
f 𝒙 𝟏, 𝒙 𝟐 = 1.42 + 3.34 ∗ 𝒙 𝟏 + 3.09∗ 𝒙 𝟐

Paper
どうやってやるのか？
AISECjp

Paper AISECjp
手順１：情報の収集
ユーザの入力モデルの出力
データ(x1, x2) クラス確率(P)
-1.602 0.638 negative 0.123
-1.062 -0.536 negative 0.022
-1.539 0.361 negative 0.068
-0.282 1.086 positive 0.979
・・・・・・・・・・・・
・モデルの利用結果
f 𝒙 𝟏, 𝒙 𝟐 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟔𝟎𝟐 + 𝒘 𝟐 𝟎. 𝟔𝟑𝟖
f 𝒙 𝟏, 𝒙 𝟐 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟎𝟔𝟐 − 𝒘 𝟐 𝟎. 𝟓𝟑𝟔
f 𝒙 𝟏, 𝒙 𝟐 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟓𝟑𝟗 + 𝒘 𝟐 𝟎. 𝟑𝟔𝟏
f 𝒙 𝟏, 𝒙 𝟐 = 𝒘 𝟎 − 𝒘 𝟏 𝟎. 𝟐𝟖𝟐 + 𝒘 𝟐 𝟏. 𝟎𝟖𝟔
目的変数「」は？
⇒確率(P)をロジット関数「」に通す
f 𝒙 𝟏, 𝒙 𝟐
𝒍𝒐𝒈𝒊𝒕 𝑷 = 𝒍𝒐𝒈
𝑷
𝟏 − 𝑷

Paper AISECjp
ユーザの入力モデルの出力
データ(x1, x2) クラス確率(P)
-1.602 0.638 negative 0.123
-1.062 -0.536 negative 0.022
-1.539 0.361 negative 0.068
-0.282 1.086 positive 0.979
・・・・・・・・・・・・
・モデルの利用結果
−𝟐. 𝟖𝟑𝟗 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟔𝟎𝟐 + 𝒘 𝟐 𝟎. 𝟔𝟑𝟖
−𝟓. 𝟒𝟔𝟕 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟎𝟔𝟐 − 𝒘 𝟐 𝟎. 𝟓𝟑𝟔
−𝟑. 𝟕𝟔𝟗 = 𝒘 𝟎 − 𝒘 𝟏 𝟏. 𝟓𝟑𝟗 + 𝒘 𝟐 𝟎. 𝟑𝟔𝟏
𝟓. 𝟓𝟐𝟑 = 𝒘 𝟎 − 𝒘 𝟏 𝟎. 𝟐𝟖𝟐 + 𝒘 𝟐 𝟏. 𝟎𝟖𝟔
手順２：方程式を解く(Equation-Solving)
・ Equation-Solving の結果
特定した係数：
複製した関数：
𝒘 𝟎 = 2.042 𝒘 𝟏 = 4.822 𝒘 𝟐 = 4.457
𝒇 𝒙 𝟏, 𝒙 𝟐 = 2.042 + 4.822 ∗ 𝑥1 + 4.457 ∗ 𝑥2
Equation-Solving

Paper AISECjp
・オリジナルのモデル
“Equation-Solving Attacks”の結果
f 𝒙 𝟏, 𝒙 𝟐 = 1.415 + 3.342 ∗ 𝒙 𝟏 + 3.089 ∗ 𝒙 𝟐
・複製したモデル
𝒇 𝒙 𝟏, 𝒙 𝟐 = 2.042 + 4.822 ∗ 𝒙 𝟏 + 4.457 ∗ 𝒙 𝟐
複製モデルで正しく分類できるのか？

Paper AISECjp
オリジナルと複製モデルの比較結果
ユーザの入力オリジナルモデル複製モデル
データ(x1, x2) クラス確率(P) クラス確率(P)
-1.602 0.638 negative 0.123 negative 0.055
-1.062 -0.536 negative 0.022 negative 0.004
-1.539 0.361 negative 0.068 negative 0.023
-0.282 1.086 positive 0.979 positive 0.996
0.692 0.493 positive 0.995 positive 0.999
-0.234 1.638 positive 0.997 positive 0.999
0.485 -1.064 negative 0.437 negative 0.410
0.585 -1.008 positive 0.564 positive 0.591
0.177 -0.729 negative 0.439 negative 0.412
・・・・・・・・・・・・・・・・・・
オリジナルと複製モデルの分類結果は完全一致（n=100）

Paper AISECjp
・Rounding confidences
モデルが返すConfidence Valuesを丸めることで複製精度を下げる
例）P= 0.437401116 ⇒ P= 0.43
“Equation-Solving Attacks”の対策
Effect of rounding on model extraction(紹介論文からの引用).

次回の予定 (Equation-Solving Attacks)
Paper AISECjp
・Binary logistic regression（✔）

Download “.PDF” version of this document:
≫ https://aisecjp.connpass.com/event/44600/

introduce "Stealing Machine Learning Models via Prediction APIs"

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (20)

Similar to introduce "Stealing Machine Learning Models via Prediction APIs"

Similar to introduce "Stealing Machine Learning Models via Prediction APIs" (20)

More from Isao Takaesu

More from Isao Takaesu (7)

Recently uploaded

Recently uploaded (10)

introduce "Stealing Machine Learning Models via Prediction APIs"