Shap

SHapley Additive exPlanations
1. SHAP
2. Tree SHAP

1. SHAP
- 아이디어
한 obs에서 특정 Feature에 따라 Prediction 변화값을 여러 번 구하고 평균값 계산하자
1) i번째 관측치에서 feature_j의 효과를 구하자
- 예측모델 f에 feature_j 값에 따른 변화량 기록
)(ˆ)(ˆˆ 1111
ijiiij xxfxf  

k
ijkij w  ˆˆ
2) 1)을 K번 반복 후 가중평균을 계산
-> i번째 관측치에서 feature_j의 효과 (=SHAP value)
= SHAP value

1. SHAP
- Cat-forbidden(feature_j) 이 주택가격예측(Prediction)에 주는 효과?
)(ˆ)(ˆˆ 1111
ijiiij xxfxf  
- 너무 많은 Coalition Case를 어떻게 estimation 해야 할까?
-> Monte-Carlo, Kernel 등 방법 적용 중
]|)([)(ˆ
sx xxfESf 
- Coalition S = a subset of the features

1. SHAP
- Monte-Carlo method
ex. 원의 넓이를 구할 때, 사각형의 boundary를 정해준 뒤 random sampling
원 안에 점이 포함된 비율 * 사각형의 넓이 = 원의 넓이

1. SHAP
- 알고리즘 (Monte-Carlo method)
1) Require
X = Data(n x p) / x = obs / f = ML model / K = 반복 수
2) Iterations
},,1{ pj 
},,1{ Kk 
For all
For all
- Data 에서 z번째 instance random sampling
- p개의 feature를 나열하는 방법(순열) 중 하나 randomly select = ordering method
- x와 z를 ordering
- Construct two new instances
),,,,,,( 111
*
pjjj
j
zzxxxx  


),,,,,,( 111
*
pjjj
j
zzzxxx  


),,,,( 1 pjo xxxx  ),,,,( 1 pjo zzzz 
)(ˆ)(ˆˆ **)( jjk
ij xfxf 

- Calculate difference
-> ordering을 이용해 coalition 정의

2. Tree SHAP
- Ensemble Tree model의 결과를 더 잘 설명하기위한 Tree SHAP algorithms
1) Leaf-node 이면
w*v 값 반환
2) internal-node 일 때 split feature가
- coalition S 에 포함되면 threshold기준으로
좌우 중 한가지 노드 선택
- 그렇지 않으면 좌우 노드 모두 계산
단 가중치가 감소
f1
f2
f3
< 3
< 6
< 1
Coalition S = {f1, f2}
x = {2,4,6}
Score
0.9
0.1 0.3
0.2
Tree 1
1
ˆij 값 계산

2. Tree SHAP
- Weight update
m = the path of unique features we have split on so far
m의 가중치를 계속 업데이트 (EXTEND , UNWIND)
- Ensemble Tree model의 결과를 더 잘 설명하기위한 Tree SHAP algorithms

3. Output
1) Instance 별 기여도가 어떻게 구성되는지 + 전체 instance 시각화 / force_plot jupyter error

2) 전체 feature 가 model output에 어떤 영향을 미치는지 / summary_plot , sum(SHAP values)로 정렬
3. Output

- Reference
http://eehoeskrap.tistory.com/14
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1950-2018.pdf
https://christophm.github.io/interpretable-ml-book/shapley.html
https://indico.cern.ch/event/736010/contributions/3035968/attachments/1667834/2674455/14.06.18.pdf
https://dreamgonfly.github.io/2017/11/05/LIME.html
https://arxiv.org/pdf/1802.03888.pdf

Shap

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

More from suman_lim

More from suman_lim (6)

Shap