White box in Computer Vision

발표자 소개
⁃ 2013.3~2019.8 수원대학교 통계정보학과 학사
⁃ 2018.7~2019.6 분당서울대병원 인공지능 연구원
⁃ 2019.6~2019.9 데이콘 데이터 사이언티스트
⁃ 2019.9~현재 모두의연구소 A.I. College XAI분야 연구원
⁃ 2019.9~현재 DNA Team 매니저
⁃ 네, 백수입니다.

* 프로젝트 소개
XAI in Computer Vision
구현결과
이후 계획
목차

프로젝트 소개
* XAI in Computer Vision
구현결과
이후 계획
목차

프로젝트 소개
* 구현결과
이후 계획
목차

프로젝트 소개
구현결과
* 이후 계획
목차

현재와 과거의 차이
과거 현재
데이터
규칙
결과

데이터
규칙
결과
내가 알아서 찾을게
인공지능 모델
현재와 과거의 차이
과거 현재

Canziani, A., Paszke, A., & Culurciello, E. (2016). An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678.
작게는 오백만개부터 많게는 일억오천오백만개
문제점

Gunning, D. (2017). Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web, 2.
미 국방고등연구계획국
문제 해결 본격적 시작

인공지능 모델
https://twitter.com/jackyalcine/
1. 모델의 결과에 대한 검증이 필요
해석이 필요한 이유
Gorillias

90점
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-
idUSKCN1MK08G
80점
A 대학교
토익 900
자격증B 취득
A 대학교
토익 900
자격증B 취득
1. 모델의 결과에 대한 검증이 필요
인공지능 모델

Explainable AI (XAI) : Interpretability for Deep Learning – SI Analytics
2. 디버깅 도구로 사용

인공지능 모델
응, 환자야 왜?
Kim, T., Heo, J., Jang, D. K., Sunwoo, L., Kim, J., Lee, K. J., ... & Oh, C. W. (2019). Machine learning for detecting moyamoya disease in plain skull
radiography using a convolutional neural network. EBioMedicine, 40, 636-642.
오!
3. 새로운 발견

영화 타짜의 한 장면 (2006)
4. 설명에 대한 권리
사쿠라야?

작년 참가했던 공모전 2차심사 당시
목차
- 주제
- 데이터
- 전처리
- 모델링
그래서 중요했던 변수들이 어떤게 있었나요?
열심히 점수 올렸습니다.
그건..
프로젝트 시작 계기
- 결론

X.A.I 가이드를 만들자
Part 1 : Computer Vision
Part 2 : Tabular
Part 3 : NLP
ModuleWeb Serving Research
White Box 시작

Classification
CA
T
DO
G
Object Detection Segmentation
Computer Vision의 문제들

Classification
CA
T
DO
G

Classification
CA
T
DO
G
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). " Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd
ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144).

딥러닝 모델의 학습과정
Gunning, D. (2017). Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web, 2.

딥러닝 모델의 해석 방법
Explainable AI (XAI) : Interpretability for Deep Learning – SI Analytics

Attribution Methods
Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). Sanity checks for saliency maps. In Advances in Neural Information
Processing Systems (pp. 9505-9515).
Saliency maps

1. Vanilla Backpropagation
2. Input x Backpropagation
3. DeconvNet
4. Guided Backpropagation
5. Integrated Gradients
6. Grad-CAM
7. Guided Grad-CAM
8. SmoothGrad
Attribution Methods

1. Vanilla Backpropagation
𝑀 𝑥 =
𝜕
𝜕𝑥
𝑓(𝑥)
𝑓 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑡𝑎𝑟𝑔𝑒𝑡.
𝑥 𝑖𝑠 𝑎 𝑖𝑛𝑝𝑢𝑡 𝑖𝑚𝑎𝑔𝑒
Attribution Methods

2. Input x Backpropagation
Attribution Methods

3. DeconvNet
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional networks. In European conference on computer vision (pp.
818-833). Springer, Cham.
Attribution Methods

4. Guided Backpropagation
Springenberg, J. T., Dosovitskiy, A., Brox, T., & Riedmiller, M. (2014). Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806.
Attribution Methods

5. Integrated Gradients
M(x1) M(x2) M(x3) M(x4) M(x5)+ + + +
Sundararajan, M., Taly, A., & Yan, Q. (2017, August). Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on
Machine Learning-Volume 70 (pp. 3319-3328). JMLR. org.
Attribution Methods

6. Grad CAM (Gradients - Class Activation Map)
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based
localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618-626).
Attribution Methods

7. Guided Grad CAM
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based
localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618-626).
Attribution Methods

8. SmoothGrad
𝑀𝑐 𝑥 =
1
𝑛
1
𝑛
𝑀𝑐(𝑥 + 𝑁 0, 𝜎2 )
𝑀𝑐 𝑥 =
1
𝑛
1
𝑛
(𝑀𝑐(𝑥 + 𝑁 0, 𝜎2
))2
𝑀𝑐 𝑥 = 𝑉𝑎𝑟(𝑀𝑐(𝑥 + 𝑁 0, 𝜎2
))
SmoothGrad
SmoothGrad SQ
SmoothGrad VAR
Smilkov, D., Thorat, N., Kim, B., Viégas, F., & Wattenberg, M. (2017). Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825.
Attribution Methods

“학습할때 보다 영향력이 있는 부분을 더 집중해서 학습하자”
1. Class Activation Map (CAM)
2. Residual Attention Network (RAN)
3. Convolutional Block Attention Module (CBAM)
Attention Methods

1. Class Activation Map (CAM)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings
of the IEEE conference on computer vision and pattern recognition (pp. 2921-2929)
Attention Methods
평균으로 압축
Saliency maps도 뽑을 수 있다.

2. Residual Attention Network (RAN)
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., ... & Tang, X. (2017). Residual attention network for image classification. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (pp. 3156-3164).
Attention Methods
Module 형식으로 구성
Sigmoid로 변환 후 계산
각 블록은 Residual Block으로 구성

3. Convolutional Block Attention Module (CBAM)
Woo, S., Park, J., Lee, J. Y., & So Kweon, I. (2018). Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer
Vision (ECCV) (pp. 3-19).
Attention Methods
What (무엇) Where (어디)

1. Coherence
2. Selectivity
3. Remove and Retrain (ROAR) / Keep and Retrain (KAR)
Evaluation

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). Sanity checks for saliency maps. In Advances in Neural
Information Processing Systems (pp. 9505-9515).
Evaluation
1. Coherence
가장 선명하게 잘나온게 어떤걸까?

Evaluation
2. Selectivity
정확도가 가장 빨리 떨어지는게 어떤걸까?
중요한 픽셀
하나씩 제거
선 위의 면적이
넓을 수록 좋다.

Hooker, S., Erhan, D., Kindermans, P. J., & Kim, B. (2018). Evaluating feature importance estimates. arXiv preprint arXiv:1806.10758.
Evaluation
2. Remove and Retrain (ROAR) / Keep and Retrain (KAR)
정확히 알려면 다시 학습해봐야해
Saliency Map 추출 특정 비율만큼만
중요한 부분 제거
재학습 후 평가

구현코드 관리
https://github.com/TooTouch/WhiteBox-Part1

⁃ Framework: Pytorch
⁃ Requirements
⁃ pytorch >= 1.2.0
⁃ torchvision >= 0.4.0
⁃ OS: Ubuntu 18.04
⁃ CPU: Intel i7-8700K
⁃ GPU: GTX 1080Ti
⁃ RAM: 64GB
구현환경

⁃ Dataset : 2개
⁃ Model : 1개 + 3개 (attention methods)
⁃ Attribution Method : 7개 + 2개(Random/Conv Output) + 3개(ensembles)
⁃ ROAR / KAR [ratio 0.1~0.9] : 9번 x 2
우리집 전기세,,,
그래픽카드 주데요,,,
학습계획
모델을 학습해야하는 횟수
( (1 x 9 x 18) + (3 x 1 x 18) + 4 ) x 2 =
Base Model
Attribution
Methods
Attention
Methods
ROAR/KAR Models
Datasets
440번

프로젝트 파일구조
D:.
│ dataload.py
│ main.py
│ models.py
│ utils.py
│ visualization.py
│
├─attention_methods
│ │ cam.py
│ │ cbam.py
│ │ ran.py
│ │ warn.py
│ │ __init__.py
│
└─ saliency
│ attribution_methods.py
│ ensembles.py
│ evaluation_methods.py
└─ __init__.py
데이터 불러오기, 전처리, dataloader 만들기
학습 및 평가
모델
기타 함수들 (train, test, scaling, get samples, resize etc)
시각화 (coherence, train logs, saliency maps etc)
Class Activation Map (CAM)
Convolutional Block Attention Module (CBAM)
Residual Attention Network (RAN)
Wide Attention Residual Network (WARN)
Grad-CAM, Guided Backprop, Integrated Gradients etc
Smooth Grad, Smooth VAR Grad, Smooth Square Grad
Selectivity, ROAR, KAR
편하게 실험할 수 있는 환경을 만들자

프로젝트 파일구조
pip install tootorch

사용데이터
MNIST 28 x 28 CIFAR10 32 x 32

Simple CNN Model
* 3 convolution layers networks (Simple CNN)
Attention Modules
Attention Models
Class Activation Methods (CAM)
사용모델

CBAM
CBAM
CBAM
사용모델
Simple CNN Model
3 convolution layers networks (Simple CNN)
Attention Modules
* Convolutional Block Attention Module (CBAM)
Attention Models

GAP
FCN 10
사용모델
Simple CNN Model
Attention Modules
Attention Models
* Class Activation Methods (CAM)

사용모델
Simple CNN Model
Attention Modules
Attention Models
* Residual Attention Network (RAN)
원래 논문 실험 이미지는 224 x 224

파라미터설정
Details
- Epochs
- MNIST models train 30 epochs.
- CIFAR10 models train 100 epochs.
- Optimizer: SGD(learning rate=0.01)
- Batch size: 128
- Loss function: cross entropy
python main.py --train --target=['mnist','cifar10'] --attention=['CAM','CBAM','RAN','WARN']

모델성능
MNIST
Number
of
Paramet
ers
0
zero
1
one
2
two
3
three
4
four
5
five
6
six
7
seven
8
eight
9
nine
Total
Simple CNN 1284042 0.998 0.995 0.995 0.995 0.993 0.990 0.986 0.989 0.996 0.985 0.992
Simple CNN + CAM 1285332 0.994 0.995 0.989 0.995 0.988 0.988 0.993 0.981 0.986 0.977 0.988
Simple CNN + CBA
M
1288561 0.998 0.995 0.992 0.996 0.990 0.990 0.990 0.991 0.995 0.989 0.993
RAN
2798746
6
0.997 0.998 0.996 0.995 0.989 0.991 0.996 0.988 0.994 0.990 0.994
Train History Validation Histoty

CIFAR10
Number
of
Paramet
ers
Air-
plane
Auto-
mobile
bird cat deer dog frog horse ship truck Total
Simple CNN 2202122 0.872 0.905 0.692 0.731 0.843 0.660 0.904 0.864 0.860 0.916 0.825
Simple CNN + CAM 2203412 0.760 0.896 0.585 0.477 0.752 0.804 0.769 0.711 0.837 0.862 0.745
Simple CNN + CBA
M
2206641 0.858 0.945 0.749 0.685 0.790 0.761 0.826 0.798 0.873 0.896 0.818
RAN
2799066
6
0.843 0.882 0.758 0.701 0.776 0.586 0.916 0.844 0.924 0.873 0.810
모델성능
Train History Validation Histoty

Saliency Maps
Layer가 깊어질수록 중요한 부분을 학습
Salinecy maps of layers using Grad CAM

Evaluation – Coherence
Attention Model - Simple CNN vs Simple CNN+CBAM
Top : SimpleCNN / Bottom : SimpleCNN + CBAM
Saliency maps by layers : CIFAR10

Attention Model - RAN
Saliency maps of RAN by layers : CIFAR10
stage1
output
stage2
output
stage3
output
Evaluation – Coherence

Evaluation – Selectivity
* iteration : 50
python main.py --eval=selectivity --target=['mnist','cifar10'] --method=['VGB','IB','DeconvNet','IG','GB','GC','GBGC']

Evaluation – ROAR / KAR
python main.py --eval=['ROAR','KAR'] --target=['mnist','cifar10’] –attention=[‘CAM’,’CBAM’,’RAN’,’WARN’]
--method=['VGB','IB','DeconvNet','IG','GB','GC','GBGC’,’CO’,’RANDOM’]

White Box Guide Book
Part 2 - Tabular

White box in Computer Vision

More Related Content

What's hot

Similar to White box in Computer Vision

Recently uploaded

White box in Computer Vision