4. Visual Analytics?
Introduction
4
Data Visualization
Interpretation
Human
Typical (interactive) visualization
- focuses on visualizing given data as it is.
- but, big data cannot be typically visualized due to the limited screen space and human perception
(for large number of data items and features with lots of noise, …).
Interaction
6. Visual Analytics for Deep Learning
Introduction
Visual analytics for deep learning?
6
7. Visual Analytics for Deep Learning
Introduction
Many visual analytics for Deep learning
7
https://medium.com/multiple-views-visualization-research-explained/visualization-in-deep-learning-b29f0ec4f136
8. Visual Analytics in DL: for interpretability, diagnosis, refinement of models
Introduction
8
GAN LabTensorBoard CNNVis
RNNVis LSTMVis ActiVis
9. 데이터
전처리
Feature
엔지니어링
모델 아키텍처
선택
모델 파라미터
선택
모델 평가 서비스 배포
Properly handling:
§ Imbalanced data
§ Outliers
§ Missing values
§ High cardinality
features
§ Highly correlated
features
§ Target leakage
§ Inconsistent feature
definition
§ Data that doesn’t fit
local memory
Selecting right
preprocessing for:
§ Numbers
§ Classes
§ Dates
§ Lists
§ Nested fields
Multiple options per
column, 100s of
columns in table
Selecting the best
model architecture
from dozens
available
§ Linear
§ Feed forward
§ Decision tree
§ Residual nets
Keeping up with the
onslaught of newest
state of the art
For each
architecture,
selecting the right
values for each
hyperparameter
§ Learning rate
§ Regularization
§ Layers
§ Hidden nodes
§ Activation fxn
Potentially more than
a dozen values to set
Evaluating model at
§ Dataset-level
§ Feature-level
§ Prediction-level
Ensuring behavior is
fully understood before
deployment
Formalized workflow of deep learning model development
Introduction
Deploying service
9
10. Problems in developing deep learning model
Introduction
데이터
전처리
Feature
엔지니어링
모델 아키텍처
선택
모델 파라미터
선택
모델 평가 서비스 배포
10
Tedious episodes of trial and error!
11. Problems in developing deep learning model => AutoML as a solution
Introduction
Black-box optimization
Number of hyperparameter combination is infinite!
A number of computation, time, and human resources are needed :(
11
데이터
전처리
Feature
엔지니어링
모델 아키텍처
선택
모델 파라미터
선택
모델 평가 서비스 배포
Þ AutoML: Automatic & Systematic approach
12. Problems in developing deep learning model => AutoML as a solution
Introduction
12
NSML AutoML
Black-box optimization
데이터
전처리
Feature
엔지니어링
모델 아키텍처
선택
모델 파라미터
선택
모델 평가 서비스 배포
13. Visual Analytics for AutoML
Problem in context
어쨌건 visual analytics를 이용해서 모델 분석도 하고,
해석도 어느 정도 가능한 것 같긴한데..
AutoML은 보통 수백, 수천 개의 모델을 만드는데
이를 어떻게 표현하고, 분석할 수 있지?
13
14. Visual Analytics for AutoML
Problem in context
No visual analytics system for AutoML
14
15. Challenges
- Too many models to be shown (n > 100, 1000, 10000, ...)
- High dimensionality and complexity of hyperparameter space
15
DL models Batch_size Learning_rate Num_epoch Layer_depth Activation
_function
… Test/acc.
Model 0 100 0.001 74 3 relu … 0.9231
Model 1 100 0.001231 68 6 sigmoid … 0.8951
Model 2 1000 0.00125 48 9 tanh … 0.5789
Model 3 500 0.00534 24 128 relu … 0.9483
Model 4 500 0.01541 24 128 sigmoid … 0.832
Model 5 500 0.05929 24 32 Tanh … 0.748
… … … … … … … …
Example results of hyperparameter optimization
Problem in context
Hyperparameter configuration space
16. - 수만 개의 모델을 만들어도 최적의 모델이라는 보장은 없음 (infinite search space)
- 매번 수만, 수십만 개의 모델을 만들 수 없음
=> 모든 space를 탐색할 시간이 한정적이고, GPU도 한정적
=> 몇백 개의 탐색 결과를 보고, 결과를 기반으로 또 다시 몇백 개의 탐색을 반복
- AutoML algorithm에 대한 configuration도 정답이 없음
=> 실행 결과를 보고 진단하고 바로 잡는 등의 과정
Þ 최적화는 한 번의 trial로 끝나지 않음 (Open-ended task)
More Challenges
Problem in context
16
17. Design Goals
- Show an overview of results with effective visual interfaces
- Enable switching to detail analysis view from the overview
by coordinated visual components
- Steer the open-ended tuning task with human-in-the-loop
approach
Hyperparameter optimization process
through visual analytics
17
Design Goals
- How to effectively visualize the result of hyper-
parameter optimization?
- How can visual analytics support the open-ended
hyperparameter optimization task?
Design Challenges Design Goals
18. Interaction flow design
- Overview 보여주고, 여러 관점에서의 분석 환경을 제공해서 결과 분석 할 수 있는 환경 제공
- 결과 분석으로 얻은 insight를 바탕으로 모델의 refinement도 도와주는 환경 제공
18
Design Goals
A
Visual exploration of
overall optimization results
Switch overview
to details
B
Hyperparameter-level
Model-level
Method-level
analysis
Action
C
User-driven
model refinement
- Parallel coordinates plot
- effective visualization for high
dimensional data
- Hyperparameter-level:
- to find effective hyperparameter
- to find effective range of hyperparameter
- Model-level:
- to validate model generalizability
- to analyze the value of loss function by time/iteration
- Method-level:
- to validate/diagnose the algorithm configurations
- to compare the performance of algorithm
- Support interactive tuning process
- easy to access autoML system
with the gained insights
19. D3.js
HyperTendril
19
- Domain specific language for data vis.
- More flexible to make own visual
component than higher-level libraries
- A number of examples, documents, tutorials
https://d3js.org/
21. Interaction flow of HyperTendril Visual Analytics
HyperTendril
22
A
Visual exploration of
overall optimization results
Switch overview
to details
B
Hyperparameter-level
Model-level
Method-level
analysis
Action
C
User-driven
model refinement
22. - To understand usage behaviors of visual analytics
- Log collection w/ Google analytics
- User feedback w/ UX interview
- Findings: the usage behaviors and
volume of interactions are various by
their tasks and purposes.
User study
HyperTendril
Click-stream analysis with
representative users and their sessions
23
Interaction patterns are
quite different!
23. - Behaviors of each user are various, but can be categorized
- Fine-tuner
- Service-oriented tuner
- Research-oriented tuner
Þ Should consider an extendable
design of visual analytics to satisfy
various types of users.
Lessons learned
Discussion & Conclusion
24
Knowledge generation loop w/ HyperTendril
Complexity & volume
of interactions
24. Conclusion & Future work
- Interactive hyperparameter optimization in real-time
- Visual analytics for multi-metric based model comparison
- including latency, classification performance (e.q., confusion matrix), and etc.
- Visual analytics for Neural Architecture Search (NAS)
- for automating the design of artificial neural network.
25
Discussion & Conclusion
- Defining problem to solve (which task can be supported by visual analytics) is important.
- Extendable design should be considered in developing visual analytics.
Future work
Conclusion
26. Issues in development
HyperTendril
- Performance Issues
- Browser down in drawling line chart with numerous data points => reservoir sampling
27
27. References
HyperTendril
- Knowledge Generation Models for Visual Analytics [1]
- Visual Analytics in Deep Learning [2]
- Tensorboard graph visualizer [3]
- CNNVis [4]
- GanLab [5]
- RNNVis [6]
- LSTMVis [7]
- ActiVis [8]
- D3.js [9]
- Reservoir sampling [10]
28
[1] Sacha, Dominik, et al. "Knowledge generation model for visual analytics." IEEE transactions on visualization and computer graphics 20.12 (2014): 1604-1613.
[2] Hohman, Fred Matthew, et al. "Visual analytics in deep learning: An interrogative survey for the next frontiers."
IEEE transactions on visualization and computer graphics (2018).
[3] Girija, Sanjay Surendranath. "Tensorflow: Large-scale machine learning on heterogeneous distributed systems." Software available from tensorflow. org (2016).
[4] Liu, Mengchen, et al. "Towards better analysis of deep convolutional neural networks." IEEE transactions on visualization and computer graphics 23.1 (2017): 91-100.
[5] Kahng, Minsuk, et al. "GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation.”
IEEE transactions on visualization and computer graphics 25.1 (2019): 310-320.
[6] Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. "Visualizing and understanding recurrent networks." arXiv preprint arXiv:1506.02078 (2015).
[7] Strobelt, Hendrik, et al. "Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks."
IEEE transactions on visualization and computer graphics 24.1 (2018): 667-676.
[8] Kahng, Minsuk, et al. "Activis: Visual exploration of industry-scale deep neural network models."
IEEE transactions on visualization and computer graphics 24.1 (2018): 88-97.
[9] D3.js https://d3js.org/
[10] J. S. Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1):37–57, 1985.