TIP_TAViT_presentation.pdf

Task-Agnostic Vision Transformer for
Distributed Learning of Image Processing
Boah Kim*, Jeongsol Kim*, and Jong Chul Ye
IEEE Transactions on Image Processing

Task-Agnostic Vision Transformer for Distributed Learning of Image Processing
Research objectives
Distributed learning
• To train a single network on multiple devices using local data
• Ex. Federated learning (FL), Split learning (SL)
[1] https://proandroiddev.com/federated-learning-e79e054c33ef [2] Singh, Abhishek, et al. arXiv preprint arXiv:1909.09145 (2019).
2
Federated learning Split learning
Parallel communication between each client Decomposition of a network into clients & server
Usually consider a common task such as classification

Research objectives
3
Distributed learning
• To train a single network on multiple devices using local data
• Ex. Federated learning (FL), Split learning (SL)
• To process various tasks across the clients without sharing local data
• To propose a systematic way for clients to synergistically learn multiple image processing tasks
Research goal
Federated learning Split learning

Background: Multi-task learning (MTL)
[1] https://pyimagesearch.com/2022/08/17/multi-task-learning-and-hydranets-with-pytorch/ [2] Kendall et al., 2017
4
• To enhance the generalization of model on one task by learning shared representations of related tasks
• To improve computational efficiency and reduces the overfitting problem
Unlike existing MTL models that learn similar tasks,
Our model is to learn multiple different image processing tasks
[1] [2]
Existing models

Background: Image processing using Transformer
[1] Vaswani, Ashish, et al. NeurIPS (2017). [2] Dosovitskiy, Alexey, et al. ICLR 2021. [3] Chen, Hanting, et al. CVPR 2021.
A network to solve sequence-to-
sequence tasks by using long-range
dependencies via self-attention
Transformer
5
An encoder-only architecture to
learn image recognition tasks
Vision Transformer (ViT)
NLP CV
Image Processing Transformer (IPT)
CNN heads/tails & Transformer body to
learn low-level vision tasks
Our model is to design a distributed learning framework using Transformer
that does not require centralized data
Distribute learning

Proposed method
Task-Agnostic Vision Transformer (TAViT)
• Subscription-based service model → Clients subscribe to a task-agnostic Transformer at the server
- Clients: CNN Heads/tails proper to their tasks
- Server: Encoder-only Transformer to learn global attention over the image features
6

Proposed method
Training scheme
• Task-specific learning: To train the client-side task-specific head and tail networks
• Deformation network: To train the server-side task-agnostic body network
Consider clients and server as two players
Alternating training strategy
7

Proposed method
Training scheme
Task-specific learning
8
• Clients train their own heads/tails with the fixed body in parallel using locally stored datasets
• Optimization:
Back-propagation
• Federated learning
When there are multiple clients for the same task,

Proposed method
Training scheme
Task-agnostic learning
9
• Server trains the Transformer body with the fixed head & tail of a randomly chosen client for each iteration
• Optimization:
Back-propagation
• To learn global embedding representation
→ To provide task-agnostic self-attended features for various image processing

Experimental results
Multi-task distributed learning
10
• 1 server + 5 clients (2 deblocking, 1 denoising, 1 deraining, 1 deblurring)

Comparison to distributed learning strategies
11
• To compare TAViT for three cycles with SL and FL
Comparison to learning each separate task
• To compare TAViT with models independently trained on each individual task

Comparison to task-specific models
12
• To evaluate TAViT with several representative methods using the benchmark datasets

Comparison to task-specific models
13
• To evaluate TAViT with several representative methods using the benchmark datasets

Conclusion
• Task-specific CNN heads/tails placed in clients + Task-agnostic Transformer body placed in the server
- Training by an alternating training scheme between task-specific learning & task-agnostic learning
• Experimental results show the success of task-agnostic learning of the Transformer body and its
synergistic improvement with the task-specific heads and tails.
• Through our TAViT, clients can train their own networks depending on the task using local data in parallel.
Propose a novel multi-task distributed learning method, called TAViT.
14

TIP_TAViT_presentation.pdf

More Related Content

Similar to TIP_TAViT_presentation.pdf

More from BoahKim2

Recently uploaded

TIP_TAViT_presentation.pdf