Federated learning trains a model on a centralized server using datasets distributed over a massive amount of edge devices. Since federated learning does not send local data from edge devices to the server, it preserves data privacy. It transfers the local models from edge devices instead of the local data. However, communication costs are frequently a problem in federated learning. This paper proposes a novel method to reduce the required communication cost for federated learning by transferring only top updated parameters in neural network models. The proposed method allows adjusting the criteria of updated parameters to trade-off the reduction of communication costs and the loss of model accuracy. We evaluated the proposed method using diverse models and datasets and found that it can achieve comparable performance to transfer original models for federated learning. As a result, the proposed method has achieved a reduction of the required communication costs around 90\% when compared to the conventional method for VGG16. Furthermore, we found out that the proposed method is able to reduce the communication cost of a large model more than of a small model due to the different threshold of updated parameters in each model architecture.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Sparse Communication for Federated Learning
1. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Sparse Communication
for Federated Learning
Kundjanasith Thonglek1
, Keichi Takahashi2
, Kohei Ichikawa1
,
Chawanat Nakasan3
, Pattara Leelaprute4
, and Hajimu Iida1
1
Nara Institute of Science and Technology, Nara, Japan
2
Tohoku University, Sendai, Japan
3
Kanazawa University, Ishikawa, Japan
4
Kasetsart University, Bangkok, Thailand
1
2. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Deployment approaches for AI applications
2
Cloud-based AI Edge-based AI
DATA
DATA
Input Data
Input Data
Inference Result
Inference Result
DATA
DATA
Input Data
Input Data
Inference Result
➢ Model is trained using data
from all edge devices
➢ Longer response time
➢ Poor data privacy
Pros
Cons
➢ Shorter response time
➢ Better data privacy
➢ Edge devices cannot share
their data with other devices
Pros
Cons
Inference Result
AI
AI
AI
3. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Federated Learning Algorithm
Local Training
Local Training
Federated learning
3
DATA
DATA
Global
Model
Local
Model
Local
Model
Federated Averagining
(FedAVG) Model Aggregation
Update the existing global model
Global Model
➢ Model is trained using local data of all edge devices
➢ Shorter response time and better data privacy
Pros
Cons
➢ Large network bandwidth consumption because the
models need to be exchanged between the server and
clients
4. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Server
Client 2
Client 1
Federated averaging (FedAVG)
4
Global Model ( )
Local Model ( )
Local Model ( )
DATA
DATA
Local Model ( )
Local Model ( )
Averaging
Global Model ( )
The number of selected clients in each round is
where C is the fraction of selected clients and N is the total number of clients.
is the weights of the global model on the server
is the weights of the local model on the client
5. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Downside #1: Whole models are exchanged
5
Since the whole models are exchanged between the server and clients,
transferring unupdated parameters wastes the network bandwidth.
Client 1 Server
Client 2
Client 3
Local training
Local training
Local training
Not updated
6. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Downside #2: Only a subset of clients participate
6
Since only a subset of clients participate in one round, the server misses local
updates that could have been obtained from the excluded clients.
Client 1 Server
Client 2
Client 3
Local training
Local training
Local training
Not uploaded
7. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Model sparsification
7
Model sparsification omits some parameters in the dense model to build a
sparse model while keeping the same model architecture
Parameters that are significantly changed after training are expected to have
large impact on the model performance
Dense Model Mask
&
Sparse Model
Model sparsification
The proposed method exchanges the most updated parameters of model
between server and clients
8. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Client 1 Server
Client N
Basic idea behind the proposed method
8
Sparsify the models exchanged between the server and clients in both directions
Global model
Sparse
Global model
Sparsification
Sparse
Global model
Local model
Local model
Sparse
Local model
Sparsification
Sparse
Local model
Sparsification
Aggregation
9. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Server
Client 1
Overview of the proposed method (Uplink)
9
Global Model ( )
Aggregation
Global Model ( )
Local Model
Sparsification
Sparse Local Model ( )
Client 2
Local Model
Sparsification
Sparse Local Model ( )
Model
Update
Model
Update
DATA
Local Model ( )
DATA
Local Model ( )
Local Model ( )
Local Model ( )
Global Model ( )
Global Model ( )
10. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Server
Client 1
Overview of the proposed method (Downlink)
10
DATA
Global Model ( )
Local Model
Sparsification
Local Model ( )
Sparse Local Model ( )
Client 2
Sparse Local Model ( )
Global Model
Sparsification
Global Model
Sparsification
Sparse Global Model ( )
Sparse Global Model ( )
Model
Update
Local Model ( )
Local Model ( )
Sparse Local Model ( )
11. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Model update
11
Model
Update
Sparse Model
Original Dense Model
Updated Dense Model
Original Dense Model
1 2 3
4 5 6
7 8 9
Sparse Model
3
5 11
+
Updated Dense Model
2 2 3
4 5 6
6 8 10
Model update
Updates a dense model using sparse updates sent from the server or clients
12. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Local model sparsification
12
Local Model
Sparsification
Local Model ( )
Sparse Local Model ( )
Local Model ( )
1 2 3
4 5 6
7 8 9
2 2 1
3 5 9
1 4 3
1 0 2
1 0 3
6 4 6
The parameter Q is supplied by the user to adjust the communication cost and
model accuracy (e.g., if Q is set to 0.7, the top 30% of the parameters are selected for transfer)
0.7 quantile of (0,0,1,1,2,3,4,6,6) is 3.56
1 0 2
1 0 3
6 4 6
Compute
mask
1 4 3
Sparse Local Model ( )
Masking
Extracts the most updated parameters to construct sparse local model
13. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Global model sparsification
13
Global Model
Sparsification
Sparse Local Model ( )
Sparse Global Model ( )
Global Model ( )
1 4 3
1 2 3
4 5 6
7 8 9
Compute mask
7 8 9
Sparse Global Model ( )
Masking
Reuse local model mask to construct sparse global model because the parameters in the mask
are still not converged yet at client-side and then those parameters have to be updated to converge
14. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Client N
Client 1 Server
Conventional Method
Client 1 Server
Proposed Method
Client N
Proposed method vs Conventional method
Downlink communication is the main difference between the proposed and conventional methods
14
Global Model ( )
Global Model ( )
Sparse Global Model ( )
Sparse Global Model ( )
Global Model
Sparsification
Global Model
Sparsification
Global Model
Sparsification
Sparse Local Model ( )
Sparse Local Model ( )
Global Model ( )
Sparse Global Model ( )
Sparse Global Model ( )
15. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Experimental environment
➢ Models:
1. VGG16 (553.43 MB)
2. ResNet152 (243.21 MB)
3. DenseNet201 ( 89.92 MB)
4. MobileNet ( 17.02 MB)
➢ Datasets:
1. CIFAR-10
2. CIFAR-100
3. MNIST
4. FMNIST
15
Configuration Value
# of communication rounds (R) 10
# of clients (N) 10
# of local epochs (E) 5
Local batch size (B) 8
Experimental Setup
Although large models can generally achieve higher
accuracy than small models, not all edge devices can deploy
large models due to resource constraints. Thus, we evaluated
the proposed method using models with different scales
16. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Comparison to conventional method
Since the conventional method sends the same global model to all clients, it is unable to build highly
accurate local models, which also leads to a decrease in the accuracy of the global model
16
Conventional method Proposed method
The global model accuracy of
the proposed method is higher
than that of the conventional
method
The variance of local model
accuracy in the proposed
method is lower than that of
the conventional method
17. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Existing methods and their hyperparameters
17
Method name Hyperparameter Description
FedAVG C Fraction of clients selected in each communication round
FedPSO N/A Does not have a hyperparameter to control communication cost
FedMMD 𝜆 Coefficient of MMD loss between the global and local models
Low rank
approximation
K Rank of the low-rank matrix to be converted
Random mask M Size of random mask to generate a random pattern
Quantization B Quantized bits used for bit-quantization
18. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Comparison to the existing methods
18
The proposed method outperforms the existing methods in terms of both the communication
cost and the accuracy of the global model
19. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Results for different models
19
VGG16 (553 MB) ResNet152 (243 MB) DenseNet201 (90 MB) MobileNet (17 MB)
➢ The reduction of the communication cost from FedPSO, FedMMD, Low rank approximation, Random
mask, and Quantization are almost identical for all model architecture
➢ The reductions of communication cost in FedAVG and Proposed method depend on the size of each
model architecture (larger models are more compressed than smaller models)
20. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Results for different datasets
20
CIFAR
10
CIFAR
100
MNIST
FMNIST
➢ The proposed method is evaluated over four
image classification datasets
➢ Q is varied from 0.1 to 0.9 at intervals of 0.1, and
from 0.91 to 0.99 at intervals of 0.01
➢ The proposed method produces consistent
results for all datasets
➢ It is able to reduce the amount of
data transfer for larger models than
for smaller models without a
significant loss of accuracy
21. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Why are larger models amenable to compression?
➢ In larger models (e.g., VGG16), small
parameter updates are more frequent than in
smaller models (e.g., MobileNet)
○ Small parameter updates have a smaller
impact on the model performance
➢ Large models receive more low-impact
updates than small models
○ The proposed method drops those
low-impact updates in large models
without a significant loss of accuracy
21
Frequency of updated values
per communication round on a client
22. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Conclusion
➢ We proposed a novel method to reduce the communication cost for
federated learning by sparsifying local and global models on both uplink
and downlink communication
➢ The proposed method utilizes exchanging the most updated parameters
of neural network models
➢ Diverse models and datasets are used to evaluate the proposed method
in terms of model accuracy and communication cost
○ The proposed method achieved a reduction in the communication costs approximately 90%
22
23. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Future work
➢ Architecture of other neural network models should be investigated to
improve reducing the required communication cost
➢ Updating the parameters in other neural network models should be
observed during the local training procedure
➢ Large number of edge devices should be used to evaluate the efficiency
of the proposed method
23
24. IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 16, 2022
IEEE International Conference on Fog and Edge Computing 2022, Taormina, Italy May 18, 2022
Q & A
Thank you for your attention
Email: thonglek.kundjanasith.ti7@is.naist.jp
24