Federated learning trains a model on a centralized server using datasets distributed over a large number of edge devices. Applying federated learning ensures data privacy because it does not transfer local data from edge devices to the server. Existing federated learning algorithms assume that all deployed models share the same structure. However, it is often infeasible to distribute the same model to every edge device because of hardware limitations such as computing performance and storage space. This paper proposes a novel federated learning algorithm to aggregate information from multiple heterogeneous models. The proposed method uses weighted average ensemble to combine the outputs from each model. The weight for the ensemble is optimized using black box optimization methods. We evaluated the proposed method using diverse models and datasets and found that it can achieve comparable performance to conventional training using centralized datasets. Furthermore, we compared six different optimization methods to tune the weights for the weighted average ensemble and found that tree parzen estimator achieves the highest accuracy among the alternatives.
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
1. Kundjanasith Thonglek1
, Keichi Takahashi1
, Kohei Ichikawa1
,
Chawanat Nakasan2
, Hajimu Iida1
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
1
Nara Institute of Science and Technology, Nara, Japan
2
Kanazawa University, Ishikawa, Japan
Federated Learning
of Neural Network Models
with Heterogeneous Structures
2. Edge intelligence
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Cloud-based AI Edge-based AI
AI
?
➢ Gather data from edge devices
➢ Update the model frequently
➢ Longer response time
➢ Lack of data privacy
DATA
DATA
Input Data
Input Data
Inference Result
Inference Result
AI
DATA
DATA
AI
AI
Input Data
Input Data
Inference Result
Inference Result
Pros
Cons
➢ Shorter response time
➢ Better data privacy
➢ Cannot update the model from
data collected at the edge
Pros
Cons
2
3. Federated learning
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Global
Model
Ref: J. Konecný, H. Brendan McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, D. Bacon, “Federated Learning: Strategies for
Improving Communication Efficiency”, NIPS Workshop on Private Multi-Party Machine Learning, 2016.
Federated Learning Algorithm
Model
Aggregation
3
Require the models with homogeneous structures
DATA
DATA
Local
Model
Local
Model
Updated parameters
4. Heterogeneity of edge devices
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Network
Bandwidth
Power
Consumption
Storage
Capacity
Computing
Resources
TPU Tensor
Processing Unit
Graphics
Processing Unit
GPU
4
5. Federated Learning
Algorithm
Federated learning for heterogeneous models
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
DATA
DATA
MobileNet
[17.02 MB.]
VGG16
[553.43 MB.]
Proposed
Method
5
Updated parameters of MobileNet
Updated parameters of VGG16
6. Centralized Server
Proposed method
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Proposed Method
Weighted Average
Ensembling
6
DATA
DATA
DATA
DATA
Model
1
Model
1
Model
2
Model
2
Model
1
Model
2
Federated Learning Algorithm
[FedAVG]
Federated Learning Algorithm
[FedAVG]
7. Weighted average ensembling
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020 7
Let y be the final output vector
Let x be the input data
Let N be the number of models
Let mi
(x) be the prediction of x using the i-th model
Let 𝛼i
be the weight vector for the i-th model
To be determined by applying optimization algorithms
8. Optimization algorithms
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
6 Tree Parzen Estimator
TPE
Ref: H. Frank, H. Holger, and L. B. Kevin, “Sequential model-based optimization for general algorithm
configuration,” International Conference on Learning and Intelligent Optimization, Jan. 2011
4 Bayesian Search
BS
Ref: A. Garcia, E. Campos, and C. Li, “Distributed on-line Bayesian search,” International Conference on
Collaborative Computing: Networking, Applications and Worksharing, Dec. 2005
2 Random Search
RS
Ref: Y. Shang and J. Chu, “A method based on random search algorithm for unequal circle packing problem,”
International Conference on Information Science and Cloud Computing Companion,, Dec. 2013
3
Particle Swarm Optimization
PSO
Ref: M. Kirschenbaum and D. W. Palmer, “Perceptualization of particle swarm optimization,”
in Proceedings of the Swarm/Human Blended Intelligence Workshop, Sep. 2015
1
Grid Search
GS
Ref: F. Xue, D. Wei, Z. Wang, T. Li, Y. Hu, and H. Huang, “Grid searching method in spherical coordinate for PD
location in a substation,” International Conference on Condition Monitoring and Diagnosis, Sep. 2018
5
Sequential Model-based Algorithm Configuration
SMAC
Ref: M. Zhao and J. Li, “Tuning the hyper-parameters of CMA-ES with tree-structured Parzen estimators,”
International Conference on Advanced Computational Intelligence, Mar. 2018
5
8
9. Experimental setup
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Setup
A B C
Total # of devices 1,200 1,200 1,200
# of models 2
1. MobileNet
2. DenseNet169
3
1. MobileNet
2. DenseNet169
3. ResNet50
4
1. MobileNet
2. DenseNet169
3. ResNet50
4. VGG16
# of devices per model 600 400 300
# of images per model 24,000 16,000 12,000
9
11. Datasets
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Name # of images # of output classes
CIFAR-10[1]
70,000 10
CIFAR-100[2]
70,000 100
ImageNet[3]
100,000 1,000
R-Cellular[4]
73,000 1,108
References:
[1] https://www.cs.toronto.edu/~kriz/cifar.html#CIFAR-10
[2] https://www.cs.toronto.edu/~kriz/cifar.html#CIFAR-100
[3] http://image-net.org/download
[4] https://www.kaggle.com/c/recursion-cellular-image-classification
11
12. Optimized weights
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Experimental Setup
➢ Setup C
Dataset
➢ CIFAR-10
Optimization
➢ TPE
12
13. Improvement of accuracy
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Experimental Setup A & R-Cellular Dataset
63.19% (before optimizing the weights)
13
14. Accuracy & Runtime
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
Experimental Setup A & R-Cellular Dataset
14
15. Conclusion
IEEE International Conference on Machine Learning and Applications 2020, December 14 - 17, 2020
➢ We proposed a novel method to ensemble neural network models
with heterogeneous structures for federated learning
➢ We compared six optimization algorithms to tune the weights for
the output classes of each model and found that TPE was able to
achieve the highest accuracy
We will investigate other parameter optimization methods and
federated learning algorithms
15
16. Q&A
Thank you for your attention
Email: thonglek.kundjanasith.ti7@is.naist.jp