1. MULTILAYER FEDERATED
LEARNING
Kalp Pawar(BT18CSE0037)
Rajat Adkine (BT18CSE071)
Apurv Chandel (BT18CSE100)
Under the guidance of:
Dr. Tausif Diwan
Assistant Professor
Indian Institute of Information Technology, Nagpur
Final Year Project
for
Indian Institute of Information Technology, Nagpur
2. Table of Contents
▪ Introduction
▪ Problem Statement
▪ Proposed Methodology & Algorithm
▪ Result and Conclusion
▪ Thesis Status
▪ Future Scope
4. FEDERATED
LEARNING
Federated learning is a machine learning technique
that trains an algorithm across multiple
decentralized edge devices or servers holding local
data samples, without exchanging them.
5. Applications
• Google makes extensive use of federated
learning in the Gboard mobile keyboard, as
well as in features on Pixel phones and in
Android Messages.
• Apple is using cross-device FL in iOS 13,
for applications like the QuickType
keyboard and the vocal classifier for “Hey
Siri”.
• doc.ai is developing cross-device FL
solutions for medical research, and Snips
has explored cross-device FL for hotword
detection.
6. Dataset and Model
We used MNIST dataset in order to test our model where it contains 42,000 images of
handwritten digits which are divided into different folders namely their respective digits.
8. Federate Architecture
• Global model is shared with
multiple clients.
• Those clients perform training of
machine learning algorithm.
• The local result is aggregated by
using algorithms like FedAvg
,FedProx etc.
9. CHALLENGES IN FEDERATED LEARNING
• Privacy is first-order concern in FL, even if the experiments are simulations
running on a single machine using public data.
• Efficient aggregation of the local outputs of the models which leads to better
model prediction.
10. Federated Learning
▪ Cross Device:
▪ This is federated learning between organizations and devices.
▪ Learning is done remotely Update the central model using the appropriate federation technique.
▪ working on a large scale and may require millions of equipment to work for the federation.
▪ The device can go offline.
▪ Cross Silo:
▪ It is a multi-institutional and cross-functional affiliate learning.
▪ Data is partitioned into silos, each has an associated coach.
▪ data is more widely distributed, such as between hospitals, banks, presumably distributed
wearable consumer data aggregated across various fitness apps companies.
11. Federated Learning
▪ Difference:
▪ Cross-device focuses on mobile and IT devices and cross-silo focuses on devices in
organization but neither of them focuses on both.
▪ Cross-device can support more number of clients but model may take days (1-10
days) to converge.
▪ Cross-silo is relatively more reliable than cross-device federated learning.
▪ Cross-device provides better privacy but may result in decreased efficiency.
12. PROBLEM STATEMENT
● Models available today are mostly focused on efficiency or functionality, but very few are
focused on privacy.
● One such model that gives enough privacy is cross-device federated learning which is not
very efficient as compared to other models.
● We propose a model which uses the advantages of both cross-device and cross-silo federated
learning for improving privacy and efficiency.
13. Proposed Methodology
Client selection: The server chooses among the clients fitting the
eligibility criteria. For example, client devices that are plugged in to
the charger and having good connection speed will be eligible for
training our model.
Broadcast: The clients then receive training models and weights
from the server (e.g., TensorFlow graph).
Client computation: Clients then perform local computations and
trains to update the model (e.g., SGD in Federated Averaging).
Aggregation: The server then aggregates the updates by clients.
Various methods of aggregation like FedAvg, FedNova etc. can be
applied for accuracy and encryption and differential privacy can be
applied for increasing privacy.
Model update: The server locally updates the shared model based
on the aggregated update computed from the clients that
participated in the current round.
14. Algorithm
Server
▪ Result: Creates task for clients and distributes
them to managing nodes
1. for each round t = 0.1.2. do
2. S < - (sample a random set of nodes)
3. end
4. for each node k = S, in managing nodes do
5. Pass data to managing nodes
6. end
Management Node
▪ Result: Returns the accumulated data from its group of
clients
1. S <- (sample a random set of clients)
2. for each client k = S, in Graph (decentralized nodes) do
3. updateWeightUsingFedAvg (ClientTraining (k, W1))
4. end
5. return Wo to server
15. Algorithm
Client device
▪ Result: Returns the updated weights
1. for each local epoch with i = 1,2,... do
2. for batch B do
3. do processing using broadcasted data
4. end
5. end
6. return W to server
16. Result and Conclusion
The accuracy of the model is intact, the model is more robust and error tolerant and we have also introduced more privacy.
Further improvements could be made with the help of Fine-Grained training models that uses blockchains for additional
security and monitoring perspective.
17. Result and Conclusion
▪ More work is required to find out its accuracy as compared with different situations using
different models.
▪ We were able to experiment and confirm that this setting significantly reduces the number of
mishaps occurring due to network issues and uncertain available time of client and reduces
overhead of getting the correct kind of data by selectively filtering the devices on non-PII data.
▪ Moving forward, we need to analyze the performance of this approach in real world datasets as
well as compare various other causal learning approaches.