Federated
Learning, a Brief
Overview
*https://en.wikipedia.org/wiki/File:Centralized_federated_learning_protocol.png
1
Machine Learning
Machine Learning is the “field of study that gives computers the
capability to learn without being explicitly programmed.” (Arthur Samuel,
1959)
2
X
1
2
3
def square(x):
return x*x
square(4)
16
Y
1
4
9
Traditional
Programing
Rules
Data
Output
Machine
Learning
Data
Rules
Output
Explicit Instructions
Algorithms implicitly learn instructions by examples
Output
Input
X Y
1 1
2 4
3 9
Learning
Input Output
Why Machine Learning?
3
def factorial(x):
result = 1
for i in range(1,x+1):
result = result*i
return result
1 2 3 4
x =
Problem:
Weather this image is
dog or cat?
def function(image):
Very complex to code this
function with traditional
programming
…
Cat Dog Cat
.
Machine Learning
Prediction:
Hypothesis:
Simple Data
Simple Problem
Traditional
Programming
Why ML is hot topic these days?
4
1. Data 2. Computing power
Working of ML at a Scale.
5
Data, a blessing as well as a curse.
6
➔ More the data more accurate ML model, but handling huge amounts of data and training is a
problem.
➔ Although the compute/processing power is increasing, but the size of datasets is increasing
much more rapidly than computing power.
➔ This results in longer training time or even big ML models may cause out of memory errors.
➔ Majority of the datasets today cannot be used on a single machine due to the humongous
size of the dataset.
So, what is the solution?
Solution: Distributed ML.
7
Drawbacks of Distributed ML.
8
➔ Data Collection:
◆ Privacy.
◆ Security.
◆ Integrity.
◆ Storage.
➔ Regulations: GDPR (Europe), CCPA (California), PIPEDA (Canada), LGPD (Brazil), PDPL
(Argentina), KVKK (Turkey), POPI (South Africa), FSS (Russia), CDPR (China), PDPB (India), PIPA
(Korea), APPI (Japan), PDP (Indonesia), PDPA (Singapore), APP (Australia), and other
regulations protect sensitive data from being moved. In fact, those regulations sometimes
even prevent single organizations from combining their own users’ data for artificial
intelligence training because those users live in different parts of the world, and their data is
governed by different data protection regulations.
➔ Communication overhead:
➔ Resource utilization and load balancing:
➔ …
Solution: Federated Learning
Federated Learning (FL) [1] is a distributed
machine learning approach in which large
decentralized datasets, residing on edge devices
like mobile phones and IoT devices, are used to
train a Machine Learning (ML) model.
Some Important standard terms in FL.
1. Server: A computational device that orchestrates the
whole FL process and is responsible for weight
aggregation.
2. Client: A device that has some computational resources
and local data associated with it. e.g mobile phones, IoT
devices, personal computers etc etc.
3. Round: Round or communication round is one round
trip journey of model weights from server to clients and
back to server.
*https://blog.ml.cmu.edu/wp-content/uploads/2019/11/Screen-
Shot-2019-11-12-at-10.42.34-AM-1024x506.png
9
Distributed ML Vs Federated Learning
1. In distributed ML, data is centrally stored
(e.g., in a data center).
2. The main goal is just to train faster.
3. We control how data is distributed
across workers: usually, it is distributed
uniformly at random across workers
1. In FL, data is naturally distributed and
generated locally.
2. Data never leaves the place of origin.
3. Data is not independent and identically
distributed (non-i.i.d.), and it is
imbalanced.
Federated Learning Working
1.Model selection 2. Model
broadcast
3. Local training 4. Model upload
and averaging
5. Broadcast
updated model.
Federated Averaging (FedAvg)
Types of Federated Learning
1. Based on Types of participating devices
1. Massive no. of clients (up to billions)
2. Small dataset per client
3. Limited availability and reliability
4. Some parties may be malicious
1. 2-100 clients
2. Medium to large dataset per client
3. Reliable clients, almost always available
4. Parties are typically honest
Cross Device FL Cross Silo FL
Types of Federated Learning
2. Based on Types of data partitioning
a. Horizontal Federated Learning (HFL):
Types of Federated Learning
2. Based on Types of data partitioning
b. Vertical Federated Learning (VFL):
Future of Federated Learning
Federated
Learning.
Research Areas in FL
FL consumes very
high bandwidth.
Communication
Cost.
Different clients may
have different data
distributions and
size.
Statistical
Heterogeneity.
Different clients have
different system
resources.
System
Heterogeneity.
ML model is not able
to train properly.
Model
Convergence
17
References
1. McMahan, H. Brendan, Eider Moore, Daniel Ramage, and Seth Hampson. "Communication-
efficient learning of deep networks from decentralized data." AISTATS, 2017.
2. M. U. Yaseen, A. Anjum, O. Rana, N. Antonopoulos, Deep learning hyper-parameter
optimization for video analytics in clouds, IEEE Transactions on Systems, Man, and
Cybernetics: Systems 49 (1) (2019) 253–264. doi:10.1109/TSMC.2018.2840341.
3. T. Yu, H. Zhu, Hyper-Parameter Optimization: A Review of Algorithms and Applications, CoRR
(2020) 1–56 arXiv:2003.05689. URL http://arxiv.org/abs/2003.05689
4. K. Murphy, Machine Learning: A Probabilistic Perspective, Adaptive Computation and
Machine Learning series, MIT Press, 2012. URL https://books.google.co.kr/books?
id=NZP6AQAAQBAJ
18
19
Thank you

Federated Learning Overview and New Research Areas

  • 1.
  • 2.
    Machine Learning Machine Learningis the “field of study that gives computers the capability to learn without being explicitly programmed.” (Arthur Samuel, 1959) 2 X 1 2 3 def square(x): return x*x square(4) 16 Y 1 4 9 Traditional Programing Rules Data Output Machine Learning Data Rules Output Explicit Instructions Algorithms implicitly learn instructions by examples Output Input X Y 1 1 2 4 3 9 Learning Input Output
  • 3.
    Why Machine Learning? 3 deffactorial(x): result = 1 for i in range(1,x+1): result = result*i return result 1 2 3 4 x = Problem: Weather this image is dog or cat? def function(image): Very complex to code this function with traditional programming … Cat Dog Cat . Machine Learning Prediction: Hypothesis: Simple Data Simple Problem Traditional Programming
  • 4.
    Why ML ishot topic these days? 4 1. Data 2. Computing power
  • 5.
    Working of MLat a Scale. 5
  • 6.
    Data, a blessingas well as a curse. 6 ➔ More the data more accurate ML model, but handling huge amounts of data and training is a problem. ➔ Although the compute/processing power is increasing, but the size of datasets is increasing much more rapidly than computing power. ➔ This results in longer training time or even big ML models may cause out of memory errors. ➔ Majority of the datasets today cannot be used on a single machine due to the humongous size of the dataset. So, what is the solution?
  • 7.
  • 8.
    Drawbacks of DistributedML. 8 ➔ Data Collection: ◆ Privacy. ◆ Security. ◆ Integrity. ◆ Storage. ➔ Regulations: GDPR (Europe), CCPA (California), PIPEDA (Canada), LGPD (Brazil), PDPL (Argentina), KVKK (Turkey), POPI (South Africa), FSS (Russia), CDPR (China), PDPB (India), PIPA (Korea), APPI (Japan), PDP (Indonesia), PDPA (Singapore), APP (Australia), and other regulations protect sensitive data from being moved. In fact, those regulations sometimes even prevent single organizations from combining their own users’ data for artificial intelligence training because those users live in different parts of the world, and their data is governed by different data protection regulations. ➔ Communication overhead: ➔ Resource utilization and load balancing: ➔ …
  • 9.
    Solution: Federated Learning FederatedLearning (FL) [1] is a distributed machine learning approach in which large decentralized datasets, residing on edge devices like mobile phones and IoT devices, are used to train a Machine Learning (ML) model. Some Important standard terms in FL. 1. Server: A computational device that orchestrates the whole FL process and is responsible for weight aggregation. 2. Client: A device that has some computational resources and local data associated with it. e.g mobile phones, IoT devices, personal computers etc etc. 3. Round: Round or communication round is one round trip journey of model weights from server to clients and back to server. *https://blog.ml.cmu.edu/wp-content/uploads/2019/11/Screen- Shot-2019-11-12-at-10.42.34-AM-1024x506.png 9
  • 10.
    Distributed ML VsFederated Learning 1. In distributed ML, data is centrally stored (e.g., in a data center). 2. The main goal is just to train faster. 3. We control how data is distributed across workers: usually, it is distributed uniformly at random across workers 1. In FL, data is naturally distributed and generated locally. 2. Data never leaves the place of origin. 3. Data is not independent and identically distributed (non-i.i.d.), and it is imbalanced.
  • 11.
    Federated Learning Working 1.Modelselection 2. Model broadcast 3. Local training 4. Model upload and averaging 5. Broadcast updated model.
  • 12.
  • 13.
    Types of FederatedLearning 1. Based on Types of participating devices 1. Massive no. of clients (up to billions) 2. Small dataset per client 3. Limited availability and reliability 4. Some parties may be malicious 1. 2-100 clients 2. Medium to large dataset per client 3. Reliable clients, almost always available 4. Parties are typically honest Cross Device FL Cross Silo FL
  • 14.
    Types of FederatedLearning 2. Based on Types of data partitioning a. Horizontal Federated Learning (HFL):
  • 15.
    Types of FederatedLearning 2. Based on Types of data partitioning b. Vertical Federated Learning (VFL):
  • 16.
  • 17.
    Federated Learning. Research Areas inFL FL consumes very high bandwidth. Communication Cost. Different clients may have different data distributions and size. Statistical Heterogeneity. Different clients have different system resources. System Heterogeneity. ML model is not able to train properly. Model Convergence 17
  • 18.
    References 1. McMahan, H.Brendan, Eider Moore, Daniel Ramage, and Seth Hampson. "Communication- efficient learning of deep networks from decentralized data." AISTATS, 2017. 2. M. U. Yaseen, A. Anjum, O. Rana, N. Antonopoulos, Deep learning hyper-parameter optimization for video analytics in clouds, IEEE Transactions on Systems, Man, and Cybernetics: Systems 49 (1) (2019) 253–264. doi:10.1109/TSMC.2018.2840341. 3. T. Yu, H. Zhu, Hyper-Parameter Optimization: A Review of Algorithms and Applications, CoRR (2020) 1–56 arXiv:2003.05689. URL http://arxiv.org/abs/2003.05689 4. K. Murphy, Machine Learning: A Probabilistic Perspective, Adaptive Computation and Machine Learning series, MIT Press, 2012. URL https://books.google.co.kr/books? id=NZP6AQAAQBAJ 18
  • 19.