Vertical Federated Learning.pptx

Vertical
Federated Learning
- Presented by Afsana Khan

Federated Learning
“Federated learning is a machine learning setting where multiple entities (clients) collaborate in solving a
machine learning problem, under the coordination of a central server or service provider. Each client’s
raw data is stored locally and not exchanged or transferred; instead focused updates intended for
immediate aggregation are used to achieve the learning objective.”
Kairouz et al., Advances and open problems in federated learning, 2019

Taxonomy of Federated Learning

Use Case for Vertical Federated Learning

Steps in VFL
● Secure Data Alignment
● Secure Model Training
● Secure Evaluation

Step 1 - Secure Data Alignment

Step 1 - Secure Data Alignment
Monica Scannapieco, et al., 2007. Privacy Preserving Schema and Data Matching. https://doi.org/10.1145/1247480.1247553

Secure Model Training in VFL
Yang, et al., Federated Machine Learning: Concept and Applications
•Step 1: collaborator C creates encryption pairs,
send public key to A and B;
•Step 2: A and B encrypt and exchange the
intermediate results for gradient and loss
calculations;
•Step 3: A and B computes encrypted gradients
and adds additional mask, respectively, and B
also computes encrypted loss; A and B send
encrypted values to C;
•Step 4: C decrypts and send the decrypted
gradients and loss back to A and B; A and B
unmask the gradients, update the model
parameters accordingly.

Vertical Federated Linear Regression

Secure Evaluation in VFL
Is the evaluation secure enough? Can C infer raw
data of A and B?
Possible Solution!!!
Secure Multiparty Computation (SMC)

Do we really need a coordinator?
(Yang et al., Parallel Distributed Logistic Regression for Vertical Federated Learning without Third-Party Coordinator,

Existing Vertically Federated Learning Algorithms
•Linear regression
(Gascon, et al., Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing
Technologies, 2017(4):345-364,2017)
•Association rule-mining
(Vaidya, Clifton, Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the eighth ACM
SIGKDD international conference on Knowledge discovery and data mining, pages 639-644. ACM, 2002.)
•K-means clustering
(Vaidya, Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In Proceedings of the ninth ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 206-215, 2003.)
•Logistic regression
(Hardy et al., Private federated learning on vertically partitioned data via entity resolution and additively homomorphic
encryption, arXiv:1711.10677, 2017.)
•Random forest
(Liu, et al., Federated forest. arXiv:1905.10053, 2019.)
•XGBoost
(Cheng, et al., Secureboost: A lossless federated learning framework. arXiv:1901.08755, 2019.)

Vertical Federated Algorithms
(Liu, et al., A Communication-Efficient Collaborative Learning Framework for Distributed Features, arXiv:1912.11187)

Structured Literature Review on VFL

Structured Literature Review on VFL (Results)
A Khan, et al., Vertical Federated Learning: A Structured Literature Review

Categorization of Vertical Federated Learning Literature

Improving Communication Overhead in VFL

A Khan, et al., Communication-Efficient Vertical Federated Learning

Feature Extraction Methods Datasets
● Principal Component Analysis
● Undercomplete Autoencoder
Evaluation Metrics
● Accuracy
● F1-Score

Undercomplete Autoencoder

Business Aspect of VFL
Motivation??

Incentive/Reward Allocation to Parties in VFL
● What is the contribution of the parties?
● What do they bring to the table?
● How to reward parties with incentive fairly?
● How to explain the allocated incentives to the parties?

Existing Approaches in FL for Incentive Allocation
Game Theory Auction Theory Contract Theory
Incentive Allocation in FL
Shapley Value
Stackelberg
Game
Only Shapley values have been explored so far for VFL settings!!

Designing Pipeline for Fair Incentive Allocation in VFL
Client Selection
Contribution
Measurement
Incentive Allocation Explanation

Open Challenges in VFL
● Communication Overhead
● Asynchronism
● Data Scarcity
● Data Redundancy
● Defense Mechanisms for Backdoor Attacks
● High Dimensions
● Fairness: Model Fairness, Collaborative Fairness
● Explainability

EXPERIMENT WITH SYNTHETIC DATASET
Y = 2.0*x1 + 5*x2 + 3.0*x3 + 4.0*x4 + 1.0*x5 + 6.0*x6
Independent Variables (Features) : x1, x2, x3, x4, x5, x6
Dependent Variable (Target): Y

Linear Regression Model
Features, X = {x1….x6}
Number of training Samples = 7000
Number of testing samples: 3000
Learning Rate : 0.01
Epochs: 50
R2_Score: 0.99
Centralized Linear Regression

Target: Y
Number of training samples: 7000
Number of features: 2
X = (x1,x2)
X = (x3,x4)
X = (x5,x6)
Client1 Client2 Client3
Features, X = {x1….x6}
Number of training Samples = 7000
Vertical Partitioning of the Dataset

R2_Score: 0.3054
Linear Regression Model Linear Regression Model Linear Regression Model
Conventional Machine Learning
Target: Y
X = (x1,x2)
X = (x3,x4)
X = (x5,x6)
Client1 Client2 Client3

Vertical Federated Linear Regression
Guest Party
(Client with Labels)
Host Party
Complete a forward propagation using
local data
Receive forward output or intermediate
results from Host Party
Calculate loss from loss function
Send loss to the host party
Compute gradients
Update local model
Complete a forward propagation using
local data
Send intermediate results to Guest Party
Receive loss computed from Guest Party
Compute gradients
Update local model

Comparison of Weights After Convergence
w1 w2 w3 w4 w5 w6
Actual
Weights
2.0 5.0 3.0 4.0 1.0 6.0
Weights after
convergence
(Centralized
Learning)
2.01 4.91 3.006 3.996 1.03 5.897
Weights after
convergence
(Vertical
Federated
Learning)
1.95 4.87 2.90 3.88 1.06 5.91

Logistic Regression Model
Contains Labels: Y
X = (x1,x2)
Does not contain Labels
X = (x3…x5)
Does not contain Labels
X = (x6)
Client1: Guest Client2: Host Client3: Host
R2_SCORE: 0.99
Evaluation of model in VFL
Client1 Output Client2 Output Client3 Output
+ +

Vertical Federated Learning.pptx

More Related Content

What's hot

Similar to Vertical Federated Learning.pptx

Recently uploaded

Vertical Federated Learning.pptx