1. Federated Learning
Benjamin P. Geisler, M.D., M.P.H., F.A.C.P., M.R.C.P (London), F.H.M.
Massachusetts General Hospital/Harvard Medical School, Boston, MA
and Ludwig-Maximillian University Munich, Germany
3. Outline
• Why Is Federated Learning Important?
• What Is Federated Learning?
• What Are Alternatives to Federated Learning?
• What Is Centralized/Decentralized/Distributed Learning?
• What Are Potential Pitfalls of Federated Learning?
• What Could Be Interesting Use Cases for Federated Learning?
• Non-Medical
• Medical
• Conclusions, Q&A
31. Interesting FL Use Cases – Medical
1. Connecting hospitals
2. Nordic health registries
3. Connecting registries to hospitals
32. My Conclusions
1. Promises are better data privacy & availability to researchers
2. Therefore: FL = key technology for medical research?
3. Challenges to FL are great, but therefore great research topic!
Editor's Notes
Well, thank you for that kind introduction, Christoffer.
Alright, these are my financial conflicts of interest. And I’ll tell you a little bit about my perspective as someone in between medicine, writing and publishing, and consulting for industry in a minute.
Here’s the outline of what I wanna go through with you today. This should be understandable without a having a Ph.D. in computer science. I think that we should be able to do this in less than 30 minutes so that we have time for a discussion afterwards. First, I’d like to point out the relevance. Then, we’re gonna spend the meat of the time on what federated learning IS. To do that, it might be good to explain also the alternatives of federated learning, so centralized learning, peer-to-peer decentralized learning and so forth. Then, I’m gonna list some problems and challenges with federated learning. And then I will mention some applications to FL could be used for, both non-medical and medical.
So, why is it important. Well, we’re here in a hospital and you all work with health data. And even though there might be greater trust in society in Norway, there are bad actors around the world.
There have been cases where folks have tried to either outright penetrate supposedly safe network infrastructure in hospitals, and there is also something called reidentification attacks. That’s when you have de-identified data and, possibly with additional data, it is somehow possible to identify who these patients are. And FL here is the superhero who is supposed to protect us, except we don’t know if that is really going to be the case. Do you know what a hackathon or a datathon is? So that’s when you get together over a weekend and try to solve a problem, create a protype over just a weekend. And with a datathon, you do just that only with data and you create models by creating groups of cliniciens AND data scientists. We did a datathon in Bergen in September with data from a local hospital. So anyways, we thought about doing a hackathon with white-hat hackers (or maybe even black-hats if we use synthetic data) to see if there is any kind of way to overcome the data security that FL promises.
FL already is a big market, but it’s supposed to grow massively over the next years. Much of that growth, outside maybe from pharma companies, will happen outside of medicine. As usual, we might be a few years behind here.
So let’s talk about what FL IS. Does anyone here have a good – meaning very simple and short definition? So I think the best way to describe it to a non-technical person is that instead of sending the data around, you send the model around. And then the model is being developed where the data is. The data never leaves the premises. And you can compare the model to other sites later and then iterate them.
I thought it’d be better if we talk about other ways to do compute models and THEN we get to FL.
This is the main way we currently use our data. So you have your data sources, let’s say three hospitals. And they all have their individual data – either a single dataset or an electronic data ware house – some kind of database – and in order to utilize them all, you send in the data.
And then you run the model with those data in a central location. You can either run a shallow learning model, say some kind of regression …
or you can run a deep learning model, say a neural network with many layers.
And then you describe your model and your results in some kind of abstract fashion and present them. So going forward, if I use a straight line, that means it’s an actual data flow – and if it’s a dashed line that means it’s only model parameters.
Of course you can attempt to make this more secure and put it into a cloud. That means that most likely that the data cannot be physically removed. Every system is hackable though, and all you can do is minimize the possibility of a hack.
There’s also this idea of an a secure cloud in the EU. So you’d log on to a system, and data from all over Europe is there – uploaded from various health centers etc.
So you don’t have a central inftrastructe where there data gets sent and computer. So you either need the analytical infrastructure in each center,
Or you do peer-to-peer infrastructures and you exchange model parameter – for example, you train the model in one center, and then you validate it in other centers.
The only central thing here might be that you present data somehow
That just means that you hub a central hub that sends out data and models and let to spokes do some the computing. Maybe less relevant for health purposes.
Federated learning, as we said earlier, means that all the data stays in the hospitals, so you don’t share private health information with a central resource or with other centers. All you send around is model parameters. There are two main sorts. One is where you have a central server that aggregates models and basically co-ordinates…
… or you have a complete peer-to-peer infrastructure.
So on the left here you have the central model with an aggregation server. You first distribute a global model. Then you train with the local data. Then you send back the updated model and aggregate it. And you repeat that until you have a good model. In the peer to peer model, you initially sunchronize a model, train, then exchange, and then also do the aggregation decentralized.
A middle ground, here on the left, is if you have sort of a hierarchical way to aggregate, for example on a regional or a country level. And then there are multiple computing plans, so sequential, with an aggregation server, and peer-to-peer, but maybe I’ll skip this.
So in the beginning, there were more shallow learning models
For example, in 2012 there was a paper out of San Diego on a federated learning model thatwas doing logistric regression.