Standard machine learning pradigms like supervised and unsupervised learning assume the data to be available as a tabular dataset. In practical applications the data can be spread in various distributed databases. For computational or confidentiallity reasons it can undesirable to copy the federated data to a central server and database. Computational reasons relate to the cost of keeping the central database up-to-date when the communication is expensive and the federated data volatile. The most important reason is, however, the confidentiallity of the data, which can be because of company confidentiallity or privacy/personal data protection (GDPR). With such limitations, the goal is to learn a global machine learning model without moving the data to a central database. Additionally, there may be explicit confidentiallity requirements with respect to information leakage between the database parties involved. Typically, there are the following steps: 1) a client initiates the learning process with an initial global model, then 2) all database parties update that model with their own data (a few iterations), 3) the client updates the global model with the local updates, 4) the whole process repeats until convergence or a finite number of iterations. For various algorithms federated solutions have been proposed with similar performance to the central algorithm. A related concept is distributed learning, where usually the goal is to gain efficiency by employing several distributed computers, or a cluster. In such a case the central database is distributed to enable the distributed processing.


  • TNO offers fantastic innovations related to Federated learning. Moreover, there is really a lot to say. Projects related to Federated learning are shown below.