SELECTED - Enabling Privacy-Preserving Analyses on Healthcare Data

Labels:

Description

Enabling Privacy-Preserving Analyses on Healthcare Data

Problem Context

Major banks are facing significant challenges with respect to detecting money laundering. The amounts of money being laundered are potentially huge and difficult to detect by a single bank as criminals often make use of transactions via different banks on a national and international level, and via cryptocurrency exchanges. Due to privacy regulations and data confidentiality banks cannot readily share data, which makes it challenging to collaborate. One approach could be to perform Anti Money Laundering (AML) analyses on a data silo with pseudonymized transaction data of multiple banks. However, such an approach faces challenges on privacy concerns and legislation. Additionally, it is unlikely that many foreign or international banks will collaborate like this on a short term. It would be better to have a decentralized solution for these purposes.

Solution

The goal of this joint research project is to develop a pilots and Proof of Concepts solution for collaborative monitoring and detection of money laundering using Privacy Enhancing Technologies, such as Multi-Party Computation (MPC), synthetic data and Federated Learning. Furthermore, we investigate the transition to include this AML capability in daily operations. MPC consists of innovative cryptographic techniques that offer opportunities for organizations to jointly analyze their data without sharing or revealing this data to anyone. This program can be seen as a Regtech approach and contributes to a solution for the (legal) data confidentiality issues that are faced in data sharing initiatives. The consortium is open for other (international) financial institutions, to join the consortium to collaboratively fight money laundering without sharing sensitive data.

Results

TNO has shown the success of MPC technology across a variety of methods for fraud detection. In 2017, TNO showed that MPC can feasibly be used to establish secure, multiparty PageRank. Not only can the secure, multiparty PageRank detect transactional fraud in a realistic setting (with millions of nodes per party), it also safeguards the privacy of bank customers. In collaboration with ABN AMRO and Rabobank, TNO furthermore built a first proof-of-concept of secure risk propagation using synthetic data that includes money laundering patterns. The proof-of-concept shows that it is possible to securely perform the risk propagation algorithm among different banks and demonstrates the value of collaborative transaction analysis.

The next step for the latter proof-of-concept is to apply the secure risk propagation on real transaction data. Some challenges will also have to be addressed. First, the computational scalability of the solution will need to be evaluated and compared with the theoretical hypothesis. Second, MPC protects the confidentiality of the input data and any intermediate data during the computation. However, for application on real transaction data, it is important to investigate how much information can be deduced from the resulting risk scores after running the protocol.

Affiliations

This project is part of TNO’s Appl.AI programme and it is partly financed from the kickstart fund that the NL AIC received from the government for research and development of AI applications. SELECTED stands for ‘Secure Learning for oncology on vertically partitioned data’. It is being developed by TNO. This is done in collaboration with the parallel project LANCELOT, partly funded by Holland High Tech, in collaboration with the Netherlands Comprehensive Cancer Organisation (IKNL) and Janssen.

Resources

SELECTED on NLAIC website

Open source code of Multi-Party Computation solution for Kaplan Meier

Open source code of Multi-Party Computation solution for logistic regression (and other machine learning protocols)

Short paper on the use of Multi-Party Computation for Kaplan-Meier and their application in the medical domain.

Scientific paper on training the Cox Proportional Hazard Model on distributed data sets.

Associated project LANCELOT

Contact

Daniël Worm, Sr consultant, TNO, e-mail: daniel.worm@tno.nl