Table 1 Glossary of terms encountered in the article alongside conceptual examples

From: Secure, privacy-preserving and federated machine learning in medical imaging

Method Description Example
Attack vectors
Attacks against the dataset
Re-identification attack Determining an individual’s identity despite anonymization based on other information present in the dataset. Exploiting similarities to other datasets in which the same individual is contained (linkage).
Dataset reconstruction attack Deriving an individual’s characteristics from the results of computations performed on a dataset without having access to the dataset itself (synonyms: feature re-derivation, attribute inference). Using multiple aggregate statistics to derive data points corresponding to a single individual.
Tracing attack Determining whether an individual is present in the dataset or not without necessarily determining their exact identity (synonym: membership inference). Exploiting repeated, slightly varying dataset queries to ‘distil’ individual information (set differencing).
Attacks against the algorithm
Adversarial attack Manipulation of the input to an algorithm with the goal of altering it, most often in a way that makes the manipulation of the input data impossible to detect by humans. Compromising the computation result by introducing malicious training examples (model poisoning).
Model-inversion/reconstruction attack Derivation of information about the dataset stored within the algorithm’s weights by observing the algorithm’s behaviour. Using generative algorithms to recreate parts of the training data based on algorithm parameters.
Secure and private AI terminology
Secure by default implementation (synonym private by design) Systems that have been designed from the ground up with privacy in mind and at best require no specialized data handling.
Anonymization Removal of personally identifiable information from a dataset. Removing information related to age, gender and so on.
Pseudonymization Replacement of personally identifiable information in a dataset with a dummy/synthetic entry with separate storage of the linkage record (look-up table). Replacing names with randomly generated text.
Secure AI Techniques concerned with protecting the AI algorithms. Algorithm encryption.
Privacy-preserving AI Techniques for protecting the input and output data. Data encryption, decentralized storage.
Federated machine learning Machine learning system relying on distributing the algorithm to where the data is instead of gathering the data where the algorithm is (decentralized/distributed computation). Training of algorithms on hospital computer systems instead of on cloud servers.
Differential privacy Modification or perturbation of a dataset to obfuscate individual data points while retaining the ability of interaction with a data within a certain scope (privacy budget) and of statistical analysis. Can also be applied to algorithms. Random shuffling of data to remove the association between individuals and their data entries.
Homomorphic encryption Cryptographic technique that preserves the ability to perform mathematical operations on data as if it was unencrypted (plain text). Performing neural network computations on encrypted data without first decrypting it.
Secure (multi-party) computation Collection of techniques and protocols enabling two or more parties to split up data among them to perform joint computations in a way that prevents any single party from gaining knowledge of the data but preserving the computational result. Determining which patients two hospitals have in common without revealing their respective patient list (private set intersection).
Hardware security implementation Collection of techniques whereby specialized computer hardware provides guarantees of privacy or security. Secure storage or processing enclaves in mobile phones or computers.