Figure 1 : Matching accuracy on simulated data under various settings.

From: Probabilistic record linkage of de-identified research datasets with discrepancies using diagnosis codes

(a) Impact of the discordance between the two datasets. (b) impact of rare codes. (c) impact of the proportion of overlapping patient between the two datasets. The figure shows the performance of the matching according to various simulation scenarios in terms of True Positive Rate (TPR) and Positive Predictive Value (PPV). F-S refers to the Fellegi-Sunter method while ludic denotes our proposed Bayesian approach.