Combining callers improves the detection of copy number variants from whole-genome sequencing

Copy Number Variants (CNVs) are deletions, duplications or insertions larger than 50 base pairs. They account for a large percentage of the normal genome variation and play major roles in human pathology. While array-based approaches have long been used to detect them in clinical practice, whole-genome sequencing (WGS) bears the promise to allow concomitant exploration of CNVs and smaller variants. However, accurately calling CNVs from WGS remains a difficult computational task, for which a consensus is still lacking. In this paper, we explore practical calling options to reach the best compromise between sensitivity and sensibility. We show that callers based on different signal (paired-end reads, split reads, coverage depth) yield complementary results. We suggest approaches combining four selected callers (Manta, Delly, ERDS, CNVnator) and a regenotyping tool (SV2), and show that this is applicable in everyday practice in terms of computation time and further interpretation. We demonstrate the superiority of these approaches over array-based Comparative Genomic Hybridization (aCGH), specifically regarding the lack of resolution in breakpoint definition and the detection of potentially relevant CNVs. Finally, we confirm our results on the NA12878 benchmark genome, as well as one clinically validated sample. In conclusion, we suggest that WGS constitutes a timely and economically valid alternative to the combination of aCGH and whole-exome sequencing.


Supplementary Figure S5 Overlap of callers
Repartition of overlap configurations per caller and type of call. Generally speaking, callers using the same signal show higher overlap than others. Deletions calls by ERDS are an exception and the most frequent category is called by all four callers, then both paired-end callers together with ERDS. Delly and CNVnator show high amounts of uniquely called deletions and gains, which can be put in relation with their lower fraction of supported calls.

Supplementary Figure S6 Overlap of callers per inspection category
Repartition of overlap configurations per inspection category and type of call. With higher call certainty came more frequent overlaps. In both true positive deletions and gains, the most frequent category was calls detected by all four callers. In the false positive calls, unique callers and pairs are the most frequent. Bars are coloured when the call is covered by a unique caller.

Supplementary Figure S7 Overlap of inspected WGS calls with mappability tracks
Calls were intersected with several mappability tracks: the DAC blacklisted regions (regions of the human genome with anomalous, unstructured, high read counts), Duke excluded regions (problematic regions for short sequence tag signal detection), a set of alternative scaffolds lifted over from hg38, and the Repeat Masker track (interspersed repeats and low complexity DNA sequences). A. False positives are not more frequently in DAC blacklisted regions. B. Deletions show a small enrichment for Duke excluded regions. C. This trend is more present for alternative scaffolds, where both deletions and gains are slightly enriched. D. There is no obvious difference in the Repeat Masker content of the calls categories, which is expected since repeats are located all over the genome, including in introns of coding genes.

Supplementary Figure S8 Contingency values per caller and type of calls
For each caller and each filter described in the method section, several contingency values were calculated. The sensitivity (Se) is the ability of the filter to detect true calls (fraction of true positive calls detected). The specificity (Sp) is the fraction of false positive calls that are indeed not detected. The predictive positive value (PPV) is the fraction of true positive calls in the total number of calls and was compared to the pre-filter PPV as a ratio (PPV ratio). The accuracy is the fraction of accurate judgments, i.e. the amount of true positive and true negative calls amongst all calls. "Intrinsic" refers to filtering with intrinsic calls properties (adjusted p-value for CNVnator, paired-end and split-read support fraction for Delly). "Delamp" stands for the removal of calls intersecting opposite calls with 75% reciprocal overlap (Supplementary Figure S9). Intersect 0.5 and 0.75 correspond to the intersection of similar caller type calls with respectively 50 and 75% of reciprocal overlap. Values are compared with the values that would be obtained without filtering ("no filter").

Supplementary Figure S9
False positive calls filtered out by the "delamp" filter A. Proportion of deletions and gains that show an overlap with opposite call, in the Delly and Manta datasets. The problem is marginal for deletions but more prominent for gains, especially for Manta. B. Deletion-gain call at a threshold of 90%. The bottom track is the Repeat Masker track, showing occurrences of the MLT1A1 repeat at both ends of the call. It could then just be a mapping error with reads aligning to both occurrences of the repeated element. C. Deletiongain call at a threshold of 75%, in a region flagged by many overlapping deletions and gains. D. Deletion-gain call at a threshold of 75%, which could doubtfully be a more complex event but is not supported by any coverage depth change.

Supplementary Figure S10 Contingency values for combinations of tools
Sensibility, specificity (A), accuracy and positive predictive value (B) for single callers and combined approaches described in the Methods section. The positive and negative sets include all inspected calls from the four callers, which explains the always imperfect sensitivity. SV2 shows high specificity, but really low sensitivity for gains. The threshold of reciprocal overlap for intersections increases specificity while preserving sensitivity, hence 75% was deemed a good option.

Supplementary Figure S11 Deletions from the intersection-union approach, not confirmed by qPCR
Despite showing clear support in IGV and being detected by all four callers, those two loci could not be confirmed by qPCR. They were found recurrently in the cohort, and overlapped with gnomAD SVs with frequency of 25.7% and 35.3%, hence they could be alternative loci.

Supplementary Figure S12 ACGH probes in intersection-union calls
A. Number of aCGH probes in the intersection-union call. The calls yielded by the intersectionunion approach are not only located in probes-devoid regions, as evidenced by the numbers of probes they contain, which increases with their size. B. Distribution of the amounts of probes in inspection categories. No obvious bias can be detected between the true and false positive calls.

Supplementary Figure S13 Insertion call in patient 3430
In this patient, a gain was called by the intersection-union approach. It is detected by both coverage-based callers, and had previously been detected by aCGH, but characterized as VUS. Indeed, this patient with mirror-image polydactyly of the hands and feet showed no phenotypic overlap with a patient carrying a similar duplication. Coverage-based callers, like aCGH, were not sufficient to provide a molecular explanation to the phenotype. Having access to pairedreads, however, allowed to identify a fusion between intron 1 of SHH on chromosome 7, and intron 8 of KDM4C on the other hand, which could lead to SHH ectopic expression (Elsner, Mensah et al, in press).

Supplementary Table S1
Average number of calls per caller and patient, for each size group and CNV type, for the ten patients of the training group.
Calls below 50 base pairs do not strictly fulfill the definition of CNV and were not considered in the total of counts. 1278 deletions (585 from 1 to 5kb, 205 from 5 to 50kb, 488 from 50 to 200kb) and 748 gains (328 from 1 to 5kb, 420 from 5 to 50kb) from the four best callers (CNVnator, ERDS, Delly and Manta) were visually inspected in IGV as described in the methods section. True positive calls objectified by the presence of a coverage drop, paired-end abnormal signal or split-reads were separated in two categories: "true, 2-" when the signal was present in a maximum of two alleles, and "true, 3+" for supposed polymorphism with allele count above two in the trio. Calls were labeled as shared when the IGV profile appeared similar in the index and both parents, either heterozygous ("shared, het"), homozygous ("shared, hom"), or uncharacterized ("shared"). The true fraction (true calls among all calls) and supported fraction (true and shared calls among all calls) are reported. The total counts gathered for filters evaluation is indicated. For each caller and each filter described in the method section, several contingency values were calculated. The sensitivity (Se) is the ability of the filter to detect true calls (fraction of true positive calls detected). The specificity (Sp) is the fraction of false positive calls that are indeed not detected. The positive predictive value (PPV) is the fraction of true positive calls in the total number of calls and was compared to the pre-filter PPV as a ratio (PPV ratio). The accuracy is the fraction of accurate judgments, i.e. the amount of true positive and true negative calls amongst all calls. "Intrinsic" refers to filtering with intrinsic calls properties (adjusted p-value for CNVnator, paired-end and split-read support fraction for Delly). "Delamp" stands for the removal of calls intersecting opposite calls with 75% reciprocal overlap (Supplementary Figure S9). Intersect 0.5 and 0.75 correspond to the intersection of similar caller type calls with respectively 50 and 75% of reciprocal overlap. Values are compared with the values that would be obtained without filtering ("no filter").

Supplementary Table S5 Contingency values for pipeline options
For single callers and pipeline steps as described in the methods section, several contingency values were calculated. "Delamp" refers to the removal of calls designed as both gains and deletions by a single caller. Sensitivity (Se), specificity (Sp), positive predictive value (PPV), ratio to positive predictive value without filters (PPV ratio) and accuracy are reported, as described in Supplementary Table S4. Average number of calls per patient in the training group, for each suggested pipeline approach, per size category. The repartition of calls per patient is plotted in Figure 4B.

Supplementary Table S9 Quantification of intersection-union calls in regions targeted by aCGH
Calls issued from the intersection-union approach for the training patients were intersected with windows of respectively 1 and 10kb around the coordinates of aCGH probes. On average, more than 250 deletions and 50 gains above 5kb are detected in close proximity to those probes, hence the additional calls detected by aCGH are not limited to regions absent from the aCGH design. or split-reads were separated in "true, 2-" when the signal was present in at most two alleles, and "true, 3+" for supposed polymorphism with allele count above two in the trio. Calls were labeled as shared when the IGV profile appeared similar in the index and both parents, either heterozygous ("shared, het"), homozygous ("shared, hom"), or uncharacterized ("shared, alt").
All calls above 200kb appeared as false positives, opposite calls or doubtful calls.