Evaluation of whole genome amplification and bioinformatic methods for the characterization of Leishmania genomes at a single cell level

Here, we report a pilot study paving the way for further single cell genomics studies in Leishmania. First, the performances of two commercially available kits for Whole Genome Amplification (WGA), PicoPLEX and RepliG were compared on small amounts of Leishmania donovani DNA, testing their ability to preserve specific genetic variations, including aneuploidy levels and SNPs. We show here that the choice of WGA method should be determined by the planned downstream genetic analysis, PicoPLEX and RepliG performing better for aneuploidy and SNP calling, respectively. This comparison allowed us to evaluate and optimize corresponding bio-informatic methods. As PicoPLEX was shown to be the preferred method for studying single cell aneuploidy, this method was applied in a second step, on single cells of L. braziliensis, which were sorted by fluorescence activated cell sorting (FACS). Even sequencing depth was achieved in 28 single cells, allowing accurate somy estimation. A dominant karyotype with three aneuploid chromosomes was observed in 25 cells, while two different minor karyotypes were observed in the other cells. Our method thus allowed the detection of aneuploidy mosaicism, and provides a solid basis which can be further refined to concur with higher-throughput single cell genomic methods.

Leishmania are unicellular protozoan parasites belonging to the family Trypanosomatidae 1 and causing a spectrum of diseases in tropical and sub-tropical regions, with an incidence estimated at 1.6 million cases per year 2 . The parasite has a dimorphic life cycle: extracellular flagellated promastigotes in the sand fly vector and intracellular amastigotes in macrophages of the vertebrate host. Leishmania have unique genetic and molecular properties that distinguish them from other unicellular and multicellular eukaryotes. Among others, there is no transcription regulation at initiation through promoters. Instead, genes are organized in long arrays of polycistronic units that are transcribed together, then trans-spliced and polyadenylated, and their expression is regulated post-transcriptionally 3 . In this context, gene dosage represents a straightforward strategy for the parasite to modify the expression of genes of interest 4 . This can occur through different mechanisms, like expansion/ contraction of tandem arrays, episomal amplifications and aneuploidy 5 .
We previously sequenced the genomes of 204 cultivated clinical isolates of L. donovani and found aneuploidy in all of them, often affecting half of the 36 chromosomes 6 . Experimental evolution studies suggest that aneuploidy changes constitute an adaptive mechanism to environmental changes like those occurring during the life cycle 7 or those associated with drug pressure 8 . In contrast to many organisms where it can be deleterious, aneuploidy thus appears to be crucial in Leishmania. Analyses of aneuploidy at individual promastigote cell level

Results
Genome coverage and base accuracy. In a first part of the study, we used a series of samples with decreasing numbers of cells (from 1,000 to 10) of L. donovani BPK275 to compare the performance of RepliG and PicoPLEX WGA kits and optimize the downstream bioinformatic protocols. Illumina sequencing of the RepliG and PicoPLEX samples returned 222 million and 111 million 100 bp paired-end reads respectively. The average depth for RepliG was significantly higher than that of PicoPLEX. Specifically, for the five RepliG samples, the total number of reads was 222 million and 96.2% of the reads were mapped to the L. donovani BPK282 reference, resulting in average depth of 17.1 ± 14.7. For the five PicoPLEX samples, the total number of reads was 111 million and 41.1% of the reads were mapped, resulting in average depth 3.4 ± 2.4. (Supplementary data 1, Table S1). As somy calculation for RepliG_10 sample was not successful due to uneven sequencing depth resulting in chromosomes with almost no coverage, this sample was omitted in downstream processing.
In order to make an unbiased comparison on the genome coverage provided by both methods, all sequencing data sets were first subsampled to the number of reads from the sample with the lowest sequencing yield (i.e. the PicoPLEX sample starting from 10 cells, containing almost 16 million reads). When comparing the fraction of the genome covered by at least one read between RepliG and PicoPLEX, the former one was only slightly higher: the average ratio of those genome covering fraction for 1,000, 100, 50 and 25 cells was 1.07 ± 0.05 (standard error of the ratios, Supplementary Fig. S1). When considering the fraction of the genome covered by at least ten reads, this ratio RepliG /PicoPLEX increased to 1.53 ± 0.41 for these same samples. Overall a better genome coverage was thus achieved with RepliG than with PicoPLEX, as visualized in the Manhattan depth plots (Supplementary data 2). This higher genome coverage is due to the fact that on average a higher percentage of reads is mapping back to the reference genome for RepliG (96.2%) compared to PicoPLEX (41.1%). Main reason for this large difference in mapping percentage is the presence of a PicoPLEX related adapter sequence (between 16 and 41% of the reads), which could not be efficiently trimmed off. As those reads will not map to the reference, or will be filtered out due to a low mapping quality score, they will not have an impact on the further results in this work. However, when calculating the normalized standard deviation on the read depth -an indicator for evenness of coverage -PicoPLEX shows a lower variation (Supplementary data 1, Table S1): when comparing RepliG with PicoPLEX, the average ratio of those normalized standard deviations for 1,000, 100, 50 and 25 cells was 1.79 ± 0.16 (standard error). This trend can be confirmed by the lower Read Count Variation of PicoPLEX (see Methods). With the exception of the 1,000 cells sample where both methods show a similar value, the Read Count Variation is dramatically higher for RepliG compared to PicoPLEX, indicating a more even read coverage using the latter one.
The third sequence feature comparing both WGA methods was the base accuracy, which was measured by the allele frequency difference between sequence derived from a given number of cells and the control bulk DNA sample (the BPK275 control). Despite the fact that this strain is different from the BPK282 strain used to construct the reference genome sequence, the high genetic similarity between both strains (only 43 SNPs) ascertains that SNPs will not affect the base accuracy interpretation. Furthermore, each dataset was subsampled to prevent the influence of sequencing depth on the SNP prediction performance. Overall, the base accuracy was higher in RepliG samples than in PicoPLEX ones (Fig. 1). The base error rates were 16.1, 9.5, 5.5, 5.1 and 4.4 fold higher in PicoPLEX than RepliG at cell numbers of 1,000, 100, 50 and 25, respectively. GC bias and read depth. Both WGA methods here used were reported to be highly affected by GC content in terms of read depth distribution [15][16][17] . The GC content across the different chromosomes is expected to be uniform, but we found a strong negative correlation between the GC content and length of chromosomes in L. donovani BPK282 (r 2 = 0.586, p value: 5.47e−8, Supplementary Fig. S2 A-B) and L. braziliensis M2904 (r 2 = 0.566, p value 1.89e−7, Supplementary Fig. S2 C-D). This correlation was absent in the genomes of Plasmodium, Cryptosporidium, Trypanosoma cruzi, Giardia species deposited in EupathDB (https ://eupat hdb.org/eupat hdb/). Lowess (locally weighted scatterplot smoothing) curves were calculated for the different samples to assess the effect of GC bias on read depth. Normalized depth only moderately depends on the GC content in the BPK275 control, resulting in a straight line with minimal slope (ANOVA on regression slope p value < 1e−15, Tukey's post-hoc between BPK275 control and the closest regression line (PicoPLEX_1000) ) value < 1e−06). However,

Somy determination.
Boxplots were used to visualize somy values and their variation in the different samples, with and without GC correction, as described in the Methods section. The somy values of the BPK275 control had a negligible depth variability ( Supplementary Fig. 3A) and the corresponding somy values corrected for GC bias were nearly identical to the values without GC correction for this control ( Supplementary Fig. S4 A). For the RepliG samples, the boxplots suggest a high variability on the somy values ( Supplementary Fig. 3 B). GC correction brings the somy values closer to those of the BPK275 control ( Supplementary Fig. 4B). However, somy values were still imprecise, as characterized by the average somy deviation (ASD, ranging from 0.420 to 0.637, average = 0.57) and the somy difference count (SDC, ranging from 13 to 21, average = 17.80) as shown in Table 1.
In contrast, the PicoPLEX samples showed smaller variation in somy values compared to RepliG (Supplementary Fig. 3C). After GC bias correction for these samples, somy values became closer to those of the control (Supplementary Fig. 4C), which is reflected in an ASD value ranging between 0.299 and 0.418 (average = 0.28), and a SDC value ranging from 2 to 11, average = 6.00 as shown in Table 1. The graphical summary of the ASD for each sample with and without GC bias correction is given in Fig. 3      www.nature.com/scientificreports/ quate than RepliG to accurately predict aneuploidy in single cells. To test this in real-life single cell applications, we applied this method to a line derived from another strain, L. braziliensis PER094, which is characterized by a much higher degree of heterozygosity with 57,402 heterozygous SNPs than the previously used L. donovani BPK275 with only 43 heterozygous SNPs. This increased number of heterozygous SNPs is essential for allele frequency analyses (see below). Sorting of individual cells of the line PER094 GFP was made by FACS using a GFP fluorescent L. braziliensis strain, as described in the Methods section. Two batches of cells were generated, respectively called PER094a (25 cells) and b (22 cells), each of them amplified and sequenced independently. For the batch of 25 PER094a cells, the total number of 151 bp paired-end reads was 183.2 million, of which 13.3% could be mapped to the L. braziliensis M2904 reference genome, resulting in average depth 2.46 ± 4.62. Similarly, for the set of 22 PER094b PicoPLEX samples, 188.9 million 151 bp paired-end reads were obtained of which 25.0% of the reads could be mapped to the reference genome, resulting in average depth of 9.4 ± 10.5. All the mapping statistics of total number of reads, mapped reads and the read depth statistics for both sets of PER094 samples were given in Supplementary data 1 (Table S2). General depth trends for PER094a and b samples could be seen in Manhattan depth plots (Supplementary data 3 and 4). For some samples, mapping percentages are very low, which can be explained by a combination of three factors: 1) presence of a PicoPLEX specific adapter which hampers proper alignment to the reference genome (fraction of reads with adapter varying between 33.80% and 56.87%), 2) Low Phred quality score of the sequencing reads, where around 40% of the samples shows a significant drop in Phred quality score toward the 3′ end of the read. 3) contamination of the FACS instrument with human DNA, which causes between 2.42% and 48.63% (average of 17.59%) of the sequencing reads to be of human origin.
Similar to the BPK275 control, the Lowess approach was used to assess the effect of GC bias on the read depth. It only showed a very limited effect for undiluted and unamplified DNA of PER094 (further called PER094 control) (Fig. 4A). Remarkably, the PER094 control Lowess fit showed an opposite trend compared to the BPK275 control, the former showing a positive trend with increasing GC content. However, for L. braziliensis single cells the GC content had a clear impact on the read depth, represented by the negative slopes as shown in Fig. 4B,C for two single cells. Where the normalized depth histogram shows a normal distribution for the even depth case, the depth histogram of the uneven samples shows a peak at a normalized depth of zero, combined with a long tail distribution towards higher normalized depths, as can be seen in the dot plot and the right side depth histogram (Fig. 4). This visual inspection lead to an ad-hoc cut-off of normalized standard deviation of read depth of 0.31, to distinguish between even and uneven depth samples. The Lowess curves of all the single cell samples were shown in Supplementary Fig. 5A,B. Curves from samples with even depth are clearly distinguishable from the ones with uneven depth. Moreover, the trend of those even depth curves is comparable to the trend observed in the L. donovani BPK275 PicoPLEX samples.
In a next stage, boxplots were used to visualize the technical variability in the sequencing depth and somy estimates for all samples ( Supplementary Figs. 6 and 7). Variability in somy values for the PER094 control, was very limited. For the single cells, variability was higher but this was more pronounced for samples with uneven depth while somy values could be accurately determined for samples with even depth: a single dominant karyotype (called kar1) was observed in 25 single cells and it was characterized by the same trisomy of chromosomes 11 and 25 and tetrasomy of chromosome 31, while all other chromosomes were disomic. This dominant karyotype matched perfectly with the 'average' karyotype observed in the PER094 control. In a few other single cells with even depth, divergent karyotypes were encountered: (i) kar2, with a disomic (instead of trisomic) chromosome 25 in PER094b-sc2 and -sc9 and (ii) kar3, with monosomic chromosome 1 (instead of disomic) in PER094a-sc2 (Supplementary Figs. 6, 7 and 8).
In a last step, we estimated the impact of GC bias correction on somy estimation. Among the even depth samples of PER094a and PER094b batches, GC correction could significantly lower the ASD value from 0.22 to 0.17 for PER094a cells (p value 6.08E−05) and from 0.27 to 0.16 for PER094b cells (p value 9.01E−05). The same trend was observed for the SDC value (2.14 and 4.4 to 1.0 and 1.6 for PER094a and PER094b cells respectively). (Table 2, Fig. 5A and B). In contrast, values observed in uneven depth samples were much higher, and the effect of GC correction was negligible. The overall impact of GC correction on these samples was illustrated in Supplementary data 5 and 6.
Analysis of somy variation with allele frequency distribution. Somy of a given chromosome can also be predicted by the allele frequency distribution, if heterozygosity is sufficiently high 18 , which is the case for PER094. For chromosome 1, we compared the allele frequency patterns of PER094a_sc2 and PER094b_sc15, respectively shown to be monosomic and disomic by methods based on read depth (see above). We observed a typical normally distributed allele frequency of a disomic chromosome shown in PER094 control. In PER094a_ sc2, however, we observed a skewed allele distribution for chromosome 1 with clear allele frequency shift toward 0 or 1 ( Supplementary Fig. 9A): this was not observed for other chromosomes of that cell, hence this decrease of heterozygosity fits with monosomy of chromosome 1 in PER094a_sc2. In contrast, the chromosome 1 of PER094b_sc15 shows a disomic allele frequency pattern, leveled due to allele frequency dropout, common to single cell samples ( Supplementary Fig. 9B). In the case of chromosome 25, the distribution of allele frequency is clearly bi-modal in PER094 control, which is a typical alternative allele frequency pattern of a trisomic chromosome in the bulk sequence ( Supplementary Fig. 9C). Similarly, a pattern with two overlapping peaks was detected in chromosome 25 of PER094b_sc15, which is compatible with trisomy ( Supplementary Fig. 9D). In contrast, the chromosome 25 of PER094b_sc9 ( Supplementary Fig. 9C) showed a flat distribution similar to disomic chromosome 1 of PER094b_sc15 ( Supplementary Fig. 9B).
The atypical allele frequency distribution shown above could be due to allele dropout. To verify this, we specifically measured the average pairwise base differences among a subset of PER094b PicoPLEX samples www.nature.com/scientificreports/ (PER094b_sc20, sc15, sc2, sc13, sc9, sc5, sc16). When sites with zero depth were excluded, the average pairwise base difference at 26,959 heterozygous SNP sites was 0.303 ± 0.020 and the corresponding pairwise difference between two bulk PER094 controls was 0.020. When missing depth was treated as a base mismatch, the average pairwise base difference at 57,402 heterozygous SNP sites was 0.349 ± 0.012 and the corresponding pairwise difference between two bulk PER094 controls was 0.019. This shows that allele dropout was quite high in PicoPLEX single cell samples.

Discussion
In the present paper, we report the first study -to our knowledge -of the genomes of single Leishmania cells.
Using L. donovani as model, we showed in a first step that the choice of WGA method should be determined by the planned downstream genetic analysis and that PicoPLEX and RepliG are more suitable for aneuploidy and SNP calling, respectively, as expected from previous work 11 . Given our interest in aneuploidy 6,7,19 and mosaicism 20 , we evaluated and developed bioinformatic methods to assess the somy for each chromosome when only limited sequencing data is available, thereby taking into account and correcting for the GC-bias. In a second step, we used the PicoPLEX approach to sequence the genome of FACS-sorted single cells and determined their aneuploidy by the computational pipeline optimized in the first step. For this experiment, we used L. braziliensis as model, as its higher heterozygosity should permit to study allele frequency distribution, which is an complementary inferential method for somy estimation often applied in bulk sequencing 18 . We successfully called somy of all 35 chromosomes in 28 out of the 47 single cells, detected aneuploidy mosaicism by read-depth based methods and subsequently validated monosomy and trisomy of some chromosomes based on their signatory somy specific allele frequency distribution. Bioinformatic pipelines were optimized to handle high technical read depth variation observed throughout the genome because of the required amplification process of WGA. We have developed and validated on low DNA amounts depth normalization methods. We tested a various range of window sizes and percentiles to quantify sequencing depth, greatly enhancing the sensitivity and specificity for somy detection. As indicated within the results, there is large variation on the sequencing coverage and evenness of sequencing depth between different samples or cells, but window of 5 kb seems to be the optimal value when considering all samples discussed in the manuscript. It also appeared critical to select the lower and upper depth bin cut offs for somy estimation; accordingly, these parameters must be optimized for each data set separately. In general we found that the combination of higher read depth and even read distribution always led to improved somy estimation based on depth and  Table 2. Estimation of somy deviation in PER094a and b single cell samples, without and with GC bias correction. ASD and SDC between a single cell sample and BPK094 control and corresponding standard deviation (see Table 1  www.nature.com/scientificreports/ read allele frequency. Samples with even but low depth could still lead to accurate somy estimation. However, samples with higher intra-chromosomal read depth variation, regardless of its read depth size, always led to very poor somy accuracy. Therefore, based on those observations we advocate that quality filtering should primarily be focused on selecting those single cell sequencing data that have an even read depth, and only as a second and less important criterion look at the sequencing depth. Previous Leishmania studies based on bulk sequencing have shown some correlation between read coverage depth and chromosome length 7,18,21 . In a previous study where we used SureSelect, a genome capture method, we observed the tendency that the shorter chromosomes had lower depth which resulted in more non-integer somy values compared to longer chromosomes in the sequencing results 19 . This bias was corrected using sequencing depth normalization using chromosomes with similar lengths. The observation of non-integer somy values in smaller chromosomes could be explained by a higher mosaicism among this size-class of chromosomes 18 . However, present study provides an alternative answer to these observations, as we found in reference genomes of both L. donovani and L. braziliensis a negative correlation between GC content and chromosome length. Interestingly, this appeared to be a Leishmania-specific feature as this GC-bias was not found in other tested genomes. The detection of this GC-bias has an impact for single-cell genomics of Leishmania, because WGA methods are reported to be particularly sensitive to GC content [15][16][17] . In our analysis, the slopes of the Lowess curves, used to represent the link between normalized depth and GC content were bigger in L. donovani RepliG samples than in PicoPLEX ones (Fig. 2). Accordingly, GC correction appears to be important for single cell genomics of Leishmania, but also for other species where the GC content is varying over the chromosomes like e.g. Leptomonas pyrrhocoris (data not shown).
In this study we sorted cells using FACS and we amplified and sequenced separately single genomes. Major disadvantages, previously reported to be associated with a FACS-based approach, are the high risk of contamination with foreign DNA during collection process and the high cost of WGA of each sample. First, the size of the microbial genome is much smaller than mammalian genomes and even small fragments of this DNA will efficiently compete with the microbial DNA during WGA. This was clearly illustrated for bacteria 22 and could represent an issue for Leishmania genome which is 100 fold smaller than human genome. To reduce the contamination risk, we cleaned and sterilized the FACS instrument, but treatment with a DNA removing agent was impossible. Second, the high cost of WGA in FACS-based approaches is related to the volume of sample collection (5 µl here) and reaction (75 µl here) and this may also have an effect on the efficiency of the amplification of minute amounts of DNA in the sample 22 . In our experiments, we tested the amplified Leishmania genomes for presence of human DNA by qPCR of only one human gene, RPL30 (see Supplementary methods). The presence of this gene was detected only in few samples, and these were excluded from further studies. However, different degree of contamination with human DNA was still detected by WGS, but this was not found to be a critical factor for the somy accuracy. Indeed, a large proportion of unmapped reads did not necessarily lead to the lower somy calling accuracy. Reciprocally, a large proportion of mapped reads did not necessarily lead to higher somy calling accuracy, the latter mainly because this type of samples often showed an uneven genome coverage. Using automated, microfluidics-or droplet-based single cell sorting and sequencing library preparation platforms would greatly reduce this risk and increase reproducibility of single cell whole genome analysis. Among the additional advantages of these methods, we mention: 1) the volume of the reaction (a few nanoliters), decreasing risk of contamination and cost and increasing reaction's efficacy; 2) the use of chips, which decreases the number of operations and the risk of contamination; 3) pre-staining or constructing parasites harboring a fluorescent marker is not needed, thus most types of unprocessed cells can be analyzed without lengthy preparation; 4) the high-throughput character of analyses and 5) the possibility -with some platforms-to check the number of cells present in each sample and sequence only the ones containing only one cell, hereby also reducing costs and ensuring that only individual cells are analyzed.
Despite the low number of L. braziliensis cells analysed here (28 with even depth), mosaicism could be detected, with 3 different karyotypes, all aneuploid: one dominant karyotype (kar1) in 25 cells and two others (kar2 and kar3) each one encountered in 2 and 1 cells respectively. Interestingly, kar2 and kar3 only differed slightly from kar1, by the somy of 1 chromosome (chr 1 and 25) respectively. Chromosome 31, shown to be tetrasomic in bulk sequences of all Leishmania species studied so far 6,18,21 was tetrasomic too in all 28 cells analyzed here. Probably because of allele drop-out, rather high in PicoPLEX samples, allele frequency distribution curves of individual chromosomes (expected to be monomodal, bi-modal and tri-modal for disomic, trisomic and tetrasomic chromosomes) were atypical in single cells. However, we could detect the allele frequency signatures of monosomic chromosome 1 and trisomic chromosome 25 and thus validate the read-depth based somy calling of these chromosomes. As such, our single cell sequencing data confirm the hypothesis of mosaic aneuploidy which was derived based on FISH data.
This study paves the way for further single cell genomics studies in Leishmania. The FACS-based approach described here is of interest for in-depth analysis of genomes in a small number of cells (for instance a plate of 96 cells), while different WGA methods should be used depending on the planned downstream genetic analysis (SNP, indel, CNV or aneuploidy). High-throughput analyses of single cells are needed to investigate the extent and dynamics of aneuploidy mosaicism in Leishmania, in both stable and variable experimental conditions. Therefore, microfluidics-and droplet-based platforms represent a promising alternative and several options exist (see recent review 23 ). www.nature.com/scientificreports/ land Biolabs), 2 µl of parasites were put in PCR tubes and flash-frozen in liquid nitrogen. For RepliG (Qiagen), 2 µl of parasites + 2 µl of PBS were put in PCR tubes and flash frozen in liquid nitrogen. PicoPLEX and RepliG samples were then further processed according to manufacturer's instructions. The average size of the DNA fragments after amplification was as expected, and differed between RepliG and PicoPLEX, being 2-100 kb and 0.1-1 kb, respectively. DNA of each sample was purified and concentrated with Genomic DNA Clean and Concentrator-25 (Zymo Research)  Amplification-free Illumina libraries were prepared 25 , for PER094 samples 150 bp paired-end reads were generated on the Illumina HiSeq X, whilst for BPK275 samples 10% PhiX DNA was added to the library pool to increase complexity then 100 bp paired-end reads were generated on the Illumina HiSeq 2,500 following the manufacturer's standard sequencing protocols 26 . DNA read mapping, SNP calling. Paired-end reads from the RepliG and PicoPLEX samples of L. donovani and from BPK275 control were mapped to the improved reference L. donovani genome LdBPKv2 7 (available at ftp://ftp.sange r.ac.uk/pub/proje ct/patho gens/Leish mania /donov ani/LdBPK PAC20 16bet a/) using Smalt v7.4 (https ://www.sange r.ac.uk/scien ce/tools /smalt -0). Similarly for L. braziliensis, paired-end reads from 22 PER094b and 25 PER094a FACS-sorted single cells and the PER094 control were mapped to the L. braziliensis M2904 reference genome, which was recently improved based on PacBio SMRT sequencing. The parameters used in Smalt were described in the previous studies 7,19 , which -apart from the default settings -implies activating exhaustive search for optimal alignments (-x), random mapping of multiple hit reads (-r 3), and requiring at least 80% of the nucleotides in the read being a perfect match (-y 0.80). Picard v1.85 (https ://broad insti tute.githu b.io/picar d/) was used for marking duplicated reads. For calculation of the genome coverage for the different samples, samtools was used to subsample all data to the number of reads obtained for the sample with the lowest yield. To assess the evenness of genome coverage, the variation on the genome coverage within 5 kb windows was calculated using the normalized standard deviation (also called coefficient of variation), i.e. the standard deviation of the sequencing depth within 5 kb windows divided by the average sequencing depth calculated over all 5 kb windows. The read count variation was calculated as specified using 5 kb windows 27 .

Materials and methods
For SNP calling for the evaluation of the accuracy of sequences derived from amplified and bulk DNA, we used the population SNP calling mode of UnifiedGenotyper in Genome Analysis Toolkit v3.4, with the SNP cut off (GATK QUAL score) (i) 1,500 for BPK275 samples and (ii) 2000 for PER094 samples (GATK: https ://softw are.broad insti tute.org/gatk/) 28 . The SNP cut off for L. braziliensis was higher since more samples were used to call SNPs. The main goal of the SNP analysis in this study was to evaluate base accuracy and the base recovery rate, i.e. how many bases would be covered by at least one read. Therefore, no lower nor higher depth cut off was imposed in the population SNP calling. In L. donovani chr22, the base positions between position 736,224 and www.nature.com/scientificreports/ the end of that chromosome were excluded from the SNP calling as this region might be an erroneous region in the genome reference sequence, since many SNPs are also detected in other BPK275 related whole genome sequencing data (data not shown) .

GC content and GC bias correction.
With WGA methods like RepliG and PicoPLEX, normalized depth is reported to be affected by GC content bias [15][16][17] . Previous Leishmania studies based on bulk cell populations have suggested a correlation between read coverage depth and chromosome length 7,21 . However, the mechanism of depth bias was not fully understood. Neighboring chromosome normalization 7 was used to assess this effect. However, this will not be accurate for samples in which the variability of chromosome copy number is irregular and skewed. First the GC content was measured in 5 kb windows across each chromosome. To evaluate the impact of GC content bias on the current data sets, we selected the long disomic chromosomes 28, 29, 30, 32 and 34 of L. donovani and L. braziliensis. These disomic chromosomes were used to avoid the depth difference due to aneuploidy. For each sample, we fitted a Lowess (Locally Weighted Scatterplot Smoothing) curve in terms of the GC content and normalized depth, using a Python package statsmodels v0.9.0 (https ://githu b.com/stats model s/ stats model s). Depths greater than the 95 percentile were marked as outliers and removed.
In the next step this Lowess curve is used to correct the somy value for each chromosome separately. Based on the average GC content of a chromosome, its somy value is divided by the value of the Lowess curve for that specific GC percentage using a lookup hash table approach. In general, the Lowess curves properly represented the relationship between the GC content and depth within a GC content range between 55 and 65% for most of the high-quality samples. As the average GC content of each chromosome was between 56.2% and 61.5% for both L. braziliensis and L. donovani genomes, accurate GC correction could be guaranteed. Finally, somy values were renormalized using a median somy value after the initial GC correction to have the median somy value over all chromosomes close to 2. The method used here was analogous to the GC bias correction methods described in the previous studies 29,30 . Somy values with and without the GC correction were visualized with Gnuplot 31 .
Somy estimation. WGA methods are known to produce highly variable depth coverage 15,17 . To mitigate skewed, uneven read distribution 7 , the chromosome median depth was calculated as a trimmed median of mean depths for 5000 bp bins, where bins with a depth less than the 10th percentile or greater than 90th percentile were removed. Somy values are visualized using boxplots as implemented using Matplotlib 32 . To assess depth across all the chromosomes, Manhattan plots across all the chromosomes were created based on median depth of 5000 bp windows. The upper limit of a Manhattan plot was set to be twice the value of a 98 percentile to focus on the informative range.
Somy range. The range of monosomy, disomy, trisomy, tetrasomy, and pentasomy was defined to be the full cell-normalized chromosome depth or S-value: S < 1.5, 1.5 ≤ S < 2.5, 2.5 ≤ S < 3.5, 3.5 ≤ S < 4.5, and 4.5 ≤ S < 5.5, respectively 7 . However, single cell sequencing depth can be highly variable among cells unlike in bulk sequencing. To overcome this technical depth variability among individual cells and to characterize somy variability clearly, we defined the Average Somy Deviation (ASD): this is defined as the average difference between the calculated somy value and the true (integer) somy value, with the latter one calculated based on the bulk control sample. Similarly, the somy difference count (SDC) is defined as the number of chromosomes where the absolute difference between the predicted somy value and the true somy value is greater than 0.5 .
When defining a new karyotype (i.e. a cell having for at least one chromosome a somy value deviating from the somy value based on the bulk genome sequencing), a more stringent criterion is applied. First, the median somy value per chromosome is calculated over all cells with an even depth. Next, for each chromosome for each cell the absolute difference with this median value is calculated. If this absolute difference is higher than one, the cell is defined as having an aberrant karyotype.
Visualization of pairwise allele frequency difference. Alternative base allele frequencies (allele frequencies) of variable sites were extracted from the GATK SNP vcf files. To quantify base recovery rate, a gap was considered to be homozygous mismatch of allele difference of one, instead of discarding missing sites or imputing the bases. For L. braziliensis single cells, we were particularly interested in identifying allele frequency shifts due amplification artifacts. Therefore we first identified high quality heterozygous sites in the PER094b and PER094a bulk samples, and only assessed allele frequency differences. We did not take this approach for the L. donovani samples since there are not enough heterozygous SNPs in BPK275. An allele frequency distribution of two samples was visualized with an allele frequency dot plot where the x-and y-coordinates represented their alternative allele frequencies, and their corresponding histogram was also given along each axis ( Supplementary  Fig. 9).

Data availability
Sequencing data are available in ENA (European Nucleotide Archive) study PRJEB8793. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.