Comparison of Two Massively Parallel Sequencing Platforms using 83 Single Nucleotide Polymorphisms for Human Identification

The potential of Massively Parallel Sequencing (MPS) technology to vastly expand the capabilities of human identification led to the emergence of different MPS platforms that use forensically relevant genetic markers. Two of the MPS platforms that are currently available are the MiSeq® FGx™ Forensic Genomics System (Illumina) and the HID-Ion Personal Genome Machine (PGM)™ (Thermo Fisher Scientific). These are coupled with the ForenSeq™ DNA Signature Prep kit (Illumina) and the HID-Ion AmpliSeq™ Identity Panel (Thermo Fisher Scientific), respectively. In this study, we compared the genotyping performance of the two MPS systems based on 83 SNP markers that are present in both MPS marker panels. Results show that MiSeq® FGx™ has greater sample-to-sample variation than the HID-Ion PGM™ in terms of read counts for all the 83 SNP markers. Allele coverage ratio (ACR) values show generally balanced heterozygous reads for both platforms. Two and four SNP markers from the MiSeq® FGx™ and HID-Ion PGM™, respectively, have average ACR values lower than the recommended value of 0.67. Comparison of genotype calls showed 99.7% concordance between the two platforms.

are common for both platforms. These 83 SNP markers, which are spread across the 22 autosomes, have high heterozygosity and low Fixation Index (Fst) giving them a high combined discrimination power 15,16 . In addition, these 83 SNPs have relatively small amplicon sizes ranging from 40 to 135 bp increasing the likelihood of successful DNA profiling of degraded DNA 17,18 .
Since HID-Ion PGM ™ and MiSeq ® FGx ™ systems utilize different approaches to sequencing, it is necessary to assess the reliability and consistency of their genotyping results. Concordance of the shared 83 SNP markers will allow merging of data that were generated by the two systems and enable expansion of existing databases. Using the overlapping 83 SNPs, we performed concordance analysis and parallel evaluation in terms of coverage per SNP locus and heterozygote balance of genotype calls on 143 blood samples that were blotted on FTA ™ paper (Whatman). FTA ™ paper is used in forensics genetics research and biobanking because it allows for easier and longer storage of DNA from samples such as blood 19 . Recently, Kampmann and co-workers demonstrated the utility of FTA samples for MPS 20 thus opening the doors for laboratories with archived samples to adopt the MPS technology.

Results and Discussion
Concordance Evaluation. The two MPS systems initially showed more than 43% non-concordance at 28 out of the 83 SNPs analyzed (Fig. 1, gray circle with red dot) because the two MPS platforms use different nomenclature in reporting SNP genotypes. Concordance was achieved after comparing the reverse complement of the genotypes from MiSeq ® FGx ™ marker panel with the corresponding genotypes in the HID-Ion PGM ™ marker panel. Overall concordance analysis of the 143 samples showed an average of 99.70% concordance and a non-concordance range of 0 to 9% across all the 83 SNPs ( Fig. 1). Non-concordance was contributed mainly by zero or low coverage reads (Figs 2 and 3) and extreme allele imbalance (Table 1). Multiple samples exhibited non-concordance at SNPs rs1736442 (9%), rs1031825 (6%), and rs10776839 (5%) ( Table 1). SNPs rs1736442 and rs1031825 showed low average coverage reads of 58 and 54 (Fig. 3), respectively, when typed with MiSeq ® FGx ™ platform. For such cases, the risk of allele dropout is higher because of the low number of allele reads 1 .
SNPs rs10776839 and rs2040411 gave very low or imbalanced average allele coverage ratio (ACR) values of 0.186 and 0.097, respectively, for the samples listed in Table 1 when typed in HID-Ion PGM ™ system. Notably, SNP rs10776839 is among the poorly performing SNPs identified for Ion Torrent ™ HID SNP assay due to inconsistent allele balance among samples typed 1 .
Coverage Analysis. Sequencing coverage directly affects the sensitivity and SNP genotyping accuracy of MPS systems applied to forensics typing. For SNP detection, the actual coverage per SNP locus (referred to as 'SNP coverage' in this paper) was the parameter used for evaluation 21 . For HID-Ion PGM ™ and MiSeq ® FGx ™ , the variation in read counts could be brought by varying factors during library preparation. SNP coverage for HID-Ion is affected by the number of wells in the sequencing chip which are occupied by Ion Sphere ™ Particles (ISP) with monoclonally amplified SNP target that were successfully read 21 . On the other hand, SNP coverage for the MiSeq ® FGx ™ is affected by the PCR amplification efficiency, purification, and bead-based library normalization 11 . Increased number of markers multiplexed in one sequencing reaction could also increase variation in coverage of the SNP markers 11 . Comparison of the markers' SNP coverage between the two MPS systems (Figs 2 and 3) showed that MiSeq ® FGx ™ achieved higher SNP coverage reads in majority of the markers; however, it also showed higher variation in SNP coverage distribution across samples than the HID-Ion PGM ™ , with more read outliers observed in the univariate statistical evaluation. In MiSeq ® FGx ™ , SNP rs338882 (average ACR = 0.51) gave 11 extreme outliers with SNP coverage values that are at most 8,740 reads away from the average SNP coverage (618 reads) of the 143 samples (Fig. 3). This SNP, however, showed 100% concordance between platforms.  Allele Coverage Ratio. The over all average allele coverage ratio (ACR) of heterozygous SNPs is 0.89 and 0.88 for HID-Ion PGM ™ and MiSeq ® FGx ™ platforms (Fig. 4), respectively. This means that coverage of heterozygous SNPs on the two platforms is generally balanced approximating the ideal ACR value of 1.0 (50:50 allele ratio). The SNPs rs214955, rs430046, rs876724, and rs917118, in the HID-Ion PGM ™ , and SNPs rs338882 and rs6955448, in the MiSeq ® FGx ™ (Fig. 4), gave average ACR values of less than 0.67, which was the recommended minimum threshold value of Eduardoff et al. for balanced heterozygote SNPs 21 . This, however, did not affect concordance between platforms of the SNPs mentioned.

Conclusion
The study puts forward the need to include the information on sequence nomenclature when reporting MPS data.
Genotyping data generated using HID-Ion PGM ™ and MiSeq ® FGx ™ Forensic Genomics System were highly concordant and SNP data may be pooled to provide a more comprehensive database of forensically relevant SNPs. Further work is needed to address the quality of MPS data from the SNPs rs10776839, rs1031825 and rs1736442 -with greater than 4.8% non-concordance between platforms-and the SNP rs338882 in MiSeq ® FGx ™ marker panel-with observed imbalance in heterozygous SNPs and with large sample-to-sample coverage read variation.

Methods
The study was implemented at the DNA Analysis Laboratory, Natural Sciences Research Institute, University of the Philippines Diliman. Laboratory work involving the use of MPS machines was conducted at Illumina Headquarters in Singapore and at the Philippine Genome Center, University of the Philippines Diliman. Steps in processing the samples using the HID-Ion PGM ™ and MiSeq ® FGx ™ ForenSeq ™ Genomics System were performed following the manufacturers' protocols 22-24 . Samples. Archived blood DNA samples on FTA ™ paper from 143 unrelated Filipino male individuals were processed using the MiSeq ® FGx ™ Forensic Genomic System and the HID-Ion PGM ™ following manufacturers' protocols. This study was approved by the University of the Philippines Manila, Research Ethics Board (UPMREB No. 2014-499-01). All procedures were performed in accordance with the approved guidelines of UPMREB. Volunteer's informed consent was obtained before sample collection was performed.  Massively Parallel Sequencing using HID-Ion PGM ™ . Library amplification was performed in the Veriti ® Thermal Cycler (Applied Biosystems) using reagents from the HID Ion AmpliSeq ™ Identity Panel kit 24 . The amplification reaction contained 5X Ion AmpliSeq ™ HiFi Master Mix (4 ul) and 2X HID-Ion AmpliSeq ™ Identity SNP-124 Panel (10 ul