Population genetics of 30 insertion/deletion polymorphisms in the Bahraini population

This paper evaluates the forensic utility of 30 insertion-deletion polymorphism (indel) markers in a sample from the Bahraini population using the Qiagen Investigator DIPplex Kit. Allele frequencies and forensic stats of the 30 indels were investigated in 293 unrelated individuals from different governorates of the Kingdom of Bahrain. None of the markers showed significant deviation from Hardy Weinberg equilibrium except for HLD88 locus and no linkage disequilibrium were detected between all possible pair of the indel loci, assuming that these markers are independent and their allele frequencies can be used to calculate the match probabilities in the Bahraini population. The high power of discrimination (CPD = 0.9999999999998110) and the low combined match probability (CPM = 1.89 × 10−13) indicate that these markers are informative and can be successfully used for human identification in terms of forensics and paternity. Genetic distances and relatedness were displayed through multidimensional plotting and phylogenetic tree using various populations in the region. Our study showed that the Bahraini population was clustered with neighboring countries such as Kuwait and Emirates which indicates that these closely geographical regions share similar allele frequencies and are more genetically related than other reference population studied.


Scientific Reports
| (2021) 11:6843 | https://doi.org/10.1038/s41598-021-86386-w www.nature.com/scientificreports/ author and presented at the General Directorate of Criminal investigation and Forensic Science-Kingdom of Bahrain to deliver their samples for the research after obtaining informed consent. The age of the participants was ranged from 20 to 70 years old. Ethical review for conducting tests was obtained and approved by the Research and Research Ethics Committee (RREC) (E007-PI-10/17) in the Arabian Gulf University, Manama, Kingdom of Bahrain. All participants agreed to the informed consent which were provided prior to their contribution. All research was performed in accordance with relevant guidelines/ regulations. In each case, males declared their ancestry (to the level of paternal grandfather) from four different geographical subdivisions of the country (Capital Governorate, Muharraq Governorate, Northern Governorate and Southern Governorate) were sampled. DNA processing. Genomic DNA was extracted using QIAsymphony SP instrument (Qiagen, Germany) following magnetic beads principal. Subsequently, the extracted DNA was quantified using Quantifiler HP DNA Quantification kit (Thermo Fisher Scientific Company, Carlsbad, USA) in the 7500 Real-Time PCR System (Thermo Fisher Scientific Company, Carlsbad, USA) according to manufacturer's recommendation.
About 0.5 ng of the extracted DNA was amplified using Investigator DIPplex kit (Qiagen, Germany) with full-volume reactions (10.5 µl) following manufacturer's protocol in 30 cycles conditions via MicroAmp Optical 96-Well Reaction Plate (Thermo Fisher Scientific Company, Carlsbad, USA) along with the provided positive control (9948) and nuclease-free water as a negative control in a Veriti thermal cycler (Thermo Fisher Scientific Company, Carlsbad, USA) following the PCR thermal cycles provided in the DIPplex manufacture protocol.
The PCR products (1 µl) were separated by capillary electrophoresis in an ABI 3500xl Genetic Analyzer (Thermo Fisher Scientific Company, Carlsbad, USA) with reference to the BTO size standard (Qiagen, Germany) in total of 12 µl master mix consisting of BTO size standard and Hi-Di formamide (Thermo Fisher Scientific, Inc., Waltham, MA, USA). GeneMapper ID-X Software v1.4 (Thermo Fisher Scientific, Inc., Waltham, MA, USA) was used for genotype assignment in combination with the Investigator DIPplex Template Files and Qiagen DIPSorter software (Qiagen, Germany). Experiments were performed in the Biology and DNA Forensic Laboratory, Ministry of Interior, Kingdom of Bahrain which is accredited with Collaborative Testing Services (CTS).
Arlequin statistical software v3.5 9 was used to calculate Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) tests between all pairs of the 30 indels, and p values were corrected by the Bonferroni 10 .
Interpopulation pairwise genetic distances based on Fst calculated from allele frequencies of the population of Bahrain and the rest of populations extracted from the literature which included Kuwait 11 , UAE 12 , Iraq 13 , Iran 14 , Turkey 13 , Slovenia 13 , Lithuania 13 , Bangladesh 15 , Indonesia 15 , and Japan 15 using POPTREE2 software 16 and represented by a nonmetric multidimensional scaling (NM-MDS) analysis using IBM SPSS Statistics v21.0 Software to investigate the populations structure between Bahraini population and the abovementioned populations based on Fst's genetic distances.

Ethics approval. Ethical review for conducting tests was obtained and approved by the Research and
Research Ethics Committee (RREC) (E007-PI-10/17) in the Arabian Gulf University, Manama, Kingdom of Bahrain.
Consent to participate. All participants provided informed consent prior to contribution their buccal swab samples.

Consent for publication.
All authors/participants provided consent for publication. All figures are generated from software indicated inthe materials and methods.

Results
Allele frequencies, forensic parameters and efficiency. Allele frequencies, and forensic efficiency parameters for the 30 indel loci in the Bahraini population are shown in Table 1. The genotypes are available in supplementary material Table 1S.
There was no deviation from Hardy-Weinberg equilibrium (HWE) after applying the Bonferroni correction value of P < 0.00017, except for HLD88 which was still deviated even with the correction. The expected heterozygosities (He) ranged from 0.413 to 0.501 with a mean value of 0.481.The observed heterozygosities (Hobs) ranged from 0.332 (HLD97) to 0.534 (HLD6 and HLD125) with a mean average of 0.450. Values for the polymorphic information contents (PIC) ranged between 0.328 and 0.375. www.nature.com/scientificreports/ All markers were highly polymorphic and informative for forensic application using Bahraini population sample. To determine the forensic efficiency, we evaluated power of discrimination (PD), power of exclusion (PE) and matching probability (MP). The combined power of discrimination (CPD) and the combined power of exclusion (CPE) for 30 indel markers were 0.9999999999998110 and 0.99276, respectively. The combined MP was 1.89 × 10 −13 for Bahrainis, allowing a reliable level of discrimination power in forensic cases. Regarding the allele frequency as indicated with deletion and insertion frequencies shown in Table 1, the deletion frequencies (DIP−) ranged from 0.291 (HLD64) to 0.658 (HLD77) with the mean of above 0.4. Insertion frequencies (DIP+) ranged from 0.342 (HLD77) to 0.709 (HLD64). Linkage disequilibrium tests (P < 0.000115 after Bonferroni correction) revealed no allelic association between all possible pairwise combinations of 30 indels, indicating the independence of the 30 indel markers as shown in Table 2S.
Interpopulation diversity. Determining the genetic structure of populations is becoming increasingly important in genetic studies 19 . To reveal population genetic similarities and divergences between Bahraini population and other populations previously reported, we have constructed the phylogenetic tree (Fig. 1) from allelic frequencies data (deletions and insertions values collected from each marker) by using the neighbor-joining (NJ) method via MEGA X: Molecular Evolutionary Genetics Analysis. Also, by applying the matrix of the Fst genetic distances to generate Multidimensional scale (MDS) plot (Fig. 2).
We have used 10 different populations along with the population of Bahrain: Kuwait 11 , UAE 12 , Iraq 13 , Iran 14 , Turkey 13 , Slovenia 13 , Lithuania 13 , Bangladesh 15 , Indonesia 15 , and Japan 15 . Fst values for allele frequency distribution between Bahraini population and the published groups are shown in (Table 2).
It is shown that Bahraini and Kuwaiti populations shared the most genetic relatedness than the other populations, along with the Emirati population. The rest of populations stood distant of genetic association with the     Figure 2 with good accordance to their geographic region.

Discussion
The forensic utility of 30 insertion-deletions polymorphism (indel) markers in a sample from the Bahraini population was successfully evaluated in this paper using the Qiagen Investigator DIPplex Kit. The deviation of HWE in HLD88 locus could be a result of high diversity of the studied population or due to the high polymorphism of HLD88 locus, which can also be supported by the PD and PM parameters. In earlier studies of autosomal STR 20 , it was indicated that the Bahraini population structure reflected the high level of endogamy, accounting for 20-50% of all marriages compared to other populations in the region 21 . Also, another explanation is the Wahlund effect within the communities; large number of homozygotes due to population substructure 22 .
As for the LD and after applying the Bonferroni's corrections, it was shown that for all possible combinations located on the same chromosome indicated minor findings for departures from the independence. Therefore, these studied indels in different loci can be counted as independent for calculation of matching probabilities. We have compared Bahraini population data with other populations according to the available data using the accessible loci. Regarding the Interpopulation diversity, the phylogenetic tree was constructed based upon the data from the 11 populations which were consistent with other population data from the region based upon the Fst values obtained.
In order to measure the population differentiation due to genetic structure, Fst values are obtained for different populations. It is shown that the Bahraini population shares comparable results with its neighboring countries (Kuwait and UAE) based on the 30 indel markers which indicates that these population have more genetic flow than other distant population resulting in similar pattern of allele frequency distribution between them. Once more studies of Arab populations in the region become accessible, it may be more probable to develop a greater understanding of the genetic associations between the different populations for the Arabian Peninsula.
This study increases the population database relevant for the application of genetic markers in forensic studies and can be complementary to STRs population genetic studies in many challenging forensic cases. To conclude, this is the first study to report the allele frequencies and forensic statistical parameters of Bahraini population using the 30 insertion and deletion polymorphisms included in the Investigator DIPplex Kit. Interpopulation comparisons showed that differences were high among populations worldwide, which revealed that DIPplex Kit might be performed well in intercontinental forensic population analysis. The 30 indels markers consisting of straightforward genotyping procedure with low mutation rate and high level of information indicates a great potential in forensic investigations especially in cases where degraded or low quality samples gave partial/null profile using the conventional STR markers, or in paternity cases where additional set of markers are needed to increase the power of the evidence.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. Full dataset associated with this article is available in Table 1S and the linkage disequilibrium is available in 2S.

Code availability
All the software applications are mentioned within the text.