Comprehensive genome-wide identification and transferability of chromosome-specific highly variable microsatellite markers from citrus species

Citrus species among the most important and widely consumed fruit in the world due to Vitamin C, essential oil glands, and flavonoids. Highly variable simple sequence repeats (SSR) markers are one of the most informative and versatile molecular markers used in perennial tree genetic research. SSR survey of Citrus sinensis and Citrus maxima were identified perfect SSRs spanning nine chromosomes. Furthermore, we categorized all SSR motifs into three major classes based on their tract lengths. We designed and validated a class I SSRs in the C. sinensis and C. maxima genome through electronic polymerase chain reaction (ePCR) and found 83.89% in C. sinensis and 78.52% in C. maxima SSRs producing a single amplicon. Then, we selected extremely variable SSRs (> 40 nt) from the ePCR-verified class I SSRs and in silico validated across seven draft genomes of citrus, which provided us a subset of 84.74% in C. sinensis and 77.53% in C. maxima highly polymorphic SSRs. Out of these, 129 primers were validated on 24 citrus genotypes through wet-lab experiment. We found 127 (98.45%) polymorphic HvSSRs on 24 genotypes. The utility of the developed HvSSRs was demonstrated by analysing genetic diversity of 181 citrus genotypes using 17 HvSSRs spanning nine citrus chromosomes and were divided into 11 main groups through 17 HvSSRs. These chromosome-specific SSRs will serve as a powerful genomic tool used for future QTL mapping, molecular breeding, investigation of population genetic diversity, comparative mapping, and evolutionary studies among citrus and other relative genera/species.

). In the whole genome of C. sinensis and C. maxima ( Supplementary Fig. 3A,B), the intra-chromosomal distribution of SSR motif types represented the frequency of mononucleotide repeats and the least presence of hexanucleotide repeats. The distribution of three major classes of perfect SSRs presented in Table 5. After excluding mononucleotide, 31,678 SSRs in C. sinensis and 83,605 SSRs in C. maxima were selected for classification. The maximum number of motifs in C. sinensis and in C. maxima (19,   www.nature.com/scientificreports/ nine chromosomes, respectively. The overall distribution graph for three major SSR classes in each chromosome of C. sinensis revealed that Chr-5 and Chr-2 had the highest number of all three classes of SSRs, followed by Chr-9 that had Class I and Class III and Chr-1 consisted of Class II. However, for C. maxima, Chr-5 had the highest number of all three classes of SSRs followed by Chr-2 (Table 5). Moreover, the overall distribution of class I SSRs in C. sinensis and C. maxima with respect to the number of repeat units for dinucleotides to hexanucleotides in each chromosome were studied (Table 5), and depicted in Fig. 1A,B. Circos graph (Fig. 1A) was represented all the three classes on each chromosome (III-inside, IImiddle, and I-outside). Dinucleotides to hexanucleotides SSR motifs were decreased from inside to outside rings of circos graph (Fig. 1B). SSR markers from both the genomes (C. sinensis and C. maxima) were distributed intra-chromosomal basis, motifs like dinucleotides (930 and 2010) were found maximum thereafter trinucleotides (713 and 986) and the pattern were consistent in all chromosomes. According to the obtained results, the frequency of SSR markers decreased with number of repeat motifs were increased except the hexanucleotides in all the chromosomes ( Table 5).
The physical position and start positions of 321 HvSSRCS markers in C. sinensis and 1206 HvSSRCM markers in C. maxima on nine chromosomes were determined and these markers were used to construct a saturated physical map ( Fig. 2A,B) showed that Chr-5 had the maximum number of SSR markers (59 in C. sinensis and 210 in C. maxima), followed by Chr-2 (48C. sinensis and 206 C. maxima), respectively. ePCR validation of the identified HvSSRs across the seven citrus species. A total of 2004 and 3492 class I SSR primers were tested on the genomes of C. sinensis and C. maxima by in silico analysis, respectively, to determine the SSRs amplification, specificity, and efficiency. The SSR markers were produced one to greater than three alleles in both genomes, and we were validated equal portions of SSR markers across the nine chromosomes (Supplementary Table S3A). A total of 1343 (83.89%) in C. sinensis and 2601 (78.52%) in   (54), C. medica (78), C. ichangensis (54), Atalantia buxifolia (30), and Fortunella hindsii (75) SSRs showed single-locus amplification (Supplementary Table S3A). Two hundred and twenty-one (81.25%) of these   Table S3A). We were selected a subset of 272 from C. sinensis and 935 from C. maxima to validate these chromosome-specific SSRs across seven citrus species. Thereafter calculate the marker parameters, the various amplicons were found using ePCR for these SSR markers across the seven citrus genomes (Supplementary Table S3B

Development of SSR-based physical map in citrus.
High-density physical map was generated on nine chromosomes with the help physical positions of 321 HvSSRCSs in C. sinensis and 1206 HvSSRCMs in C. maxima (Supplementary Table S2, Fig. 2A,B) which showed that Chr-5 (59 and 210 markers) and Chr-2 (48 and 206) had a maximum number of allocated markers, respectively, followed by Chr-3 (46) in C. sinensis and Chr-1 (145) in C. maxima ( Supplementary Fig. 4A,B). It is an interesting to note that scatter plots showed the physical distance (Mb), the intervals between SSR markers, and the lengths of each SSRs track on each chromosome ( Supplementary Fig. 4A Table S5 Table S6). The primer sets were showed clear amplification with well-resolved fragments. The HvSSRs would be able to discriminate between the different citrus germplasm. Moreover, some HvSSR markers did not show clear discrimination among accessions of few sub-groups due to occurrence of spontaneous mutation, like in sweet oranges, grapefruits, and some mandarins. Total primer, total polymorphic primer, average number of alleles, number of different alleles, major allelic frequency, number of effective alleles, shannon's information index, observed heterozygosity, expected heterozygosity, unbiased expected heterozygosity, polymorphic information content was calculated to determine the genetic diversity/variability within whole germplasm (181) ( Table 7). Based on taxonomic classifications, the overall germplasm was divided into 11 sub-groupings according to citrus variety collection (UCR: Citrus Variety Collection). The observed heterozygosity in the population as a whole was 0.69. Citrons, excluding trifoliate hybrids, have the lowest observed heterozygosity of the citrus groupings that are considered to be true Citrus species. Grapefruit was showed the maximum observed heterozygosity of all eleven taxonomic groups at 0.92. divided into three main groups: group 1, which had 7 species, group 2, which contained 16 species, and cluster 3, which contained one citrus species (Fig. 4A). The main coordinates (PCoAs) 1 and 2 accounted 13.81% and 10.56%, respectively, 24.37% for the total variation among the 24 citrus species (Fig. 4B).
In the Neighbor Joining tree, all 181 citrus accessions were clustered into three major grouped: group 1 included 24 (Pummelo and sour orange) while cluster 2, 85 (Sweet orange, Mandarin and Grapefruit) genotypes and cluster 3, 72 (Lime, Lemons, Trifoliate hybrids, Fortunella and others citrus related species) citrus genotypes (Fig. 5A). Furthermore, the PCoA allocated 181 accessions to three distinct groups (Fig. 5B). The principal coordinates (PCoA) 1 and 2 described 11.25% and 7.36%, respectively. Total variance among all the genotypes and contributed for 18.61% of the overall variation. It is an interesting to note that PCoA 1 distinguished between wild and cultivar groups for three clusters.  www.nature.com/scientificreports/ Population structure. Seventeen HvSSRs were employed to estimate population structure among the 181 diverse citrus genotypes (Fig. 6). These genotypes were grouped into four set: mandarins/lemons/trifoliates (blue); mandarins/sweet orange (yellow); sweet orange/grapefruit (green); pummelos, citrons, trifoliates, sour orange, limes and a kumquat/papeda group (red). Mandarins and pummelos are true citrus species, whereas, citrons, Fortunella and trifoliate hybrids are not classified as separate but they are related genera. The other citrus species showed mixing between two or more of these four populations, which were evident hybrids between naturally existing types (Fig. 6). An intriguing outcome of this research is that mandarins (#30-35) were segregated from other mandarins, possibly as a result of their varied geographic origins, as seen in Fig. 6. According to this research, sweet oranges (#37-45 and 47-55) have a genetic makeup that is mostly derived from mandarin and very little from pummelo, whereas #46, 70, and 71 have genetic compositions that are mostly derived from pummelo and some part from mandarin.
GO classification of genic SSRs. The potential functions of SSR-loci were assigned using BLASTX. This method was showed that 95% of all SSR loci had no substantial resemblance to known protein-coding sequences, whereas 5% of all SSR marker had functional protein-coding sequences annotated in the public non-redundant protein database. A significant gene annotation was found in 104 loci for C. sinensis and 387 loci for C. maxima. The majority of SSR loci with an annotation were discovered to be engaged in biological activities, including oxidation reduction (27%) metabolic processes (15%), and carbohydrate metabolism (6%) in C. sinensis. In C. maxima, metabolic processes (38%) were observed, as well as biosynthetic processes (15%), reactions to stress (15%), and metabolic processes (15%) in C. sinensis. Among the many molecular process categories, C. maxima was shown to have better ATP and protein binding activity than C. sinensis. However, it was shown that both citrus species had a comparable 10% protein kinase activity level. The appearance of above mention category is differed from the whole gene set in both C. sinensis or C. maxima ( Supplementary Fig. 5A,B). Similar percentages of oxidoreductase, cellular and catalytic activity were found in both species ( Supplementary Fig. 5A,B).

Discussion
HvSSR markers have been extensively used for genomic study, linkage/trait/QTL mapping, DNA fingerprinting, gene tagging, population genetics, conservation biology, and idiotype/molecular breeding in citrus breeding. However, the limited availability of chromosome specific highly variable SSR markers has impeded trait identification and mapping in citrus species because of high cross-compatibility. Kijas et al. 46 was developed the first citrus SSR markers to improve citrus. The draft genome sequences of various citrus species, viz., C. sinensis, C. clementina, P. trifoliata, and C. limon, were used in numerous projects from 2006 to 2020 to identify www.nature.com/scientificreports/ a considerable number of genome-wide SSR markers 9,24,47-54 . Applicability of these markers and their efficiency for genetic mapping, genetic diversity, and population structure were investigated in additional citrus species. Additionally, Biswas et al. 24 reported a success rate of 56.21%, Biswas et al. 9 65.0%, and Barkley et al. 37 26 . Regarding to intra-chromosomal distribution of SSR motifs in C. sinensis and C. maxima mononucleotide was found to be abundant SSR type thereafter dinucleotides, which is the best conformation with the prior findings 24 .
Class III SSRs were found to be the most prevalent, thereafter class II and I, when the distribution of SSR types across chromosomes were examined. The frequency of SSRs and the number of repeats on each chromosome were correlated, and these results were consistent with observations from other plant species, i.e., pepper 59 , globe artichoke 55 , eggplant 26 and pomegranate 17 . Additionally, class I SSRs in each chromosome in both citrus species had mononucleotide dominance followed by dinucleotide repetitions. These trends were also seen in the distribution of the three main classes of SSRs over the whole genome. In contrast, Biswas et al. 9 , was reported that varied nucleotide repeats in class I as compare to our study, while class II repeats were similar to both the citrus species.

Chromosome-specific hypervariable SSR marker-design and distribution. Class I SSR markers
were used to generate primers in C. sinensis and C. maxima, 2004 and 3492 primers were specific to Chr-5 (257) and Chr-5 (558), while Chr-6 (113) add Chr-8 (217) had lower number of primers, respectively. The number of SSRs and chromosomal length in eggplant and pomegranate, respectively, were shown to be correlated, according to Portis et al. 26 and Patil et al. 17 . In our study same results were observed in C. sinensis and contrast results in C. maxima were observed, there is no correlation in C. maxima between chromosome length and number of SSR markers. The present study found that, at the whole genome level, the distribution of markers for each chromosome reduces as track length increases. Similarly, Patil et al. 17 observed that number of SSR markers decrease when track length increases in pomegranate and Portis et al. 26 in eggplant genome. As shown across the whole genome of C. sinensis and C. maxima, mononucleotides (A and T) predominately found in each chromosome in citrus, followed by dinucleotides (AT). In contrast, Portis et al. 26 and Patil et al. 17 reported distribution of dinucleotide predominated followed by trinucleotide in eggplant and pomegranate genome within individual chromosomes, respectively.

Development of high-density physical map using HvSSRs. A high-density and high-resolution
genetic linkage map with consistent genomic locations and maximum coverage is essential for the mapping of genes and QTLs, which is easily achieved by recent developments in sequencing and genotyping technology 60,61 . In citrus, there are currently not enough reports on the creation of HvSSR-based physical maps. Here, using 321 and 1206 HvSSR markers, we were constructed a saturated physical map of C. sinensis and C. maxima, respectively. In both citrus species, Chr-2 had the second-highest number of markers, followed by Chr-5. This demonstrated a relationship between the number of markers and chromosomal length. With a little divergence in the sparse distribution of markers towards the middle of chromosomes as opposed to distal ends, the SSR distribution pattern revealed that each chromosome had approximately equal amounts of markers present. A highly saturated physical map may be used as a reference map for genotyping data analysis for various breeding populations and genotypes, speeding up the mapping and breeding of distinct citrus traits. Zhao et al. 62 , were reported that the utility of HvSSR markers situated on physical map and it will be applicable for fine mapping from the reported QTLs and many other crops used to estimate synteny and collinearity 17,63,64 . In consideration of this, we can predict that the data obtained here will surely be useful to citrus scientists for citrus breeding. www.nature.com/scientificreports/ approach has been used for confirmation through in-silico of molecular markers 17 . The present investigation was provided, 321 and 1206 (> 40 bp) ePCR validated SSR markers across nine chromosomes of C. sinensis and C. maxima, respectively. Among these, 272 and 935 SSR loci had single ePCR amplified product in C. sinensis and C. maxima genomes, respectively. Out of these, 272 primer pairs, in C. maxima (198), C. clementina (54), C. medica (78), C. ichangensis (54), Atalantia buxifolia (30), and Fortunella hindsii (75) 36 . SSR polymorphism and track length were found to be directly proportional 13,65 and our results were similar to these findings. Genetic diversity analysis of 181 citrus germplasm. The genetic diversity assessment of 181 different citrus germplasm with 17 SSRs demonstrated the usefulness of the novel HvSSRs for citrus genetic improvement. Our current findings revealed that among 181 citrus genotypes, there is a significant amount of genetic variation. Due to spontaneous mutation within some group cultivars which leads to discovered few molecular polymorphisms among them 37,69,70 . Therefore, these 17 HvSSR markers were unable to discriminate some clonally produced varieties. According to Barkley et al. 37 , the lowest reported heterozygosity was seen in citrons excluding trifoliate hybrids, which validates our findings but we observed 2.25-fold change increase in our findings. Grapefruit are apparent hybrids of pummelo and sweet orange 68 , across all taxonomic groupings, had the greatest detected heterozygosity, increasing the 1.64-fold chain. In comparison to the groups categorised as citrus ancestral or relatives, several of the groupings assumed to be hybrids of the naturally existing types of citrus showed a larger share of heterozygous loci. Limes are reportedly tri-hybrids of Citrus medica (citron), Citrus maxima (pummelo), and Microcitrus 71 or apparent hybrids of citrons and papedas as a maternal parent 68,72 , showed the maximum observed heterozygosity of all the eleven systematic groups at 0.66, which is almost identical to Barkley et al. 37 . The sweet oranges, on the other hand, have long been believed to be a back cross between a pummelo and a mandarin (1:3 ratio) 68 , showed minimum heterozygosity (0.65) among the natural and developed hybrid sets. Among the ancestor species, the pummelo had the greatest frequency of heterozygotes (0.74), which increased from previous research by 1.77-fold chain. Of all the taxonomic groupings, limes had the greatest observed heterozygosity (0.66). However, among the hybrid groups including sweet oranges exhibited one of the lowest heterozygosity (0.65). In the last cluster, there are six admixture groups. The citrons, kumquats, trifoliate hybrids, and species related to citrus were all grouped together with the lemon, lime, and their hybrids. Lemons are believed to be natural crossbreeds between citrons and limes or between citrons and sour oranges 39,68,73,74 . Mandarin-lime was the initial classification for the rangpurs. Rangpurs are belongs to C. reticulata introgressed with a few genes from C. medica 37,39,68 . Phylogenetic analysis of all accessions. In the present study, 181 diverse citrus accessions were used for phylogenetic analysis. Pummelo and pummelo hybrids, sour oranges, and a few sour orange hybrids make up the first major category. According to Scora 75 , the pummelo is regarded as a true citrus species which was used for hybridization to produced bitter grapefruits and oranges 71,75 . The pummelos were quite similar to one another because they grouped together and had very small branch lengths between accessions.

Wet-lab validation of HvSSRs on a core set of citrus genotypes (24).
The mandarins, sweet oranges, and grapefruits made up the next significant group but did not form a welldefined clade. The delicious orange and mandarin groupings dispersed into several smaller clusters. When hybrid and nonhybrid accessions were analysed, Federici et al. 76 discovered that C. reticulata group did not constitute a coherent cluster. Mandarins formed a distinct monophyletic group; hybrids were eliminated from the genotypic data (Fig. 5). C. reticulata is regarded as a legitimate citrus species. C. sinensis, assumed to a natural hybrid and majority of its genome inherited from C. reticulata supposed to be female parent because chloroplast genome recovered from mandarin and minute segment of genome from C. maxima features, is an interesting outcome of earlier reported investigation 71,72,75,77,78 . But among the hybrid varieties, the sweet oranges, which were previously believed to be a back cross of mandarin 68  www.nature.com/scientificreports/ Pummelo and sweet orange were thought to be the parents of grapefruit 71,72 . This research included DNA markers based on InDel-SSR markers 79 , SNP markers 74 , and DNA fingerprinting analysis 73 . Grapefruit is a hybrid of Citrus maxima cross with Citrus sinensis were confirmed by using whole genome sequencing 48,68,80,81 . Hybrids were developed in past era through natural and man-made crossing events between C. sinensis (oranges), C. reticulata (admixed mandarins), and C. paradisi (grapefruits). Tangors were developed from C. reticulata (mandarins) and C. sinensis (sweet oranges), Tangelos from C. paradisi (grapefruit) and C. reticulata (mandarin), and orangelos from C. sinensis (sweet orange) and C. paradisi (grapefruit). These hybrids were considered as a small citrus variety 81 . Dendrogram showed that Fortunella and Alalantia buxifolia are not far from accessions in the genus Citrus. Structure analysis. The connections between citrus species and the origins of their hybrids have been better understood from the result of structure analysis of the HvSSRs data. On the other hand, the findings support one another to offer a decent analysis. The neighbour-joining tree is a distance-based approach that determines percentage of common alleles across species and then plots these distance correlations as a tree. Structure seeks to identify population structure in which each population is in linkage equilibrium and Hardy-Weinberg equilibrium. It does this by using a Bayesian clustering technique to probabilistically assign people to populations based on their genotypes.
The 181 accessions, population structure was examined using structure. If an individual genotype suggests mixing, they are allocated to a population or many populations. The majority of genetic marker systems may be used in this technique to estimate population structure, given that the markers are not strongly connected 7,37,39,68 . It makes no assumptions about the specific mutation process 37 . According to Scora 75 and Barrett and Rhodes 71 , there are just a few naturally occurring varieties of citrus (citron, pummelo, and mandarin). These studies also give more evidence for the ancestry of the majority of other citrus species, which are thought to be hybrids descended from these species. The trifoliate hybrids, kumquats, and citrons did not cluster as a separate population despite several runs of the study. This could be as a result of the small number of genotypes included in the genotypic data and the substantial mixing that most of them exhibit. Finally, it is probable that additional molecular markers will be required to distinguish between a distinct population of trifoliate hybrids, citrons, and kumquats.
Gene annotation. This is to be expected as the majority of SSRs are present in the intergenic regions of both the genome (C. sinensis and C. maxima). However, only 9% of the SSRs were showed notable Gene Ontology (GO) hits. SSR loci that include GO keywords which provides an excellent candidate for use as DNA markers in association analysis 24,82 . Functionally, defined SSR markers may make it easier to choose potential gene-based markers for the validation of the functional annotation and for establishing relationships between markerphenotype associations. For trait association analysis, marker-assisted selection, building transcript base maps, comparative mapping, and evolutionary research all taken together, functional markers may offer benefits over anonymous markers 24,82 .

Conclusion
New molecular breeding techniques aim to overcome conventional breeding limits for citrus species, in order to obtain new varieties with improved horticultural traits and resistance to biotic and abiotic stress. Earlier in citrus, two classes of SSRs were identified on the basis of track length maximum 20 nt but in this study, we described SSRs that represent nine chromosomes from C. sinensis and C. maxima genome, and increased the track length > 40 nt (extremely variable SSRs 321 from C. sinensis and 1206 C. maxima) because polymorphism will be increase with increase the track length. C. sinensis and C. maxima yielded a total of 1,08,833 and 1,29,321 perfect SSRs, respectively.
Through ePCR, we first evaluated the in-silico amplification of 321 HvSSRs from C. sinensis and 1206 HvSSRs from C. maxima, and we discovered 272 SSRs in C. sinensis and 935 in C. maxima that amplify a single locus in each species. Seven citrus genome assemblies were subjected to the ePCR method, which revealed 221 C. sinensis and 701 C. maxima SSRs to be polymorphic. 129 HvSSRs were validated through wet-lab and found 98.45% polymorphism. 181 genotypes were divided into 11 main groups through 17 HvSSRs. However, the genotypes were genetically dissimilar due to genetic admixture. In general, all SSR loci used in this study showed high levels of polymorphism (mean 98.45%), which were confirmed the high genetic diversity of citrus in different genotypes. The diverse genotypes of present study may be selected for cross breeding and development of mapping population in citrus breeding program for horticultural traits and resistance to biotic and abiotic stress.

Materials and methods
The present study was conducted at Punjab Agricultural University (PAU), Ludhiana, India during the years from 2020 to 2022 with relevant institutional guidelines and legislation. Necessary permission was obtained from the institute for the collection of plant material.
In silico evaluation of designed SSRs markers. The Genome-wide microsatellite analyzing tool package (GMATA) software 87 was utilized to execute an in-silico ePCR amplification 88 to evaluate the amplification efficiency of newly generated SSRs (class I, > 30 nt) and to map the proposed marker to genomic sequences of nine chromosomes of C. sinensis and C. maxima. The settings for ePCR were margin 3,000, no gap in primer sequence, no mismatch in primer sequence, the amplicon size range of 100-1,000, word size (-w) 12, and contiguous word (-f) 1.
The marker mapping information was processed using the ePCR results. The output file (.emap) contained information about the markers amplification patterns, such as amplicon sizes, physical chromosomal positions, as well as the unique and multiple loci mapped markers. Subsequently, extremely variable SSRs (class I, > 40 nt) were tested on nine chromosomes of 'C. sinensis and C. maxima' to identify SSRs producing one amplicon. Finally, all the identified single-locus SSR primers of C. sinensis and C. maxima chromosomes were evaluated www.nature.com/scientificreports/ across the five (C. clementina, C. medica, C. ichangensis, Atalantia buxifolia, and Fortunella hindsii) draft genome sequences of citrus species along with C. sinensis and C. maxima. The produced amplicon sizes obtained for highly variable SSRs across the seven citrus genomes using GMATA were used to estimate various SSR marker parameters viz., total primer (TP); total polymorphic primer (TPP), average number of alleles (N), No. of different alleles (Na), major allelic frequency (MAF), No. of effective alleles (Ne), shannon's information index (I), observed heterozygosity (Ho), expected heterozygosity (He), unbiased expected heterozygosity (uHe), polymorphic information content (PIC) by using GenAlEx v. 6.5 software 89 .
Construction of a highly saturated SSR-based physical map. The start and end positions of all SSR loci on each chromosome of both species, as well as their major classes, viz., classes I, II, and III, were obtained through Krait software. Circos software (http:// www. circos. ca) was used to create a circular graph to show the chromosome wise distribution of different SSR markers 90 . The chromosome wise scatter plots were created through Microsoft Excel depends upon physical positions, and tract length of the hypervariable SSR markers (class I, > 40 kb) and by using MapChart v 2.2 software 91 , the physical locations of hypervariable SSRs were used to show the high density SSR based physical map of every chromosome from both citrus species.
Experimental validation of SSR markers. 181 diverse citrus germplasms were utilized to validate newly designed HvSSRs (Table 9). Plants were grown in the orchard of the Department of Fruit Science in Punjab Agricultural University, Ludhiana, India. The modified CTAB 92 procedure was used to extract genomic DNA from healthy leaf samples of all citrus accessions ( Table 9). The extracted DNA was quantified on 0.8% agarose gel electrophoresis and Thermo scientific NanoDropTM 1000 spectrophotometer and normalized to 30 ng/µl for polymerase chain reaction. For wet-lab validation, a total of 129 (68 from C. sinensis and 61 from C. maxima) chromosome wise hyper variable HvSSRCS and HvSSRCM primer were synthesized, and firstly, screened on a subset of 24 citrus germplasm (24 genotypes denoted the most of the citrus species and closely relative genera from 181 accessions) for the PCR amplificatipn and transferability analysis, Table 9 with* marks. Subsequently, 2 markers from each chromosome were selected randomly from both the species for genetic diversity analysis in 181 citrus accessions.
PCR amplification was done for wet-lab using final volume 10 µL reaction mixture (2.5 mM Taq buffer, 1.5 mM MgCl 2 , 0.2 mM deoxynucleotide triphosphate (dNTPs), 0.4 µM primer, and 1.0 U of Taq DNA polymerase) using Thermo scientific ABI thermocycler. The amplification was achieved using a thermal PCR profile of initial denaturation at 94 ˚C for 5 min, followed by 35 cycles of denaturation at 94 ˚C for 1 min, annealing at varied from primer to primer for 1.30 min and extension at 72 ˚C for 1.30 min, and a final extension at 72 ˚C for 7 min. PCR products were separated on 3.5% molecular grade agarose gel (VWR, Life Science, India), and visualized under UV light in gel documentation and the amplicons were scored on Alpha Innotech Alpha Imager Hp System(SYNGENE, G: Box, USA). The amplified DNA fragments for all primers were scored as '1' for presence or '0' for the absence and base pairs size of each fragment in all studied genotypes.
Population structure and phylogeny analysis. The genotypic data were generated from 24 initially tested germplasm and finally 181 tested germplasm and generated data were utilized for assessing the genetic variability parameters through GenAlEx v. 6.5 software 89 , the TP, TPP, N, Na, I, MAF, Ne, Ho, He, uHe, and PIC. A dendrogram was generated on the basis of the distance matrix using an unweighted pair group with arithmetic mean (UPGMA) based cluster analysis and principal coordinate analysis (PCoA) through DARwin v. 6.0.021 software for studying the genetic relatedness among genotypes 93 .
STRU CTU RE v.2.3.441 was used to estimate population structure using Bayesian clustering. The admixture ancestry and correlated allele frequency model were used to perform structure analysis for K (number of subpopulations, five separate runs with a burn-in length of 100,000 and MCMC repetitions of 100,000 were done for each K) values ranging from 1 to 10. The optimal K was calculated through delta K estimation method 42 by using STRU CTU RE Harvester43. Citrus germplasm was divided into sub-populations depending on the probability of cluster assignment (Q). To allocate citrus accessions to each group, the cluster assignment probability (Q) value of 0.50 was employed.
Functional gene annotation. BLASTX was used to examine the flanking regions against the GenBank non-redundant protein database to assign probable functions of the discovered SSR marker. To assign putative functions to each locus, the best matched sequences with P < 0.001 were utilised, and the putative functions were saved in a text file. A Blast2Go analysis was used to functionally annotate SSR loci.   Table S2.