The complete plastome sequences of invasive weed Parthenium hysterophorus: genome organization, evolutionary significance, structural features, and comparative analysis

Parthenium hysterophorus, a globally widespread weed, poses a significant threat to agricultural ecosystems due to its invasive nature. We investigated the chloroplast genome of P. hysterophorus in this study. Our analysis revealed that the chloroplast genome of P. hysterophorus spans a length of 151,881 base pairs (bp). It exhibits typical quadripartite structure commonly found in chloroplast genomes, including inverted repeat regions (IR) of 25,085 bp, a small single copy (SSC) region of 18,052 bp, and a large single copy (LSC) region of 83,588 bp. A total of 129 unique genes were identified in P. hysterophorus chloroplast genomes, including 85 protein-coding genes, 36 tRNAs, and eight rRNAs genes. Comparative analysis of the P. hysterophorus plastome with those of related species from the tribe Heliantheae revealed both conserved structures and intriguing variations. While many structural elements were shared among the species, we identified a rearrangement in the large single-copy region of P. hysterophorus. Moreover, our study highlighted notable gene divergence in several specific genes, namely matK, ndhF, clpP, rps16, ndhA, rps3, and ndhD. Phylogenetic analysis based on the 72 shared genes placed P. hysterophorus in a distinct clade alongside another species, P. argentatum. Additionally, the estimated divergence time between the Parthenium genus and Helianthus (sunflowers) was approximately 15.1 million years ago (Mya). These findings provide valuable insights into the evolutionary history and genetic relationships of P. hysterophorus, shedding light on its divergence and adaptation over time.

The circular map represents the entire structure of the P. hysterophorus plastome, which spans 151,881 bp in length.It consists of a duplicated region known as inverted repeats (IR), which accounts for 25,085 bp.These IR regions are positioned on opposite ends of the genome and are separated by two distinct regions: a small single copy (SSC) region measuring 18,052 bp and a large single copy (LSC) region spanning 83,588 bp (Fig. 1 and Table 1).The overall G + C content of the entire chloroplast genome is 37.6%.The GC content of rRNA is greater (55.3%) than other parts of plastome.Among other studied species, P. argentatum has the longest genome size of 152,803 bp with 76,636 bp protein-coding regions.H. annuus has the shortest (151,104 bp) plastome size among all species (Table 1).In P. hysterophorus, there are 129 genes, including 85 genes coding for proteins, eight rRNA genes, and 36 genes for tRNA.The chloroplast (cp) genome contains various protein-coding genes, including 15 genes associated with photosystem II (psbA, B, C, D, E, F, H, I, J, K, L, M, T, Z), nine genes encoding large ribosomal proteins (rpl2, 14, 16, 20, 22, 23, 32, 33, 36), 11 genes for small ribosomal proteins (rps2, 3, 4, 7, 8, 11, 12, 14,  15, 18, 19), five genes related to photosystem I (psaA, B, C, I, J), and six genes responsible for ATP synthesis and the electron transport chain (atpA, B, E, F, H, I).Notably, the psbL gene is not found in the plastome.Similarly, 17 protein-coding genes contained introns, of which three genes (clpP, rps12, ycf3) comprised two introns, while

Structural comparison for P. hysterophorus plastome with related species
Structural analysis of P. hysterophorus revealed that, like most of other Asteraceae plastomes, it exhibits a high level of sequence similarity and structural conservation (Fig. 2).The analysis has provided evidence supporting the presence of a rearrangement in the LSC region of the genome.This rearrangement involves two inversions: a relatively large inversion spanning approximately 22.8 kilobases (kb) and a smaller inversion nested within the larger one, spanning around 3.3 kb.Notably, this rearrangement is observed across all species belonging to the subfamilies Cichorioideae, Carduoideae, Mutisioideae, and Asteroideae (Fig. 2).Synteny visualizations were utilized to identify similarities and differences among these genomes.The results demonstrated that P. hysterophorus is closely related to all the related species from Halianthae and exhibited high levels of synteny and similarity (Fig. 2).However, a comparison with Arabidopsis, a model plastome, confirmed the large inversion in the LSC region, as reported above (Fig S2).
The availability of multiple complete Asteraceae plastomes offers a valuable opportunity to investigate sequence variations within the family at the genome level.Using the VISTA program and referencing the annotation of P. hysterophorus, we aligned and plotted the plastomes of 12 Asteraceae species (Fig. 3).The overall  www.nature.com/scientificreports/alignment of these genomes reveals a predominantly conservative nature, with limited divergent regions.As observed in other angiosperms, the coding regions exhibit higher conservation levels than the non-coding counterparts.A number of regions are found to show more divergence, including trnH-psbA, matK, rps16-trnE, trnR-psbD, ndhC-trnV, ycf3-trnS, clpP, petB, ycf1, rpoA, rpl32, and ndhF, but the divergence is much more in A. artemisiifolia.Similarly, in P. argentatum the sequence divergence from psbl-trnC to trnE-rpoB is more.The trnT, trnS psaA-ndhj showed more divergence.In X. sibiricum and S. calendulacea the trnV exhibited greater divergence than other species (Fig. 4).On the other hand, A. artemisiifolia, H.annuus, A. anchusifolia, I. heterophylla and T. diversifolia, from accD-psal and trnL-ycf2 region showed more divergence (Fig. 4).In S. calendulacea the gene from trnN-ndhF is missing, while in P. argentatum, trnN showed significant divergence, A. artemisiifolia, H. annuus, A. anchusifolia, I. heterophylla, and T. diversifolia also showed high divergence in ycf2 gene as compared to P. hysterophorus.In a pairwise sequence divergence analysis, P. hysterophorus exhibited the highest divergence (0.07) with S. calendulacea followed by X. sibiricum (0.02) and showed the lowest divergence with previously sequenced P. hysterophorus (0.00007), followed by P. argentatum (0.018) (Table S1).

Contraction and expansion of IRs
The borders of LSC-IRb and SSC-IRa in the plastome of P. hysterophorus were compared to 11 other closely related species, including A. anchusifolia, A. artemisiifolia, E. angustifolia, H. annuus, I. heterophylla, P. argentatum, P. hysterophorus, T. diversifolia, S. integrifolium, S. calendulacea, and X. sibiricum.All species had an intact

Repeat sequence analysis
Different types of repeats were examined in P. hysterophorus and compared with other related species.The result showed that P. hysterophorus consists of a total of 16 palindromic repeats, 14 forward repeats and 18 reverse repeats, and 45 tandem repeats.However, in P. argentatum, these repeats were 11, 21, 16, and 52, respectively.Among the related plastomes, the highest number of tandem (52) and reverse (28) repeats were found in A. artemissifolia (Fig. 7).However, among the other species X. sibiricum and S. calendulacea possess the highest palindromic and forward repeats, i.e. (22, 22), (20, 26) respectively, while A. artemisiifolia comprised the lowest palindromic repeats (4).However, A. artemisiifolia comprised the highest reverse repeats (28), followed by S. integrifolium 27.On the other hand, X. sibiricum and S. calendulacea have the lowest reverse repeats, 8 and 4, respectively.In the case of tandem repeats A. artemisiifolia comprised the highest number of tandem repeats e.g.62.However, when we observed the length of different repeats, we found that in the case of palindromic, forward, and reverse repeats, most of the repeats were 21-30 bp long, while in the case of tandem repeats, majority of repeats were 11-20 bp long in all plastomes.In A. artemisiifolia, about 16 and 12 repeats were of 61-70 and 71-80 bp in length (Fig. 7).

Simple sequence repeats (SSRs) analysis
In P. hysterophorus plastome, a total of 40 SSR repeats are detected, and all of them are mononucleotide repeats.
The highest number of repeats were observed in I. heterophylla (47) with 43 mononucleotides, two dinucleotides, and two trinucleotides and X. sibiricum (46) with 45 mononucleotides and one dinucleotide.About 45 SSRs were observed in S. integrifolium followed by A. anchusifolia (44) and T. diversifolia (42).No tetra and pentanucleotide SSRs were detected in any plastome.In P. hysterophorus, most mononucleotide SSRs were A (47.5%) and T (52.5%) motifs (Fig. 8).The highest C motif (45%) was observed in X. sibiricum, while only one C motif was observed in five species (S. integrifolium, H. annuus, A. anchusifolia, I. heterophylla, and T. diversifolia) while no C motif was detected in Parthenium species plastomes.Only one dinucleotide with AT motif was observed in X. sibiricum, while one TA motif was observed in S. calendulacea, H. annuus, and A. anchusifolia while two TA motifs were observed in I. heterophylla plastome.Similarly, two trinucleotide motifs (GAA) were observed in T. diversifolia.

Phylogenetic analysis
This study conducted a comprehensive analysis to determine the phylogenetic position of P. hysterophorus within the Asteraceae family, specifically the Heliantheae tribe, which comprises 75 members of 11 genera.The investigation involved aligning the sequences of 72 shared genes among these members.Two widely used methods, namely maximum likelihood (ML) and Bayesian inference (BI), were employed for phylogenetic analyses to ascertain the evolutionary relationships.Notably, the ML analysis provided valuable insights by assigning bootstrap values to   the nodes in the tree.Remarkably, 40 out of the 72 nodes demonstrated a bootstrap value equal to or exceeding 95%, indicating robust support for their placements (Fig. 9).Upon constructing the phylogenetic trees using the 72 shared gene sequences, it was observed that P. hysterophorus formed a distinctive clade along with P. argentatum.Both bootstrap analysis and Bayesian inference consistently supported this clustering.Analysis of multiple data sets revealed that P. hysterophorus, a plant species, shares a close evolutionary relationship with the genera I. heterphylla and Helianthus.Similarly, T. deversifolia was found to be closely related to the Aldama genus.Additionally, Echinacea, Xanthium, and Ambrosia genera were clustered with strong statistical support, indicating their shared evolutionary history.Conversely, the genera Eclipta, Sphagneticola, and Silphium formed a distinct clade at the base of the phylogenetic tree.Using the Bayesian approach implemented in BEAST, the divergence time between Parthenium and Helianthus was estimated at approximately 15.1 million years ago (Mya) with a 95% highest posterior density (HPD) interval of 11.2-22.25 Mya (Fig. 10).This analysis also suggested that the Heliantheae tribe, encompassing these plants, diverged around 22-26 million years ago during the early Miocene period.The TimeTree web tool was employed to verify these results further (Fig S3), yielding similar estimates and supporting the findings derived from maximum likelihood (ML) and maximum parsimony (MP) methods.

Discussion
According to the present study, the complete plastome of P. hysterophorus was analyzed, revealing a length of approximately 151.8 kilobase pairs (kbp) (Table 1 and Fig. 1).Like other angiosperms, the P. hysterophorus genome displayed a characteristic quadripartite structure (Fig. 1).In terms of gene content, the P. hysterophorus chloroplast genome was found to encode around 129 genes , comprising 85 protein-coding genes, eight ribosomal RNA genes, and 36 transfer RNA genes.Additionally, the genome exhibited 40 microsatellites scattered randomly throughout its sequence.Furthermore, the study identified various types of repeats in the P. hysterophorus chloroplast genome.Approximately 14 forward, 45 tandem, 18 reverse, and 16 palindromic repeats were detected (Fig. 7).The findings about the gene content and repetitive elements in the chloroplast genome of P. hysterophorus align with previously reported observations in other members of the Asteraceae family, including P. argentatum 23 , Helianthus annuus 21 , Helianthus giganteus 33 , as well as other related species 34 .The proteincoding gene known as rps12 exhibits an uneven distribution within the genome.Specifically, its 5' terminal exon is situated in the large single-copy (LSC) region, while two copies of the 3' terminal exon and intron are found within the inverted repeats (IRs).This distribution pattern of rps12 is consistent with observations made in other angiosperm plastomes 34,35 .Hence, the positioning of rps12 exons and introns in different regions of the chloroplast genome is a phenomenon shared among various flowering plant species.
In the chloroplast genome of P. hysterophorus, we found fifteen genes with introns.Thirteen genes had a single intron, while ycf3, clpP, and rps12 had two introns each.The longest intron was observed in the rpoC1 gene, spanning 1,636 base pairs, followed by the ndhB gene with an intron length of 776 base pairs.These introns are crucial for regulating gene expression.Recent studies indicate that strategically positioned introns can boost the expression of introduced genes 36 .Thus, introns can serve as valuable tools for improving the efficiency of genetic transformation.Interestingly, it has been noted that genes such as ycf1, ycf2 37,38 , rpl23 39 , and accD 40,41 are often absent in plant genomes.However, these genes were detected in the reported P. hysterophorus plastomes, consistent with findings in other members of the Asteraceae family 41,42 .
We have identified 93 repeat sequences in the chloroplast (cp) genomes of P. hysterophorus.These repeats consist of reversed, forward, tandem, and palindromic sequences.Repeat sequences are highly valuable in studying the evolutionary relationships of species 43,44 .They also play a significant role in genome rearrangements 44 .Previous investigations of various plastomes have demonstrated the essential role of repeat sequences in causing insertions and substitutions 45,46 .In the case of P. hysterophorus, the length of the identified repeats was relatively short, ranging from 11 to 20 base pairs.Similar results have been reported in plastomes of other plant species from the Asteraceae family 21,23,34 .However, longer repeats have been observed in other plant families, such as a 132-base pair repeat in Poaceae and a 287-base pair repeat in Fabaceae 47 .The presence of longer repeats in DNA sequences can significantly contribute to sequence variation and rearrangement within the genome.This phenomenon occurs through mechanisms like slipped strand mispairing and improper recombination, as extensively discussed earlier 21,48 .These repeats, which are characterized by the repetition of specific DNA segments, have been identified as significant hotspots for genome reconfiguration, highlighting their crucial role in shaping genetic landscapes 48 .Moreover, the importance of these repetitive elements extends beyond their impact on genomic stability.They also serve as invaluable resources for developing genetic markers utilized in various studies involving the phylogenetics and population analysis of P. hysterophorus and its closely related species.We extensively analyzed perfect simple sequence repeats (SSRs) within the plastome of P. hysterophorus, and a comparative analysis was undertaken with ten closely related species belonging to the Helianthae tribe.. SSRs are specific regions of DNA that tend to undergo mutations at a higher rate due to the slipping of DNA strands.These regions exhibit significant variation in the number of repeat units within the chloroplast genome, making them valuable molecular markers for studying plant population genetics, evolution, and ecology 49 .In our study, we focused on identifying SSRs that were ten base pairs or longer, as these have been suggested to be more susceptible to slipped strand mispairing, which is considered the primary mechanism for generating   SSR polymorphisms 50,51 .Our investigation revealed the presence of 40 SSRs in the plastome of P. hysterophorus, exclusively comprising 100% mononucleotide SSRs.Furthermore, SSRs with repeat motifs 37, 38, 45, and 46 were identified in the plastomes of P. argentatum, E. angustifolia, S. integrifolium, and X. sibiricum, respectively.These findings align with previous research indicating that chloroplast genome SSRs are predominantly composed of mononucleotide repeats of ' A' or 'T' 52,53 .Our research findings are in line with previous studies that have consistently highlighted the prevalence of polythymine (polyT) or polyadenine (polyA) repeats in plastomes.These repetitive patterns of short sequence repeat (SSRs) have been observed to be more abundant compared to tandem cytosine (C) and guanine (G) repeats, which are relatively less common 54,55 .The presence of polyT or polyA repeats contributes significantly to the overall composition of plastomes in P. hysterophorus.This observation is consistent with earlier investigations across different species, indicating a high proportion of ' AT' base pairs 23,56 .Such ' AT'-rich regions have also been reported in previous studies, emphasizing the correlation between repetitive patterns and the prevalence of ' AT' base pairs in plastomes.
According to genome synteny and comparison analysis, the plastome of P. hysterophorus shows significant sequence similarity with other species belonging to the Heliantheae tribe (Fig).This analysis also confirms the presence of a rearrangement in the large single-copy region (LSC), involving a double inversion spanning 25 kb, which has been previously reported in other members of the Asteraceae and a few other families 23,[57][58][59] .We identified substantial sequence congruence between P. hysterophorus and its closely related species.Nevertheless, our comprehensive sequence analysis also unveiled noteworthy divergences within specific genomic regions.These variations resulted in relatively lower identity between the species in these comparable regions.Furthermore, consistent with previous findings on plastomes of related species 35,46,60,61 , the LSC and SSC regions exhibited lower similarity compared to the two inverted repeat (IR) regions in all the studied species' plastomes.This suggests that the IR regions are more conserved across these species.
The expansion and contraction at the borders of inverted repeats (IRs) are major factors contributing to size variations among plastomes, playing a crucial role in evolution [63][64][65] .In order to investigate these variations, a comprehensive analysis was conducted on the two IRs and two single-copy regions of the plastomes of P. argentatum, A. anchusifolia, A. artemisiifolia, E. angustifolia, H. annuus, I. heterophylla, T. diversifolia, S. integrifolium, S. calendulacea, and X. sibiricum, in comparison toP.hysterophorus.Notably, no significant differences were observed in the length of the IRs among these plastomes.However, certain genes at the junctions of the IRs and single-copy regions, such as rps19, ycf1, and rpl2, exhibited slight variations (Fig. 6).
Previous studies have extensively used plastid genes to support the monophyly of Asteraceae 66 .These studies have also identified 45 tribes within the family, organized into 13 subfamilies 1,67 .Plastid sequences have been crucial in determining the relationships between Asteraceae subfamilies and most tribes 68,69 .However, some uncertainties still exist in these relationships.The utilization of plastome genomes in phylogenetic studies and molecular evolutionary systematics has yielded immense value by offering a profound comprehension of intricate evolutionary connections within the realm of angiosperms.This avenue of research has provided researchers with a comprehensive understanding of the complex relationships that exist among various species of flowering plants 34,[68][69][70][71] .Consequently, in this study, we utilized 72 shared protein-coding genes from 75 representatives of 11 genera to establish the phylogenetic position of P. hysterophorus within the tribe Heliantheae.Both Bayesian inference (BI) and maximum likelihood (ML) methods were employed for the phylogenetic analysis (Fig. 9).The study's results revealed that P. hysterophorus and P. argentatum are closely related, which was strongly supported by reliable statistical measures like a 100% bootstrap value and Bayesian inference.This close relationship was determined through the analysis of phylogenetic studies carried out by 72 .Additionally, the position of P. hysterophorus within Heliantheae, as confirmed by this study, aligns with the previously published phylogeny described 72,73 .According to a Bayesian approach implemented in BEAST, the estimated divergence time between Parthenium and Helianthus is approximately 15.1 million years ago (Fig. 10).Furthermore, the tree generated by BEAST exhibited a consistent topology with those produced by maximum likelihood (ML) analysis.These findings were also corroborated by a study conducted by 72 on the basis of transcriptomics data.Our findings align with the results obtained from TimeTree, which indicated that the adjusted time divergence between Parthenium and Helianthus occurred approximately 15.0 million years ago (Mya) (Fig. 10

Chloroplast DNA extraction, sequencing, and assembly
To extract high quality DNA from young and immature leaves of P. hysterophorus, we employed a meticulous process.Firstly, the leaves were finely ground into a fine powder using liquid nitrogen.This method ensured that the DNA would be released from the cells effectively.To isolate the DNA, we utilized the highly reliable DNeasy Plant Mini Kit from Qiagen (Valencia, CA, USA).This kit provided us with a robust and efficient method for DNA extraction from plant samples.The kit's protocol was followed carefully to obtain high-quality DNA.Once the DNA was successfully isolated, we proceeded to sequence the chloroplast DNA using an Illumina HiSeq-2000 platform at Macrogen (Seoul, Korea).This cutting-edge sequencing platform allowed us to generate a vast amount of raw reads for P. hysterophorus, specifically around 475,610,881 raw reads.However, to ensure the reliability and accuracy of our analysis, we needed to filter out low-quality sequences.To achieve this, we implemented a stringent filtering criterion based on a Phred score of less than 30.This quality control step eliminated any reads that did not meet the desired threshold, ensuring that only high-quality sequences were retained for further analysis.To assemble the plastomes with precision, we employed two different methods.Firstly, we utilized the GetOrganelle v 1.7.5 pipeline 74 , which is a sophisticated tool specifically designed for plastome assembly.Additionally, we also employed SPAdes version 3.10.1 (http:// bioinf.spbau.ru/ spades) as an assembler to enhance the accuracy and reliability of the assembly process.

Genome annotation
The annotation process of the plastomes involved several steps using established tools and software.CpGAVAS2 75 and DOGMA (http:// dogma.ccbb.utexas.edu/, China) 76 , widely recognized online tools for genome annotation, were utilized to carry out the initial annotation.Additionally, tRNAscan-SE 77 , a well-established program, was employed to identify tRNA genes within the plastomes.To ensure the accuracy of the annotations, a comparative analysis was conducted by comparing the plastomes with reference genomes using Geneious Pro v.10.2.3 78 and tRNAs can-SE (v.1.21) 77.This step allowed for the identification of start and stop codons, determination of intron boundaries, and implementation of manual alterations when necessary.To visualize the structural features of the plastomes, chloroplot, a powerful tool developed by 79 , was used.Furthermore, the genomic

Conclusion
In this studywe sequenced and analyzed the complete chloroplast genome of P. hysterophorus and compared it to related species in the Asteraceae family.Our analysis revealed that the chloroplast genome of P. hysterophorus encompasses a total length of 151,881 bp.Structural similarities and intriguing variations were found when comparing the P. hysterophorus plastome to those of related species.Moreover, a number of different genes, including matK, ndhF, clpP, rps16, ndhA, rps3, and ndhD, showed significant gene divergence in our analysis.The analysis has provided evidence supporting the presence of a rearrangement (inversions) in the LSC region of the plastome.The phylogenetic analysis revealed that P. hysterophorus shares a close evolutionary relationship with the genera I. heterphylla and Helianthus.The divergence time between Parthenium and Helianthus was estimated at approximately 15.1 million years ago (Mya).Our findings provide valuable insights into the genetic characteristics and evolutionary history of P. hysterophorus.This study contributes to our understanding of the plastomes in the Asteraceae family and can serve as a valuable resource for further research on P. hysterophorus and related species.

Figure 1 .
photosystem I photosystem II cytochrome b/f complex ATP synthesis NADH dehydrogenase RubisCO larg subunit RNA polymerase small ribosomal protein large ribosomal protein clpP, matK, infA hypothetical reading frame transfer RNA ribosomal RNA other

Figure 4 .
Figure 4. Sliding window analysis of nucleotide variability among the P. hysterophorus and related plastomes (window length: 200 bp; step size: 100 bp), (A) nucleotide variability between P. hysterophorus and P. argentatum.(B) Nucleotide variability among P. hysterophorus and related eleven plastomes from Heliantheae.(C) Heatmap showing pairwise sequence distance of 66 genes from of P. hysterophorus and related plastomes from Heliantheae.

Figure 5 .
Figure 5. Summary of genes lost across P. hysterophorus and related species plastomes.The blue color shows the missing genes, green color shows single genes whereas the red shows the genes duplicated in plastomes.

Figure 6 .
Figure 6.Distances between adjacent genes and junctions of the small single-copy (SSC), large single-copy (LSC), and two inverted repeats (IR) regions among TP.hysterophorus and related plastomes.Boxes above and below the primary line indicate the adjacent border genes.The Fig is not scaled regarding sequence length and only shows relative changes at or near the IR/SC borders.

Figure 7 .
Figure 7. Repetitive sequences in P. hysterophorus and eleven related plastomes (A) Total number of repetitive sequences.(B) Lengthwise frequency of palindromic repeats in plastomes, (B) Lengthwise frequency of forward repeats, (C) lengthwise frequency of reverse repeats, (D) lengthwise frequency of tandem repeats.

Figure 8 .
Figure 8. Analysis of the simple sequence repeats (SSRs) in P. hysterophorus and eleven related plastomes; (A) total number of SSR repeats in genomes; (B) frequency of the simple sequence repeat motif in the chloroplast genome of P. hysterophorus and and eleven related plastomes.

Figure 9 .
Figure 9. Phylogenetic trees were constructed from 72 commonly shared genes among 75 members of the Heliantheae tribe, representing 11 different genera using different methods, Bayesian inference (BI) and maximum likelihood (ML).Numbers above the branches are the posterior probabilities of BI and bootstrap values of ML.Dot represent the position for P. hysterophorus.

Figure 10 .
Figure 10.Divergence time estimates of P. hysterophorus based on 72 commonly shared genes among 75 members of the Heliantheae tribe, representing 11 different genera.The GTR + G substitution model was used with four rate categories and a Yule tree speciation model was applied with a lognormal relaxed clock model in BEAST.The 95% highest posterior density credibility intervals are shown for the node ages in circles (mya).Numbers indicate date estimates for different nodes.A geological time scale is shown at the bottom of the Fig.
and Fig S3).These results are in line with previous reports on the estimation of the divergence time of the Helianthae tribe (Fig S3).

711 bp 18,329 bp 24,645 bp 24,645 bp 151,330bp LSC LSC IRb SSC IRa // 83,846 bp 17,889 bp 25,081 bp 25,081 bp 151,897bp LSC LSC IRb SSC IRa
copy of the rps19 gene across the LSC/IRb (JLB) border.The rpl22 gene is located in the LSC region in all species.The rps19 gene passes through JLB junction, and 187 bp occurs on the LSC side inP.hysterophorus and 92 bp in IRb region, 184 bp on the LSC side in P. argentatum, X. sibiricum, and S. integrifolium and 95 bp in IRb region, 177 bp in H. annuus and I. heterophylla in LSC and 102 bp in IRb region in E. angustifolia 179 bp in LSC and 100 bp IRb region in S. calendulacea 182 bp LSC and 97 bp IRb region.However, in A. artemisiifolia rps19 gene is present in the LSC region 63 bp away from JLB junction (Fig.6).The rpl2 gene lies in IRb region just near to JLB border.The ycf1 gene passes through JSB border except in P. argentatum, which is located in SSC region, and in A. artemisiifolia it passes through JLA border in S. calendulacea, the ycf1 gene is 597 bp in IRb region, and only 5 bp in SSC region.Similarly, the ndhF gene is located close to JSA border toward SSC side except in P. argentatum, while in S. calendulacea it is located near JSB region in SSC region.The trnH gene occurs intact with JLA junction toward the LSC region.The trnN gene only occurs in S. calendulacea in IRa region.