Introduction

Airborne fungi and their spores are the most common bio-aerosols that are inhaled by humans. It causes numerous health issues to human notably allergic rhinitis and asthma1. Alternaria alternata is a ubiquitous saprophytic, air-borne fungus found commonly in the environment. A. alternata plays a major role in the asthma development of sensitized individuals2 and another study reported approximately, 80% of asthmatic patients are sensitized to A. alternata3. Several life-threatening exacerbations of asthma have also been strongly linked to A. alternata exposure4. Furthermore, A. alternata is strongly associated with the development of Type I hypersensitivity, triggering an IgE response against allergens and leading to the release of pharmacological mediators, such as histamine-IgE-sensitized mast cells, resulting in acute inflammatory reactions like asthma or rhinitis. The Global Asthma and Allergy European Network and the National Health & Nutrition Examination Survey have reported the occurrence of allergic rhinitis in the United States and around Europe by 12.9% and 8.9%, respectively5.

Alternaria spores are the most common fungal allergens, often found in areas with humid climates and also in arid regions4. Higher concentrations of spore counts are assessed in outdoor environments, but the spores may also enter the indoor environment through an infiltration process or ventilation, apart they can also be carried by indoor occupants6. Because of their larger spore size (23–34 µm × 7–10 µm), it is frequently found in indoor dust particles and cannot be eliminated through normal filtration methods. Though the Alternaria spores are more common in an outdoor environment, the indoor environment has also been the secondary source of exposure for the colonization of invaded spores in the building materials and indoor space7. Until now, 17 potential allergens of A. alternata have been reported on the International Union of Immunological Societies (IUIS) website and the Allergome database8. However, Alt a 1 is considered a major allergen among 17 allergens of A. alternata and it was predominantly found in more than 85% of Alternaria-induced allergic patients9.

Allergens are a class of proteins that can elicit the powerful T helper lymphocyte type 2 (Th2) responses, culminating in excessive IgE antibody production for the development of allergic diseases. Antigen recognition and uptake by innate immune cells is the primary defense step against allergic reactions and produces excessive IgE antibodies, sensitizing and triggering mast cells10. Type I hypersensitivity reactions are classified into immediate and late-phase reactions. The immediate reaction occurs within a few minutes (5–30 min) and subsides in 60 min, resulting in Thr release of inflammatory mediators. The late phase reactions occurs in 3 to 4 h (starting in 2–8 h and lasting 2–3 days) and is cell-driven process leading to cellular infiltration and mediator release11.

Asthma and allergic inflammations result from dysregulated Th2-like airway inflammatory responses to the environment12. These responses are mediated by CD4+ T cells polarized towards Th2 cell phenotype and help B cells for IgE expression13. Th2 also interacts with other cells such as eosinophils through IL-5; smooth muscle cells through IL-9; epithelial cells and keratinocytes through IL-13 and epithelial cells through IL-31, that drive the pathogenic characteristics of asthma, such as increased IgE, airway hyperresponsiveness, excessive mucus production, airway remodeling, and airway eosinophilia14. The interaction between cytokines IL-5, IL-13, and their receptors on the B-cell surface is the first signal for antibody class switching. The second signal involves an interaction between CD40 on B cell and CD40L on T cell leading to the production of IgE antibodies. The transcriptional factors such as BSAP (B cell specific activator protein), NF-Kβ (nuclear factor kappa B), and STAT6 (singal transducer and activator of transcriptorion) are identified to bind in the promoter sequence of various allergic reaction-associated genes15. Comparatively upregulation of mRNAs specific to the chemokine proteins (eotaxin, MIP-1α, MIP-2) and chemokine receptors (CCR-1, CCR-2, and CCR-5) were observed in the A. alternata spores infected lung allergy patients16. Expression of CCR-3 in lungs and Th2 cytokine (IL-4, IL-5, and IL-13) secretion in the BAL (bronchoalveolar lavage) was also additionally observed.

A. alternata is a major fungal pathogen exhibiting harmful effects on human and plant health. In the present study, we have identified the potential allergens from the hypothetical proteins (HPs) of A. alternata. Sequence and structure-based computational approaches were used to characterize the functions of the HPs. Decoding the complete proteome aids in identification of potential virulence causatives and allergenic proteins of the pathogen and thereby pave the way for drug and vaccine discovery. Numerous literature reports states the use of bioinformatics-based approaches for the design and development of drug and highly effective vaccines to treat infections caused by the bacterial pathogens17,18,19,20,21,22.

Materials and methods

Sequence retrieval and dataset analysis

The complete HP sequences of A. alternata were retrieved from the NCBI database using their primary accession numbers in FASTA format. The sequences of all 11,227 HPs were subjected to the computational prediction for the identification of potential allergens. Furthermore, the identified allergens are functionally annotated using a well-optimized series of bioinformatics tools.

Allergen prediction: AlgPred and SDAP

AlgPred, a web-based (http://www.imtech.res.in/raghava/algpred) allergen prediction tool was used to predict the possible allergenic proteins present among 11,227 HPs of A. alternata23. AlgPred also predicts the potential IgE epitopes in the subjected 11,227 HPs. The tool uses different approaches for the prediction of allergenic proteins, which includes motif based techniques, machine learning and hybrid approach24. The protein predicted to be an allergen by most of the approaches has a high probability to be an allergenic protein. The possible allergenic proteins predicted in the AlgPred tool are subjected to the Structural Database of Allergic Proteins (SDAP), to investiage the cross-reactivity between known allergens (http://fermi.utmb.edu/sdap/)25. In order to determine the distantly related sequences, the physical–chemical amino acid descriptors E1–E5 were used to locate sequences with similar chemical properties. In E1–E5 descriptors, the similarities between the two sequences are examined with the property distance function PD. Each amino acid is represented as a vector and these vectors are generated by the method of metric multidimensional scaling of 237 physical–chemical properties for the naturally occurring 20 amino acids.

Physicochemical properties and sub-cellular localization

ProtParam tool in the Expasy server (http://web.expasy.org/protparam/) and PDB Goodies were employed to compute the physicochemical properties of HPs26,27. Theoretical calculation of various physicochemical properties such as molecular weight, aliphatic index, isoelectric point, instability index, extinction coefficient, and grand average of hydropathicity (GRAVY) was calculated for the selected 10 HPs (Table 1). The Wolf PSORT28 and CELLO29 servers were used to predict the subcellular localization of the potential allergens. Wolf PSORT converts given sequences into numerical localization features based on sorting signals, amino acid composition, and functional motifs. Upon conversion, the simple k-nearest neighbor classifier is used for the protein subcellular location prediction. CELLO uses a two-level SVM (Support Vector Machine) classifier and homology search method to annotate the sub-cellular localization of HPs (Table 2).

Table 1 Physico-chemical properties predicted for the potential allergens of A. alternata.
Table 2 Sub-cellular localization annotation of hypothetical proteins of A. alternata.

Sequence-based functional annotation

The identified allergens in A. alternata were extensively analyzed using CDD (Conserved Domain Database), InterPro, and Pfam to characterize the functional domains by utilizing the sequences of these HPs30,31,32. CDD inspects the functional characteristics of protein sequences by using the heuristics BLAST algorithm, and searches against a complete collection of domains to identify the structural and functional domains in the protein sequences33. InterPro scan combines multiple resources for motif discovery which predicts the information of protein domains, families, and functional sites. Protein sequence motifs are the signatures of protein families that are often used in predicting the function of the protein, especially in the case of metabolic enzymes; these motifs are associated with catalytic functions. Pfam is defined by multiple alignments and profile hidden Markov model (HMM) to define the family-representative sequences. Pfam uses the HMM algorithm to search the target sequence against the UniProt Knowledgebase (UniProtKB) to predict family relationships34.

Structure modelling and validation

The protein structural folds are highly conserved than sequences. Thus, structure-based functional annotation of the HPs are considered more reliable than sequence-based function assignment. The three-dimensional structure of predicted allergens was determined using the Phyre2 (comparative homology)35 and Robetta (de novo)36. In the absence of the structural homology in repositories, Robetta builds the three-dimensional structure of the targeted allergic proteins by the de novo fragment insertion method. The Monte Carlo local structure search algorithm was used for energy minimization and optimization. Both knowledge-based and physically-derived scoring terms were used to score the quality of the generated models. PROCHECK program was used to validate the reliability of the generated structures by analyzing the overall structure and residue-by-residue geometry of proteins37. ProQ, a neural network method was also used to predict the quality of the predicted structures38. Models showing a high LG score and MaxSub score were selected for function prediction studies.

Structure-based functional prediction

The predicted structures of the HPs are then used as similarity search queries in ProFunc and DALI servers for the structure-based function prediction39,40. ProFunc uses secondary structure elements (SSEs), SURFNET algorithm, residue conservation, and nest analysis on query structure to identify similar functional motifs or close associations to the experimentally annotated proteins. DALI uses a weighted sum of similarities of intra-molecular distances to classify the structurally similar proteins in the PDB databases related to our input structure. The list of structural neighbors is sorted by pairwise structural similarity score (Z-score). A higher Z-score implies the structures agree more closely in architectural details.

Results and discussion

Advancements in the field of computational biology have developed several models namely the Hidden Markov Model (HMM), Neural Network (NN) model, and Support Vector Machine (SVM) to decode the biological phenomenon at the system level. The models and their associated methods are more efficient and accurate in annotating the functional properties of the proteins. We have used above-described models and methods to identify the potential allergens from the HPs of A. alternata. Further, the functions of the selected proteins were annotated based on their sequence and structural information. A total of 11,227 HPs of A. alternata were retrieved from the NCBI database and evaluated for their allergenicity using bioinformatics approaches.

Allergenic prediction

The predictions of allergenic proteins through computational approaches are an important phenomenon in the development of an effective vaccine and therapeutics in pharmaceutical industries. FAO/WHO (FAO: Food and Agriculture Organization of the United Nations; WHO: World Health Organization), Codex Alimentarius Commission guidelines (2003) have recommended various tests for examining and analyzing allergenic behavior of proteins which includes the origin of a gene, sequence similarities with a known allergen, protein stability and binding mechanism of IgE epitopes. AlgPred predicted protein sequences having more than 35% sequence similarity (over 80 amino acids) with known allergens designates a protein as a potential allergen (Table 3). Based on the AlgPred result, it was observed that 29 HPs are predicted as potential allergens.

Table 3 Prediction of potential allergens among hypothetical proteins in Alternaria alternata using AlgPred tool.

Bioinformatics part of guidelines 2001 has documented that a protein is potentially allergenic if it either has at least six contiguous amino acids or a minimum of 35% sequence similarity over a window of 80 amino acids shared with known allergenic proteins. The 29 protein sequences predicted as allergens by AlgPred were further analyzed with SDAP. SDAP confirms 10 protein sequences as allergens and the remaining 19 protein sequences that do not fulfill the SDAP criteria were excluded from the study (Table 4). Among the 10 protein sequences, A0A177DEP8 and A0A4Q4N5B7 showed high sequence similarity (96.25%) with the allergen Alt a 12 of A. alternata. Alt a 1 to Alt a 12 are the well-known allergens of A. alternata. Alt a 12 comprises the structure of large ribosomal protein P1, which plays a distinct role in protein synthesis7. A0A4Q4NGZ8 and A0A4Q4NJR8 showed a 50% sequence similarity with Penicillium crustosum (Pen cr 26.0101) and Cupressus arizonica (Cup a 3). In general, P. crustosum are food spoilage microorganisms and also responsible for the production of mycotoxins, in which Pen cr 26 comes under ribosomal protein P141. Cupressaceae family is responsible for the relevant cause of respiratory allergy including, rhino-conjunctivitis, hay fever and asthma in sensitized individuals42. Cup a 3 a major allergen in this family is reactive in more than 90% of the Cupressaceae patients43. A0A177D895 and A0A177DB16 shown high sequence similarity (47.50%) with Triticum aestivum (Tri a 18) and Gallus domesticus (Gla d). Tri a 18 is a minor allergen for patients with bakers’ asthma44. A0A4Q4NI20 and A0A4Q4N975 showed 43.75% and 41.75% sequence similarity with Musa acuminate (Mus a 2.0101) and Hevea brasiliensis (Hev 5). Mus a 2 from bananas are classified under class 1 chitinase that belongs to pathogenesis-related protein (family 3), provoked positive skin prick test in 50% of banana allergic patients45. Hev b 5 has been identified as a major latex allergen and it is particularly observed among healthcare workers. The allergic reaction ranges from rhinitis to asthma, conjunctivitis, urticarial, anaphylactic shock, and occasionally death46.

Table 4 Screening of potential allergens in SDAP, representing its percentage identity with an allergen over a window of 80 amino acids.

A0A4Q4NJR8

The functional annotations of the potential allergens are listed in Table 5. The sequence-based analysis suggests that the HP (A0A4Q4NJR8) is localized in the extracellular region and may act as a thaumatin like protein family. BLASTP search showed that the HP belongs the thaumatin-like food allergen from Malus domestica that is associated with IgE-mediated symptoms in apple-allergic individuals47. The apple protein whose amino-terminal sequence shares about 50% identity with pathogenesis-related protein-5 family members was the first thaumatin-like protein described as allergen48. Family and conserved domain database strongly suggest that HP belongs to the thaumatin-like protein. The thaumatin-like proteins are also involved in host defense mechanisms and a wide range of developmental processes in fungi, plants, and animals49. HHpred also suggests a high similarity with thaumatin I from Thaumatococcus daniellli. Motif search using MotifFinder suggests that the HP sequence possesses a motif that is involved in the thaumatin protein family. Usually, large type thaumatin-like protein has 16 cysteine residues at conserved positions, and this characteristic feature was also observed in our HP50. These residues can take part up to 8 disulfide bridges, highly conserved in the thaumatin-like proteins. String database indicates that the HP showed maximum scoring function with Setosphaeria turcica and the result revealed several interaction partners such as glycoside hydrolase family 2 protein, glycoside hydrolase family 12 protein, alpha glucuronidase, Arf family, and glycoside hydrolase family 30 protein. Based on these observations, we suggest that HP may function as a thaumatin protein.

Table 5 Sequence and structure-based functional annotation of potential allergens from the HPs of A. alternata.

The three-dimensional structure of HP was predicted by Phyre2, which shows the sequence homology and identity of 69% and 38%, with the template (PDB ID: 2AHN). Model validation with the Ramachandran plot showed 84.9% of amino acid residues in the favored region and 0.5% of amino acid residues occupied the disallowed region. The LG-score of the HP model is − 0.835 showing that the predicted model is extremely reliable. Structure comparison and analysis revealed that the HP contains a lectin-like β barrel (Domain I), several loops (Domain II), and two beta sheets (Domain III), and all these three domains are stabilized through at least one disulfide bridge linked by up to one cysteine residues with a conserved spatial distribution throughout the protein49. Superimposition of the HP model with other thaumatin-like proteins showed the RMSD value of 0.660 Å (PDB ID: 3ZS3), 0.869 Å (PDB ID: 1DU5), 0.000 Å (PDB ID: 2AHN), respectively showing that HP belongs to the fold which is similar to that of 2AHN indicating a close functionality. The SuSPect tool embedded in the Phyre2 identified Cys290 amino acid residue has the highest mutational sensitivity, which has a functional/phenotypic effect in the protein. Further, Pocket-Finder analysis shows the following amino acid residues Tyr263, Asp265, Asp266, Ile268, Gln269, Arg270, Pro271, and Asn283 that plays a major role as active site residues.

DALI server shows the high structural similarity of HP with the protein function similar to thaumatin-like proteins. We found a significant match with thaumatin-like protein (Z score: 39.0), Laminaripentaose producing beta-1,3-guluase (Z score: 14.9), Beta-1,3-glucanase (Z score: 12.6), etc. The aligned residues are usually in the range of 189–412 with the RMSD in the range of 0.9 Å to 2.9 Å. We also observed a close structural similarity with the beta-1,3-glucanase enzyme. Furthermore, the ProFunc server revealed the close similarity of HP with the structure of allergenic and antifungal banana fruit thaumatin-like protein. An extensive sequence and structural analysis strongly suggest that the HP could function as a thaumatin-like protein.

A0A177DEP8, A0A4Q4NGZ8 and A0A4Q4N5B7

The sequence (Interpro, CD search, and Pfam) based analysis strongly suggests that these HPs exist as ribosomal protein P1 and its subfamily represents the eukaryotic large ribosomal protein P1. Also, HHpred analysis showed high similarity with 60S acidic ribosomal protein P1. We found the localization of these HPs in the cytoplasm as predicted by Wolf PSORT and CELLO. The acidic ribosomal P proteins are small molecules (10–11 kDa) that form lateral stalk structures in the active site region of the large ribosomal subunit and play an important role in the elongation phase of the translation process51. Based on sequence homology, the ribosomal P proteins are classified into two types in mammals, yeast, and protozoans (P1 and P2), whereas, the third distinct group (P3) was observed in plants52. Furthermore, these HPs contain a structural motif that is found in the family of 60S acidic ribosomal protein. The functional partnership of three HPs was predicted using the STRING database which resulted in HPs (A0A177DEP8 and A0A4Q4N5B7) showing maximum scoring function with Mycosphaerella pini and partnership interaction with zinc-binding ribosomal protein S27e-like protein, 60S acidic ribosomal protein P0, and 0S ribosomal protein S21. Similarly, A0A4Q4NGZ8 showed maximum scoring function with Parastagonospora nodorum and exhibited associated functional interactions among eukaryotic ribosomal protein P1/P2 family, 60S acidic ribosomal protein P0, and universal ribosomal protein uS4 family. Based on these findings we suggest two HPs may function as 60s acidic ribosomal protein.

Three-dimensional structures of HPs (A0A177DEP8, A0A4Q4NGZ8, and A0A4Q4N5B7) were predicted using the Phyre2 server. These HPs showed high sequence similarity with the crystal structure of human ribosomal protein P1/P2 (PDB: ID-2LBF) and it was used as a template to predict the models. The predicted models were validated using the Ramachandran plot and it showed 82.1%, 84.4%, and 86.4% of amino acids were present in the favored region respectively, and none of the residues occupied the disallowed region except A0A4Q4N5B7 protein (0.7%). The model quality was validated using ProQ, which showed an LG score of − 0.835 confirming its structural quality. Likewise, structural superimposition of the predicted models with template structure showed less RMS deviation of 0.14 Å, 0.14 Å, and 0.15 Å respectively, confirming the reliability of the predicted models. DALI analysis showed similar structures that belong to 60S acidic ribosomal protein P1. Likewise, ProFunc revealed the same result as predicted by DALI. Active site prediction shows that Trp43, Leu46, Phe47, Ala50, Leu51, Lys55, Asp58, Leu59, Asn62, Val63 are the important amino acids that are essential for catalyzing A0A177DEP8 and A0A4Q4N5B7. Also, the A0A4Q4NGZ8 active site may contain Met1, Ser2, Glu9, Gln10, Ala13, Trp47, Leu50, Phe51, Ala54, Leu55, Lys58 Glu62, Val63, Leu64, Thr65, Ala66, Val67, Thr68, Ala69, and Ala70. In addition, an earlier study reports that acidic ribosomal protein P1 from A. alternata is considered a major allergens and plays a role in fungal allergy and autoimmune disease. Moreover, it is categorized as a rich source of mold allergens and deposited in the WHO/IUIS database53. The present investigation strongly suggests that these three HPs may act as 60S acidic ribosomal protein P1 and classify as allergens with the virulent property.

A0A177D895, A0A177DB16 and A0A4Q4NI20

The sequence-based analysis including InterPro, Pfam, and CD search revealed that the HPs A0A177D895, A0A177DB16, and A0A4Q4NI20 may act as chitin recognition protein or ChtBD1_1 domain-containing protein. Also, HHpred analysis showed maximum similarity with cysteine-rich and chitin-binding proteins. Furthermore, these HPs contain a structural motif that is found in the Chitin recognition protein. The sequence-based analysis, suggests that HPs function as a chitin-binding protein. Wolf PSORT and CELLO, predict these HPs present in the extracellular region and insoluble. Chitin Binding Proteins (CBP) are involved in various biological reactions such as hydrophobic surface sensing, binding to chitin, antimicrobial activities, and increasing chitinolytic activity54,55,56. It is commonly found in the exoskeleton of arthropods, nematodes, protozoa, insects, mollusks, and fungal cell walls. Based on the chitin-binding property and amino acids similarity the carbohydrate-binding modules are classified into several families including 1, 2, 12, 14, 18, 19, and 3357. Chitin binding proteins mainly catalyze the chitin degradation mechanism and its action varies from fungi to other organisms. In addition, the presence of discrete domains in enzymes, and chitin-binding modules also exist as independent and non-catalytic. Such non-catalytic CBPs are mostly found in 14, 18, and 33 families58.

Due to the unavailability of the appropriate template, the three-dimensional structures of these HPs were predicted using the Robetta server. The quality of the structures and their accuracy were validated using the Ramachandran plot, and it showed 87.6%, 88.1%, and 87.7% of residues occupied the favored region respectively and except A0A177DB16 (0.2%) and no residues occupied the disallowed region suggested a good quality of the predicted model. LG score of − 0.835, implies the predicted models are valid with high confidence. ProFunc and DALI analysis revealed that A0A177D895 may act as chitin recognition and it is involved in a variety of biological reactions. Despite that, the other two A0A177DB16 and A0A4Q4NI20 proteins showed no significant function due to their high structural and sequence variation as compared with A0A177D895. In addition, no similar hits were obtained from ProFunc and DALI analysis. Habitually chitin, chitinases, and chitin-binding proteins produce allergenic inflammation as well as wound inducible activity. In addition, chitin-binding proteins are also classified as pathogenesis-related proteins which include prohevein and other wound-inducible proteins. Prohevein, is a cysteine-rich protein and one of the major IgE-binding allergens that affect healthcare workers in natural rubber latex. Earlier studies reported the herein protein has significant similarities with (about 71%) chitin-binding proteins which is the reason behind latex allergic patients59,60. Hence, the present study investigation concludes that HPs act as chitin-binding proteins and induce allergenic reactions in humans as well as cause asthmatic inflammation.

A0A177DU49 and A0A4Q4NRZ2

HPA0A177DU49 and HPA0A4Q4NRZ2 are localized in the nuclear system. BLASTP sequence analysis suggested its activity as 20S-pre-rRNA-d site endonuclease Nin One Binding (NOB1). Furthermore, sequence-based functional prediction clearly states that HPs are the Nin One Binding (NOB1) and the virulence prediction indicates the HPs are involved in the cellular process. The 20S pre-rRNA is converted into the mature 18S rRNA in the cytoplasm due to the action of NOB1 endonuclease at site D61. This NOB1 contains a PilT N-terminus (PIN) domain common to many other exonucleases or endonucleases and a zinc ribbon domain. In, general, PIN domain protein has been shown to possess endonucleolytic activity62. Uniprot molecular function suggests that HPs possess endoribonuclease activity. String database indicates that the HP showed maximum scoring function with Pyrenophora triticirepentis and the result revealed several interaction partners such as bystin, pre-rRNA processing protein pno1, serine/threonine-protein kinase RIO2/RIO3, low-temperature viability protein Itv1, periodic tryptophan protein 2, U3 small nucleolar ribonucleoprotein IMP4, rRNA biogenesis protein RRP5, and GTP binding protein Bms1. MEME suite analysis suggests the presence of three significant motifs in the sequences namely 68ʹ-CHACFNIDFQMDKQFCKRC, 471ʹ-CNNDSPARYDAYAAFCKKKGAH AVGLMQD, 515ʹ-HPWEKMGDKY for both HPs. The active site region of the HPs are observed to Glu8, Ile10, Gly11, Glu12, Gly13, Thr14, Tyr15, Val18, Lys20, Ala31, Lys33, Val64, Phe80, Glu81, Phe82, Leu83, His84, Gln85, Asp86, Lys88, Lys89, His125, Asp127, Lys129, Pro130, Gln131, Asn132, Leu134, Ala144, Asp145, Ala149, Val154, Thr158, Glu162, Val163, Val164, Thr165, Trp167, Tyr168, and Leu298.

Both the HPs showed 100% sequence identity between them, therefore HP (A0A177DU49) alone was taken for structure prediction. Due to the unavailability of a reliable template, the structure of HP was predicted through an ab initio algorithm using the Rosetta server. Model validation with the Ramachandran plot showed 91.5% of amino acid residues in the favored region and 0.2% of amino acid residues occupied the disallowed region, showing high fold similarity with the template. The secondary structure prediction shows that HPs consist of numerous alpha-helices connecting through loops. The structure similarity using the DALI server shows a model that is similar to pre-18S ribosomal RNA (Z score = 29.3, RMSD = 1.5 Å), putative toxin VAPC6 (Z score = 11.7, RMSD = 4.2 Å) and Ribonuclease VAPC30 (Z score = 9.4, RMSD = 3.1 Å) etc. Moreover, structure-based function prediction using ProFunc shown that the protein may act as endonuclease nob1. Both sequence and structure-based analysis indicate that these HPs function as Nin One Binding.

A0A4Q4N975

A0A4Q4N975 is predicted to be localized in mitochondria and extracellular as suggested by WoLF PSORT and CELLO, respectively. There is no transmembrane helix present in the sequence of HP. The motif and domain analysis suggest that the HP is a glycosyl hydrolase. The members of this glycosyl hydrolases family of enzymes have been identified in bacteria, fungi, and plants, and play key roles in different aspects of life ranging from developmental processes to host–pathogen interactions63. Sequence similarity search also suggests that this HP belongs to the glycoside hydrolases family 17 protein. The predicted partners for HP are endo-beta-1,3-glucanase and class III chitinase (belongs to the glycosyl hydrolase 18 families).

Due to the unavailability of any reliable template in the PDB, Rosetta was used to predict the model. The predicted model shows 86.0% of amino acid residues in the allowed region and 1.0% of residues in the disallowed region of the Ramachandran plot. Rosetta server was not able to completely predict the secondary structural elements of the HP; hence ProFunc could not able to predict the HP function. The structure similarity using the DALI server shows a model that is similar to 6FCG (Z score = 43.2, RMSD = 1.1 Å), 4WTP (Z score = 28.2, RMSD = 2.1 Å), and 3UR8 (Z score = 24.1, RMSD = 2.5 Å). Based on the sequence and structural analysis, the HP may function as glycosyl hydrolases.

Conclusion

In the last decade, an enormous challenge has been made in characterizing the hypothetical proteins present in the genome. The functional assignments helps to understand the molecular biology at the system level and also identify potential drug targets, which can specifically act on pathogens to combat the pathogenicity. In this present study, computational analysis was performed to analyze allergic assessments of hypothetical proteins in A. alternata. Based on the analysis, 10 proteins were predicted as potential allergens. Furthermore, we have characterized the functions of these HPs with a high level of confidence using various bioinformatics approaches. The predicted functions of the HPs are chitin binding, ribosomal protein P1, thaumatin-like protein, glycosyl hydrolase, and Nob1 Zn binding protein. The physicochemical properties of the proteins help in the characterization of protein function, whereas subcellular localization of the proteins plays a pivotal role in differentiating the vaccine and drug targets. This study provided a basic understanding of the potential allergens and could aid in the development of novel therapeutics to counterattack A. alternata and other associated fungal allergic infections.