In silico studies evidenced the role of structurally diverse plant secondary metabolites in reducing SARS-CoV-2 pathogenesis

Plants are endowed with a large pool of structurally diverse small molecules known as secondary metabolites. The present study aims to virtually screen these plant secondary metabolites (PSM) for their possible anti-SARS-CoV-2 properties targeting four proteins/ enzymes which govern viral pathogenesis. Results of molecular docking with 4,704 ligands against four target proteins, and data analysis revealed a unique pattern of structurally similar PSM interacting with the target proteins. Among the top-ranked PSM which recorded lower binding energy (BE), > 50% were triterpenoids which interacted strongly with viral spike protein—receptor binding domain, > 32% molecules which showed better interaction with the active site of human transmembrane serine protease were belongs to flavonoids and their glycosides, > 16% of flavonol glycosides and > 16% anthocyanidins recorded lower BE against active site of viral main protease and > 13% flavonol glycoside strongly interacted with active site of viral RNA-dependent RNA polymerase. The primary concern about these PSM is their bioavailability. However, several PSM recorded higher bioavailability score and found fulfilling most of the drug-likeness characters as per Lipinski's rule (Coagulin K, Kamalachalcone C, Ginkgetin, Isoginkgetin, 3,3′-Biplumbagin, Chrysophanein, Aromoline, etc.). Natural occurrence, bio-transformation, bioavailability of selected PSM and their interaction with the target site of selected proteins were discussed in detail. Present study provides a platform for researchers to explore the possible use of selected PSM to prevent/ cure the COVID-19 by subjecting them for thorough in vitro and in vivo evaluation for the capabilities to interfering with the process of viral host cell recognition, entry and replication.


Results and discussion
The PSM library contains 4,704 molecules collected from 203 plant species belong to diverse plant families (Supplementary File 1). Over 22,000 docking reactions were run (which include replications, to confirm the activity of several molecules with BE) using 4,704 ligands against four selected target proteins/ enzymes involved in host cell recognition, entry and replication of SARS-CoV-2. Upon molecular docking, a wide range of BE was obtained for all four target proteins. The obtained results were arranged in ascending order of BE. For the ease of the study, we selected the top 268 molecules against M pro and 250 molecules for all other target proteins for further analysis (Supplementary File 2) viz., structural similarity and activity relationship, and physicochemical characterization to evaluate their drug-likeness. Binding energy, physicochemical characters and bioavailability score (BAS) of ten top ranked molecules against four target proteins are compiled in Table 1.

SARS-CoV-2 spike protein.
Spike protein is a class I fusion protein present within the envelope as a homotrimer and consists of three S1-S2 heterodimers. The receptor-binding domain (RBD) is located on the head of S1 17 and binds with the cellular receptor hACE2 18 . Any PSM interacts with these selected Amino acid residues (AARs) of spike protein by forming multiple H bonds and other interactions with lower BE may interfere with the spike protein -hACE2 interactions, thereby preventing the early recognition of host cell by SARS-CoV-2.
Interestingly, the large pool of PSM found interacting with the exposed surface of spike protein was found belonging to class triterpenoids. More than 50% of the PSM among top 250 molecules belong to triterpenoids and their derivatives, largely include triterpenes and sterols. Here, sterol lactones alone represent > 14% of total PSM (Fig. 1a). Few biflavonoids, flavonoid glycoside and hydrolysable tannins were also showed promising results.
Coagulins from Withania coagulans (Stocks) Dunal were recorded lower BE with target AARs of the spike protein. Similar observations were made with structurally similar triterpene and steroid, i.e., steroidal lactones, steroidal saponins, steroidal glycoalkaloids, triterpene glycosides, triterpene saponins, and triterpene sterols.  Withanolides are naturally occurring C 28 -Steroidal lactone triterpenoids build on an intact or rearranged estrogen framework [19][20][21]   Saponins are naturally occurring plant glycosides found in a wide range of plant species. They are high molecular weight amphiphilic compounds having triterpenoids and steroid aglycon as lipophilic moiety and sugars as hydrophilic moiety 22 . Another class of basic steroidal saponins contains nitrogen analogues of steroid sapogenins as aglycones. The bio-transformation of saponins mainly occurs at intestine with the aid of gut microbes leading to the generation of the rare low molecular weight saponins containing no or lower number of sugar moiety 23 . These hydrolyzed products are higher in bioavailability and bioactivity compared to their parental compounds [24][25][26] . In our study, Graecunin E (from Trigonella foenum-graecum L.) 27 recorded a least of − 8.6 BE and found interacting with THR415 and GLN493 AARs of spike protein (Fig. 1b, Supplementary File 2). Other Graecunin related compounds, Trigofoenoside E1 (BE − 8.2), Uttronin B (BE -8.3), Stigmasteryl glucoside (BE − 7.9) and Yuccagenin (BE − 7.6) also showed promising results. It was noticed that BE of abovementioned molecules was related to their number of the sugar moiety. As the number of sugar moiety reduces, BE also found reducing. However, with the loss of sugar moiety, their bioavailability was increasing as observed between Greacunin E and Yuccagenin (aglycon form) (Fig. 1b, Supplementary Files 2 and 3). Also, Yuccagenin was found forming H bonds with GLY496 and ASN501 AARs of spike protein which is crucial to interact with AARs of hACE2 and found fulfilling all drug-likeness characters as per Lipinski's rule (Fig. 1b).
Bismahanine, a carbazole alkaloid isolated from leaves of Murraya koenigii (L.) Spreng 28,29 and some other Rutaceae members 30 . Bismahanin and related compounds are reported for their broad biological activities such as anti-oxidant, anti-diabetic, anti-inflammatory, anti-microbial, anti-cancerous, anti-viral, etc. [31][32][33] . In our studies, physicochemical properties of Bismahanin revealed it as a low polar, non-soluble, high MLOGP value and high molecular weight PSM according to Lipinski's rule of drug-likeness, which reduces its bioavailability (BAS 0.17). The molecule interacted with residue GLU406 through H bonds and showed Van der Waals interaction with other residues (TYR495, LEU455, ASN501 and GLY502) of spike protein which are not reported to involve in interaction with AARs of hACE2 (Fig. 1b, Supplementary File 2). Pseudojervine, a steroidal alkaloid was first isolated from the rhizome of Veratrum album L. by Wright and Luff 34 which is regularly used in Chinese traditional medicine. In the present study, Pseudojervin recorded − 8.7 BE and found interacting with AARs GLN493, GLY496 and SER494 of spike protein through H bonds. Also, it passed all the Lipinski's rule except molecular weight (587.74 g/mol) and recorded 0.55 BAS (Fig. 1b, Supplementary Files 2 and 3).
Tannic acid is a high molecular weight polyphenolic compound, highly soluble in water (low lipophilicity). Its GI absorption is almost nil in its original form. However, it is hypothesized that bio-transformed products may enter into plasma and exert biological activities which still need a thorough study. In our study, Tannic acid recorded lower BE of − 8.9, and some of the structurally related hydrolysable tannins such as Strictinin (BE − 8. Arecatannin A3 (a condensed tannins) contains epicatechin-epicatechin-epicatechin-epicatechin-catechin as their basic structure and in its original form it is poorly bioavailable because of their unfavourable physicochemical properties (Supplementary File 3). But, its bio-transformed products, especially monomers, dimers or trimers, may show a varied degree of bioavailability and bioactivity. Their monomers catechin and epicatechin are reported as bioavailable better than their parental molecules 35 . However, their dimer and trimers are poorly bio-available 36 . In our studies, none of the catechin or epicatechin, and their derivatives were appeared in top 250 PSM rank indicating their inability to bind an open surface of the spike protein. Some Arecatannin related molecules such as Arecatannin B1 (BE − 8.1), Proanthocyanidine A-6 (BE − 7.9) and Proanthocyanidine A1 (BE − 7.6) also found interacting with spike protein (Supplementary File 2).
Kamalachalcone C is a dimeric chalcone first isolated from Mallotus philippensis (Lam.) Müll. Arg. by Furusawa et al. 37 . It recorded − 8.8 BE and was found interacting with AARs GLN493, GLN492, ARG403 and GLU406 of the spike protein. Bioavailability score of 0.55 was recorded by Kamalachalcone C, indicating it as a potent drug candidate (Fig. 1b, Supplementary File 2). Amentoflavone, a biflavone also recorded a lower BE of − 8.7 and showed H bond formation with AARs GLN493, SER494 and GLY496 of SARS-CoV-2 spike protein (Fig. 1b).
Recent studies also reported several PSM such as Pavetannin-C1 (BE − 11. Human transmembrane serine protease (TMPRSS2). TMPRSS2 plays a key role in priming spike protein of SARS-CoV-2. The cellular protease cleaves at S1/S2 and S′2 sites thereby facilitating the fusion of viral and host cell membrane 38 . Hence, reducing the protease activity of TMPRSS2 using PSM is considered as another potential way to manage COVID-19. In our study, structural similarity and BE correlation analysis indicated that a large number of flavonoid glucoside (> 32%) were found interacting with the target site of TMPRSS2, followed by ellagitannins and triterpenoids. Interestingly, two triterpenoid saponins (Liquorice and Glycyrrhizic acid) and a stilbenoid (Cis-Miyabenol C) recorded the lowest BE (Fig. 2a). The data obtained from this study indicated the possibilities of developing flavonoid glycoside or triterpenoid based drug molecules targeting human TMPRSS2 which process SARS-CoV-2 spike protein and facilitate the entry of the virus into the host cell.
In our study, a large number of flavonoid glycosides were found efficiently binding to the target site of protein with lower BE. Similarly, in most of the earlier in vitro studies, flavonoid glycosides are proved to be better candidates for enzyme inhibition activity related to health benefits. However, significant concern about flavonoid glycosides is their bioavailability. Upon consumption, at the small intestine, they absorbed mostly in their aglycon form at the cellular level. Absorption of glycosidic form mainly depends on their number of sugar moiety. Through many research studies, it was confirmed that flavonoid glycosides are converted to aglycones by gut www.nature.com/scientificreports/ microbes and decomposed to yield two different phenolic products. Whereas, catechin and procyanidins are biotransformed into 5-(hydroxyphenyl)-γ-valerolactone found in blood plasma and urine as sulfate and glucuronide metabolites 39 . However, several earlier studies supported the hypothesis that glycosidic forms are more bioavailable and resistant to microbial degradation. Also, they can deliver aglycone moiety better when administrated as glycosidic form. Participation of active uptake of flavonoid glycosides by enterocytes was researched in detail. During this process, flavonoid glycosides are converted into aglycones by membrane-bound beta-glucosidase (reviewed by Kumar and Pandey) 40 . Here, Agatisflavone, a bioflavonoid recorded a BE of − 8.9 and found interacting with catalytic site AARs SER441 of TMPRSS2 forming hydrogen bonds. Nirurin, a prenylated flavonone glycoside found in Phyllanthus niruri L. recorded a BE of − 8.9 and showed H bond formation with HIS296 and SER441 of TMPRSS2 catalytic site and with VAL290, SER436, GLY439 and CYS465 in close vicinity of the catalytic site, presenting itself as a strong irreversible inhibitor of TMPRSS2 (Fig. 2b, Supplementary File 2). Following to this, Naringin, a flavanone-7-O-glycoside recorded BE of − 8.3 and showed similar interaction with AARs of TMPRSS2 (Supplementary File 2). All these molecules are low in their BAS and their physicochemical characters are not favourable as per Lipinski's rule (Supplementary File 3). Though the flavonoids are dominating the group in numbers, the least BE was recorded by two triterpenoid saponins, i.e., Liquorice (− 9.7) and Glycyrrhizic acid (− 9.5) followed by a stilbenoid Cis-Miyabenol (− 9.4).
Here, Liquorice was found forming H bond with SER441 of TMPRSS2 catalytic site AAR through glucose moiety. Similarly, Glycyrrhizic acid also showed H bond formation with HIS296 of TMPRSS2 through oxygen involved in a glycosidic bond. Liquorice and Glycyrrhizic acid are isomers, triterpenoid glycosides obtained from roots of Glycyrrhiza glabra L. (Liquorice). Roots of this plant are traditionally used to alleviate jaundice, gastritis and bronchitis. Gancao, a Chinese herbal decoction of dried plant roots and stem, contains 3.63-13.06% Glycyrrhizin and well known for their therapeutic properties, including antiviral 41 . Glycyrrhizic acid was reported as an active component from the roots of G. glabra which inhibited the growth and cytopathology of both DNA and RNA virus viz., Herpes simplex type I, Newcastle disease virus, Vesicular stomatitis virus and Polio type I virus without affecting host cell activity and replication. Research regarding the bioavailability of Glycyrrhizic acid revealed that after oral administration, only bio-transformed Glycyrrhetic acid was detected in plasma at higher concentration 42,43 . It is because of the complete biotransformation of Glycyrrhizic acid to Glycyrrhetic acid by the activity of gut microbes. Hence, to know the capability of aglycon form, i.e., Glycyrrhetic acid (Enoxolone, not present in main PSM library) was docked and found it also form H bond with SER441 of TMPRSS2, but the BE was reduced to − 7.7 (Fig. 2b).
Cis-Miyabenol C, a stilbenoid (resveratrol timer) found in Foeniculum vulgare Mill. (fennel). Stilbenoids are characterized by two phenyl group linked by a transethane bond and reported to exhibit a wide range of biological activities and pharmacological properties 44 . In the present study, Cis-Miyabenol C recorded a lower BE of − 9.4 and found interacting with catalytic site AAR of TMPRSS2, i.e., ASP345 through H bond (Fig. 2b, Supplementary File 2). This compound was low in GI absorption, highly lipophilic, insoluble in water and found violating Lipinski's rule of drug-likeness. The low bioavailability of stilbenoids is mainly due to their rapid, extensive metabolism in the intestine and liver during and after absorption giving rise to a lower level of the free parent compound 45 .
Chrysophenols (Anthraquinones) are anthracene derivatives, and structurally they are tricyclic aromatic quinones with two ketone group attached to the central benzene ring. Here, two anthraquinone glycosides, Chrysophanein and Rhein-8-glycoside recorded lower BE of − 8.9 and − 8.9, respectively, which was lower than their structurally related flavonoid glycosides, Baicalein 6-glucoside (BE − 8.7) and Luteolin 3′-xyloside (BE − 8.5). Chrysophaenein showed H bond with SER441, an AAR of the catalytic site of TMPRSS2 and VAL280 and GLY439 in the catalytic pocket. Chrysophaenein passes all the parameters of Lipinski's rule with 0.55 BAS (Fig. 2b, Supplementary Files 2 and 3) proving itself as a better drug candidate to inhibit the activity of TMPRSS2.
Plumbagin is a naphthoquinone derivative from the roots of Plumbago zeylanica L. and well studied for its anti-cancerous property 46,47 . In our study, 3,3′-Biplumbagin recorded BE of − 8.9 and found interacting with HIS296 and SER441 of the catalytic site of TMPRSS2 and with VAL280 and GLY439 AARs in the close vicinity, showing its strong affinity towards the catalytic domain of TMPRSS2. Also, it recorded higher BAS of 0.55 and found fulfilling all necessary physicochemical characters for a drug-like molecule as per Lipinski's rule (Fig. 2b,  Supplementary File 2 and 3).
Quinic acid is a carboxylated cyclohexanepolyol that is found in several plants like coffee, tomato, carrot, etc. and exists either in free form or as esters 48 . Quinic acid is a starting compound used to synthesise "Tamiflu", an anti-viral drug used to treat Influenza A and B virus 49 . Derivatives of Quinic acid were reported to be anti-viral in nature against Human immunodeficiency virus (HIV), Hepatitis B virus (HBV), Herpes simplex virus 1 and Dengue virus [50][51][52][53] . 3-Caffeoyl-5-feruloylquinic acid, in our study, recorded − 9.0 BE and found interacting with SER441 of TMPRSS2 through H bonding. Following this, structurally similar 3-O-Caffeoyloleanolic acid also recorded BE of − 8.7. Both the molecules recorded low BAS of 0.11 and violated Lipinski's rule of drug-likeness (Fig. 2b, Supplementary File 2). Studies of Gonthier et al. 54 showed that chlorogenic acids, the esters of caffic and quinic acid are poorly bioavailable. However, their microbial bio-transformed metabolites such as m-Coumaric acid and derivatives of phenylproponic, benzoic and hippuric acid were found in higher concentration after oral administration of chlorogenic acids. The anti-viral activity of bio-transformed molecules is yet to be studied.
In the present study, group of bulky Ellagitannins such as Terflavin A, Terflavin B, Punicalin, Strictinin, Pedunculagin, Punicafolin, Tellimagradin I, Tercatain, Emblicanin A, Phyllanemblinin B, etc. recorded lower binding efficiency indicating their capability to inhibit the activity of TMPRSS2 enzyme. Similarly, Granatin B, an ellagitannin commonly found in the pericarp of Punica granatum L. recorded lower BE of − 9.1 followed by Granatin A (− 8.9). Granatin B recorded H bond formation with HIS296 of TMPRSS2. Bioavailability score of Granatin B was 0.11 and found violating Lipinski's rule (Fig. 2b, Supplementary Files 2 and 3). However, they are all high molecular weight, and according to previous studies, they are poorly/ not bioavailable. Further, they hydrolyze into Ellagic acid and their complex derivatives in the intestine and can cross the gastrointestinal barrier  www.nature.com/scientificreports/ into the bloodstream 55 . At the same time, gut microflora are reported to convert Ellagitannins to Urolithins, an anti-cancerous compounds 56,57 . Bioavailability of Ellagitannins metabolites was successfully proved in humans in the form of Ellagic acid in blood plasma 58 . Ellagic acid consists of a hydrophilic domain made up of four phenolic groups and two lactones, and a lipophilic domain made up of four rings. These domains, particularly the hydrophilic, which can H bond and accept an electron, thus determining a structural activity relationship.

In our experiments, Ellagitannins derivatives 3′-O-Methyl ellagic acid 4-xyloside recorded BE of − 8.1 against TMPRSS2 (Supplementary File 2), indicating the possibilities of Ellagitannin metabolized products function as TMPRSS2 inhibitors.
Molecules such as Geniposide (BE − 14.6), Cytidine-5′-diphosphocholine (BE − 13.9), Durumolide K (BE − 13.2) and 5′-methoxyhydnocarpin D (BE − 13.5) were reported to be effective molecules of plant origin which can bind to active site of human TMPRSS2 and interfere with the viral spike protein priming activity (Supplementary File 4). However, with respect to TMPRSS2 activity inhibition, competitive inhibitors are preferred over uncompetitive/noncompetitive inhibitors as TMPRSS2 activity is an essential component of cellular structure and function.

SARS-CoV-2 Main Protease (M pro ).
The key enzyme in proteolytic processing of SARS-CoV-2 replication is M pro . It is initially released by the auto-cleavage of pp1a and pp1ab. Then M pro , in turn, cleaves pp1a and pp1ab to release functional proteins necessary for viral replication 59 . Any PSM binding to the AARs of the catalytic site or pocket with H bonds and other interactions with lower BE may interfere with the viral replication process in host cell, thereby reducing the severity of the COVID-19.
When BE and canonical SMILES structural similarity of top 250 PSM were analyzed, it was observed that molecules from two major groups, i.e., Flavonol glycosides and Anthocyanidine were dominating with > 16% and > 16% PSM, respectively. Other flavonoids and triterpenes also recorded promising results (Fig. 3a). But, the least BE was recorded by Hypericin, a naphthodianthrone and Amentoflavone, a biflavonoid.
Hypericin, a naturally occurring chromophore "Naphthodianthrone" compound, derivative of anthraquinone found in common St. Johnswort (Hypericum species) and in some fungi. Hypericum perforatum L., a source of Hypericin has been used as folk medicine. Also, Hypericin is reported as anti-depressive, anti-tumor, antiviral, antineoplastic, etc. 60 . Here, Hypericin recorded least BE of − 10.4. Further, it was found forming H bond with GLU166 residue in catalytic sites of M pro (Fig. 3b, Supplementary File 2). Physicochemical parameters of Hypericin showed that it is poorly soluble in water and violates Lipinski's rule of drug-likeness (Fig. 3b, Supplementary File 2) with 0.17 BAS. It is a naturally occurring photosensitizer reported to accumulate in tumour cells and upon illuminating release Reactive oxygen species (ROS) killing the cancerous cells 61 . Recent in silico studies revealed its potential to bind SARS-CoV-2 spike protein 62 . Additionally, its anti-viral properties against Bronchitis virus 63 , Hepatitis C virus 64 and Human Coronavirus Oc43 65 represent it as a potential anti-viral drug material.
Sotetsuflavone, a biflavonoid from Dacrydium balansae Brongn. & Gris (Gymnosperm) was found to be the most potent inhibitor of Dengue virus NS5 RdRp with IC 50 of 0.16 µM, among 23 biflavonids screened. Further, their enzyme inhibitory activity was related to the number and position of methyl groups on the biflavonoid moiety and the degree of oxygenation on flavonoid monomers 66 . In our studies, Amentoflavone recorded least of − 9.7 BE against M pro and found interacting with target AAR by forming H bonds with GLU166 and other residues in the vicinity of catalytic site (Fig. 3b, Supplementary File 2). Similarly, other biflavonoids, both Ginkgetin and Isoginkgetin (derivatives of Amentoflavone from Ginkgo biloba L.) were recorded BE − 9.5 followed by Agatisflavone (BE − 9.3).
Flavonol glycosides are the most influential group of PSM found interacting with the target site of M pro . Flavonols are the class of flavonoids that has 3-hydroxyflavone backbone and present in a large group of plants. Lower BE of − 9.6 was recorded by Quercetin 3,5-diglucoside followed by Myricetin 3-rutinoside (BE − 9.4), Rutin (BE − 9.3), Kempferol (BE − 9.2), Myricetin 3-rhamnoside (BE − 9.2), and Robinetin 3-rutinoside (BE − 9.1) (Fig. 3b, Supplementary File 2). It was observed that flavonol with two glucose moieties recorded lower BE when compared to flavonol with one or three glucose moieties. With regards to bioavailability, Graefe et al. 67 reported the heights of 2. Proanthocyanidins are a class of oligomeric flavonoids (condensed tannins) found in a variety of plants. They are formed by the condensation of catechin and epicatechin units. Studies on Proanthocyanidins reveal its poor bioavailability, its monomers viz., ( +)-catechin and (−)-epicatechin have been previously reported to cross GI track in its monomeric form or as conjugated metabolites 35 . Further, its dimers are also reported to present in plasma and urine after consuming food and beverages rich in Proanthocyanidins 68,69 . In our studies, Proanthocanidin A2 and Proanthocanidin B2 recorded least of − 9.4 BE followed by Procyanidine B1 (BE − 9.     (Supplementary File 2). Glycosidic anthocyanidin (anthocyanins), water-soluble pigments found in many fruits like grapes, bilberry, raspberry, blackberry, cherry, blueberry, etc. Here, Cyanidine-3,5, diglucoside recorded − 9.4 BE followed by Penodine 3,5 diglucoside (BE − 9.3). Further, H bond formation between Cyanidine-3,5, diglucoside and AARs GLY143 and SER144 of the catalytic site of M pro was observed. It recorded a BAS of 0.17 and didn't fulfill the criteria of a drug-like molecule as per Lipinski's rule (Fig. 3b, Supplementary  File 2). Bioavailability of Cyanidine-3-glucoside was studied in human trials by Czank et al. 70 following the isotopically labeled compound. They found radioactivity in plasma, urine and breathe, indicating the absorption of the labeled compound. However, the study didn't reveal the existence of bioactive metabolites; it may be in degraded forms of the parental compound. Terflavin B, an ellagitannin (found in Terminalia chebula Retz. and T. catappa L.) recorded BE of − 9.7. Also, it was found forming H bonds with AARs HIS41, GLY143, SER144, CYS145 and GLU166 present in the catalytic site of SARS-CoV-2 M pro . It is water-soluble, low in GI absorption and violates Lipinski's rule (Fig. 3b, Supplementary File 2). Aqueous extracts of T. chebula were reported to be inhibitory to Hepatites B virus infection in Hep G 2.2.15 cells 71 . Not much research studies on their bioavailability and anti-viral properties have been done with purified Terflavin B. Vescalagin, and Castalagin are ellagitannins found in oak and chestnut wood 72,73 . They are water-soluble, high oxidizable and astringent 74 . Here, Vescalagin recorded − 9.6 BE and found interacted with GLU166 residue of M pro catalytic site through H bond, and Castalagin showed − 8.8 BE (Fig. 3b, Supplementary  File 2). Vilhelmova et al. 75 demonstrated the anti-viral activity of Vescalagin and Castalagin against Herpes simplex virus type I and II. Also, these PSM synergistically inhibited the multiplication of test virus along with anti-viral compound Acyclovir.
Mudanpioside J is a monoterpene glucoside identified in Paeonia delavayi Franch. (Chinese medicinal plant), which was found inhibitory to Influenza virus neuraminidase 76 . In our studies, Mudanpioside J recorded BE of − 9.6 and showed strong interaction with GLY143, SER144 and CYS145 residues of M pro catalytic site through H bonds (Fig. 3b, Supplementary File 2). Studies on Pharmacokinetics properties of monoterpene glycosides are limited. Paeoniflorin and Albiflorin, Mudanpioside J related compounds were reported as low in oral bioavailability due to their poor membrane permeability and gut microbes-induced metabolism 77,78 . SwissADME analysis of Mudanpioside J also indicated its P-glycoprotein substrate nature and violation of Lipinski's rule with 0.17 BAS (Supplementary File 3).
Lignans were reported for their wide biological activities, including anti-viral against Hepatitis B, Hepatitis C, Herpes simplex virus type 1 and 2, Epstein-Barr virus and Cytomegalovirus 79,80 . Justalakonin is an Arylnaphthalene lignan glycoside isolated from Justicia purpurea L. In the present study, it recorded BE of − 9.4 and showed interaction with AARs GLY143 and SER144 of the catalytic site of M pro through H bonding (Fig. 3b

SARS-CoV-2 RNA-dependent RNA polymerase (RdRp). The central component of coronaviral rep-
lication/ transcription machinery is RNA-dependent RNA polymerase (RdRp, also named nsp12) that constructs copies of its RNA genome playing the key role in replication and transcription of SARS-CoV-2 in the host cell 81,82 . Because of its high sequence and structural conservation, it remains the target of choice for the prophylactic or curative treatment of several viral diseases. Studies on structural activity relationship and BE of PSM revealed that a large number of flavonol glycoside (> 13%) followed by hydrolysable tannins, anthocyanins and triterpenes were found interacting with target site of RdRp with lower BE (Fig. 4a). Though the number of flavonol glycosides were high, the lowest BE was recorded by Erodictyol-7-O-glycoside and Narirutin belong to group flavanon glycosides. Interestingly, none of the PSM analyzed was found interacting with VAL557 of RdRp. However, they found forming H bond and another type of interactions with AARs in the catalytic pocket probably sterically hinders the substrate interaction with the catalytic site, thereby reducing the RdRp activity.
Eriodictyol-7-O-rutinosideis a flavanone glycoside commonly found in lemon 83 , also called as lemon or citrus flavonoid. It recorded − 9.9 BE and found interacting with ARG553, ALA554, THR556, ASP618, TRY619 and ASP623 in the active site of RdRp. Both glycone and aglycone moiety of this molecule were found involved in H bond formation with the target site of RdRp. Similarly, Narirutin, another flavanone glycoside which naturally presents in sweet oranges recorded lower BE of − 9.7 and a similar pattern of H bond formation with a catalytic pocket of RdRp was observed. Both molecules recorded BAS of 0.17, indicating their low bioavailability and found violating Lipinski's rule (Fig. 4b, Supplementary File 2). Structurally similar compounds, Nirurin from P. niruri and Naringin from grapefruits also showed promising results with BE − 9.0 and − 8.9, respectively. Myricetin 3-rutinoside is a flavonol glycoside isolated from Chrysobalanus icaco L. 84 and other related plant species recorded − 9.5 BE. This molecule was found forming a large number of H bond with AARs ASP452, TYR456, ARG553, ALA554, ARG555, CYS622, ASP623 and SER682 showing its capability of strongly interacting with the target protein. Similarly, Kempferol (flavonol glycoside) also recorded − 9.5 BE. For both the compounds BAS was found to 0.11 (Fig. 4b, Supplementary File 2 and 3). Rhoifolin and Isorhoifolin are flavone glycoside reported from Rhus succedanea L. 85 87 and reported as an anti-inflammatory and anti-proliferative 88,89 . Here, Rotundioside B recorded − 9.5 BE and formed H bond with ASP452, SER501, ALA554, ASP623, ARG624, ASN691, SER759 and ASP760 of RdRp catalytic pocket proving itself as a strong contender of an anti-viral drug. However, this molecule recorded poor BAS of 0.11 and violated Lipinski's rule of drug-likeness (Fig. 4b, Supplementary File 2). Similar to this, Ginsenoside Ro, a triterpene saponin from roots of Panax ginseng C.A. Meyer. and related plants recorded − 9.4 BE. Trigofoenoside G, a steroidal saponin from plants and seeds of T. foenum-graecum was found interacting with RdRp with − 9.3 BE and forms H bonds with ASP452, LYS500, ALA554, ALA558, LYS621 and ALA685 of catalytic pocket (Fig. 4b,  Supplementary File 2). Following to this, Trigofoenoside F and Trigofoenoside A recorded − 8.9 and − 8.4 BE, respectively (Supplementary File 2).
Hippomanin A, an ellagitannin found in Hippomane mancinella L. is known for its toxic properties. It causes oropharyngeal and gastrointestinal tract lesions, hypotension and bradycardia 90,91 . Here, Hippomannin A recorded − 9.6 BE and forms H bond with ARG553, TYR619, ARG624, THR680, SER682 and ASP760 with target site of RdRp. A structurally similar molecule, Tellimagrandin I found widely in fruits, nuts and vegetables. Tellimargrandin was reported for its wide spectra of biological activities (reviewed by Zheng et al. 92 ). Here, it recorded − 9.5 BE followed by Punigluconin (BE − 9.3). Other hydrolysable tannins viz., Emblicanin A and Phyllanemblinin C found in fruits of Indian Gooseberry (Emblica officinalis Gaertn.) were recorded lower BE of − 9.4 and − 9.3, respectively. As the above discussed hydrolysable tannins are highly water-soluble and larger in molecular weight, their bioavailability in their original form is a major concern. Some of the molecules, structure similar to their bio-transformed metabolites with higher bioavailability, like Ellagic acid (BE − 8. Recent studies also reported several PSM such as Cyanidin 3-(6″-manlonylglycoside) (BE − 11.5), Caftaric acid (BE − 10.6) and Chrysophanol 8-(6-galloylglucoside) (BE − 9.9) with lowest binding energy as a potential molecules to reduce the pathogenicity of SARS-CoV-2 by suppressing the activity of RdRp thereby reducing the viral multiplication capability inside host cell (Supplementary File 4).

Conclusion
From the obtained results, it could be concluded that virtual screening of large number of PSM through molecular docking is a promising preliminary step towards developing an effective drug against a desired target protein/ enzyme by understanding their structure-activity relationship. Here, more than the BE of a PSM, its bioavailability also plays a crucial role in determining its biological activity under in vivo environmental conditions. Triterpenoid based PSM structures (Coagulins, Withanolides, Pseudojervine, Kamalachalcone, etc.) are hypothesized as potent drug molecules which can be used to block surface AARs of spike protein which interacts with hACE2, thereby preventing host cell recognition by SARS-CoV-2. In the case of TMPRSS2, M pro and RdRp, molecules belong to flavonoid glycosides, biflavonoids, ellagitannins, anthocyanidins, triterpens, etc. (Table 1) can be explored. Though the large numbers of PSM were found violating Lipinski's rule and recorded lower BAS, they can't be ignored. Because several bio-transformed structure of these PSM are highly bioavailable and they may retain structural moiety of the parental compound. These bio-transformed molecules may further interact with the target site of a protein and exert similar results as observed in molecular docking studies. In our study, most of the potential anti-SARS-CoV-2 PSM were well studied for human consumption to manage various diseases and disorders previously. Also their controlled administration was proved to be non-toxic to humans. By considering the above facts, the possibilities of using these molecules along with the existing best practices to be explored immediately. Further, to streamline the large pools of PSM, they can be subjected for several in vitro and in vivo studies.

Materials and methods
Preparation of plant secondary metabolites library. To prepare PSM library, an extensive literature survey was conducted on selected plants and the general and species-specific PSM including Alkaloids, Phenolics and Terpenoids were listed. The 3D and 2D structures (SDF Files), and canonical SMILES of the selected PSM were retrieved from online databases such as PubChem (https ://pubch em.ncbi.nlm.nih.gov) and Chem-Spider (https ://www.chems pider .com). The 2D structures were converted into 3D coordinates, and geometries were optimized by using Marvin Sketch (https ://www.chema xon.com/produ cts/marvi n/marvi nsket ch). As several PSM are present in multiple plant species, an approximate 6% duplication was allowed in the main PSM library. Additionally, PSM isomers were considered as separate ligands. All the files were coded and used for further studies.
Target proteins. In the present study, we selected four target proteins, one from human (human transmembrane serine protease 2, TMPRSS2) and three from SARS-CoV-2 (spike protein, M pro and RdRp). These Scientific Reports | (2020) 10:20584 | https://doi.org/10.1038/s41598-020-77602-0 www.nature.com/scientificreports/ selected enzymes/ proteins are well studied for their involvement in host cell recognition, membrane fusion, and viral replication in host cell, which are critical stages in determining viral pathogenicity. The crystal structure of SARS-CoV-2 spike receptor-binding domain bound with hACE2 (PDB ID: 6M0J) (2.45 Å) was retrieved from Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) (RCSB, http:// www.rcsb.org) 93 and hACE2 was removed, processed and used for docking studies. The 14 amino acid residues (AARs) (THR415, ASN439, TYR449, TYR453, LEU455, PHE486, ASN487, TYR489, GLN493, GLN498,  THR500, ASN501, GLY502 and TYR505) of spike protein that are key for binding hACE2 94 were considered as active sites for molecular docking process. The TMPRSS2 sequence (NP_001128571.1) was retrieved from the National Center for Biotechnology Information (NCBI) protein database. The structure of TMPRSS2 was generated by using the SWISS-MODEL online server 95 . The structures were marked, superimposed and visualized by using Chimera 96 . Model 1 of our work and O15393 present in UniProt were found 100% similar. Three amino acid residues (HIS296, ASP345 and SER441) of catalytic site 97 were considered as key residues of TMPRSS2 in the molecular docking process. The crystal structure of SARS-CoV-2 M pro (PDB ID: 6LU7) (2.16 Å) 98 was retrieved from RCSB-PDB and used for docking studies after processing. M pro has three domains, and the active site is located between domain I and II. Here, CYS145-HIS41/SER144-HIS163 can act as a nucleophilic agent and GLY143 and GLU166 can form hydrogen bonds with "CO-NH-Cα-CO-NH-Cα" structure of the backbone of the substrate protein. These six residues were considered as the critical residues of M pro in the molecular docking process 99 . The SARS-CoV-2 RdRp protein sequence (YP_009725307.1) was retrieved from the NCBI protein database. The structure of RdRp was generated by using the SWISS-MODEL online server 95 . The structure was marked, superimposed and visualized by using Chimera 96 . The amino acid residue VAL557 in motif F 82 was considered as a critical residue of RdRp in the molecular docking process. Removal of water molecules, metal ions, cofactors, and addition of charges and hydrogen atoms were done by UCSF Chimera tool 96

Molecular docking.
The ligands were energy minimized by conjugate gradients optimization algorithm with total numbers of 200 steps performed as a default universal force field (UFF) parameters 100 . The capability of ligands to interact with the target site of selected proteins was studied following computational ligand-target docking approach. Molecular docking was carried out using PyRx, AutoDock Vina option based on scoring functions 101,102 . The least binding energy (BE, kcal/mol) conformation was considered as the most favourable docking pose. The interactions between ligand and protein were analyzed using PyMOL 103 and Discovery Studio 3.5 (Accelrys Software Inc., San Diego, CA, USA).
Structural activity relationship analysis. According to their BE, all the ligands were assigned ranking and aligned in ascending order. For the ease of the study, top 250 molecules (268 for M pro ) were subjected to ligand structure similarity analysis (based on canonical SMILES) and BE using Data Warrior software (Version 5.2.1).
Physicochemical properties and bioavailability of PSM. The drug-likeness and the physicochemical properties were studied using SwissADME (www.swiss adme.ch). The canonical SMILES of the selected PSM were subjected to SwissADME analysis. The PSM were analyzed for their drug-likeness properties following Lipinski's rule of five 104 . The bioavailability radar charts obtained were analyzed for their drug like properties, i.e., lipophilicity (XLOGP3 between − 0.7 and + 5.0), Molecular weight (between 150 and 500 g/mol), Topological polar surface area (between 20 and 130 Å 2 ), Solubility (log S not higher than 6), Flexibility (no more than 9 rotatable bonds) and Saturation (fraction of carbons in the sp 3 hybridization not less than 0.25) 105 . The bioavailability score (BAS) for selected PSM was also recorded 106 .

Data availability
The supporting data related to the manuscript is provided as Supplementary Files.