Introduction

Historically, microorganisms have provided the source for the majority of the drugs in use today.1 Among these, 45% are produced by actinomycetes, 38% by fungi and 17% by unicellular bacteria.2 Since the advent of bacterial genome sequencing in the mid-1990s, it has become apparent that marine actinomycetes have an unrivalled capacity to synthesize bioactive secondary metabolites with a wide spectrum of bioactivities.3, 4, 5, 6, 7, 8, 9 For example, the genome scanning of the deep-sea actinomycete ‘Verrucosispora maris’ has resulted in the discovery of >20 biosynthetic gene clusters.10 Moreover, the application of species richness estimates to actinobacterial diversity data has predicted a value as high as 1353 taxa in the deep sea, with 90% of these taxa representing novel species and genera. Improved systematics increasingly provides a roadmap to biosynthetic gene clusters and thence to products. As new chemical entities are likely to be discovered from novel actinobacteria, marine actinomycetes are a likely target for improved technological platforms in the search and discovery of novel bioactive compounds.

Many microbial natural products that have reached the market without any chemical modifications are a testimony to the remarkable ability of microorganisms to produce drug-like small molecules.11, 12, 13, 14 Although still in clinical trails, a feature example of this is salinosporamide A (NPI-0052), a novel anticancer agent found in the exploration of new marine environments.15 Natural products, including drugs, are known to occupy a larger and more diverse chemical space than combinatorial chemicals16 and completely novel chemical skeletons continue to be discovered among microbial natural products.17 In 2008, over 1000 marine natural products were reported.18 However, out of the 19 microbial-derived drugs reported in 2008, no natural products from marine microbes were present, signifying the novelty of their systematic exploration.19 Currently, >30 compounds of marine microbial origin are in clinical or preclinical studies for the treatment of different types of cancer,20, 21, 22, 23 clearly demonstrating that marine microorganisms have become an essential resource in the discovery of new antibiotic leads.24 Quoting Newman and Hill,7 ‘The search for bioactive metabolites from marine microbes has only just begun’.

The evolution of marine microbial natural product collections and development of high-throughput screening methods have attracted researchers to the use of natural product libraries in drug discovery.25, 26, 27, 28 These libraries include subsections of crude extracts, pre-fractionated extracts and purified natural products.29 A research group in Ireland has developed a two-dimensional chromatographic strategy that includes an automated HPLC-MS fractionation protocol to generate purified marine natural product libraries that are accurately characterized by mass during production to expedite dereplication of known compounds and identification of novel chemotypes.30 Although purified natural product libraries are suitable for high-throughput screening, such as single protein assays, the crude extract library and the strain library are also very important for the efficient drug discovery process because they are both cheap and easy to obtain. However, diversification and dereplication of strains and products are central to the construction of a high quality microbial natural product library (MNPL) (Figure 1), saving time and resources for future isolation and purification processes.

Figure 1
figure 1

The flow chart of construction of MNPL and evaluation system.

Biodiversity of marine microbial strains

Recently, large pharmaceutical companies have addressed the premise that the understanding of the chemical diversity of microbial secondary metabolites relies heavily on a good understanding of microbial diversity itself.31 They hypothesize that maximizing biological diversity is the key strategy to maximizing chemical diversity. Some companies like Cubist Pharmaceuticals focus on mining cryptic pathways and combinational biosynthesis to generate new secondary metabolites related to existing pharmacophores.32 The integrated approaches for maximizing the diversity of microbes in drug discovery programs have been reviewed recently, with selective isolation of novel microorganisms, and metagenomic approaches for, as yet, uncultured microorganisms as keys in diversifying microbial resources and gene sources.11, 13, 33, 34, 35

For a culture dependent-bioprospecting strategy, actinobacterial systematics is a guide to successful biodiversity. In 2009, Li's group isolated at least 60 novel species/genera of marine actinomycetes from saline habitats, polluted soils, deep-sea sediments or as symbionts with other organisms.36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95 These types of novel environments are important in a natural product library that aims to feature novel active compounds. Goodfellow's group has focused on the isolation of novel actinomycetes. They have isolated 10 novel genera of indigenous marine actinobacteria from acidic and alkaline ecosystems, desert biomes, littoral sediments, hyper-arid deserts and marine habitats.96 Indeed, many novel bioactive compounds such as Abyssomicin, Proximicin and Caboxamycin from these new actinomycetes have been isolated.97, 98, 99 Samples from anaerobic habitats, freshwater habitats, high and low temperature environments and low nutrient sites are also important sources for isolating novel actinomycetes.100

Despite the fact that a large diversity of microorganisms are continuously isolated using traditional methods, the majority of microorganisms from the environment are unculturable in the laboratory.101 In recent years, many innovative techniques have been developed to efficiently isolate novel microorganisms from the marine biosphere.102, 103, 104, 105, 106, 107 First, natural substrates containing living cells as the source of nutrients and potential signaling agents can be used to stimulate the microorganisms that are difficult to culture. This method uses membrane systems that allow the diffusion of small molecules but prevent contamination by undesired bacteria. This strategy has been described for the isolation of marine bacteria using diffusion chambers incubated in an aquarium in the presence of other microorganisms.101 Using this approach, a large number of diverse colonies can be recovered, although new phylotypes may not necessarily be obtained. The addition of factors such as pyruvate, cyclic AMP and homoserine lactones have also been shown to instigate the generation of greater numbers of microorganisms.13 Second, a selected group of strains can be cultivated using oligotrophic isolation media, such as the use of seawater-based media for marine organisms, allowing only growth of selectively adapted microbes and, at the same time, inhibiting the majority of the natural population. Third, high-throughput methods in which conventional Petri dishes are replaced by microtiter plates have been developed to obtain the untapped microbial diversity. Connon and Giovanonni have developed a high-throughput method using low nutrient medium to isolate and cultivated many marine microorganisms,104, 106 including many uncultured bacteria from bacterioplankton communities.108, 109 Another high-throughput technique involving encapsulation of single cells using gel microdroplets was developed by Diversa Corporation (San Diego, CA, USA).110 This approach combines cultivation of the encapsulated cells under low nutrient flux conditions, followed by flow cytometry to detect microdroplets containing microcolonies. Despite the fact that high dilution and high-throughput methods have been used to cultivate more microorganisms, it remains unclear whether the microorganisms obtained are able to produce novel secondary metabolites. It is obvious, however, that these methods are valuable and need further exploration.

Biodiversity of marine microbial gene resources

Metagenomics is promising in efforts to gain access to uncultured microorganisms. The analysis of DNA isolated from environmental samples has proved very useful for these bacteria, which may be a new source of novel antibiotics.111, 112, 113 For example, metagenomes of sponge microbial communities have been shown to contain genes and gene clusters typical for the biosynthesis of biologically active natural products.114 Heterologous expression approaches have also led to the isolation of secondary metabolism gene clusters from uncultured microbial symbionts of marine invertebrates, and soil metagenomic libraries. In the metagenomic approach, isolated DNA is ligated into bacterial artificial chromosome vectors, low copy plasmids that can contain large DNA inserts up to 300 kb.115 The bacterial artificial chromosome vectors are then subsequently transformed into host microorganisms such as Escherichia coli, Streptomyces lividans and Pseudomonas putida.116 The resulting clones can then be screened for biological activity or alternatively probed for sequences of interest.117 For example, Gillespie et al.118 have isolated the antibiotics turbomycin A and B from a metagenomic library of soil microbial DNA. Many other biologically active compounds such as amino acids and fatty acids have been obtained by Singh and Pelaez.31 We have constructed a metagenomic library of about 30 000 clones, of which 16 clones were identified as having interesting biological activity after screening for lipase activity (data not published). It is noteworthy that metagenomic approaches will not only activate the research field of marine microbial diversity but will also provide opportunity for the genetic conservation of marine microorganisms.112 Undoubtedly, metagenome analysis technology combined with high-throughput screening will bring innovation to the drug discovery.

Biodiversity of microbial products

Although the biosynthetic and regulative crosstalk of secondary metabolite biosynthesis is complex within and between microorganisms, all levels can be influenced by imitating natural environmental changes. Use of this method to release nature's chemical diversity has been termed ‘one strain, many active compounds’.119 Bode et al.119 used the systematic alteration of easily accessible cultivation parameters to increase the number of secondary metabolites from a single organism. Very small changes in cultivation conditions resulted in a complete shift in the metabolic profiles of the microorganisms. For example, in the presence of different supplements, various acyl and phenyl α-L-rhamnopyranosides were produced by Streptomyces griseoviridis.120 Bills et al.121 have used bacterial micro-fermentators for fungal growth in nutritional arrays, and the results indicate that the protocols can be used to pre-select strains and their growth conditions for scaling up.

Development and testing of new culture media for the maximum expression of secondary metabolites is as important as genome-guided chemical diversity in the construction process of an MNPL. Secondary metabolite production in microbes is strongly influenced by nutritional factors and growth conditions. However, without prior knowledge of the preferred growth conditions for a given microorganism, random assignment of media to strains may generate an inefficient redundancy of metabolites or strains lacking the relevant levels of secondary metabolites. An optimization of ‘one strain, many active compounds’ can be used together with ‘fingerprint’ methods (HPLC and nuclear magnetic resonance) for the optimization/selection of culture media for high-throughput fermentation of novel strains. Tormo et al.122 developed a method for the selection of production media for actinomycete strains based on their metabolite HPLC profiles, and three media types that yielded the highest metabolite diversity and least overlapping HPLC profiles were selected for large-scale fermentation. Researchers at Merck use eight different types of media for the cultivation of strains to make product library, from which new antimicrobial compounds such as platensimycin, platencin, philipimycin, fluvirucins, lucensimycins and okilactomycin have been isolated.123, 124, 125, 126, 127, 128 We used 10 different media (Table 1) for the cultivation of novel actinomycetes.129, 130, 131 It was found that this was successful in obtaining a higher positive hit rate (70%) than the historical hit rate (1%) with the same screening methods (Table 2). Till now, 13 novel active compounds have been isolated from the microbial natural products library, which include those from fungi.132, 133

Table 1 The fingerprint characteristic of 10 culture medium for MS098
Table 2 Hits from our library by high-throughput screening model

Dereplication of strains and crude extracts

Discrimination between previously tested or recovered microorganisms (dereplication) is a primary issue for a high quality-MNPL. It is not practical to characterize the large numbers of unknown strains within a library using polyphasic taxonomic approaches. Therefore, tools such as color-grouping, rep-PCR, single strand conformation polymorphism and analytical chemistry (FTIR, MALDI-TOF, pyrolysis mass spectrometry) have been developed to estimate diversity and dereplicate strains. It is noteworthy that strains similar according to sequence analysis but unique according to REP analysis could produce different surfactant mixtures under same growth conditions.134 Thus, 16S rRNA gene database commonly used for determining phylogenetic relationships may miss diversity in microbial products (for example biosurfactants and antibiotics) that are made by closely related isolates. One can amplify the rep-PCR target bands by the employment of genus special primers when we obtained active strains to make the dereplication. By using this method, we isolated a novel bioactive compound (3304X) from a marine fungus Aspergillus fumigatus MF330 (data not shown). Pyrolysis mass spectrometry is a whole-cell fingerprinting technique that enables the rapid and reproducible sorting of microorganisms using small samples in a fully automated system.135 It has been successfully used to distinguish nitrile-hydrolysing strains of actinomycetes, revealing a significant variation within pyrogroups containing strains with same genotypic characteristics, thus demonstrating its discriminatory capacity at the infraspecies level.136 Intact-cell MALDI-TOF mass spectrometry has been used for the rapid clustering of 456 strains based on their proteomes, resolving 11 separate groups and permitting the rapid identification of isolates for dereplication and the selection of rare species.137 In addition, MALDI-TOF MS can also be used for the dereplication of complex microorganism communities based on the mass spectra of constantly expressed high-abundant proteins, such as ribosomal proteins of 2000–20 000 Da.138 The color-grouping procedure establishing the taxon richness is limited when used to distinguish between different pigments. Unfortunately, this means that it is not possible to generate cumulative databases to compare the results of independent studies on indigenous populations of streptomycetes. Recently, a computer-assisted color-grouping method corresponding with rep-PCR data has been developed for dereplicating large numbers of alkaliphilic streptomycetes.139 This computer-assisted numerical analysis method is a cheap and reliable alternative to molecular and chemical dereplication methods because it offers a minimal taxon description method for large numbers of isolates.

Another issue is the dereplication of chemical compounds from the library. Tandem analytical techniques such as MS/MS/MS, GC-EI/MS, HPLC-SPE-NMR, LC-MS-MS and LC-NMR have also been developed for dereplication of natural products.140, 141, 142, 143, 144, 145 However, these methods can lead to false compound identifications because there are uncertainties in the observed pseudomolecular ions such as MH+, MNa+, MK+, MNH4+, MCH4CN+ etc.146 To speed up dereplication, Bitzer and Bradshaw147, 148 have used a reference library-supported analytical technique for the dereplication of crude extracts and pre-fractionated samples. However, this kind of approach is not always viable due to fact that the databases needed are not widely available. Recently, Munro's group has developed a new technology using HPLC profiling with biological evaluation followed by capillary probe NMR spectroscopy/ESMS/UV combined with NMR database (AntiMarin) evaluation, which reduces the crude extract requirement for dereplication to submilligram quantities.149 By this method, they isolated a new peptaibol chrysaibol, from a New Zealand isolate of the mycoparasitic fungus Sepedonium chrysospermum.150 It is worthy to note that the method developed by Munro et al. will be effective for crude extract evaluation, isolation and dereplication.

Last but not the least is the dereplication of positive hits with the development of target biosensor organisms (Candida albicans and Staphylococcus aureus) with differential whole-cell sensitivity. Cubist Pharmaceuticals constructed a multi-drug resistant E. coli strain, which carries resistance markers for 17 of the most frequently produced antibiotics.32 Thus, a comparison of extract activities against sensitive and resistant E. coli strains will allow researchers to rapidly prioritize extracts whose activities cannot be accounted for by any of these 17 antibiotics. We also used this strain to dereplicate the active extracts from our natural product library. In addition, we have developed a streptomycin- and rifampicin-resistant Mycobacterium smegmatis strain for screening of novel antituberculous compounds. Encouragingly, some extracts, while not active against the sensitive strain, are active against the resistant strain, leading to the rapid discovery of novel and specific active compounds.

Evaluation of the microbial strain and product library

Recent advances in molecular techniques have led to the development of databases that describe microbial diversity at the genetic level based on 16S rRNA sequence diversity.151, 152, 153 This mass of information and the highly conserved nature of the 16S rRNA gene can be used to identify and evaluate the diversity of the MNPL, and dereplicate genetically similar microorganisms.154 As well as general diversity, specific primers can be used to evaluate samples to target actinobacteria that have not been described earlier.155 With specific primers, Stach et al.156 could detect an actinobacterial diversity of at least one order of magnitude higher than those obtained with current culture-based techniques. In addition, denaturing gradient gel electrophoresis is another molecular tool used to detect the phylogenetic ‘fingerprint’ of diverse marine microorganisms.157, 158, 159 It is worthy to note that the addition of new 16S rRNA gene sequence information and the changes in phylogenetic positions of some taxa influence decisions about which 16S rRNA nucleotides to define as taxon specific. On the basis of 10 years of development in the identification of actinomycetes, Zhi et al.160 redefinied the higher ranks of the class Actinobacteria, with the proposal of two new suborders, four new families and emended descriptions of the existing higher taxa.

The quality of the microbial products in the library in terms of their physicochemical properties is critical for their development into useable drugs. Lipinski's rule of five, focusing on the MW and structure complexity, has been used as a rule of thumb to indicate whether a molecule is likely to be orally bioavailable (bioactive).161 However, various authors, including Lipinski, have pointed out that many antibacterial compounds have exceptions to these rules primarily because of their higher MW and polarity. For example, both daptomycin and cyclosporine A have MWs of 1620 and 1202, respectively, which is larger than Lipinski's ideal value of 500.162, 163 Leeson and Davis164 also showed a deviation in anti-infective drugs toward higher MW and increased polarity. Another example is the result from Ganesan's group that some 24 unique natural products in the 1970–2006 period violated Lipinski's rules.19 Of particular interest is O'Shea's results that the unique physicochemical property space required for antibacterial active compounds, and especially Gram-negative antibacterials, must be taken into account during high-throughput screening when identifying hits with whole-cell activity.165 These physicochemical properties are paramount in the uptake of compounds through the cell membrane and therefore the drugability of the compound.

Evaluating the potential of a library of compounds and identifying those with new activities and/or new structure is the basis of quality evaluation. Targeted high-throughput screening methods are important for the speed and accuracy of identification of novel antimicrobials; for example, in the case of platensimycin, paltemycin and MAC13243.123, 124, 166, 167 We have focused on systematic biology to develop high-throughput synergy screening for synergistic antifungal compounds (Figure 2).12 In our group, special target screening (pABA synthesis inhibition model; 14-3-3 protein binding model and AHLs quorum-sensing inhibition model) and the architecture-based screening (enediyne synthesis model) are used to give a comprehensive active evaluation of the compounds in the MNPL. From these evaluation models, many crude extracts or purified compounds were obtained as positive hits (Table 2). In addition for evaluation purposes, it is worthy to note that these screening assays also provide mode of action hypothesis from the crude extracts.

Figure 2
figure 2

Schematic representation of high-throughput methods applied to the process of drug discovery from natural resources. A full color version of this figure is available at The Journal of Antibiotics journal online.

Novel gene clusters such as polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs) in the library indicate the likelihood of novel compounds being produced. Ayuso-Sacido and Genilloud168 designed new PCR primers specifically targeted to amplify NRPS and PKS-I gene sequences from actinomycetes. By using primers designed by Ayuso-Sacido and Courtois, Goodfellow's group identified five strains with NRPS and PKS gene clusters from 38 marine actinomycetes.96 By using Ayuso-Sacido's primers, we have made a primary evaluation of the MNPL, where 14 strains with PKSs and NRPSs were identified from the randomly selected strain library and 27 PKSs and NRPSs were identified from the megagenomic library (Table 3). The Mohanty group has developed an NRPS-PKS software and organized the sequence information on various experimentally characterized NRPS and PKS gene clusters in the form of searchable computerized databases.169 This approach may facilitate the evaluation of the presence of NRPS and PKS gene clusters in the strain library. Recently, genome mining has been used for the successful induction of a silent metabolic pathway in the important model organism Aspergillus nidulans, which led to the discovery of novel PKS-NRPS hybrid metabolites.170 This method can also align strains based on genus and provides information on the metabolites and functional compounds as many metabolites of actinomycetes are produced by PKSs and NRPSs.135 Merck researchers have developed a new fingerprinting approach based on the restriction analysis of these PKS and NRPS-amplified sequences, and observed a good relationship between the presence of PKS-I, PKS-II and NRPS sequences and the antimicrobial activities in Streptomyces.135

Table 3 The PKS and NRPS screening results

In summary, the combined results of molecular evaluation techniques (16S universal and specific primers, PKS and NRPS gene clusters), physicochemical properties of compounds, and activities in vivo and in whole-cell assays determine the novelty and quality of the marine microbial strain and product library.

Conclusion

This review expresses our long-term interest in the construction of a high-quality microbial natural product library in the facilitation of high-throughput screening programs for the drug discovery process. In the initial phase, it is hard to say whether the active compounds obtained from our natural product library are novel; therefore, great care must be taken to dereplicate known compounds before comparing with the dictionary of the natural product library, to identify which compounds are truly novel. Although tools for the evaluation of the microbial natural product library have been developed and can be directly applied to maintain the quality of the library, the initial biodiversity of the strains used for the construction of the library is of ultimate importance to reduce the redundancy of chemical compounds. Selection of novel marine actinomycetes combined with the employment of a high-throughput fermentation method maximizes the biodiversity foundation of the library.