Introduction

Various strategies were used to study the relationship between the ecosystem functioning and the structure of microbial communities. Major goals of these efforts are to attribute key functions to specific community members and, in view of the ecosystem stability, to reveal cooperation between community members and functional redundancies. Besides measuring enzyme activities, respiration rates and metabolite concentrations (Landi et al., 2000; Moreno et al., 2001), nucleic acids (Eilers et al., 2000) and lipids (Fredrickson et al., 1995) are often used as markers for microbial identity, and to a certain extent, metabolic potential and its expression in environmental samples (Kanagawa, 2003; Fields et al., 2006). Structural data obtained from lipids are based on the occurrence of fatty acids or quinones that are specific for certain taxa. However, there are no specific markers for all known bacteria. Using phospholipid fatty acids (PLFA), a metabolic function is only inferred from the presence of microorganisms known for that function. In comparison to lipids, the amplification of structural and functional genes can be used as a good indicator for the presence of microorganisms and their metabolic potential. Meanwhile, DNA and RNA from environmental samples can be quantified in one step using microarray with several thousands of probes for known genes and pathways involved in biodegradation and metal resistance (Zhou, 2003; Rhee et al., 2004). Although the analysis of thousands of genes from environmental samples is already possible, the selection of probes in advance imposes a restriction to genes already known. Gene probes are also used for a combination of fluorescence in situ hybridization (FISH) and microautoradiography (MAR) to assign structural and functional genes to single cells of microbial communities (Wagner et al., 2006). In liquid samples, population dynamics can be followed by multiparametric flow cytometry and phylogenetic identification of subcommunities by 16S ribosomal RNA (16S rRNA) gene sequencing (Kleinsteuber et al., 2006).

Another strategy to assign activity to microorganisms is the use of substrates labeled with stable isotopes. The stable isotopes are predominantly incorporated into active biomass. Its enrichment in PLFA (Boschker et al., 1998) and DNA (Radajewski et al., 2000) of active microorganisms allows their differentiation from inactive community members. However, intermediates of the labeled substrate added can be also used by other species that are not involved in the pathway and may cause false results (Manefield et al., 2006).

In comparison to lipids and nucleic acids, proteins are promising alternative markers since they reflect the actual functionality with respect to metabolic reactions and regulatory cascades, and give more direct information about microbial activity than functional genes and even the corresponding messenger RNAs (mRNAs; Wilmes and Bond, 2006). In other respect, the use of proteins also bears the potential to reveal the identity of the active microorganisms via database analysis using the level of homology to other species. Furthermore, PCR-based techniques in nucleic acid detection are often biased by their preference for some templates, and thus, not ideal for quantitative analysis (Kanagawa, 2003). Finally, regulation of metabolic processes can take place at the post-transcriptional level, hence uncoupling quantities of mRNA transcripts and actual activity. Thus, the presence of specific proteins in environmental samples is a potentially reliable indicator for microbial function.

Meanwhile, a variety of powerful methods enables to visualize and identify the proteome of pure microbial cultures and also increasing proteomes of microbial communities are analyzed. Proteomics has become a useful complement of functional genomics. The availability of numerous genomes triggered this development by facilitating the identification of proteins, although the available genome data are still likely to represent only a tiny fraction of the genetic information present in environmental samples. The most widely used approach in proteomics is the separation of proteins by 2D-electrophoresis (2-DE), which combines isoelectric focusing (IEF) in the first dimension and sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) in the second dimension (Klose, 1975; O'Farrell, 1975; Görg et al., 1988). This separation is followed by in-gel digestion of selected spots and protein identification by mass spectrometry (MS). In the case of proteins from fully sequenced organisms, the identification is achieved by data bank comparison of peptide mass fingerprint (matrix assisted laser desorption ionization time of flight MS (MALDI-ToF-MS)) or fragmentation data of the peptides (electrospray ionization source tandem mass spectrometry (ESI-MS/MS) or MALDI-ToF/ToF-MS). Identification of peptides from unsequenced species can be achieved in two ways; either the peptides and their MS/MS fragments resemble those of known proteins (in the case of high homology) or the peptides are sequenced by MS and exhibit sufficient homology to published sequences (in the case of lower homology). Owing to the ever increasing number of fully sequenced genomes, there is a fairly good chance that homologies can be found. A prerequisite for this procedure is high quality MS data requiring pure samples containing enough copies of every protein.

The preparation of protein extracts is thus of paramount importance. This is the main reason why proteomics was restricted mostly to microbial isolates in culture. In fact, its application to soil requires considerably improved protocols of protein extraction and sample preparation (Nannipieri, 2006; Ogunseitan, 2006) since these are very critical steps for gaining high resolution and sensitivity of 2-DE. The extraction of water-soluble proteins from soil has been proposed recently (Schulze et al., 2005). In this approach, minerals were dissolved by hydrofluoric acid and the extracted proteins identified by MS coupled to liquid chromatography. Methods based on freeze-thawing or bead beating of soil for extracting intra- or extracellular proteins were also proposed (Singleton et al., 2003). The greatest difficulty lies in the separation of proteins from humic acids. Another environmental medium of emerging importance is groundwater due to the increasing need for high quality drinking water. Here, the challenge consists in the low abundance of biomass. Until now, several authors presented metaproteome data from different environmental systems: soil particles (Schulze et al., 2005), activated sludge (Wilmes and Bond, 2004), biofilms (Ram et al., 2005) and seawater (Kan et al., 2005). Additionally, metaproteomics was used to determine the stress response of mixed cultures after exposure to cadmium (Lacerda et al., 2007).

The objectives of this paper were (i) to develop an extraction protocol to allow the analysis of the protein composition of environmental samples by a combination of SDS-PAGE or 2-DE and liquid chromatography online linked to MS/MS via electrospray ionization source (LC-ESI-MS/MS) and (ii) to use the metaproteomic information to attribute metabolic functions to community members, that is to link structural and functional aspects of the microbial diversity of these ecosystems. The method was applied to soil microcosms that had been enriched in inoculated chlorophenoxy acid-degrading bacteria by incubation with 2,4-dichlorophenoxy acetic acid (2,4-D) and to groundwater from the chlorobenzene-contaminated aquifer of Bitterfeld (Saxony-Anhalt, Germany), where chlorobenzene is aerobically degraded by a microbial community of known composition but of unknown roles of its individual community members. Both environments are examples for ecosystems contaminated with organic compounds and they were chosen to demonstrate that metaproteomics may be used for indication of biodegradation processes.

Experimental procedures

Enrichment of chlorophenoxy acid-degrading bacteria in soil percolation experiments

Two columns (Table 1) of 250 ml volume were filled with 220 g compost soil from Leipzig (Germany), which is rich in humic organic matter. The first column was inoculated with 50 ml of a mixture of Cupriavidus necator JMP134, Rhodoferax sp P230 and Sphingomonas herbicidovorans B488 (30 mg dry weight) previously grown on succinate as carbon and energy source. In the second column, no bacteria were added and only the indigenous microorganisms were present as starter culture. The columns were percolated with minimal medium (Muller and Babel, 1986), containing 2 mM 2,4-D at a flow rate of 3 ml h−1. After 3 days, the flow was increased to 6 ml h−1 and after 5 days to 12 ml h−1. After 7 days, the flow rate was decreased to 6 ml h−1 and remained the same until the end of the experiment after 22 days. The concentration of 2,4-D was determined by HPLC separation and UV-detection as described previously (Oh and Tuovinen, 1990). Five grams of soil were sampled after homogenization of the column content.

Table 1 Experimental conditions used for soil percolation experiments and in the groundwater treatment reactor

Treatment of groundwater samples

The groundwater (Table 1) originated from the effluent of a field reactor (0.6 m diameter × 12 m length) filled with lignite-containing glacial sandy material (porosity 0.2) of the quaternary aquifer at Bitterfeld, Germany. The reactor was continuously recharged with anoxic, contaminated groundwater (15 mg l−1 to 20 mg l−1 chlorobenzene) from the site at an average linear velocity of 1.4 m day−1. During the sampling period, a short pulse of pure oxygen gas was sparged daily into the interstitial voids of the reactor material. Upon gas injection, the oxygen gas dissolved creating intermittently aerobic conditions for several hours throughout the full length of the reactor. Cellular biomass (motile and sheared-off cells from biofilms covering the reactor filling) of 1 l reactor effluent was obtained by centrifugation (16 000g, 10 min, 4°C). The cells were collected on filters, stained with 4′,6-diamidino-2-phenylindole (DAPI) and counted using fluorescence microscopy (Vogt et al., 2004).

Protein extraction and purification

The original compost soil was used to develop the extraction and purification method applied later on also for groundwater samples with low protein contents. This led to the following protocol: 5 g of the original compost soil or the soils from percolation experiments were treated with 10 ml 0.1 M NaOH for 30 min (Figure 1). The suspension was centrifuged 10 min at 16 000g and 20°C. About 6 ml of supernatant were mixed with 16 ml liquid phenol (10 g phenol and 1 ml water) and 10 ml water and shaken for 1 h at 20°C. Afterwards, the phases were separated by centrifugation (10 min at 14 000g). About 15 ml of the lower phenol phase were collected and washed by mixing with 15 ml water, followed by 5 min shaking and subsequent centrifugation. The proteins in the phenol phase (15 ml) were precipitated with the 5-fold volume of 0.1 M ammonium acetate in methanol at −18°C, overnight. Then, the sample was centrifuged (10 min, 16 000g, 0°C), the pellet was suspended by ultrasonication in 10 ml 0.1 M ammonium acetate in methanol, incubated 15 min at −18°C and centrifuged again. The pellet was successively washed in 2 ml 0.1 M ammonium acetate in methanol, 2 ml 80% acetone, 2 ml 70% ethanol, each washing step including 15 min incubation and subsequent centrifugation.

Figure 1
figure 1

Scheme for extraction and purification of proteins from soil.

The cell pellet from the groundwater was incubated with 400 μl 0.1 M NaOH (1400 min−1, 20°C). The sample was centrifuged (10 000g, 10 min, 4°C) and 400 μl supernatant was mixed with 900 μl liquid phenol and 400 μl distilled water. The phases were separated by centrifugation (10 min at 16 000g) and protein in the phenol phase (1 ml) was precipitated with the 5-fold volume 0.1 M ammonium acetate in methanol at −18°C, overnight. Afterwards, the pellet was resuspended and washed as described above.

SDS-PAGE and 2D-PAGE of extracted proteins

Extracted proteins were separated by SDS-PAGE (Laemmli, 1970) or 2D-polyacrylamide gel electrophoresis (2D-PAGE; Klose, 1975; O'Farrell, 1975; Görg et al., 1988). Before electrophoresis, the protein pellets were dissolved in 10 μl deionized water by sonication (sonication bath, 5 min). For SDS-PAGE, 2 μl to 5 μl solubilized proteins were mixed with sample buffer (Laemmli, 1970), incubated 5 min at 90°C and loaded on SDS gels (4% stacking gel, 12% separating gel). For 2D-PAGE, 8 μl of solubilized protein were mixed with 50 μl IEF sample buffer (8 M urea, 2 triton X100, 2% CHAPS, 0.28 dithiothreitol (DTT), 0.5% IPG-buffer pH 3-10 NL and bromophenol blue) and treated as described previously (Benndorf et al., 2006). After electrophoresis, gels were stained with colloidal Coomassie brilliant blue and dried in a stream of unheated air.

Identification of proteins

For identification of proteins from SDS-PAGE of soil or groundwater samples, the complete lanes were divided into bands and digested overnight with trypsin (Santos et al., 2004). For identification of proteins from 2D gel of groundwater, the most abundant spots were excised and digested overnight. The extracted peptides were separated by reversed-phase nano-LC (LC1100 series, Agilent Technologies, Palo Alto, CA, USA; column: Zorbax 300SB-C18, 3.5 μm, 150 × 0.075 mm; eluent: 0.1% formic acid, 0% to 60% acetonitrile) and analyzed by MS/MS (LC/MSD TRAP XCT mass spectrometer, Agilent Technologies). Database searches were carried out with MS/MS ion search (MASCOT, http://www.matrixscience.com) against NCBInr. Alternatively, the mass spectra were searched against the NCBInr using the software Spectrum Mill (Agilent Technologies).

Results and discussion

Using proteins as biomarkers requires their extraction and purification from environmental samples. Our objective was thus to develop an extraction protocol for a wide range of samples from soil, rich in humic organic matter, to groundwater, comprising only low amounts of protein. Since organic soil is the most complex of these matrices, initially an extraction and purification protocol suitable for compost soil was developed and later adapted to aqueous samples. There are basically two different strategies for protein extraction from soil. Whereas some groups used buffers containing salt and detergents to release proteins from the organic and inorganic components of the soil matrix (Rigou et al., 2006), others extracted proteins by dissolving the soil mineral matrix in concentrated HF (Schulze et al., 2005). Under more rigorous conditions more proteins are extracted, but also more contaminating humic compounds, which make further purification necessary. Our protocol consists of unselective extraction, protein purification and precipitation (Figure 1). An initial extraction step with 0.1 M NaOH is widely used to extract humic compounds from soil (isolation of International Humic Substances Society (IHSS) soil fulvic and humic acids, http://www.ihss.gatech.edu/soilhafa.html). This method has the side effect (often in combination with SDS) to lyse efficiently bacteria for subsequent extraction of protein (Guerlava et al., 1998). The dark-brown color of NaOH extracts from untreated compost soil indicated that besides intracellular proteins of soil microorganisms, high amounts of humic organic matter were eluted, which we attempted to remove by a water and liquid phenol two phase extraction that had been used to purify proteins from lysates of olive leafs (Wang et al., 2003) and fungal spores (Benndorf, unpublished results). Phenol is also used to remove proteins in DNA and RNA purification protocols (Kirby, 1956), because its gentle hydrophobicity makes it a better solvent for proteins than water. After phase separation, most of the brown color was found in the water phase, which was separated from a clear light yellow phase by an opaque creamy inter-phase. After precipitation of the contents of all phases, nearly the entire pellet obtained from the phenol phase was soluble in sample buffer for SDS-PAGE, whereas the pellets from the water phase and the inter-phase contained much insoluble dark-brown material. During SDS-PAGE of the soluble material of all phases, some material precipitated in the sample slots and produced brown smear over the entire length of the lanes (not shown). A subsequent Coomassie stain also detected the brown substances, thus confirming that this dye visualizes humic compounds (Aoyama, 2006). The phenol phase contained significantly less brown smear showing that most humic compounds were successfully removed. These findings correspond to the data of (Simonart et al., 1967) who used a similar procedure for enrichment of bulk soil proteins. However, even in the lane of the phenol phase there was no defined band visible. A likely reason is the lack of dominant organisms and enzymes in the highly complex and heterogeneous control soil (Singleton et al., 2003). Another possibility is that our procedure also extracted large amounts of extracellular protein from soil particles (Nannipieri, 2006) which is presumably also heterogeneous and prevented the detection of bands. To overcome this problem and to introduce target proteins for metaproteome analysis, we enriched the soil with autochthonous chlorophenoxy acid-degrading bacteria by percolating the soil with 2,4-D and inoculated another 2,4-D-percolated soil column with the 2,4-D-degrading bacteria C. necator JMP134, Rhodoferax sp P230 and S. herbicidovorans B488. We opted against simply spiking soil with either protein or massive amounts of known bacteria to approach the natural situation, where a bacterial community has time to establish tight interactions with its habitat including physicochemical interactions with the soil matrix. After 22 days, 2,4-D (2 mM in the influent) was completely degraded during the passage through the inoculated column, whereas 33 μ M 2,4-D remained in the eluate of the column with autochthonous bacteria, showing that 2,4-D-degrading communities had established in both columns. Protein extracts from both columns were separated by SDS-PAGE and some bands were detected before blue background (Figure 2). The band patterns were similar except for bands at 29 and 12 kDa that were only present in the sample that had been inoculated with lab cultures (Figure 2, lane 2). The separation of both samples by 2-DE failed, although phenol extraction is one of the most powerful purification steps in sample preparation. Additionally, other purification strategies, for example trichloroacetic acid (TCA) precipitation and acetone washes, were tried. Mostly much brown material enriched in the acidic half of the IPG-strip. An alternative to remove that material would be liquid IEF. Finally, each lane of the SDS-PAGE from soil samples was cut into 23 pieces and submitted to protein identification by nanoLC-ESI-MS/MS. 2,4-Dichlorophenoxyacetate dioxygenase and outer membrane protein (porin) were identified in both samples, whereas catechol 1,2-dioxygenase, an enzyme found in the metabolism of chlorinated catechol intermediates, and molecular chaperone GroEL were only identified in the sample from the column inoculated with lab cultures. The presence of both metabolic enzymes known to be involved in the catabolism of 2,4-D corresponds well with the observed degradation of 2,4-D in both columns. The failure to identify catechol 1,2-dioxygenase in the column with autochthonous bacteria may be due to low gene expression or insufficient sensitivity of the MS analysis, as it corresponds with the lower degradation rate observed in that column. Unfortunately, the database search failed to identify the proteins at 29 and 12 kDa that were only present in the extract from the inoculated column. The appearance of distinguishable amounts of an outer membrane porin and the molecular chaperone GroEL indicates that percolation with 2,4-D increased the microbial biomass, presumably of species involved in degradation of 2,4-D. Porins are involved in the transport of small molecules and chaperones in maintaining the integrity of proteins. All four proteins, regardless of the origin of the column matched best (hits with highest Mowse score in MASCOT) with entries of C. necator JMP134, a well described 2,4-D-degrading species. For the column without specific inoculation, it appears that an organism identical or related with C. necator JMP134 that was present in compost soil was enriched upon growth on 2,4-D. For the inoculated column, it may indicate that this autochthonous strain and/or its inoculated relative outcompeted Rhodoferax sp P230 and S. herbicidovorans B488 under the applied conditions.

Figure 2
figure 2

Proteins extracted from soil columns percolated with 2,4-dichlorophenoxy acetic acid (2,4-D). Lane 1: standard proteins; lane 2: soil from column augmented with C. necator JMP134, Rhodoferax sp P230 and S. herbicidovorans B488; and lane 3: soil from column without inoculation. Phenol phases were precipitated and washed with organic solvents before sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and staining with Coomassie brilliant blue G250.

Assuming that the entire 1.285 g of 2,4-D supplied to the column during the 22 days was metabolized and finally converted into biomass (assumed yield: 0.2 g dry mass/g of 2,4-D; protein content: 0.5 g protein/g dry of mass) and remained as such, 1 g soil should contain at least 0.58 mg protein of 2,4-D-degraders, which is 2 to 10 times more than the typical protein content of natural soils (Vance et al., 1987). It appears thus a relatively high amount of protein enabled the identification of at least some proteins. For calculating the efficiency of extraction, a protein determination after the phenol extraction and subsequent precipitation with organic solvent was not possible since the precipitates were only soluble in SDS-sample buffer or IEF sample buffer-containing high concentrations of detergents and reducing reagents. On the basis of a comparison of the intensity of the complete lane on SDS gels with a bovine serum albumin (BSA) in standard of the same gel, a rough estimation of the extraction yield gave that only 31 μg protein were extracted from 5 g soil that, in theory, could have contained 2.5 mg protein. This corresponds to an extraction efficiency of about 1% and illustrates the difficulties of quantitative protein extraction from soil. There may be some potential for improvement by starting extractions with higher amounts of soil, increasing the ionic strength during the elution step or adding metal ion-complexing compounds to minimize the interactions between proteins and humic acids (Criquet et al., 2002). However, elution of soil organic matter with NaOH is a very strong and effective procedure (80% of SOM; Stevenson, 1982) that probably also minimizes the interactions between proteins and humic compounds. Therefore, the potential effect of increasing the ionic strength or adding complexing compounds appears to be limited.

The protocol was also applied to a chlorobenzene-contaminated groundwater sample containing about 107–108 cells l−1. The centrifugation pellet of 1 l groundwater contained, besides cells, also particles from the reactor material consisting of sand and lignite microparticles as could be seen from its dark color. Proteins were extracted with 0.1 M NaOH and purified by phenol extraction as described above. One-third of the final precipitate was separated by SDS-PAGE (Figure 3). About 20 Coomassie blue-stained bands were distinguished. The remaining two-thirds of the sample were separated by 2-DE in which about 100 spots were detected (Figure 4). The results show that the extraction protocol worked well with the groundwater samples that were low in humic organic matter. Nineteen bands from 1D gels and 50 most abundant spots from 2D gels were excised and used for protein identification. Table 2 summarizes 29 proteins that were identified in bands from the 1D gel and Table 3 summarizes 26 proteins identified in the 2D gel. The proteins belong to several functional groups and include enzymes as well as chaperones. In both gels together, for instance all enzymes of the chlorobenzene degradation pathway via 3-chlorocatechol to 3-oxoadipate (Supplementary Figure 1) were detected (van der Meer et al., 1998). The best matches of all catabolic enzymes were achieved with Proteobacteria sequences including the taxa Acidovorax and Pseudomonas that had been previously identified in that site (Alfreider et al., 2002; Balcke et al., 2004) of which members of the genus Acidovorax play a key role in the catabolism of chlorobenzene under oxygen limitation (Nestler et al., 2007). This hypothesis has to be proved by 16S rRNA FISH. Some of the catabolic enzymes, for example, chlorobenzene dioxygenase, chloromuconate cycloisomerase, dihydrodiol dehydrogenase and chlorocatechol 1,2-dioxygenase appeared in more than one spot in the 2D gel. This could be due to some degree of proteolysis during extraction, posttranslational modification or isoenzymes originating from different genes as also described for the metaproteome of a natural acidophilic biofilm (Lo et al., 2007). Proteolysis is not likely because of the extraction procedure, where the treatment with NaOH and the following treatment with phenol and organic solvents are mostly inactivating proteases. A clear distinction between posttranslational modification of isoenzymes of different genetic origin cannot be drawn yet, although significant differences in the mobility on the gel may indicate distinct isoenzymes. If this was true, this would indicate that we resolved a metaproteome comprising different species. Indeed two unique peptides with amino acid substitutions matching a homologue region confirmed that at least two chlorocatechol 1,2-dioxygenases of different genetic origins were present. One reason for the occurrence of several chlorocatechol 1,2-dioxygenases could be that specific enzymes dominate in a bacterial community that correspond best to the different availability of substrates such as 3-chlorocatechol or oxygen in the aquifer (Kukor and Olsen, 1996; Krooneman et al., 1998; Nestler et al., 2007). The alignment of partial sequences of MASCOT-identified elongation factors Tu (Figure 5) confirmed the presence of isoproteins of different genetic origins by unique peptides (grey background) with amino acid substitutions (black background) matching the homologue regions. Both results show that our protein extract represented a part of the metaproteome of the Bitterfeld aquifer.

Figure 3
figure 3

Proteins extracted from groundwater separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Lane 1: standard proteins; and lane 2: proteins extracted from ground water. Phenol phase was precipitated and washed with organic solvents before SDS-PAGE and staining with Coomassie brilliant blue G250. Arrows point towards gel slices cut from gel for protein identification.

Figure 4
figure 4

Proteins extracted from ground water separated by 2D-polyacrylamide gel electrophoresis (2D-PAGE). The phenol phase was precipitated and washed with organic solvents before sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and staining with Coomassie brilliant blue G250. Numbers mark identified proteins (refer to Table 2 for identifications). The isoelectric point and molecular weight scales were estimated by the values of the identified proteins.

Table 2 Identification of proteins extracted from groundwater and separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE)
Table 3 Identification of proteins extracted from groundwater and separated by 2D-polyacrylamide gel electrophoresis (2D-PAGE)
Figure 5
figure 5

Partial alignment of amino acid sequences of elongation factors Tu of selected species. From MASCOT result list, the first species and the corresponding sequence of all hits with at least non-redundant peptides were selected. Marked amino acids were covered by the peptides measured with electrospray ionization source tandem mass spectrometry (ESI-MS/MS). Bold letters: peptides that match to more than 50% of the species. Underlined letters: peptides that match to less than 50% of the species. Grey background: peptides that match to the same region of the proteins, but differ in at least one amino acid marked with black background. Numbers above the alignment correspond with amino acid residues of selected proteins.

The existing doubts about the metaproteomic nature of our results are a major challenge for the further development of the approach. The peptides of proteins identified by database search with MASCOT very often cover only small portions of proteins sequences (Supplementary Tables 1–3) and complicate the discrimination between isoenzymes and their taxonomic classification. However, the indirect, homology-based approach is necessary in cases where metagenomic information is not available. This is the typical situation in samples from soil and contaminated sites. Increasingly, examples are reported, where the availability of metagenomic data facilitated the metaproteomic analysis of acidic mine drainage by LC-MS/MS (Ram et al., 2005). MS-based de novo sequencing would represent an improvement of metaproteome analyses and was demonstrated as a possibility to extract protein sequence data from proteins of different sources (Wilmes and Bond, 2004; Schulze et al., 2005; Lacerda et al., 2007). Recently, there have been substantial developments of instrumentation (CID-MALDI-ToF/ToF, (Samyn et al., 2006) and sample preparation methodology (for example by derivatization, Chen et al., 2004). Sequences obtained in this way can be compared with known genomes and this will reveal similarities to known functional proteins thus pointing at the function fulfilled by the protein. This approach will also yield data information about proteins that have not been observed before.

Conclusion

Alkaline extraction of protein from environmental samples followed by purification with liquid phenol extraction resulted in samples that could be separated on 1D or 2D gels. This is the first report on the MS identification of intracellular proteins from microbial communities inhabiting soil and groundwater.

The enzymes identified in a groundwater sample mirrored the observed metabolism of chlorobenzene and thus represent a part of the functional metaproteome of this environment. Isoforms of catechol 1,2-dioxygenase and of elongation factor Tu indicate that at least two species are involved in the biodegradation of chlorobenzene.

However, there is an urgent need for improved extraction and fractionation schemes for samples of high complexity, and in particular, high humic matter contents such as soil samples (Wilmes and Bond, 2006). On the basis of the proteins extracted from microcosms, the autochthonous community established on 2,4-D appears to be similar to the community established after inoculation with laboratory strains and shows the potential of autochthonous bacteria to compete with laboratory strains.

Finally, increasing availability of genome data from environmental samples and improved de novo sequencing strategies will facilitate the analysis of environmental metaproteomes and their functional interpretation.