Introduction

The coffee plant is native to tropical Africa and southern Asia. It was introduced to the Americas in the late 18th where, currently, the largest producing countries are located, including Brazil, Colombia, Peru, Guatemala, Honduras, and Mexico. Colombia is the third-largest coffee producer in the world after Brazil and Vietnam1. Colombian coffee comes almost exclusively from Arabica cultivars and is often regarded as some of the highest-quality coffee in the world. This is mainly due to rich volcanic-ash soil, abundant annual rainfall (200 mm per month on average), and high altitudes (1,800 to 2,000 m) where the coffee trees are grown2.

In the international market, coffee is classified into two main categories according to the postharvest processing technology used to remove the outer layers adhered to the fruits: ‘natural coffee’, produced from coffee beans processed on the farm by the simple method of sun-drying, known as dry processing; and ‘washed coffee’, produced from coffee beans that undergo a relatively complex series of steps, including depulping, fermentation, and sun-drying, known as wet processing3,4. The abundant rainfall and warm temperatures in the regions of Colombia where coffee is grown causes an immediate, undesirable fermentation after harvest4. The most practical way to avoid this unwanted fermentation is the wet processing method, whereby fermentation can be controlled in terms of time, temperature, and exchange of water; spontaneous development of microorganisms can be better managed to minimize any adverse impacts on coffee quality4.

Due to the high quality of ‘washed coffee’ produced in the Colombia, nowadays, fermentation process is a subject of worldwide interest3,5. Countries such as Brazil, Guatemala, Honduras, and Ecuador export, annually, tons of coffee beans classified as ‘washed coffee’. Thus, studies have been devoted to uncovering the microbial diversity and chemical changes involved in spontaneous coffee fermentation worldwide6,7,8. Recent microbiome studies have shown the rich microbial diversity, comprising more than 80 genera, in coffee fermentations conducted in Brazil and Ecuador9,10. The complex microbial activity produces ethanol, lactic acid, and a range of minor compounds, such as esters, higher alcohols, aldehydes, and ketones, which are speculated to diffuse into the beans and impact in the final composition of coffee beverage11,12,13,14,15.

Here we conducted a comprehensive study of the microbial structure during spontaneous coffee-bean fermentation in Colombia. By sampling the fermentation process at different times, we were able to describe the composition, distribution, and dynamics of naturally occurring microbial communities. Our study describes, for the first time, the great diversity of bacteria and yeasts harboured in the most traditional method of coffee fermentation in the world. These core microbiomes have strong influence on the metabolites produced during the fermentation and may impact the chemical composition of Colombian coffees.

Results and Discussion

We obtained a total of 297,126 and 129,126 300-bp quality-filtered 16S and 18S rRNA gene sequences from temporal sampling, respectively. The alpha rarefaction curves of each temporal analysis suggest that most of the microbial diversity has been sampled (Supplemental Fig. S1). Estimated richness (Ace and Chao) and diversity (Shannon and Simpson) were calculated to evaluate the alpha and beta diversity of microbial communities through the fermentative process (Table 1). The higher ACE and Chao index indicated the higher microbial richness, which means that richness of microbiota increased during the fermentation process. On the other hands, diversity depends not only on richness, but also on evenness. Thus, due to the gradual increased of read sequences relative to LAB species and decreased evenness, Simpson and Channon index decreased in the course of the fermentation process.

Table 1 Microbial community richness and diversity indices of the 16S and 18S rRNA gene sequences for clustering at 97% sequence similarity from Colombian coffee beans fermentation.

The sequences corresponded to 173 different microbial genera after searching in the SILVA database. The complete list of bacteria and fungi at the genus level is shown in the supplemental material (Table S1). This represents a greater diversity than those found in coffee fermentations conducted in Brazil and Ecuador using the Illumina-based amplicon sequencing approach9,10. The more microorganisms participate in a community, the more complex the interactions are. As in other well-studied fermentation models, microbial interactions emerge as yeast-bacteria, bacteria-bacteria and yeast-yeast, and also interactions of filamentous fungi with other species16,17. Here, Leuconostoc in the prokaryote group and Pichia nakasei in the eukaryotes were found to govern the fermentation process (Figs 1 and 2). LAB of the genera Leuconostoc are commonly found in association with yeast in domestic applications (e.g., wine, cocoa and kefir) and in nature in over-ripened or faulty fruits18. The complex nature of this interaction is highlighted by the observations that (i) the autolysis of yeasts release nutrients, such as amino acids, polysaccharides and riboflavin, favorable for bacterial growth16,19, and that (ii) the acidification of the fermentation media by LAB creates a prone environment for yeast development4. These positive interactions have been shown to promote desired sensory attributes in wine, sourdough and yogurt. However, information about these mechanisms in coffee fermentation is scarce.

Figure 1
figure 1

Relative abundance of bacteria at phyla level during Colombian spontaneous coffee beans fermentation process. (A) Microbial groups with relative prevalence ≥0.1%. (B) Microbial groups with relative prevalence <0.1%. Identification and distribution at genus level are shown in the supplemental material (Table S1).

Figure 2
figure 2

Relative abundance of fungi during Colombian spontaneous coffee beans fermentation process.

The least-found sequences were commonly associated with microbial groups originating from native soil (Rhizobium, Methylobacterium, Pseudomonas, Bacillus, Burkholderia, Agrobacterium, Azospirilum, Streptomyces, Planctomyces, and Phenylobacterium), water source used in the fermentation (Acinetobacter and Polaromonas), air surrounding the fermentation tank (Pedobacter, Stenotrophomonas, Sphingomonas and Microbacterium), birds and domestic animals (Enterobacter, Micrococcus, Bacillus, Turicibacter, Corynebacterium, and Brevibacterium), insects (Acetobacter, Gluconobacter, Gluconacetobacter, Luteimonas, and Alcaligenaceae), and human related (Sphingobium, Corynebacterium, Malassezia, Candida, Tremellaceae, Enterococcus, Paracoccus, and Caulobacteraceae)20,21,22,23,24,25,26,27,28,29,30, of which 54 genera had not been previously reported in the coffee fermentation process (Table S1). Some of these microbial groups harbor a remarkable fermentation profile that includes production of aromatic compounds, enzymes, and organic acids25,26. Although present in low proportions, this wide diversity indicate a microbial activity specific to geographical region and niche, and may impart flavours that yield clues to the terroir of Colombian coffees. In addition, this complex ecosystem can be used as a source of microbial and biosynthetic diversity for natural products discovery, such as synthesis of antibiotic, pigment and amino acid, bioremediation of dye- and hydrocarbon-contaminated sites, and accumulation of silver nanoparticles (Supplemental Table S2).

Relative abundance of bacterial groups presenting >0.1% of read sequences is shown in Fig. 1A. The beginning of the fermentation process was characterised by the high prevalence of LAB and bacteria belonging to the families Enterobacteraceae and Acetobacteraceae. The Enterobacteriaceae family was represented by the genera Erwinia, Klebsiella, Pantoea, Serratia, Enterobacter, and Citrobacter, while Gluconobacter, Acetobacter, Gluconacetobacter, Roseomonas, and Roseococcus genera represented Acetobacteraceae family (Supplemental Table S1). Enterobacteraceae are mainly associated with human contact and formation of off-flavour metabolites, such as 3-isopropyl-2-methoxy-5-methylpyrazine, 2,3-butanediol, and butyric acid27,28,29. Acetobacteraceae are commonly reported in commensal and symbiotic relationship with sugar-rich diet insects, such as bees, Drosophila, and ants30. As the coffee bean pulp presents high sugar content, Acetobacteraceae can disperse into fermentation sites using the insects as dispersion vectors31. The major metabolism of Acetobacteraceae members results in the production of acetic acid—a volatile organic acid that can decrease significantly the coffee beverage quality when found in concentrations higher than 1 g/L32. However, members of the Acetobacteraceae and Enterobacteraceae families decreased dramatically after 12 h of fermentation, being overlapped by LAB (Fig. 1A).

Within LAB group, Leuconostoc showed a continuous domain of the process with a prevalence peak of 84% observed at 24 h (Fig. 1A). Leuconostoc was also a dominant microbial group in coffee fermentations inside Brazil, Mexico, Ecuador, Taiwan, and India biomes6,9,10,33,34. This confirms Leuconostoc as a core bacterium in coffee-bean fermentation worldwide. In addition, LAB belonging to the families Streptococcaceae (Lactococcus) and Lactobacillaceae (Lactobacillus and Pediococcus) were detected in significant presence during the initial and final fermentation phases, respectively (Fig. 1A). The presence and dominance of LAB indicates that these microorganisms are adapted to the environmental conditions and stressor factors that coffee fermentation imposes, such as low pH, availability of sugars, and competition with other microorganisms32. These lactic acid-producing bacteria contribute to the demucilage process of coffee pulp and inhibition of the growth of pathogens, spoilage microbes, and toxin-producing fungi35. Other LAB members found in low proportions in this study, such as Weisselia and Fructobacillus, were reported with significant presence in coffee fermentations of Brazil and Mexico6,36. These LAB genera possess specific metabolism, such as improved fructose consumption and extracellular polysaccharides production, which may induce metabolite changes in the Colombian process37. Further investigations are necessary to obtain a comparative microbial diversity and activity between different coffee producing regions.

Pichia nakasei was the dominant group within eukaryotes, followed by Candida sp., Dipodascus tetrasporeus, and Malassezia sp. (Fig. 2). Pichia have been reported as a dominant yeast in coffee beans fermentation conducted in Brazil, Tanzania, and China7,8,38,39; however, this is the first study to report the dominance of P. nakasei. This species have been isolated from fruit or fruit products, and classified into P. kluyveri clade40. Other fungal groups were found at specific time of fermentation process, including Schwanniomyces, Bensingtonia, Candida vanderwaltii, Martiniozyma asiatica, Physciceae, Tremellaceae, and Lobulomycetales (Fig. 2). Because coffee fermentation take places in an open tank, the process is susceptible to contaminations from human contact, air, insects, and dead leafs41,42. In addition, some extremophile yeasts, such as Dipodascus tetrasporeus and Lobulomycetales, were for the first time reported in coffee fermentation process. These yeast species are generally associated with deep Pacific Ocean sediment and subglacial sediments of the Andean Mountains43,44,45. The coffee farm sampled in this study is localized at 161 km from Colombian coastal region, which may facilitate the migration of these species into the coffee farm environment.

The changes in major non-volatiles (sugars, organic acids and ethanol) and volatiles (acethaldeyde, ethyl acetate and hexyl acetate) metabolites were quantified in the course of fermentation time (Fig. 3). The initial composition of Colombian coffee pulp was 0.82 g/L glucose, 1.16 g/L of frutocse, 0.20 g/L latic acid, pH 5.2 and 5.3 °Brix. The Brix and sugar concentration showed an increase during the initial 12 h of fermentation (Fig. 3A,B). This observation can be a result of the action of hydrolytic enzymes produced by microbial metabolism, which promotes the breakdown of pectin, cellulose, sucrose, and other coffee pulp complexes carbohydrates, into monomers of glucose and fructose6,46. After this increase, both glucose and fructose were partially consumed until 18 h, followed by a stabilization of the concentration until the end of the fermentative process. A residual content of 0.98 and 1.52 g/L for glucose and fructose was observed, respectively. The presence of residual sugars is an evident characteristic of spontaneous coffee fermentations, as previously observed in Brazil, Ecuador, India, and Mexico6,10,47,48.

Figure 3
figure 3

Course of sugar consumption, metabolite formation, pH, temperature and Brix during Colombian spontaneous coffee beans fermentation process.

No significant changes were observed in the contents of ethanol, citric acid, acetic acid, and succinic acid (Fig. 3D). On the other hand, lactic acid showed a significant increase, reaching maximum concentration of 0.37 g/L at the end of the fermentation, causing pH decay from 5.2 to 4.2 (Fig. 3B). The steady increase in lactic acid content is correlated to LAB growth that display fermentative metabolism with usually lactic acid as the main metabolic end product37. This is also an important end-metabolite associated with coffee fermentation, which assist in the coffee-pulp acidification process without interfere in the product final quality32,36,49,50.

The major volatiles measured by GC were acetaldehyde, ethyl acetate, and hexyl acetate (Fig. 3C). Among these, acetaldehyde showed a greater variation through the fermentation, ranging from 0.99 to 3.33 µmol/L. Acetaldehyde is predominantly produced by yeasts of the genus Pichia, which have low alcohol dehydrogenase activity responsible for the conversion of acetaldehyde into ethanol51. This aromatic molecule is widely known to contribute to fruity sensory notes in alcoholic beverages52. The direct relationship between acetaldehyde and coffee quality is, however, not yet known. It is possible to speculate that an intense diffusion during fermentation may have a complementary function in the development of fruity notes of Colombian coffees. Further studies on this relationship are needed

A total of 20 minor volatile metabolites were detected through the fermentative process by GC–MS, including 5 organic acids, 3 alcohols, 1 aldehyde, 4 esters, 2 ethers, 1 furan, 2 hydrocarbons,1 phenol, and 2 terpenes (Table 2). The main metabolites found at the beginning of the fermentation process were 1,2-benzenedicarboxylic acid, 9,12-octadecadienoic acid, and 1-pentadecanal. However, the concentration of these compounds was significantly reduced with the fermentation period, possibly due to evaporation or microbial use as precursors in metabolic routes3. On the other hand, the content of 1-hexanol, 2-heptanol, nonanal, isoamyl acetate, 2,2,6-trimethyloctane 2-hydroxy-benzoic acid, linalool, and limonene, which can be formed from microbial activity, increased gradually through the fermentation process. In this respect, LAB and yeasts have a pivotal influence through the generation of different aroma-influencing molecules via central carbon and nitrogen metabolism3. Some microbial groups reported in this study, including Pichia nakasei, Candida sp., Lactobacillus, Lactococcus, Leuconostoc, and Oenococcus, are often characterised as important fermenters in the production of wine, cheese, yogurt, dough, and beer due to the production of low-molecular-weight flavour compounds53,54,55,56,57,58. For coffee, however, more studies are necessary to evaluate the direct impact of this compound on the final product quality.

Table 2 Concentration of volatile compounds (Area *105) formed during Colombian spontaneous coffee beans fermentation process.

The chemical composition of fresh (unfermented) and fermented coffee beans was analysed by HPLC and GC-MS (Table 3). The Colombian coffee beans constitution had 7.28 g/L glucose, 5.26 g/L of fructose, 0.18 g/L ethanol and 2.76 g/L of organic acids content. The coffee bean composition remained unchanged after the fermentation and drying processes. This indicates that microbial activity and drying process do not interfere in the composition of major compounds inside the coffee beans. The major organic acids and sugars concentration in the fermented coffee beans is similar to those found in Brazil and Ecuador10,36, and are considered key precursors of coffee-impacting molecules generated during the roasting5.

Table 3 HPLC and GC-MS analysis of fresh (unfermented) and fermented coffee beans from Colombian process.

A total of 15 volatiles were detected by GC-MS in fresh (unfermented) and fermented coffee beans, including organic acids, alcohols, aldehydes, alkanes, terpenes, and aromatics (Table 3). 3-Methylbutanoic acid, benzaldehyde, benzeneacetaldehyde, 5-isobutylnonane, and 4,6-dimethyldodecane were the compounds found in high concentrations in both fresh and fermented coffee beans. Interestingly, most of these compounds have increased significantly after the fermentation process. In particular, the increase of aldehydes (e.g., acetaldehyde, benzaldehyde, benzeneacetaldehyde, nonanal, octanal) and 1-octen-3-ol content can be attributed to the domination of P. nakasei, Candida sp., Leuconostoc, Lactobacillus, and Lactococcus (Figs 1 and 2). As the volatile compounds present low olfactory threshold, their incorporation in the chemical composition of the beans can attribute desirable and pleasant floral, fruity and citric sensory notes to the final beverage. On the other hand, some compounds, such as styrene and limonene, may have been originated from the seed germination during the processing3,59.

In summary, the Colombian spontaneous coffee beans fermentation process was characterised by the dominance of Leuconostoc and Pichia nakasei. The metabolic activity of these microbial groups resulted mainly in the production of lactic acid and acetaldehyde. In addition, 170 other microbial groups were present, contributing to the formation of a complex array of metabolites. Interestingly, 56 fungal and bacterial genera were reported for the first time in the coffee fermentation process. These microorganisms are mainly associated with local environments and migration from proximal biomes, indicating the formation of microbial niche-specific and metabolic activity in the Colombian spontaneous coffee fermentation process. Our data contribute to a better understanding of microbiome composition and open perspectives on their application in the enhancement of coffee fermentation process, such as production of high-quality coffee beans, selection of specific microbial groups for flavour modulation, and potential candidates for biological markers of Colombian coffees.

Methods

Fermentation

Coffee beans fermentation samples were collected from a coffee farm located in Buesaco, Nariño, Colombia (1°23′05″N, 77°09′23″W). The coffee farm is at 1,959 m altitude and produces ‘washed coffee’ according to the traditional method of fermentation practiced in Colombia. 10 kg of freshly harvested coffee cherries (Coffea arabica L.) were mechanically pulped and placed in cement tanks. Approximately, 4 L of water was added and a natural fermentation was allowed to occur by 48 h. Samples of 10 mL of the liquid fraction of fermenting coffee-bean mass at 0, 6, 12, 18, 24, 36, and 48 h were collected in triplicate. The collected samples were mixed and frozen in sterilized Falcon tubes at −20 °C for the further analysis.

Total DNA extraction and high-throughput sequencing

Samples of 1 mL of the liquid fraction of fermenting coffee-bean mass at 0, 6, 12, 18, 24, 36, 48 h were used for total DNA extraction. The samples were centrifuged (12,000 × g, 1 min) and the supernatant removed. The pellets were suspended in 500 µL of Tris-EDTA, homogenized with 10 µL of lysozyme solution (Sigma-Aldrich, Arklow, Ireland), and incubated at 30 °C for 60 min. Then, 50 µL of SDS 10% (w/v) and 10 μL of proteinase K (Sigma-Aldrich) were added, followed by homogenization and incubation at 60 °C for 60 min. A volume of 150 µL of phenol/chloroform (25:24; Sigma-Aldrich) was added, homogenized by inversion and centrifuged (12,000 × g, 5 min). Finally, supernatant was removed and the DNA precipitated with absolute ethanol (Sigma-Aldrich). Pellets were washed with 80% ethanol, dried and suspended in Mili-Q® ultrapure water (Merck, Darmstadt, Germany). Extracted DNA quality was cheked on a 0.8% (w/v) agarose gel and quantified with the Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Twenty ng of the extracted DNA, containing complementary adaptors for Illumina platform60, were amplified using degenerated primers for the hypervariable V4 region of both 16S (515F and 806R) and 18S (5287F and 706R) rRNA genes. PCR conditions for the generation of bar-coded amplicons are as follow: initial denaturation at 95 °C for 3 min; followed by 18 cycles of denaturation at 95 °C for 30 s; annealing at 50 °C for; extension at 68 °C for 60 s, and a final extension at 68 °C for 10 min. The PCR products were quantified using the Qubit dsDNA HS kit (Invitrogen, Carlsbad, CA, USA). Raw sequences obtained were analysed using the standard parameters of QIIME (Quantitative Insights into Microbial Ecology) software, version 1.9.061. Low quality reads, short sequences (<100 bp), and sequences containing more than one ambiguous base (N) were removed. Then, the high quality reads obtained were aligned against the SILVA database62 using the UCLUST method63. The taxonomic allocation and generation of the operational taxonomic units (OTUs) were performed using a 97% sequencing identity as cut-off. Shanghai, China. ACE, Chao 1, Shannon, and Simpson index were used to analyze the richness and biodiversity of the microbial communities.

Major compounds quantified by High Performance Liquid Chromatography (HPLC) and Gas Chromatography (GC)

At every 6 h, the concentration of sugars (glucose and fructose), organic acids (citric, lactic, acetic and succinic acids), and ethanol of the liquid fraction of fermenting coffee beans was determined using high-performance liquid chromatography (HPLC), according to Carvalho Neto et al.47 Samples (1 mL) of the fermenting coffee pulp-bean mass (liquid fraction) were centrifuged at 6,000 × g (centrifuge model A14; Jouan SA, Saint-Herblain, France). Then, the supernatant was diluted at a 1:5 ratio with distilled water and filtered using a 0.22-µm pore size filter (Filtrilo, Colombo, Brazil). Filtered samples were injected into HPLC system equipped with a Hi‐Plex H column (300 × 7.7 mm; Agilent, Santa Clara, USA) and a refractive index (RI) detector (model HPG1362A; Hewlett-Packard Company, São Paulo, Brazil). The column was eluted in isocratic mode with a mobile phase of 5 mM H2SO4 at 60 °C and a flow rate of 0.6 mL/min. Major volatile compounds were quantified by gas chromatography (GC). Samples of 5 mL of the fermenting coffee pulp-bean mass (liquid fraction) plus NaCl 5% (w/v) were disposed in hermetic sealed vials (20 mL). In order to achieve a headspace equilibrium, samples were heated (65 °C) and agitated (150 rpm) on a hot plate with a magnetic stirrer (IKA®, Campinas, Brazil) for 10 min. The volatiles present in the headspace were manually extracted with a 1.0 mL syringe (Hamilton®, Reno, NV, USA) and injected into a gas chromatographer (GC) (Shimadzu model 17 A, Tokyo, Japan) equipped with a HP‐5 capillary column (30 m × 0.32 mm × 0.25 µm). The working conditions within the GC were: column at a range 40–150 °C (rate of 20 °C/min), injector at 250 °C, and detector at 250 °C. Nitrogen was the carrier gas used, at a flow rate of 1.5 mL/min, column press of 50 kPa, and split ratio of 1:5. The volatile compounds were identified comparing the peak retention times against standards. For quantification, standard solutions of ethanol were prepared in concentration levels (1, 10, 20, 50, 100 and 1000 μmol/L) and used to construct calibration curves. The quantification of volatile compounds was expressed as ethanol equivalent.

Minor compounds identified by gas chromatographer coupled to a mass spectrophotometry

The volatile compounds from fermenting coffee pulp-bean mass (liquid fraction) were analysed by Solid Phase Microextraction (SPME), using a DVB/CAR/PDMS Fibre (Supelco Co., Bellefonte, PA USA) and injected into gas chromatographer coupled to a mass spectrophotometry connected to an autosampler (GCMS2010 Plus, TQ8040, AO 5000; Shimadzu, Tokyo, Japan). The sample was prepared by diluting two milliliters at 1:1 ratio with distilled water plus NaCl 5% (w/v) and disposed in hermetic sealed vials (20 mL). The SPME fibre was exposed for 30 min at 60 °C. The compounds were thermally desorbed at 260 °C and directly introduced into the gas chromatograph. The GC was equipped with a capillary column (model SH-Rtx-5MS; 30 m × 0.25 mm × 0.25 µm). The temperature within the GC was at follows: column oven at 60 °C, injection at 260 °C, and detector at 250 °C. Helium was the carrier gas used, at a flow rate of 1 mL/min, column press of 57.4 kPa, and split ratio of 1:20. The mass spectrophotometry range was 30–250 (m/z), at an ion source temperature of 250 °C. Volatiles were identified comparing each mass spectrum either with the spectra from authentic compounds or with spectra in reference libraries. The relative abundance of each volatile compound present in the headspace was showed as peak area times 105.

Chemical composition of fermented coffee beans

After the end of the fermentation, coffee beans were dried at 45 °C during 72 h in a drying oven with air circulation (Thoth, Piracicaba, Brazil) until a humidity of 11% was reached. Then, beans were grounded using an electric coffee grinder. Dried, ground coffee beans were analysed by HPLC and GS-MS according to the procedures described above. Unfermented coffee beans were included as a control. For HPLC analysis, a cold extraction was done by adding 0.2 g of ground beans to 1 mL of distilled water. Samples were centrifuged, filtered using a 0.22-µm pore size filter (Filtrilo, Colombo, Brazil), and injected into the HPLC system.

Statistical analysis

Statistical significance was calculated using post-hoc comparison of means using Duncan’s test. Analyses were performed using the SAS programme, version 7.0 (Statistical Analysis System, Cary, NC, USA). Level of significance was established using a two-sided p-value < 0.05.