Antimicrobial resistance genes in raw milk for human consumption

The increasing prevalence of antimicrobial resistance (AMR) is a significant threat to global health. More and more multi-drug-resistant bacterial strains cause life-threatening infections and the death of thousands of people each year. Beyond disease control animals are often given antibiotics for growth promotion or increased feed efficiency, which further increase the chance of the development of multi-resistant strains. After the consumption of unprocessed animal products, these strains may meet the human bacteriota. Among the foodborne and the human populations, antimicrobial resistance genes (ARGs) may be shared by horizontal gene transfer. This study aims to test the presence of antimicrobial resistance genes in milk metagenome, investigate their genetic position and their linkage to mobile genetic elements. We have analyzed raw milk samples from public markets sold for human consumption. The milk samples contained genetic material from various bacterial species and the in-depth analysis uncovered the presence of several antimicrobial resistance genes. The samples contained complete ARGs influencing the effectiveness of acridine dye, cephalosporin, cephamycin, fluoroquinolone, penam, peptide antibiotics and tetracycline. One of the ARGs, PC1 beta-lactamase may also be a mobile element that facilitates the transfer of resistance genes to other bacteria, e.g. to the ones living in the human gut.

contains heat treatment steps that kill the majority of bacteria. Thus the role of active DNA-export mechanisms between the intestinal and the nutriment's bacteriome is lower 13 .
Raw milk is a product sold unprocessed; thus the presence or the grade of heat-treatment steps are upon the decision of consumers. In addition to this, the consumption of non-heat-treated raw milk justified by its favourable health effects is nowadays commonly set as a trend in the developed countries 14,15 .
To our best knowledge, no previous study has investigated the possible presence of ARGs in raw milk, and we have found no data on the raw milk's resistome. This study aims to test the presence of ARGs in milk metagenome, investigate their genetic position and their linkage to mobile genetic elements 16,17 .

Metagenome.
After DNA extraction and sequencing (see Methods), from sample A 17,773,004 while from sample B 8,425,326 paired-end reads were recovered. By the quality filtering, 0.20% and 0.80% of the reads were discarded from sample A and B, respectively. The reads were aligned to the host (Bos taurus) genome. As expected, most of the genetic material originated from the milking cow, from sample A 96.41% and from sample B 97.01% of the cleaned reads were filtered out due to host origin.
Of the reads, not aligning to the cow genome, we were able to classify 42.11% in sample A and 52.96% in sample B to known taxa. 185,982 reads of sample A and 11,437 reads of sample B were identified to belong to the kingdom of Bacteria. In sample A 93.54% of the reads were classified as Gram-positive bacteria, while in sample B this proportion was only 40.54%. The detailed composition of the core bacteriomes at class level is shown in Fig. 1.

ARGs and MGEs.
Reads with overlapping pieces were assembled into longer DNA contigs by the metaSPAdes tool. The assembled contigs having a shorter length than 162 bp (sample A: 0.68%, sample B: 0.33% of all contigs) were excluded. The remaining contig's median length was 268 (IQR: 126.5) and 244 (IQR: 42) in sample A and B, respectively. Then, the contigs having any open reading frame (ORF) matched with an ARG in the Comprehensive Antibiotic Resistance Database (CARD) were collected. The detected ARGs and particular properties are presented in Table 1.
The identified ARGs were classified with the Resistance Gene Identifier (RGI) tool according to the ratio of their coverage in the samples and to the identity between the contigs assembled from the sequenced reads and the CARD ARG reference sequences. In Table 1 we list each ARG with perfect or strict hits predicted by RGI. We were able to identify three perfect ARG matches in sample A, mepR, mgrA and Staphylococcus aureus norA. According to the taxonomical classification of the contigs harbouring these ORFs their most likely origin is bacteria from Staphylococcus genus. The MGE analysis showed that none of these ORFs is mobile.
In the bacterial genome, ARGs may be located on chromosomes or on plasmids, the latter ones being more likely to translocate between bacteria. With the PlasFlow tool, we identified contigs harbouring chloramphenicol acetyltransferase, PC1 beta-lactamase (blaZ) and tet (38) ARGs that may be encoded on plasmids. The results of  www.nature.com/scientificreports www.nature.com/scientificreports/ MGE domain coexisting analysis showed that PC1 beta-lactamase (blaZ) ARG might be mobile since the contig had a phage integrase ORF within the distance of 10 ORFs.
There were no perfect matches in sample B that is not surprising since its overall bacterial nucleic acid content was less than 10% of that of sample A. The sequence coverages of the strict matches in this sample ranged between 5.71% and 24.83%, with the mean of 13.31%. The identity between ORFs and CARD ARG reference sequences was 100.00% in each detected ARG. Contigs containing ARGs were classified and genera Acinetobacter (50%) and Delftia (50%) were identified. None of the identified ARGs could be related to any mobile genetic elements.
The detected ARGs in both samples were matched to their corresponding antibiotics. Since one antibiotic compound may be related to more than one ORFs, we decided to select those to which we could link the ORFs with the broadest coverage and the highest identity to the reference ARG sequence. The maximal coverage and identity of detected ORFs are shown in Fig. 2. In sample A ARGs known to be decreasing the effectiveness of acridine dye, cephalosporin, fluoroquinolone, penam and peptide antibiotics were found in full length and with 100% identity. There were two other ARGs identified in sample A in full length and with identity above 99% that encoded resistance against further antibiotics (cephamycin and tetracycline).

Discussion
Antimicrobial resistance (AMR) is a natural feature of microorganisms that have originally occurred as a means of defence in the rivalry amongst the members of the microbiotas. The ubiquity of antimicrobial resistance genes (ARGs) is beyond question. Genes against antibiotics are present both in non-pathogenic and pathogenic bacteria. With the extended agricultural and clinical use of antibiotics, the number of ARGs are on the rise, and the growing number and spread of multi-resistant bacteria strains pose a global threat to global health. According to the CDC's Antibiotic Resistance Threats in the United States, 2019 report 3 , more than 2.8 million antibiotic-resistant infections occur in the U.S. each year, and more than 35,000 people die as a result. In addition, 223,900 cases of Clostridioides difficile occurred in 2017 and at least 12,800 people died. Dedicated prevention and infection control efforts in the U.S. and around the world are working to reduce the number of infections and deaths caused by antibiotic-resistant germs. However, the number of people facing antibiotic resistance is still too high. The AR Threats Report warns that not only people but also animals carry bacteria in their guts which may include antibiotic-resistant bacteria either with intrinsic or with acquired ARGs 12 . Beyond disease control, animals may be given antibiotics for growth promotion or increased feed efficiency. Since bacteria are exposed to low doses of the drugs over a long period, this inappropriate antibiotic use can lead to the development of resistant bacteria 3 . The CDC report notes that when animals are slaughtered and processed for food, resistant germs in the animal gut can contaminate meat or other animal products, but do not mention the possible contamination of milk.
Detected ARGs in raw milk (Table 1) can be transferred from non-pathogens to pathogens via HGT. The over-expression of such genes, e.g. norA (regulated by mgrA) and mepA (regulated by mepR) coding multidrug efflux pumps confer resistance to fluoroquinolones (including norfloxacin or ciprofloxacin) and even disinfectants [18][19][20][21] . Ciprofloxacin is a broad-spectrum antibiotic used to treat a variety of bacterial infections, including intra-abdominal, respiratory tract, skin, urinary tract, and bone and joint infections. Norfloxacin might be used for uncomplicated urinary tract infections (including cystitis) or the prevention of spontaneous bacterial peritonitis in cirrhotic patients, among others. MepA was also shown to result in resistance to tigecycline 22 , an antibiotic www.nature.com/scientificreports www.nature.com/scientificreports/ that was developed to tackle complicated infections caused by multiresistant bacteria such as Staphylococcus aureus, Acinetobacter baumannii, and E. coli.
The two samples differ both in the composition of core bacteriome and the ARG abundance. In sample A the bacteriome was dominated by Gram-positive bacteria. Furthermore, most of the contigs harbouring ARG were classified taxonomically belonging to Gram-positive bacteria. In sample B, the Gram-negative bacteria governed the bacteriome. So the lower ARG abundance in sample B might come from the lower proportion of Gram-positives. Nevertheless, in sample B, not just the number of detected ARGs was lower, but the maximal coverage of the ARGs as well. One may find the reason for this phenomenon, the lower sequencing depth of sample B. The identity of these ORFs with the reference ARGs are very high so we may assume the assembled ORFs originated from ARGs. Accordingly, the possible reason of the lower coverage of ARGs may be caused by the insufficient read counts for assembly the complete ORFs. One possible argumentation of the ARGs' difference between sample A and B may be derived from the fact that health issues (e.g. mastitis) are relatively more common in large scale farms. Since the use of antibiotics is more permissive in veterinary practice -compared to human medicine -in the treatment of bacterial infections, it places a selective pressure on the bacteria of herds, what might increase the frequency and the diversity of ARGs.
Our results show that indeed ARGs can be present in raw milk. However, it should be the subject of further research to identify how resistant bacterial DNA gets into the milk, is it already there in the cow's udder or does it only mixed into the milk as contamination during or after milking.
At raw milk's environment of origin (dairy farms), the use of antimicrobial agents is widespread. Consequently, the microbiome of this product may show relatively high levels of resistance. Without heat-treatment, bacteria that are present in raw milk are not hindered from further multiplication what results in the amplification of their resistance genes either. Such a rise in the number of ARGs may increase the risk of horizontal gene transfer (HGT) events. This risk may even be higher in case of mobile ARGs (e.g. blaZ, which was detected on a plasmid and near to a phage integrase ORF).
Beyond human intervention, there are natural mechanisms that limit ARG-transfer 11 . First of all, donor and recipient populations need to be present at the same physical space 23 and reach a specific critical density to ensure proper connectivity for a successful gene transfer event. Chances for a series of HGT events amongst two physically distant populations are relatively low except for the case when there is positive selection driven by any factors (e.g. selection by antibiotics). The second factor arises from the fact that genes encoding resistance against the same compounds may limit each other's spread. A population owning genes against a particular antibiotic is not under selective pressure to gain any other ARGs with the same effect. As a conclusion of earlier evolutionary steps, possession of resistance determinants of the same substrate profiles are possible. However, in a population where the distribution of these genes is stable, the chances of new recruitments are lower. Tertiary, acquisition of resistance genes sets metabolic costs deriving from the transfer and integration mechanisms needed. These costs vary by each ARG, and only affordable genes are spread 11 .
Even though the bacterial compositions of milk are affected by the heat treatment 24,25 , the question may arise whether the ARG content of raw and pasteurized milk are different? In water, DNA degradation starts by 90 °C 26 . The HTST pasteurization (high temperature/short time) is performed at 72 °C for 15-40 seconds, while ultra-pasteurization (UHT) is at 135 °C for 1-2 seconds. Summarizing this information, one may conclude that the resistomes do not differ significantly in HTST and raw milk. On the other hand in UHT milk some DNA degradation might be suspected. Nevertheless, some aspects are broadening the picture, that are worth taking into consideration. First of all, in raw milk, the members of the bacteriota remain viable and may multiply depending on the storing temperature. The proliferation of bacterial cells increases the amount of the sample's extractable bacterial DNA content what appears in the results of the sequencing as raised bacterial read rates. Consequently, after the assembly of the reads, the likelihood of having contigs containing ARGs is higher. Pasteurization kills 99.99% of bacteria; thus, their multiplication has a low significance. Secondly, the bacteriome of milk consumers (humans and animals) may gain the ARGs of the milk-resistome by transformation and transduction only 13 , as pasteurization decreases the number of viable bacteria. In contrast, raw milk's higher viable bacterial cell count facilitates conjugation to the consumers' bacteriome while the above-mentioned horizontal gene transfer mechanisms 13 are also kept. Of course, this phenomenon rather has an impact on the risk of HGT than on the resistome of raw or pasteurized milk.
Nevertheless, heat-treatment of raw milk seems to be an advantageous and a more than considerable step that besides inhibiting the amplification of genes having a potential risk, makes active gene transfer mechanisms lose their significance. On the other hand, even though it reduces the number of multiplication cycles, after the lysis of cells free DNA fragments appear in the sample that may still be uptaken by newly arriving bacteria.
However, the interpretation of resistome studies is yet to be deepened. The combination of next-generation sequencing, metagenomic and computational methods provides valuable data on the presence of ARGs. Moreover, it makes it possible to find genes in full coverage and length, and to identify their taxonomical classes of origin and their exact sequential surroundings. Synteny with mobile genetic elements is a fact to be taken into consideration when examining the risks meant by an ARG. Thus, the combination of methods mentioned above serves as a core component of today's necessarily expanded antimicrobial resistance research.
As a means of evolutionary pressure, the use of antibiotics selects bacterial strains that have antimicrobial resistance genes. Moreover, in the production animal sector, the application of such compounds increases not only the number of antibiotic-resistant bacterial strains but also the frequency of their appearance. After the consumption of animal products, these strains may meet the human microbiota, and the circumstances may be appropriate for the horizontal gene transfer derived spread of antimicrobial resistance genes amongst these populations. This phenomenon unfolds a possible source of acquisition of human pathogens' antimicrobial resistance other than the direct presence of antibiotic residuals in animal products.
DNA extraction and metagenomics library preparation. Before the laboratory procedures, the milk samples were stored frozen. 120 mL of raw milk was centrifuged at 10.000 g for 10 min. Total DNA was extracted from the pellet using the ZR Fecal DNA Kit from Zymo Research. Paired-end fragment reads (2 × 150 nucleotides) were generated using the TG NextSeq 500/550 Mid Output Kits v2 sequencing kit with an Illumina NextSeq sequencer. Primary data analysis (base-calling) was carried out with bcl2fastq software (v.2.17.1.14, Illumina).
Bioinformatic analysis. Quality based filtering and trimming was performed by Adapterremoval 27 , using 15 as a quality threshold. Only reads longer than 50 bp were retained. Bos taurus genome (ARS-UCD1.2) sequences as host contaminants were filtered out by Bowtie2 28 with very-sensitive-local setting minimizing the false positive match level 29 . The remaining reads were taxonomically classified using Kraken2 (k = 35) 30 with the NCBI non-redundant nucleotide database 31 . The taxon classification data was managed in R 32 using functions of package phyloseq 33 and microbiome 34 . For further analysis, the reads assigned to Bacteria was used only 35 . Core bacteria was defined as the relative abundance of agglomerated counts at class level above 0.1% at least one of the samples. By metaSPAdes 36 the preprocessed reads were assembled to contigs, with the automatically estimated maximal k = 55. From these contigs having a shorter length than the shortest resistance gene of the Comprehensive Antibiotic Resistance Database (CARD) were discarded 37,38 . The ARG content of filtered contigs was analyzed with Resistance Gene Identifier (RGI) v5.1.0 and CARD v.3.0.6 37,38 . Contigs harbouring ARG identified by RGI with perfect or strict cut-off were preserved and classified by Kraken2 on the same way as was described above. The plasmid origin probability of the contigs was estimated by PlasFlow v.1.1 39 . To identify possible further mobile genetic element (MGE) homologs the predicted protein sequences of contigs were scanned by HMMER 40 against data of PFAM v32 41 and TnpPred 42 . Following Saenz et al. 35 from the hits with lower than E 10 −5 the best was assigned to each predicted protein within the distance of 10 ORFs. The MGE domains coexisting with ARGs were categorized as phage integrase, resolvase, transposase or transposon.

Data availability
All data are publicly available and can be accessed through the PRJNA591315 from the NCBI Sequence Read Archive (SRA).