Introduction

Livestock selection for the improved production of milk, meat, skin and fiber has influenced the evolution of human society and animal breeds1,2. Traditional selection based on ancestral performance is currently being supplanted by “genomic selection” based principally on the summation of genome-wide marker or haplotype associations with ancestral performance parameters3. While both industrial processes have provided considerable genetic gain, application of either methodology excludes rare alleles or new mutations with significant phenotypic impact from the future gene pool.

Haldane4 estimated the mutation rate for hemophilia in humans to be approximately 10−5 and subsequent studies have confirmed the rate of new mutations per gene to range from 10−6 to 10−4 per generation5,6,7. Due to the comparable sizes of human and cow genomes8,9, we hypothesized that if novel mutations were to arise in the bovine at a comparable rate, rare alleles for extreme phenotypes should be detectable provided that the sample size is above one million and an appropriate screening protocol for the rare trait is available. We also hypothesized that variations in these genetically and biochemically complex components could be proxies for individually refined milk compositions, including variations in individual milk fatty acids.

We have tested our hypothesis using a large and well-characterized bovine population. Our aim was to discover new alleles that are responsible for milks with extremely low or high milk fat or protein content.

Results

We screened the lactation records of 2.5 million cows and identified 29 individuals with milks deviating by more than 3.5 standard deviations (SD), over repeated milk tests and lactations, from the breed- and age-adjusted means for protein or fat content. Of these, Holstein-Friesian cow ‘363’ produced milk with average fat content of 2.81% ± 0.10% (mean ± SD), 3.9 SD below the breed mean of 4.52%. Milk protein concentration of cow 363 was 1 SD lower than the population group mean, while mean seasonal milk production of 6223 ± 925 L was 3.0 SD above the breed average. At selection, cow 363 was 8 years old and had completed five lactations, indicating normal fertility and survival rate. Information provided by her breeder and lifetime owner confirmed normal history and health status and the cow showed no abnormalities on inspection (Fig. 1b). The 74,618 paternal half-sisters of cow 363, farmed throughout New Zealand, produced milks with an average fat content of 4.4 ± 0.7%. Milks from cow 363's deceased dam and her maternal half-sister contained 3.98 ± 0.15% and 5.60 ± 0.3% fat, respectively.

Figure 1
figure 1

An outlier cow with a heritable mutation responsible for low milk fat content.

(a) Pedigree of cow 363 showing segregation of the low milk fat phenotype. Founder cow 363 is indicated by the large filled circle. Filled symbols represent affected females or carrier males, open symbols represent wild-type individuals. Numbers inside symbols indicate average milk fat content in percent, determined during the first lactations of the founder's filial generations, or during the lactation lifespan for all other animals. For offspring from bull sons, the milk fat averages of their phenotypic daughter groups are stated together with the number of animals in each group. Hash symbols indicate unavailable DNA samples and the asterisks denote offspring generated by embryo transfer. (b) Founder cow 363 during the dry phase at 13 years of age. (c) Segregation of milk fatty acid composition in the F1 generation. Milk contents of saturated (SFA, blue), monounsaturated (MUFA, red) and polyunsaturated fatty acids (PUFA, black) for three individual mutant and wild-type daughters of cow 363 and for three unrelated, breed-matched control cows in the same herd, are indicated by open symbols. Means are indicated by bold horizontal bars and P values (two-tailed Student's t-test) for fatty acid groups differences are stated between genotype groups. Differences between wild-type daughters and control cows were not significant (PSFA = 0.96, PMUFA = 0.96, PPUFA = 0.93).

Detailed milk composition analysis revealed a substantial improvement in the ratio of saturated : unsaturated milk fat. Saturated fat content was reduced by 3–4 SD, while mono- and polyunsaturated fatty acids were 2–3 SD above the New Zealand average10 (Supplementary Table 1). In contrast, content and composition of casein and whey proteins were within one 1 SD of the national breed average (Supplementary Table 2). The extreme milk fat content and milk fat composition phenotypes remained unchanged after cow 363 was relocated from her home farm to our research farm, indicating a negligible influence of environmental factors.

To assess the genetic basis of the extreme milk fat phenotypes, we generated four additional sons and seven daughters of cow 363 (Fig. 1a). Three daughters produced milks with an average fat content of 2.62% ± 0.09% and fatty acid compositions similar to cow 363; in contrast, average fat content (4.14% ± 0.37%) and fatty acid composition of milks from her other four daughters were similar to unrelated control cows (Fig. 1c, Supplementary Table 3). When unrelated Holstein-Friesian dams were bred with semen from the five sons of cow 363, 18 of the 50 daughters sired by three of the sons produced milks with fat content similar to that of founder cow 363. Milks from their paternal half-sisters and from the 42 daughters of the other two bull sons of cow 363, were similar to that of the breed average (Fig. 1a). Similar to the founder, no other phenotypes were apparent in her offspring. Taken together, these results suggested that a heterozygous genetic variation with a dominant effect was responsible for the milk fat phenotypes of cow 363.

Genotyping of 56,000 single-nucleotide markers in the pedigree of cow 363 and genome-wide association analysis identified a single locus associated at a genome-wide level of significance [p-value ≤ 1.27 × 10−6] (Fig. 2). This locus harbors the DGAT1 gene encoding diacylglycerol O-acyltransferase 1, which catalyzes the terminal reaction in triglyceride synthesis11,12,13. Sequencing of the DGAT1 locus of cow 363 revealed a heterozygous, non-synonymous A>C transversion in exon 16 (Fig. 3a). This variation was present in all females producing milks with less than 3% fat and in the three bulls producing offspring with reduced milk fat content but was absent in all other offspring of cow 363, her sire and her grand sires. Moreover, the variant was absent in 185 bulls that have sired the majority of the New Zealand dairy cows. As historical records showed that the milk fat phenotype of cow 363's deceased dam was typical of the breed, the mutation has likely occurred de novo. Cow 363 was homozygous for the common Ala 232 variant of DGAT114,15.

Figure 2
figure 2

Association mapping of the mutation responsible for low milk fat content.

Genome-wide association of milk fat content. The chromosomal position of SNP markers (x axis) is plotted against -log10 GWAS P-value (y axis). The threshold for genome-wide significance (P≤1.27x10−6) is indicated by a horizontal line. Markers on chromosome 14 showing significant association with milk fat content are identified.

Figure 3
figure 3

A de novo mutation in bovine DGAT1 gene induces skipping of exon 16.

(a) Structure of the bovine DGAT1 gene (Genbank accession AY065621.1) indicating the new mutation at position 8078 in exon 16. Small boxes represent untranslated regions, large boxes represent coding regions connected by a line representing introns. The Ala232 allele in the wild-type and mutant genes is indicated below exon 8. Sequence electropherograms from a wild-type cow (top) and founder cow 363 (bottom) indicate the 8078A>C transversion (arrow). The 5′-end of intron 16 is indicated by lower-case sequence. The putative wild-type and mutant splice enhancer motifs are underlined and the conceptual translation of wild-type and mutant mRNAs containing exon 16 is provided below the genomic DNA sequence. (b) Skipping of exon 16. RT-PCR products were obtained from liver biopsies of 18-months old wild-type (AA) and heterozygous mutant (AC) cows and from liver autopsy samples from a 10 day-old homozygous mutant (CC) cow. Agarose gel electrophoresis reveals transcripts lacking the 63-nucleotide exon 16 in mutant livers (lanes 1-3, lower band of 137 bp). A small amount of full-length transcripts can be detected in a homozygous mutant with a PCR primer binding to exon 16 (lane 7, lower band of 106 bp). Lane 4, molecular weight markers of 200 and 100 bp. (c) Presence of full-length transcripts in homozygous mutants. Transcripts containing exon 16 were quantified by RT-qPCR. Data indicate mean relative abundance ± s.e.m of transcripts obtained with primers binding to regions corresponding to exons 15 and 16 (normalized to transcripts containing exons 4 and 5), averaged from 2 independent liver samples obtained from the number of cows indicated (n) at equivalent time points in 2 lactation seasons (only one sample was obtained from each CC homozygote). P(AA-CC) = 1.85 × 10−5, P(AA-AC) = 1.54 × 10−4, P(AC-CC) = 1.01 × 10−5, two-tailed Student's t-test.

Quantitative PCR amplification polymerase and cDNA sequencing revealed that the mutation disrupts splicing and causes exon 16 skipping. In liver and mammary gland biopsies from heterozygous animals, transcripts lacking the 63-nucleotide exon 16 were as abundant as full-length transcripts (Fig. 3b). In contrast, only low levels of full-length transcripts were detectable in liver biopsies from homozygous mutants (Fig. 3b,c). Analysis of the wild-type sequence of exon 16 identified the putative consensus exonic splicing enhancer16,17 motif 5′-ATGATG overlapping the mutation (Fig. 3a).

To assess the effect of the mutation on enzyme activity, we expressed wild-type and mutant DGAT1 cDNAs in a bakers' yeast strain lacking endogenous diacylglycerol acyltransferase activity18. In contrast to the full-length variants, the mutant enzyme lacking the 21 amino acids encoded by exon 16 was unable to transfer oleic acid from CoA to diacylglycerol (Fig. 4).

Figure 4
figure 4

The mutation abolishes the diacylglycerol acyltransferase activity of Dgat1.

Thin-layer chromatography (a) and quantification by densitometry (b) of triglycerides produced by recombinant Dgat1 wild-type (Lys232 and Ala232) and mutant (Ala232-ΔE16) enzymes, compared to vector-only control (pYES2). Extracts were adjusted to contain equivalent levels of recombinant DGAT1 mRNA or yeast TDH1 mRNA (for pYES2). Panel a shows results from a typical experiment, while data in panel b show mean ± s.e.m. of triglyceride levels produced by at least seven extracts prepared from at least two independent yeast transformations for each plasmid. P-values determined by two-tailed Student's t-test are indicated. As previously reported15, the full-length Ala232 variant showed slightly lower activity than the ancestral Lys232 enzyme.

These results show that a novel single nucleotide substitution in an exonic splicing enhancer of the DGAT1 gene induces exon 16 skipping and results in enzymatically inactive diacylglycerol O-acyltransferase 1. The mutation also reveals a link between triglyceride synthesis capacity and milk fat saturation and provides an important clue to the role of DGAT1 in milk fat saturation19.

Cows heterozygous for the new mutation thrived in a pastoral environment and displayed normal growth, survival and fertility rates. In contrast, homozygous calves exhibited scouring (non-bloody watery diarrhea), slow growth, sensitivity to dietary fats, reduced levels of serum cholesterol, free fatty acids and triglycerides and flattened intestinal microvilli. Scouring severity and incidence was reduced on iso-energetic low-fat diets and growth rates improved by regular parenteral administration of essential lipids. These phenotypes overlap in part with the single reported case of loss of DGAT1 function in humans20, but differ markedly from the normal survival and tolerance of high-fat diets of DGAT1 knockout mice13,21,

Our observations indicate that the mutant cows may provide valuable new insights into the role of DGAT1 in intestinal lipid absorption, postabsorptive fatty acid re-esterification and enteroendocrine peptide release22,23,24,25. In light of ongoing efforts targeting DGAT1 to treat obesity and diabetes26,27,28,29, it will also be interesting to investigate over the longer lifespan of cows any novel effects of DGAT1 deficiency and whether the pleiotropic effects of DGAT1 deficiency observed in mice12 will manifest in a large mammal.

In summary, our findings demonstrate that quality phenotypes for large populations, under long-term trait-specific selective pressure, can be screened to identify new and rare mutations responsible for extreme traits and that these mutations can be employed to establish new animal lines. Combined with genome sequencing and genomic selection for the integration of these traits into the wider population, this approach should result in the rapid identification of animal lines producing foods with improved nutritional qualities. Thus screening of large, well-characterized populations appears to be an approach able to “rescue” new phenotypes generated by spontaneous mutation from the extinction predicted by a mathematical theory of long standing4.

Methods

Population screen

New Zealand's National Herd Testing database records milk volume, protein and fat content and somatic cell count determined by Fourier transformed infrared spectrometry (FTIR, http://www.foss.dk) for 70–80% of dairy cows 3–4 times per season. We retrieved statistical outlier cows with low or high milk fat percentage, high milk protein percentage, or high milk protein : fat content ratio in multiple herd tests over at least two lactation seasons. Candidates were compared to herd mates to exclude cows with extreme milks due to environmental factors and we biased selection towards de novo mutations by excluding candidates with paternal half-sibs or maternal ancestors showing similar, albeit less extreme phenotypes. Candidates were inspected for gross phenotypic abnormalities and their owners interviewed to reveal possible environmental contributions to the extreme phenotypes. The owners were informed of the aims and potential contribution of their animal to the program before the cows were purchased.

Animal procedures

Animals were managed under standard New Zealand seasonal farming routines on mixed ryegrass/white clover pasture and milked twice daily. Adult females were artificially inseminated for calving in July and August. Experimental samples were collected from September to February before cows were dried off in April or May. Lactating females from the founder pedigree were farmed together with unrelated control cows.

In vitro fertilization and embryo implantation into surrogate dams was performed according to routine industry protocols (Artificial Breeding Services, Hamilton, New Zealand). Liver and mammary gland tissues were collected post milking by needle biopsy30, or at slaughter. Samples were snap-frozen in liquid nitrogen and stored at −86°C until use. Animals were monitored after biopsy as described30. Procedures were approved by the AgResearch Ruakura Animal Ethics Committee.

Milk composition analysis

Milk samples were collected during routine morning and evening milking and combined into a representative daily sample. Milk fat and protein content was determined by FTIR. Fatty acid composition was determined by gas chromatography as described31 and casein and whey protein content was determined by HPLC and SDS-PAGE32.

Genetic mapping

The 92 granddaughters of cow 363 indicated in figure 1a (excluding the single cow with 4.16% milk fat content) were genotyped (BovineSNP50 Genotyping Bead Chip, Illumina) by GeneSeek, Inc. (Lincoln, NE, U.S.A.). SNPs with > 5% missing data or minor allele frequency < 5% or that did not map to a chromosome were removed, leaving 39,438 SNPs for analysis. The animals were analyzed for SNP associations utilizing a mixed linear model in Tassel 2.133: milk fat content (in %) = genotype + u + e, where u is a random effect accounting for relatedness and has covariance matrix proportional to the kinship matrix. The kinship matrix was estimated using PLINK34. A Bonferroni correction was used to adjust for multiple testing. The 0.05-level multiple-testing corrected significance threshold is 0.05/39438 = 1.27 × 10−6, which was met by six markers.

Sequence analysis

The sequence of the DGAT1 locus from the founder and one of her low-fat daughters was determined by PCR and Sanger sequencing and compared to sequences from normal cows, to GenBank record AY065621.1 and to bovine reference genome build 4.1, by visual inspection of chromatograms. All sequence variants were confirmed by independent amplification, sequencing and genotyping reactions.

Genotyping of the 8078A>C mutation

Animals from the 363 pedigree, 185 sires frequently used in the New Zealand dairy population and 80 sires and 1595 cows representing a crossbreed herd35, were genotyped for the mutation by the Australian Genome Research Facility using a custom-designed iPLEX™ Gold assay (SEQUENOM) with primers detailed in supplementary file 2.

Identification of splicing regulatory motifs

The effect of the 8078A>C substitution was analyzed against a set of functionally validated hexamer motifs that are statistically overrepresented in exons16,36,37 by querying the RESCUE-ESE web server (http://genes.mit.edu/burgelab/rescue-ese/) with wild-type and mutant bovine DGAT1 sequences.

Analysis of DGAT1 mRNA splicing

Total RNA was prepared from liver samples using RNeasy columns (QIAGEN) and on-column DNAse treatment and reverse transcribed using the SuperScript III First-Strand Synthesis System (Invitrogen). Presence of DGAT1 exon 16 was determined by gel electrophoresis and sequencing of the PCR products obtained with primer pairs to exon 15 and 16, or to exons 15 and 17. Exon 16 skipping was quantified by qPCR on a Lightcycler 480 using the Lightcycler 480 Probes Master System and universal probes (Roche). Briefly, exon16-containing transcripts were PCR-amplified using primers to exons 15 and 16 with fluorescent probe #44 and normalized to DGAT1 transcripts quantified using primers to DGAT1 exons 4 and 5 with probe #98. Detailed parameters are provided in the Supplementary MIQE File. Statistical significance of expression differences was determined using Student's t-test (two-sided, unequal variance).

Yeast expression and diacylglycerol acyltransferase assay

Yeast expression vectors encoding wild-type (Lys232 and Ala232) or mutant (Ala232-ΔE16) DGAT1 were constructed by cloning bovine cDNAs into pYES2 (Invitrogen). Plasmids and vector were transformed into triglyceride-negative strain H124618 and transgene expression was induced using galactose as described (pYES2 Manual, Invitrogen). Twenty OD600 nm of culture was harvested and 1 OD600 nm was retained for mRNA quantification. Cells were sedimented, washed with water and disrupted by vigorous vortexing in 50 μL of 10 mM Tris-HCl (pH7.9), 10 mM MgCl2, 1 mM EDTA, 5% glycerol, 1 mM DTT, 0.3 M ammonium sulfate, Complete protease inhibitor cocktail (Roche), 0.8 mM Pefabloc SC (Roche) and 600 mg of glass beads (diameter 450–600 μm, Sigma-Aldrich). The cell lysate was recovered in 500 μl disruption buffer and cleared by centrifugation at 12000 g-1 for 10 min. Protein concentration was determined using the DC protein assay (Bio-Rad).

Diacylglycerol acyltransferase activity in 50 μg of clarified yeast lysate was determined as described15. TLC plates were exposed to a phosphor imaging screen and scanned (Pharos FX+, Bio-Rad). DGAT1 cDNA was quantified by qPCR using primers to exons 4 and 5 (see above), while endogenous TDH1 transcripts were quantified using fluorescent probe #82 and primers detailed in supplementary file 2. DGAT1 expression levels were computed as the ratio of mean quantification cycle (Cq) for the bovine DGAT1 reaction to the mean Cq for the yeast TDH1 reaction. Statistical significance of gene expression differences was determined using Student's t-test (two-sided, unequal variance).