Multi-omics dataset of bovine mammary epithelial cells stimulated by ten different essential amino acids

Application of high-throughput sequencing and screening help to detect the transcriptional and metabolic discrepancies in organs provided with various levels of nutrients. The influences of individual essential amino acid (EAA) administration on transcriptomic and metabolomic profilings of bovine mammary epithelial cells (BMECs) were systematically investigated. A RNA sequencing and liquid chromatography-tandem mass spectrometry generated a comprehensive comparison of transcriptomics, non-targeted metabolomics and targeted amino acids profilings of BMECs with individual EAA stimulation by turn. The sequencing data and raw LC-MS/MS data of samples were presented in the databases of Gene Expression Omnibus, MetaboLights and Figshare for efficient reuse, including exploring the divergences in metabolisms between different EAAs and screening valuable genes and metabolites regulating casein synthesis.


Background & Summary
Milk protein from mammary gland of dairy cows is a valuable protein source available for human consumption 1 .However, insufficient utilization of genetic potential restricts dairy's competitiveness and weakens efforts to decline investments into food production 2 .Although it varies greatly, the efficiency of dietary nitrogen conversion into milk is only about 25% in lactating cows, which results in a huge pecuniary loss and a series of environmental pollutions 3 .
Given that, nitrogen intake has a significant correlation to nitrogen excretion 4 .Dairy cows are likely to be supplied protein that exceed their requirements resulting from unclear definitions of amino acid (AA) needs.The bovine mammary gland is a forceful milk protein synthesizing factory.Present diet formulation is according to the single limiting nutrient theory proposed by von Liebig, which suggested that AA transfer from lumen to milk follows a stationary efficiency until needs are satisfied 5 .However, the observations that several AAs regulated cell signaling pathways 5,6 and the changeable AA transport activity of mammary gland indicate the mutable efficiency of AA utilization that violates the supposition raised by Mitchell and Block 7 .Therefore, more data about the individual AA metabolism in mammary gland is required to improve the predictions of AAs requirements for dairy cows.
When dairy cows were fed an average diet of 17.8% crude protein (CP), only 25% of dietary nitrogen is recovered in milk 3 , but this efficiency increased to 30% under a grass-based diet with a well-balanced supplement of histidine, lysine, methionine, and leucine to a 15% CP diet 8 .Similarly, Zhao et al. reported that lactating cows fed 12% CP diet supplemented with isoleucine, leucine, methionine and threonine had similar milk productions relative to those fed 16% CP diet 9 .These results indicated that a reasonable strategy to elevate postabsorptive nitrogen efficiency is to reduce present percentages of dietary nitrogen and add specific essential AAs (EAA), which requires a better understanding and comparison of different EAA conversions in mammary gland.
Similar to the low-protein diet supplemented with EAAs in dairy cows, single EAA addition to the standard medium devoid of total EAAs had been used to explore the influences of single EAA on milk protein synthesis in vitro 1,5,10 .Gao et al. found that cultivation with standard medium devoid of all AA and supplemented with histidine, lysine, methionine, and leucine for 6-h had different effects on β-casein synthesis 11 , respectively.Appuhamy et al. reported that media lack of total EAA and individually supplemented with arginine, isoleucine, leucine, methionine, or threonine exerted various influences on mTOR signaling pathways that regulate the rates of translation initiation and elongation 5 .These results indicated that different individual EAA addition under low total AA/EAA settings had diverse effects on casein synthesis.Besides, additions of several EAAs, such as arginine, leucine and branched-chain AAs were also used to improve the health status of humans with preeclampsia, sarcopenia and cirrhosis [12][13][14] , respectively.However, previous studies usually focused on few EAAs, and information about the systematic comparison of mammary metabolisms between all individual EAAs is relatively limited.Therefore, transcriptomic and metabolomic profilings as well as the AA concentrations of bovine mammary epithelial cells (BMECs) came from 72 samples with various single EAA availability (0 or standard concentration in medium for each EAA) were systematically compared in this study based on TruSeq Stranded mRNA LTSample Prep Kit (Illumina) and liquid chromatography-tandem mass spectrometry (LC-MS/MS).All of data were deposited in Gene Expression Omnibus 15,16 , MetaboLights [17][18][19] and Figshare 20,21 repositories.
The public availability of this dataset helps to promote researches in biochemistry of EAA regulating milk protein synthesis, transcriptomic and metabolomic comparisons between different studies or explorations into repetitiveness of data analysis in multi-omics.Given that this study has a horizontal design, this dataset can be used in multiple perspectives.First, these data presented a landscape of transcriptomic and metabolomic profiles in BMECs with different individual AA supplies, which contributes to investigate the discrepancies between individual EAA metabolisms.Second, it helps researchers to study the pathways by which EAAs mediate BMECs function and milk protein synthesis.Third, data can be applied for optimizing mammary mathematical models to predict the responses of milk protein synthesis to EAA administrations and contributing to screening key genes and metabolites with potential breeding values.

Methods
Study design.Experimental protocols were ratified by Welfare and Health Committee of Qingdao Agricultural University.Primary BMECs were isolated as described previously 22 .Briefly, mammary gland tissues from healthy dairy cows were cut into 1mm 3 pieces in D-Hanks solution (Solarbio, Beijing, China) containing 1% penicillin-streptomycin (Sigma-Aldrich, MO, USA).Following washing with PBS buffer three times, the pieces were incubated at 37 °C and 5% CO 2 .The tissues were removed when the cells isolated from tissues reached 80% confluence.BMECs and fibroblasts were divided according to their different sensitivity to 0.15% trypsin plus 0.02% EDTA (Sigma-Aldrich, MO, USA).BMECs (6 passages) were grown in DMEM-F12 medium (Gibco, NY, USA) including 1% L-glutamine and 10% (v/v) fetal bovine serum (Gibco, NY, USA) as well as 5 μg/mL insulin and 1% penicillin-streptomycin (both from Sigma-Aldrich, MO, USA) at 37 °C and 5% CO 2 .
The experimental design was shown in Fig. 1.For exploring the transcriptomic and metabolomic responses to diverse EAA stimuli, 0.5 × 10 7 BMECs were seeded in a 12-well plate.After nearly 48-h cultivation, the cells were serum-starved overnight when they reached a 90% confluence and had a aggregates-like formation, and then subsequently assigned to 1 of 12 treatment media (n = 6).Complete DMEM-F12 medium (Gibco, NY, USA) containing 0.70 L-arginine, 0.15 L-histidine, 0.42 L-isoleucine, 0.45 L-leucine, 0.5 L-lysine, 0.12 L-methionine, 0.22 L-phenylalanine, 0.45 L-threonine, 0.04 L-tryptophan, and 0.45 L-valine (all in mmol/L) serves as the positive control (POS) treatment, while DMEM-F12 medium without total EAA served as the negative control (NEG) treatment.Ten treatments were NEG individually supplemented with L-arginine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, L-threonine, L-tryptophan or Fig. 1 The experimental design and workflow to acquire data of transcriptomics, non-targeted metabolomics and targeted amino acids profilings in BMECs.Complete DMEM-F12 medium used for BMECs was the positive control (POS) treatment, while DMEM-F12 medium without total essential amino acid served as the negative control (NEG) treatment.Ten treatments were NEG individually supplemented with L-arginine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, L-threonine, L-tryptophan or L-valine, respectively (n = 6).Individual EAA was supplemented to achieve a concentration equal to that in POS.
L-valine (Sigma-Aldrich, MO, USA), respectively.Individual EAA was supplemented to achieve a concentration equal to that in the POS treatment.After 6-h treatment, cell samples were collected simultaneously and kept at −80 °C for subsequent analysis.

Sample preparation.
Total RNA of treated cells were collected with TRIzol reagent (Invitrogen, Carlsbad, CA, USA).The quantity and quality (optical density at 260/280 nm = 1.8-2.0) of RNA were measured using a biophotometer (Eppendorf, Hamburg, Germany) and agarose gel electrophoresis for analyzing 28 S and 18 S rRNA subunits.
For non-targeted and targeted metabolomics, the metabolites of 1 × 10 6 cells were taken from each treatment using 1 mL of reagent including methanol, acetonitrile and water (2:2:1, v/v/v).The BMECs solutions were then vortexed for 1 min, following by ultrasonicating for 0.5-h at 4 °C and for 1-h at −20 °C to precipitate protein.
A vacuum centrifuge was adopted to dry the supernatant of solution after a centrifugation with 14,000 rcf for 20 min, and then maintained at −80 °C until latter use.At last, the precipitation after drying was dissolved in 0.1 mL acetonitrile/water (1:1, v/v) and adequately vortexed, and then followed a 20 min centrifugation with 14,000 rcf to obtain supernatants for subsequent LC-MS/MS analysis.1. Statistics analysis of clean reads mapping onto reference genome.Complete DMEM-F12 medium used for BMECs was the positive control (POS) treatment, while DMEM-F12 medium without total essential amino acid served as the negative control (NEG) treatment.Ten treatments were NEG individually supplemented with L-arginine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, L-threonine, L-tryptophan or L-valine, respectively (n = 6).Individual EAA was supplemented to achieve a concentration equal to that in POS.After 6-h treatment, total RNA was extracted for latter RNA-seq.
total RNa sequencing.Poly-T oligo-attached magnetic beads were applied to purify mRNA.The synthesis of first-strand cDNA was conducted by random hexamer primer and M-MuLV Reverse Transcriptase(RNase H-), and the second-strand cDNA was generated with DNA Polymerase I and RNase H. Adaptor having hairpin loop structure were ligated after the adenylation of 3' ends of DNA fragments.AMPure XP system was used to purify PCR products.The quality of library was assessed by the Bioanalyzer 2100 system (Agilent Technologies, CA, USA).The library establishments were then sequenced using an Illumina sequencing platform (HiSeqTM 2500) and 150 bp paired-end reads were producted.
Chromatography.BMECs samples were dissociated on an UHPLC (Vanquish UHPLC, Thermo) along with a Orbitrap.The mobile phase included A = ammonium acetate (25 mM) and ammonium hydroxide (25 mM) in water as well as B = acetonitrile.The process was 98% acetonitrile for 90 seconds and was decreased to 2% during 10.5 min linearly, and maintained for 2 min following by increasing to 98% in 6 seconds.Continuous analysis of samples was conducted arbitrarily to reduce the influences of instrument detection signal fluctuations.Quality control (QC) samples that were composed of aliquots from total samples were operated 3 times before the queue to monitor the column condition and every 6 inserts after that to evaluate discrepancies.BMECs samples for targeted metabolomics of AA profiling were separated on an Agilent 1290 Infinity LC UHPLC System (Agilent Technologies, CA, USA) using a HILIC column.The mobile phase of composition A includes water, ammonium formate (25 mM) and formic acid (0.1%), and B was 0.1% formic acid in acetonitrile.The process of elution was: 85% B during 0-1 min; B reduced from 85% to 50% (1-11 min) in a linear manner, B was kept at 40% (11.1-12 min), B elevated from 40% to 75% (12-12.1 min), and finally B was kept at 75% (12.1-19 min).Continuous analysis of samples was conducted arbitrarily to reduce the influences of instrument detection measuring undulations.QC samples were operated 3 times before the queue to monitor the column condition and every 6 inserts after that to evaluate discrepancies.Chromatographic retention time was corrected by the blend of standard AA metabolites.
Mass spectrometry.ESI positive and negative ion modes were adopted in non-targeted metabolomics.
BMECs samples were dissociated through UHPLC and underwent mass spectrometry using a Thermo Scientific Orbitrap Exploris 480 (Thermo Fisher Scientific).The ESI source settings were: source temperature, 600 °C; ion source gas 1 (nitrogen), 60; ion source gas 2 (nitrogen), 60; curtain gas, 30; ion spray voltage floating, ±5,500 V.In MS only acquisition, 80-1,200 Da m/z was got by the system coupled with the resolution of 60,000 and the accumulation time of 100 ms.In auto MS/MS acquisition, the system was adjusted to get over the m/z from 70 to 1,200 Da and the resolution was adjusted as 30,000.The accumulation time was adjusted as 50 ms, and exclude time within 4 s.

Data analysis.
The QC and reads statistics in transcriptomics were measured by Trimmomatic 23 .Subsequent analyses were conducted with high-quality clean data.Hisat2 was used for the mapping of Bos taurus clean reads to the corresponding reference genome 24 .StringTie (v1.3.3b) was adopted to assemble the mapped reads of each sample 25 .Each transcript's expression was quantified using the amount of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced (FPKM) method 26 , and the read counts as well as FPKM value were  and c).Complete DMEM-F12 medium used for BMECs was the positive control (POS) treatment, while DMEM-F12 medium without total essential amino acid served as the negative control (NEG) treatment.Ten treatments were NEG individually supplemented with L-arginine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, L-threonine, L-tryptophan or L-valine, respectively (n = 6).Individual EAA was supplemented to achieve a concentration equal to that in POS.Quality control (QC) samples in metabolomics that were composed of aliquots from all samples were operated 3 times before the queue to monitor the column condition and every 6 inserts after that to evaluate discrepancies.The distinctive cluster of QC samples indicated that there was no significant variation induced by non-biology in this experiment.
computed by cufflinks and htseq-count 27 , respectively.The two expression profilings between each treatment and control group were quantified by DESeq R package using nbinom test 28 .Significant differentially expressed genes (DEG) were discriminated when the unigenes have P < 0.05 and |log2(fold-change)| > 1.
The raw data files of non-targeted metabolomics were transformed to the format of mzML by ProteoWizard 29 .XCMS program was applied for peak alignment, retention time correction, and peak area extraction 30 .The following settings for peak picking were applied: centWave m/z = 25 ppm, peakwidth = c (10, 60), prefilter = c (10, 100).Bw = 5, mzwid = 0.025, minfrac = 0.5 were applied for peak grouping.Metabolite identification was conducted by MS/MS spectra using an in-house database.To make the metabolomics data reproducible, the relative standard derivation (RSD) of the peaks in the QC samples larger than 30% were filtered out.After normalized to total peak intensity, the processed data were uploaded into SIMCA-P (version 14.1, Umetrics, Umea, Sweden), where it was subjected to multivariate data analysis, including Pareto-scaled principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA).Response permutation testing along with 7-fold cross-validation were used to assess the robustness of model.The variable importance in the projection (VIP) value of each variable in the OPLS-DA model was calculated to indicate its contribution to the classification.Significance was measured with an unpaired Student's t test.Compounds with VIP value > 1 and P < 0.05 were considered as differentially expressed metabolites (DEM).
For targeted metabolomics of AA profiling in BMECs samples, multiQuant software was adopted to extract the chromatographic peak area and retention time.AA standards were used for retention time correction and metabolites identification 31 .Compounds having coefficient of variation under 30% cross samples were identified as reproducible measurements.For DEM identification, statistical analyses between two groups for each EAA administration (NEG vs. individual EAA treatment or POS vs. individual EAA treatment) were conducted with fold changes and P-values that were obtained according to a Student's t test.Metabolites having P-values < 0.05 and VIP > 1 were considered as DEM.

Data Records
The sequencing data of BMECs with different EAA administrations were presented in Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo) with number of GSE232591 16 .Correlation coefficient matrix among the samples used for transcriptomics was uploaded to Figshare 20 .Raw LC-MS/MS data files for non-targeted and targeted metabolomics in BMECs were uploaded to the MetaboLights 17 database (http://www.ebi.ac.uk/metabolights) under MTBLS7789 18 and MTBLS3956 19 , respectively.Metabolic annotation and DEM analysis were present in Figshare 21 .

technical Validation
Quality of the sequencing data was evaluated according to the sequence quality, GC content, presence of adaptors and overrepresented k-mers with FastQC.Samples subjected to routine data cleaning to guarantee that no base was called with a Phred quality below 20.Statistical robustness was supported by the 6 biological replicates of each treatment.A number of raw reads came from 12 treatments, ranging from 40,612,706 to 53,705,974 (Table 1).A total of 40,253,216 to 52,418,068 clean reads were kept after trimming and the overall mapping efficiency ranged from 95.92 to 97.71%.These results suggested that the sequencing data has a high quality for subsequent analysis.Correlation of gene expressions between samples is an important parameter to check the experimental reliability.Our data deposited at Figshare 20 showed that the pearson correlation coefficient (r) with a square value between samples were all greater than 0.85, which was a prerequisite for subsequent differential expression analysis.
Various methods were used to improve the data quality in mass spectrometry experiments.First, sample extraction, LC-MS derivatization, and MS run followed the randomization sequences.Second, QCs that were made up of different samples were inserted during the measurement process to assess the reliability of data and system stability.Stability of measurement system was also examined using PCA analysis.PCA is a method that gives a overview of the data regarding questions about high variance as well as sample clusters and outliers 32 .QC samples cluster distinctively in Fig. 2, indicating that there was no significant variation induced by non-biology in this experiment.

Fig. 2
Fig. 2 Principal component analysis of the datasets obtained from transcriptomics (a) and metabolomics (band c).Complete DMEM-F12 medium used for BMECs was the positive control (POS) treatment, while DMEM-F12 medium without total essential amino acid served as the negative control (NEG) treatment.Ten treatments were NEG individually supplemented with L-arginine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, L-threonine, L-tryptophan or L-valine, respectively (n = 6).Individual EAA was supplemented to achieve a concentration equal to that in POS.Quality control (QC) samples in metabolomics that were composed of aliquots from all samples were operated 3 times before the queue to monitor the column condition and every 6 inserts after that to evaluate discrepancies.The distinctive cluster of QC samples indicated that there was no significant variation induced by non-biology in this experiment.