Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition

Quantitative proteomics is key for basic research, but needs improvements to satisfy an increasing demand for large sample series in diagnostics, academia and industry. A switch from nanoflowrate to microflowrate chromatography can improve throughput and reduce costs. However, concerns about undersampling and coverage have so far hampered its broad application. We used a QTOF mass spectrometer of the penultimate generation (TripleTOF5600), converted a nanoLC system into a microflow platform, and adapted a SWATH regime for large sample series by implementing retention time- and batch correction strategies. From 3 µg to 5 µg of unfractionated tryptic digests that are obtained from proteomics-typical amounts of starting material, microLC-SWATH-MS quantifies up to 4000 human or 1750 yeast proteins in an hour or less. In the acquisition of 750 yeast proteomes, retention times varied between 2% and 5%, and quantified the typical peptide with 5–8% signal variation in replicates, and below 20% in samples acquired over a five-months period. Providing precise quantities without being dependent on the latest hardware, our study demonstrates that the combination of microflow chromatography and data-independent acquisition strategies has the potential to overcome current bottlenecks in academia and industry, enabling the cost-effective generation of precise quantitative proteomes in large scale.


Suppl. Fig. 3. Yeast protein quantification from microLC-SWATH using 29 x 16 m/z isolation windows.
A yeast tryptic digest was analysed using microLC (25 cm x 0.3 mm Triart-C18, 3 µl/min, 60 min gradient) and coupled to a TripleTOF5600 MS operating in SWATH mode by repeated injection of 10 µg yeast proteome digest (9x). Isolation windows were chosen as 29x16 m/z, with 40 ms accumulation time and a mass range of 400-850 m/z. Data was processed with Spectronaut v8.0 using SWATH libraries generated a) by sample fractionation ( frac ) or b) sample exhaustion ( exh ), c) using a publicly accessible spectral library ( Biognosys library , Spectronaut repository), or with a library generated by DIA-Umpire. Library a) allowed quantification of 1422 proteins, while 1157 proteins could be quantified using library b). The public library (c) did quantify 1118, and DIA-Umpire, that uses correlation patterns to create a library directly out of the SWATH data, 890 proteins. Spectral libraries (except the one obtained from Spectronaut repository) were generated according to Schubert et al. 1 Suppl. Fig. 4. Yeast peptide quantification from microLC-SWATH using 34 x 25 m/z isolation windows. A yeast tryptic digest was analysed using microLC (25 cm x 0.3 mm Triart-C18, 3 µl/min, 60 min gradient) and coupled to a TripleTOF5600 MS operating in SWATH mode by repeated injection of 10 µg yeast proteome digest (9x). Isolation windows were chosen as 34x25 m/z, with 100 ms accumulation time and a mass range of 400-1250 m/z. Data was processed with Spectronaut v8.0 using SWATH libraries generated a) by sample fractionation ( frac ) or b) sample exhaustion ( exh ), c) using a publicly accessible spectral library ( Biognosys library , Spectronaut repository), or with a library generated by DIA-Umpire. Library a) allowed quantification of 8824 peptides, while 6866 peptides could be quantified using library b). The public library (c) did quantify 8283, and DIA-Umpire, that uses correlation patterns to create a library directly out of the SWATH data, 6132 peptides. Spectral libraries (except the one obtained from Spectronaut repository) were generated according to Schubert et al. 1

Suppl. Fig. 5. Yeast peptide quantification from microLC-SWATH using 29 x 16 m/z isolation windows.
A yeast tryptic digest was analysed using microLC (25 cm x 0.3 mm Triart-C18, 3 µl/min, 60 min gradient) and coupled to a TripleTOF5600 MS operating in SWATH mode by repeated injection of 10 µg yeast proteome digest (9x). Isolation windows were chosen as 29x16 m/z, with 40 ms accumulation time and a mass range of 400-850 m/z. Data was processed with Spectronaut v8.0 using SWATH libraries generated a) by sample fractionation ( frac ) or b) sample exhaustion ( exh ), c) using a publicly accessible spectral library ( Biognosys library , Spectronaut repository), or with a library generated by DIA-Umpire. Library a) allowed quantification of 6598 peptides, while 5673 peptides could be quantified using library b). The public library (c) did quantify 6518, and DIA-Umpire, that uses correlation patterns to create a library directly out of the SWATH data, 4841 peptides. Spectral libraries (except the one obtained from Spectronaut repository) were generated according to Schubert et al. 1

Suppl. Fig. 6. Technical variability of yeast peptide quantification is low in microLC-SWATH-MS irrespective of data extraction
Fold change variability of 765 peptide precursors present in all data sets was compared throughout nine replicates. Median coefficients of variation are between 7.3 % and 8.4 % for libraries generated using respectively fractionation and exhaustion approach, 8.8 % for an unrelated yeast library, and 7 % for a library generated by DIA-Umpire. Spectral libraries (except the one obtained from Spectronaut repository) were generated according to Schubert et al. 1 Suppl. Fig. 7. Human peptide quantification from microLC-SWATH data.
A tryptic digest of a whole-cell protein extract from human K562 cells was analysed using microLC (25 cm x 0.3 mm Triart-C18, 3 µl/min, 60 min gradient) and coupled to a TripleTOF5600 MS operating in SWATH mode by repeated injection of 3 µg digest (6x). Data was processed with Spectronaut v8.0 using a SWATH library obtained from the SWATHAtlas repository 2 ( 10k library ), or using SWATH libraries generated by repeated analysis of HEK293 or HeLa cell extracts (Spectronaut repository). Data analysis using a rich library allows quantification of 20508 peptides, while 9256 peptides can be quantified using a HEK293 and 12272 using a HeLa library, respectively. Suppl. Fig. 8

. Technical variability of human peptide quantification is low in microLC-SWATH-MS irrespective of data extraction
Fold change variability of 6722 peptide precursors present in all data sets compared throughout the nine replicates. Median coefficients of variation of signal intensities are between 5.8 and 6.5 %, for all libraries. Suppl. Fig. 9. Direct correlation of peptide intensities between two human samples reveals no negative bias towards low intensity peptides Correlation of peptide intensities of two human samples is linear (R 2 = 0.98), and correlation is equally good for high and low intensity peptides over five orders of magnitude. Suppl. Fig. 10

. Coefficients of variation show little bias towards low intensity peptides
Coefficients of variation of peptide intensities of six human samples is low (mean 3-10 %) for peptides with intensities >1000 cps, and only marginally elevated for lower peptides (mean 8-12 %). Inset: Coefficients of variation of peptide intensities of six human samples. Suppl. Fig. 11. Library size is associated to quantification precision.
Protein identifications in a spectral library generated by sample fractionation (green; 3896 proteins) or using DIA-UMPIRE approach (violet; 853 proteins) were plotted against absolute protein abundances (log scale) reported by two independent datasets (Kulak et al. DIA-Umpire library indeed captures highly abundant proteins. Suppl. Fig. 12

. Total precursor intensities before and after batch correction
Combined intensities of all precursors varies with experimental batches, and is equalled out by batch correction. Suppl. Fig. 13. Protein fold change standard deviation is reduced by batch correction Protein fold change was calculated from three most abundant peptides of the representative ZRP1_YEAST protein quantified in two yeast strains in 9 replicates (3 batches). After batch correction, variability is considerably reduced.

Supplementary Tables
Suppl.