Reusing a prepaid health plan’s fecal immunochemical tests for microbiome associations with colorectal adenoma

An altered colonic microbiota probably increases colorectal adenoma (CRA) and cancer (CRC) risk, but large, unbiased fecal collections are needed to examine the relationship of gut microbiota diversity and composition to colorectal carcinogenesis. This study assessed whether fecal immunochemical tests (FITs) from CRA/CRC screening may fulfill this requirement. Using FIT, self-collected by members of Kaiser Permanente Hawaii (KPH), as well as interspersed quality control (QC) specimens, DNA was extracted and amplified to generate 16S rRNA microbiome profiles rarified at 10,000 reads. CRA/CRC were diagnosed by colonoscopy and histopathology. Covariates were from electronic KPH records. Of 921 participants’ FIT devices, 538 (58%) yielded at least 10,000 rRNA reads and 1016 species-level variants mapped to 46 genera. Of the 538 evaluable participants, 63 (11.7%) were FIT-negative per protocol, and they were considered negative for CRA/CRC. Of the 475 FIT + participants, colonoscopy and pathologic review revealed that 8 (1.7%) had CRC, 71 (14.9%) had high-risk CRA, 107 (22.5%) had low-risk CRA, and 289 (60.8%) did not have CRA/CRC. Men were 2.27-fold [95% confidence interval (CI) 1.32–3.91] more likely than women to be FIT+ . Men also had 1.96-fold (CI 1.24–3.07) higher odds of low-risk CRA, with similar trends for high-risk CRA and CRC. CRA/CRC were not associated with overweight, obesity, diabetes, or antibiotic prescriptions in this study. QC analysis across 24 batches of FIT devices revealed QC outliers in four batches. With or without exclusion of the four QC-outlier batches, as well as lenient (1000-read) rarefaction, CRA/CRC had no consistent, statistically significant associations with fecal microbiome alpha diversity, beta diversity or genera relative abundance. CRA/CRC had expected associations with male sex but not with microbiome metrics. Fecal microbiome profiling using DNA extracted from at-home collected, re-used FIT devices is feasible, albeit with substantial challenges. Using FITs for prospective microbiome studies of CRA/CRC risk should consider the impact of the current findings on statistical power and requisite sample sizes.


Results
Study population. As shown in Fig. 1, FIT devices were provided by 980 KPH participants, kept frozen, and subsequently subjected to DNA extraction, 16S rRNA amplification, and sequencing. Of these, 383 had fewer than 10,000 16S rRNA sequences (our rarefaction criterion), and 59 other participants were outside the per-protocol age range, leaving 538 participants for association analyses. As per protocol, 63 (11.7%) of the 538 participants were FIT-negative, and they were considered negative for CRA/CRC. Of the 475 FIT + participants, colonoscopy and pathologic review revealed that 8 (1.7%) had CRC, 71 (14.9%) had high-risk CRA, 107 (22.5%) had low-risk CRA, and 289 (60.8%) did not have CRA/CRC.
Associations of microbiome composition with FIT and CRA/CRC status. After rarefaction at 10 K reads, 538 devices (61%) yielded 1016 species-level amplicon sequence variants (ASVs) mapped to 46 genera. Table 2 shows the association between microbiome alpha diversities (number of species, Chao1, Shannon index, PD-whole tree) and FIT status. Overall, the number of species, chao1 index and Shannon index were significantly lower in FIT + compared to FIT− (P < 0.05). The FIT + group also had lower PD-whole tree, but this was not statistically significant (P = 0.12). When adjusted for age, gender and race/ethnicity, there was no significant www.nature.com/scientificreports/ association between any of the alpha diversities and FIT status (linear regression, all P > 0.05). We also compared the high risk CRA/CRC group with the rest of the participants, and there was no significant difference between groups for any of the alpha diversities. There was a trend of higher alpha diversities (number of species, Chao1 and Shannon index) for CRA + compared to CRA-with a borderline significance (0.05 < P < 0.1). For PD whole tree, the trend was the same, but not significant. The same trend existed for high risk CRA/CRC, however, the p-values were not significant, possibly due to the small sample size. Beta diversity was calculated using the rarefied microbiome data. Figure 2 shows the principal coordinate plots for three beta diversity measurements: Bray Curtis, weighted UniFrac (W.UniFrac) and unweighted Uni-Frac (U.UniFrac). Microbiome composition did not separate by disease category. We used MiRKAT to statistically assess the association between microbiome beta diversity and clinical factors. CRA + status was marginally associated with Bray-Curtis dissimilarity (P = 0.038) but not with U.UniFrac (P = 0.904) or W.UniFrac (P = 0.594), Omnibus (P = 0.101). Likewise, FIT + status was marginally associated with Bray-Curtis (P = 0.071) and U.UniFrac (P = 0.067) but not with W.UniFrac (P = 0.409), Omnibus (P = 0.167). When adjusted for BMI, age, gender and race, there were no significant associations between microbiome and any status of interest.
Individual taxa analyses. We evaluated the relative abundance of individual taxa at the genus level. Via unadjusted linear regression, three genera were significant: Blautia was significantly reduced in FIT + participants, and Roseburia was significantly increased in CRA + participants. Also, the relative abundance of Escherichia/Shigella was increased with high-risk CRA/CRC. However, none of these associations was significant at FDR 0.05 level. Figure 3 shows the beta coefficients of the associations ordered by effect size. Repeating these linear regressions after adjusting for age, gender and race, no statistically significant associations were observed at FDR 0.05 level. provided FIT devices, 102 were out of range on age. Of the remaining 878 FIT-based fecal specimens, 538 (61.2%) yielded more than 10,000 microbiome sequences, a relatively stringent rarefaction criterion that we employed for our primary analyses. With less stringent rarefaction, 650 (74.0%) specimens yielded more than 5000 reads and 811 (92.4%) specimens had more than 1000 reads. As shown in Table S1, there was no association between library size and age, gender, race, FIT status or CRA/CRC diagnosis.
Batch effects of patient specimens and quality control (QC) samples. All FIT-based fecal specimens, together with 72 QC samples, were assigned and processed through 24 different batches for microbiome sequencing. We assessed the possibility of batch effects by first investigating systematic differences in the microbiome data. As shown in Figure S1 and Figure S2, there was no difference across the batches on sequencing depth (one-way ANOVA, p-value = 0.36). Some cross-batch differences were found in estimates of microbiome alpha diversity (one-way ANOVA, p-value chao1 = 0.02; p-value shannon = 0.12; p-value PD.whole.tree = 0.10; p-value observed.species = 0.02) and beta-diversity (PERMANOVA, p-value bray_curtis = 0.001; p-value w.unifrac = 0.001; p-value u.unifrac = 0.004). On the other hand, there were no associations between the batches and participants' clinical characteristics including FIT status (permutation-based chi-square test, p-value = 0.76), CRA-vs CRA + (permutation-based chi-square test, p-value = 0.04), and high-risk CRA/CRC vs all others (permutation-based chisquare test, p-value = 0.74). Because the participants were generally balanced across batches, we did not adjust for batch in our primary analyses, which may have generated false positive discoveries as well as reducing statistical power.
The 72 QC samples were of three types: Artificial Colony, Robogut A and Blank. As each batch contained at most one QC sample of each type, no statistical analysis was performed. Rather, all QC richness and alpha diversity estimates across the 24 batches are presented in Fig. 4. Principal coordinate plots of beta diversity estimates for the QC samples are shown in Figure S3. Batches 11 and 18 had elevated alpha diversity (and beta diversity outliers) in the Blank QCs, and batches 23 and 24 had elevated alpha diversity (and beta diversity outliers) in the Artificial Colony QCs, suggesting that these batches might be systematically different than others. Thus, we conducted the following sensitivity analyses.  (Table 2), Chao1 and Shannon estimates of alpha diversity were higher in FIT+ than FIT−, whereas Shannon was higher in CRA + than CRA-neg participants (Table S2). In beta diversity analyses by MiRKAT, FIT status was associated with Bray-Curtis dissimilarity (P = 0.02) but not with W.UniFrac (P = 0.35) or U.UniFrac (P = 0.39); Omnibus test P = 0.059. Likewise CRA + status was associated with Bray-Curtis dissimilarity (P = 0.012) but not with

Discussion
This is the first microbiome study of FIT devices after they have been used for at-home fecal sampling, postal shipping, and laboratory-based automated detection of heme for primary CRA/CRC screening in a large, fully integrated health plan. As per standard clinical and institutional practice, FIT+ (that is, heme+) participants were referred for colonoscopy to identify large-bowel pathology, particularly CRA and CRC. Per protocol, we enriched the study population by including all FIT+ and a small, random sample of FIT-negatives. With a mean age of 62, most participants in our comprehensive pre-paid health program had been screened for CRA/CRC previously. Thus, among the FIT+ we found the expected proportions with CRC (1.7%), high-risk CRA (14.9%), www.nature.com/scientificreports/ and low-risk CRA (22.5%). Also as expected, we found that men had an approximately twofold higher risk of CRA/CRC than women. We found virtually no microbiome associations with CRA/CRC, in contrast to published studies based on feces collected specifically for microbiome and related molecular analyses [6][7][8][9][10][11][12][13][14][15][16]21 . The reasons for this discrepancy may be many. Foremost, we would postulate that the quantity of feces obtained by many participants, while adequate for heme detection, may have been inadequate to generate 16S rRNA-based amplicons representative of the fecal microbiota. Although we could not quantify total feces in each of the 980 FIT collected routinely by our members, we found that 61% of them yielded satisfactory microbiome profiles with stringent (10,000-read) rarefaction and 92% had satisfactory profiles with more lenient (1000-read) rarefaction. While likely useful for some research applications, these yields are lower than the excellent microbiome metrics obtained with FIT collected by clinical and research staff [19][20][21] .
Second, our specimens may have degraded, as our participants did not immediately freeze or chemically stabilize their FIT devices at home, although we note that upon receipt they were expeditiously tested for heme, then frozen immediately at −20 °C, and maintained thereafter at or below −70 °C until DNA extraction, amplification, and sequencing. Sinha and colleagues reported that microbiome metrics were highly stable in FIT stored at room temperature for four days 22 . Consistent with that, rarefaction at 10,000 reads in the current study yielded the expected distributions of alpha diversity (1016 species-level ASVs) and taxonomy (46 genera).
Third, true associations might have been attenuated by contamination as suggested by our blank-QC FIT devices. Consistent with Rounge and colleagues' findings of possible cross-contamination by the FIT detection system 23 , the contamination in our specimens was at low level and in only two of 24 batches. Moreover, our positive control QCs performed as expected in nearly every batch (24 of 24 with Robogut A, and 22 of 24 with Artificial Colony). Sensitivity analyses still yielded no CRA associations when the affected batches were excluded.
Fourth, we had no statistical power to replicate previously reported microbiome associations with CRC 8-11 , and we may have had insufficient power to detect generally weaker associations with CRA 7,12-16 . And lastly, researchers and journals may have been reluctant to publish null fecal microbiome associations like ours.
Beyond our failure to identify FIT-based fecal microbiome associations with CRA, our study had several noteworthy strengths and a few additional weaknesses. KPH is an integrated health system, with (pre-COVID) 80-90% participation rate in annual FIT screening. Thus, our participants are highly representative of the underlying population with respect to age, sex, and race. KPH's comprehensive EMR afforded assessment of demographics, diabetes, overweight/obesity, and use of antibiotics as potential confounders. And KPH provided colonoscopic and histopathologic ascertainment, at the expected rates, of CRC, high-risk CRA, and low-risk CRA, despite possible underascertainment due to incomplete colonoscopies. We had no participants at very high CRC risk, such as inflammatory bowel disease or hereditary polyposis, as such patients are monitored by colonoscopy, not with FIT screening. We classified our FIT-negatives as negative for CRA/CRC, which is justified by Kaiser Permanente Northern California's report that a single FIT had 97.9% negative predictive value for CRA 24,25 . We also classified as CRA-negative participants who were found to have trivially small polyps or rare extraneous conditions. Clinical and laboratory staff were masked to each other's data at all steps.

Conclusions
We demonstrated that bar-coded FIT devices can be systematically frozen immediately after screening for fecal heme, and that fecal DNA in these used devices can be extracted, amplified, and sequenced to generate microbiome profiles for association analyses and other research applications. We found no clear associations with CRA/ CRC with either stringent (10,000-read) or lenient (1000-read) rarefaction. These findings should be considered for the impact on statistical power and sample sizes of future prospective studies designed to understand the relationship of the fecal microbiota to CRA/CRC. For broad, large-scale success in reducing CRC-related morbidity and mortality, technical issues with fecal screening (including quantity, stabilization, shipping, and storage); the potential adverse impact on population-level participation rate; and the value and costs of cuttingedge laboratory methods such as metagenomics, meta-transcriptomics, and metabolomics must be resolved.

Field site and CRA/CRC screening. Kaiser Permanente Hawaii (KPH) is a fully integrated health plan
that encourages all members age 50-75 (excluding those with previous CRA or CRC, who are followed by colonoscopy) to participate in annual CRA/CRC screening 2 ; approximately 75% of these members provide a FIT for screening each year. Thus, members with previous CRA or CRC were excluded from the current study, but those with previous FIT screening (irrespective of FIT result) were eligible. Colonoscopy is not used for routine screening. The first stage of screening entails self-collection, at home, of feces using a licensed, commercially available, barcode labeled FIT device (Polymedco OC-Auto Micro FOB Test, Cortland Manor NY). The device generally obtains 10-50 mg of feces in a sealed vial containing 2 mL of proprietary solution. Within 3 days, it is sent at ambient temperature in a prepaid mailer through the US Postal Service to the KPH reference laboratory (Moanalua Medical Center) where, on the next business day, immunochemical testing for human hemoglobin is performed with a dedicated robotic instrument system (Polymedco Auto Sensor Diana) that is integrated with the laboratory's information system. The positive cutoff with the Polymedco process and equipment is 100 ng hemoglobin / mL (stool or diluent), yielding an analytic sensitivity of 96.11%, specificity 99.33% (accessdata.fda. gov/cdrh_docs/reviews/K041408.pdf). Positive (FIT+) as well as negative results thus flow electronically from the instrument into the KPH electronic medical record (EMR). Patients with a FIT + result are referred to KPH gastroenterology for second-stage screening by colonoscopy, which is usually completed within a few weeks. Descriptive colonoscopy results are added to the EMR on the same day. Gross and histopathologic diagnoses of biopsies and excised lesions are added within a week. The KPH primary care provider coordinates follow-up and Outcome and covariate classification. The current project employed a hierarchy of histopathologic diagnoses: first, CRC (excluding in-situ) if present; else high-risk CRA (at least one adenoma with diameter ≥ 1 cm or with villous histology); else low-risk CRA (< 1 cm diameter and no villous histology); else all other (including miscellaneous and benign). BMI, calculated from weight and height (Kg/M 2 ), was classified as healthy (18.5-24.99), overweight (25-29.99), or obese (≥ 30). Race and ethnicity (self-declared), Charlson morbidity index, and medical diagnoses were ascertained from KPH EMR data. Races were grouped into four categories: White, Asian, multiple, and other/ambiguous race. Charlson morbidity index was classified into three categories: = 0; ≥ 1 and ≤ 3; ≥ 4. Using the KPH electronic pharmacy records, we assessed oral and parenteral medications prescribed to our study participants during the 365 days prior to FIT collection. The medications included antibiotics (specifically, cephalosporins, fluroquinolones, macrolides, penicillins, tetracyclines, and aminoglycosides), cardiovascular drugs (specifically, statins, fibrates, beta blockers, calcium channel blockers, and other antihypertensives), corticosteroids, and proton pump inhibitors. Antibiotics and cardiovascular drugs were categorized by cumulative length of prescription (none = 0, < median = 1, ≥ median = 2). Corticosteroids and proton pump inhibitors were categorized as none vs any.
Collection and shipping of specimens. Over the course of six months, all FIT + devices, plus four randomly selected FIT-negative devices per week (N = 96 total FIT-negatives), plus 24 blank FIT devices (run through detection system with no feces) were immediately frozen at −20 °C, then shipped on dry ice by overnight courier in approximately equal sized batches to the National Cancer Institute (NCI) repository in accordance with International Airline Transport Association (IATA) regulations. KPH staff excluded no specimens, as they knew only the FIT result, not the indication for testing or other information. The NCI repository organized the specimens into 24 batches, each including a blank FIT, an Artificial Colony specimen, and a Robogut A specimen 26 . Processing of specimens; generation and editing of 16S rRNA sequence data. At the Institute for Genome Sciences, University of Maryland School of Medicine, all 2 mL proprietary solution plus feces was suctioned from each FIT device. This was mixed with 350 µL of lysis buffer composed of 0.05 M potassium phosphate buffer containing 50 µL lyzosyme (10 mg/mL), 6 µL of mutanolysin (25,000 U/ml; Sigma-Aldrich) and 3 µL of lysostaphin (4000 U/mL in 98sodium acetate; Sigma-Aldrich, St. Louis, MO). The mixture was incubated for 1 h at 37 °C, following which 10 µL proteinase K (20 mg/ml), 100 µL 10% SDS, and 20 µL RNase A (20 mg/ml) will be added. This mixture was incubated for 1 h at 55 °C. To further lyse microbial cells, Lysing Matrix B 2 ml beads [MP Biomedicals (Santa Ana, CA)] was added, following which mechanical disruption (bead beating) was performed on the mixture using a FastPrep instrument (MP Biomedicals, Solon, OH) set at 6.0 m/s for 30 s. The lysate was processed using the QIAsymphony SP protocol Pathogen complex 400 (Qiagen, Gaithesburg, MD) according to the manufacturer's recommendation. The DNA was eluted into 100 µL of storage buffer [QIAsymphony reagent buffer AVE (0.04% sodium azide), Qiagen], pH 8.0. PCR inhibitors were removed from the extracted DNA using the Zymo-Spin IV Spin Filter column according to the manufacturer's recommendations (Irvine, CA). DNA was quantified by Quant-iT PicoGreen (Molecular Probes, Inc., Eugene, OR) in a SpectraMax M5 microplate reader (Molecular Devices, Sunnyvale, CA).
A region of approximately 469 bp encompassing the V3 and V4 hypervariable regions of the 16S rRNA gene was targeted for sequencing. This region provides ample information for taxonomic classification of microbial communities from specimens associated with human microbiome studies and was used by the Human Microbiome Project 27 . Fusion dual barcoded primers 319F (5' ACT CCT ACG GGA GGC AGC AG-3') and 806R (5'-GGA CTA CHVGGG TWT CTAAT-3') were used to amplify the V3-V4 region of bacterial 16S rRNA genes. The amplicons were pooled in equimolar concentration and sequenced on an Illumina MiSeq Instrument using the 300 bp paired-end protocol. The sequenced reads were processed using the following steps: (1) removal of primer sequence, (2) truncation of reads not having an average quality of 20 over a 30 bp sliding window based on the phred algorithm 28,29 implemented previously 30,31 , (3) removal of trimmed reads having less than 75% of their original length, and (4) removal of the mate of reads that were discarded for having less than 75% original length. The Quantitative Insights Into Microbial Ecology (QIIME pipeline, version 1.6.0) 32 was used for all further sequence processing steps, including quality trimming and demultiplexing. Quality trimming in QIIME was performed using the following criteria: (1) truncate sequence before 3 consecutive low quality bases and re-evaluate for length, (2) no ambiguous base calls, and (3) minimum sequence length of 150 bp after trimming, (4) remove sequences with less than 60% identity to a pre-built Greengenes database of 16S rRNA gene sequences (Oct, 2012 version) 33 . Further data processing included denoising by clustering similar sequences with less than 3% dissimilarity using USEARCH 34 and de novo chimera detection and removal in UCHIME v5.1 35 . Paired reads were stitched together with "N" between each sequence and processed as one sequence in the analysis.

Statistical analyses.
To calculate microbiome alpha and beta diversities, we first rarefied all samples to 10,000 read counts so that all samples were comparable. Samples with sequencing depth less than 10,000 were removed. Analysis result at other rarefying level (5000 and 1000) is quantitively similar. Microbiome alpha diversities (Number of Species, Chao1, Shannon index and Phylogenetic diversity) and beta diversities (Bray-Curtis dissimilarity, weighted UniFrac distance and Unifrac distance) were calculated using phyloseq package in R (3.6.2) 38 Associations between microbiome alpha diversities and disease diagnosis, and between demographic variables and disease diagnosis, were assessed through linear regression models or fisher exact test for continuous or categorical outcomes respectively. MiRKAT 39 was used to assess the association between microbiome beta diversities and disease diagnosis. All statistical analysis was conducted in R (3.6.2). For individual taxa analyses, we focused on the genus-level. We excluded rare amplicon sequence variants (ASVs, present in < 10% of specimens) and excluded specimens that had fewer than 1000 read counts. Wilcoxon rank-sum tests were used to compare FIT + vs FIT-negative and CRA + vs CRA-negative on relative abundance for each genus. In addition, linear regression with each genus' relative abundance as the dependent variable was used to compare these FIT and CRA groups adjusted for age, sex, and race. Significance was declared at false discovery rate (FDR) of 0.05.
Ethics approval and consent to participate. Prior to implementation, the project was reviewed and approved by the KPH Institutional Review Board. Only authorized KPH staff had access to personally identifiable data. For statistical comparisons to the fecal microbiome profiles, KPH provided to NCI a limited dataset that included demographics (sex, age, race/ethnicity); length of membership in KPH; FIT result; date of FIT; colonoscopy diagnosis and date; types of gastrointestinal surgery and dates; most recent height and weight; presence / absence of type 2 diabetes; medication prescriptions within 365 days of FIT; history of Crohns disease or ulcerative colitis, and other common, major clinical diagnoses. Based on analysis of coded data and specimens previously collected for clinical care, the NIH Office of Human Research Subjects Protection issued the determination (#12694) that the proposed project was not human subjects research as defined by 45 CFR 46 and thus was exempt from NIH Institutional Review Board review.

Data availability
The data have been posted to the Sequence Read Archive (SRA) with Bioproject ID PRJNA673212: http:// www. ncbi. nlm. nih. gov/ biopr oject/ 673212. Artificial Colony and Robogut A quality control specimens may be available from Dr. Sinha (sinhar@exchange.nih.gov). Specimens from the primary FIT devices are no longer available.