Novel, rare and common pathogenic variants in the CFTR gene screened by high-throughput sequencing technology and predicted by in silico tools

Cystic fibrosis (CF) is caused by ~300 pathogenic CFTR variants. The heterogeneity of which, challenges molecular diagnosis and precision medicine approaches in CF. Our objective was to identify CFTR variants through high-throughput sequencing (HTS) and to predict the pathogenicity of novel variants through in 8 silico tools. Two guidelines were followed to deduce the pathogenicity. A total of 169 CF patients had genomic DNA submitted to a Targeted Gene Sequencing and we identified 63 variants (three patients had three variants). The most frequent alleles were: F508del (n = 192), G542* (n = 26), N1303K (n = 11), R1162* and R334W (n = 9). The screened variants were classified as follows: 41 – pathogenic variants [classified as (I) n = 23, (II) n = 6, (III) n = 1, (IV) n = 6, (IV/V) n = 1 and (VI) n = 4]; 14 – variants of uncertain significance; and seven novel variants. To the novel variants we suggested the classification of 6b-16 exon duplication, G646* and 3557delA as Class I. There was concordance among the predictors as likely pathogenic for L935Q, cDNA.5808T>A and I1427I. Also, Y325F presented two discordant results among the predictors. HTS and in silico analysis can identify pathogenic CFTR variants and will open the door to integration of precision medicine into routine clinical practice in the near future.

(ii) hybridization of oligonucleotide: hybridization of the oligonucleotides pool (upstream and downstream) specific to the regions of interest was performed in Veriti 96-Well Thermal Cycler (Applied Biosystems, Waltham, Massachusetts, USA).
(iii) removal of the oligonucleotides in suspension: the separation and removal of the oligonucleotides in suspension from the other components of the solution were performed with magnetic beads with genomic DNA affinity. Removal of oligonucleotides represents a critical step in building libraries due to the possible loss of DNA, since the beads are sensitive to temperature and dryness and may loosen the DNA that will be removed along with the oligonucleotides in suspension. Novel, rare and common pathogenic variants in the CFTR gene screened by high-throughput sequencing technology and predicted by in silico tools Online supplement 1 (v) amplification of DNA libraries: each sample was identified by the combination of indexes (i7 and i5) incorporated into the amplification sequences. The indexes cannot be contaminated and must represent a single combination/sample. Amplification was performed in 29 cycles, according to the number of amplicons, following the manufacturer's protocol.
(vi) DNA library clean-up: similar to step (iii), with the use of magnetic beads to separate PCR products from other components in solution during the reaction. After removal, the product was evaluated in 4% agarose gel and submitted to electrophoresis to identify the amplified fragments of ~350 bp.
(vii) DNA library normalization: the libraries were normalized to a similar concentration for all samples in order to reduce the likelihood of non-homogeneous sequencing among the products. This process occurred through beads that bind to DNA until reaching saturation. This is a critical step as it relies on the use of beads and is subject to handling errors.

Part III. Evaluation of the variants in the CFTR gene by Sanger sequencing
The CFTR gene is composed of 27 exons; but as exon 13 is large in size, it was amplified in 2 fragments The primer sequence, annealing temperatures, buffer size and amplified fragments are described in Table   1 of the Online supplement. CFTR exon sequencing, including exon/intron boundaries, was performed as previously described.  T, temperature; S, sense primer; AS, antisense primer; bp, base pairs. *, addition of 5% DMSO (Dimethyl sulfoxide or dimethyl sulfoxide) in the reaction.

PART IV. Computational methods (in silico) to classify pathogenicity
Predictive methods were selected according to their approach and algorithm, in order to complement one another and provide the best identification of the possible degree of pathogenicity of the identified CFTR variants. In this study, the predictors were applied in three distinct groups: (i) variants previously described as pathogenic in order to validate the predictors, (ii) variants of uncertain significance in order to identify the possible association with pathogenicity and as a cause of CF and (iii) variants still not described in the literature with the aim of characterizing the pathogenic potential. Thus, the following predictors were applied in the variants identified in the CFTR: is a tool that provides an analysis with multiple information and metrics. This method integrates the analysis of evolutionary conservation, allelic diversity, variants annotation, functional genomic data, transcription information and causal variants within individual genome sequences. The output provides a numerical score and the higher the raw C score, the more predicted to be deleterious. A score greater or equal 10 indicates that these are the 10% most deleterious substitution. A score of greater or equal 20 indicates the 1% most deleterious and so on 36 .