Introduction

Chronic lymphocytic leukemia (CLL) displays a very heterogeneous clinical behavior, therefore prognostic and predictive markers play an important role in disease management. To date, the key decision-making biomarkers in CLL are TP53 gene defects: chromosomal aberrations of 17p13, in particular deletions spanning the TP53 locus, and TP53 gene mutations, both of which are associated with adverse disease outcome due to resistance to chemoimmunotherapy [1,2,3,4].

Early studies utilizing fluorescent in situ hybridization (FISH), for the detection of cytogenetic aberrations, revealed that CLL patients carrying del(17p) have a significantly shorter overall survival compared to patients harboring other recurrent cytogenetic abnormalities, i.e., del(11q), trisomy 12, or del(13q) [5]. Inactivation of the TP53 locus due to del(17p) is frequently associated with mutation(s) on the second TP53 allele. However, TP53 mutations also occur in the absence of del(17p) in about 5% of untreated patients and are associated with a poor outcome, similar to the disease course observed in del(17p) CLL patients [6, 7]. More specifically, approximately 90% of patients with del(17p) carry a TP53 mutation; conversely, only 60–70% of patients with TP53 mutation also harbor del(17p), as detected by FISH [8,9,10,11,12].

The clinical utility of TP53 mutation analysis in CLL has been well documented by many studies [7,8,9, 11, 13], including findings from prospective clinical trials [6, 14, 15] clearly showing that patients carrying TP53 defects are resistant to chemoimmunotherapy. In this context, the advent of novel treatment options inhibiting B-cell signaling and anti-apoptotic BCL2 that proved efficacious in patients harboring TP53 gene disruption [16,17,18] has brought an urgent need for accurate assessment of the TP53 gene status in routine clinical practice with the aim of identifying those patients who would not benefit from chemoimmunotherapy, and hence should be considered for targeted agents.

TP53 gene assessment should always be performed prior to initiation of the first and every subsequent line of treatment [19]. That said, a few situations exist where TP53 mutational analysis may not be required, e.g., when the use of p53-independent drugs is not possible due to either patient fitness or limited market access, or when the presence of a TP53 alteration has already been documented.

The recent introduction of high-throughput next-generation sequencing (NGS) has led to the identification of TP53 mutations with a low variant allelic frequency (VAF)—usually below the detection limit of conventional Sanger sequencing—that may be positively selected with the use of chemotherapy, ultimately leading to the expansion of an initially minor TP53 mutant subclone into a prevalent refractory clone [20,21,22,23,24].

Taken together, the recent therapeutic and technological advances necessitate an update of the previously published ERIC recommendations for TP53 mutation analysis in CLL [19], including assessment of the current methodological approaches as well as recommendations for the interpretation of the findings and the accurate reporting of results. An overview of the updated recommendations is provided in Table 1.

Table 1 Overview of ERIC recommendations for TP53 analysis

Procedure description

Material for TP53 mutation analysis

For most CLL patients, peripheral blood (PB) is an appropriate starting material for TP53 mutation analysis. Nevertheless, an important factor influencing the result is the cancer cell fraction (CCF), and this is particularly relevant in cases with a low lymphocyte count (<10 × 109/L and/or <60–70% lymphocytes in PB). This is usually evidenced in patients with predominant lymphadenopathy and few circulating clonal cells, i.e., small lymphocytic lymphoma. In such cases, material enriched with tumor cells such as bone marrow (BM) or lymph node biopsies may be an alternative option.

PB or BM should be collected in tubes containing an anticoagulant, such as EDTA or heparin, followed by mononuclear cell separation by density gradient centrifugation to enrich the lymphocyte fraction. The use of mononuclear cells might be insufficient when the specimen analyzed contains less than 60–70% lymphocytes and could lead to a false-negative result when using Sanger sequencing (Supplementary Fig. S1). In such instances, selection of CD19+ cells using enrichment techniques such as RosetteSep or MACS should be performed to yield a higher CCF. Alternatively, ultra-deep NGS, which has a much greater sensitivity level, can be performed and the VAFs corrected with respect to the CCF. Regarding tissue material, fresh/frozen material is strongly preferred. Formalin-fixed, paraffin-embedded (FFPE) tissues are recommended only when no alternative sample is available as the fixation and embedding processes may hamper the analysis, since: (i) FFPE material often contains highly degraded DNA fragments, therefore shorter amplicons are required for sequencing; (ii) the process of tissue fixation damages DNA through cross-linking, thus reducing the number of intact DNA molecules added into the PCR [25]; and (iii) DNA can be chemically modified, leading to artefactual sequencing results (particularly deamination and oxidation artifacts) [26,27,28]. Therefore, any variants detected in DNA samples from FFPE material should be confirmed by independent PCR and carefully verified using the recommended databases (described below) before interpreting and reporting them as mutations.

Finally, when considering the type of nucleic acid to analyze, genomic DNA is highly recommended. Analyzing RNA may result in truncating or splice site variants being missed due to nonsense-mediated RNA decay [29]. In addition, using whole-genome amplification for diagnostic purposes is discouraged as it may introduce a bias in allelic frequencies and could lead to allelic drop-out.

Region of interest

At a minimum, the sequenced region of the TP53 gene must include exons 4–10, which corresponds to the DNA-binding domain (codons 100–300) and the oligomerization domain (codons 323–356). Sequencing of exon 10 is recommended as the frequency of mutations in exons 9 and 10 is similar or even higher in exon 10 as documented by the recent studies [30] (Fig. 1). Optimally, exons 2–11 should be analyzed to cover the entire coding region [30]. TP53 gene profiling studies by NGS, which usually involves also exons 2, 3, and 11, have shown that variants can also occur in these exons, although their frequency is low (T. Soussi, unpublished results; Fig. 1). As each exon is surrounded by a splice donor and a splice acceptor site, sequencing of +2/−2 intronic nucleotides is required to detect variants which may impair splicing and translate to inactive proteins.

Fig. 1
figure 1

Frequency of TP53 variants detected in individual exons. Data are retrieved from the last version of the UMD_TP53 database (http://p53.fr/) and include somatic and germline mutations detected by next-generation sequencing of exons 2–11

Sanger sequencing

Primer sequences, as well as the protocol for performing the PCR, are available on the International Agency for Research on Cancer (IARC) TP53 website (http://p53.iarc.fr/ProtocolsAndTools.aspx). This PCR protocol is adaptable and can be modified based on local experience. Bidirectional sequencing analysis is the only acceptable strategy, and the chromatograms generated by Sanger sequencing should be carefully scrutinized to ensure that somatic variants present at lower allelic frequencies are not overlooked; adjusting software settings to detect germline homozygous and heterozygous variants is not sufficient. The ERIC TP53 Network provides the opportunity to analyze Sanger sequencing data via a web-based tool called GLASS [31]. This software was purpose-built to assist with the assessment of somatic gene variations and provides a standardized variant output as recommended by the Human Genome Variation Society (HGVS). GLASS was specifically developed to support ERIC TP53 Network activities and is freely accessible at http://bat.infspire.org/genomepd/glass/ or via the ERIC website (http://www.ericll.org/guidance-toolstp53/).

Finally, although the relevance of pre-screening methods, such as denaturing high-performance liquid chromatography and high-resolution melting analysis is decreasing, they remain a viable and cost-effective option. That notwithstanding, in order to identify the specific variant, aberrant screening results must always be confirmed by Sanger sequencing in an independent PCR.

Next-generation sequencing

Targeted NGS can be used for the analysis of the TP53 gene as a standalone assay or as part of a gene panel investigating several genes. Numerous commercially available ready-to-use analytical kits include the TP53 gene, and ERIC is conducting a multi-center collaborative effort to assess and compare various pre-designed and custom gene panel technologies. Previous studies exploring the inter-reproducibility of targeted NGS and Sanger sequencing for TP53 analysis demonstrated very good correlation of the results, specifically showing that all variants detected by Sanger sequencing are also detectable by NGS [22, 23, 32,33,34,35]. A recent study also showed an excellent correlation between the results obtained from two different NGS platforms, namely, the Ion PGM (ThermoFisher) and the MiSeq (Illumina) [33]. In addition, NGS is capable of detecting variants below the sensitivity threshold of Sanger sequencing, even VAFs as low as <1% [20, 22, 23]. Due to the low detection limit of NGS, multiple subclonal mutations within the TP53 gene (i.e., convergent mutations) may be detected in some patients [20, 35].

To ensure the maximum applicability and reliability of NGS, several important issues need to be addressed when establishing the methodology, as erroneous results can arise for various reasons (Table 2).

Table 2 Types of NGS errors and their sources

DNA input and quality

Low input and/or degraded DNA may result in false-negative results due to a sampling effect, and may also produce false-positive results as amplified errors might constitute a significant proportion of the final sequencing library [36]. The initial amount of DNA should always be calculated with respect to the required limit of detection (LOD), keeping in mind that a human cell (two alleles) contains approximately 6 pg of DNA. For reliable detection, the DNA input must ensure that the sample contains a sufficient number of variant molecules and that the variants can be distinguished from background noise. For instance, at least 10 ng corresponding to approximately 1500 cells or 3000 alleles should be used to detect variants present at 1% VAF. This is also relevant for techniques which require the starting amount of DNA to be distributed amongst individual nano-scale PCRs, e.g., the Fluidigm Access Array, RainDance Technology, or Wafergen. Although DNA isolated from PB and BM is usually of good quality, testing the integrity of the DNA by agarose electrophoresis or specialized automated electrophoresis devices is recommended (and often required) for NGS. Special attention is required when considering the quality and quantity of DNA obtained from FFPE samples due to the increased risk of false-positive as well as false-negative results.

Library preparation

Both amplicon-based and capture-based approaches are applicable. From a practical perspective, amplicon-based library preparations require much smaller quantities of input DNA and the workflow tends to be simpler and less time-intensive and labor-intensive compared to capture-based methodologies. On the other hand, hybridization capture-based approaches demonstrate better uniformity of coverage and generate fewer false-negative as well as false-positive calls as compared to amplicon-based techniques. When designing in-house primers for amplicon-based libraries, it is important to check the primer positions against potential single-nucleotide polymorphisms (SNP) and ensure that the primers can efficiently read across splice junctions. In order to establish an NGS assay with high detection sensitivity, proofreading polymerases with low error-rates are recommended. Incorporating unique molecular identifiers into the library preparation helps to distinguish errors introduced artificially during the process from true low-frequency variants and also allows for more accurate quantification (especially with PCR-based protocols) [37, 38]. Additional benchmarking studies are required to establish standard analytical methods that must then be checked for accuracy and reproducibility.

Sequencing and coverage

The required coverage should be set to ensure that the call is statistically above the background noise. Generally, the minimal coverage should not be less than 100 at any position within the regions of interest and the number of variant reads for reliable variant calling should be at least 10. The frequently reported mean or median coverage of a diagnostic panel is non-informative as uncovered regions cannot be deduced from this average value and therefore a ≥99% minimum coverage percentage is a vital requirement. Of note, the number of reads does not necessarily reflect the actual number of unique template gDNA molecules, as many reads will be duplicates generated during PCR amplification. When employing longer reads, a confident overlap (>60–70%) between the paired reads is recommended in order to avoid the introduction of false-positive results. Calling variants found in unbalanced regions with forward-reverse ratios of less than 10% (i.e., strand bias) should be avoided.

Data analysis

Multiple commercial, as well as free, software tools are available to analyze NGS data and, as the bioinformatics field is continuously evolving, no single tool is currently preferentially recommended. That said, it is of utmost importance to use a pipeline that has been optimized, and validated, for the detection of low abundance variants that must be distinguished from background error noise. Another issue concerns the accurate identification of insertions and deletions (indels), which may be missed during the alignment process, especially in the case of complex indels. Numerous indel-calling tools have been developed that often vary in the manner by which they detect indel breakpoints. Performance evaluations of indel-calling software have revealed limitations in detection; consequently, manual inspection of the data is always recommended and is particularly required for indel variants and variants close to the detection limit.

Limit of detection

LOD refers to the lowest VAF that is reproducibly detectable by the particular method under specific well-defined conditions. The LOD is a function of both the initial DNA input and the coverage achieved. The NGS assay should be established, and validated, to at least reliably identify variants detectable by Sanger sequencing and avoid false-positive calls with VAF above the Sanger sequencing detection limit (e.g., minimum LOD is 10% VAF). LOD should be set by taking into account non-uniformity of coverage across the analyzed sequence and an inconsistent error distribution. The occurrence of sequencing errors varies depending on the nucleotide position and composition and is also platform-dependent, with C:G>T:A being the most frequent using Illumina platforms [39]. The error rate is also influenced by the specific sequence context (e.g., homopolymers are more prone to erroneous variant calling). The issue of detection limit and how it can influence the interpretation of findings is discussed in the following section.

Clinical reporting and interpretation of the results

Variant description

Detected variants should be described using the nomenclature devised by the HGVS nomenclature (http://varnomen.hgvs.org/) [40]. Several software programs are available to ensure adherence to standardized nomenclature (e.g., Mutalyzer; https://www.mutalyzer.nl/). Variants should be described at both cDNA and protein level, and the reference sequence number and version including the transcript and protein variant should be stated (see Supplementary Material). To standardize the output, the preferred coding DNA reference sequence is the stable Locus Reference Genomic sequence (LRG; http://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_321.xml) [30]. Transcript and protein variants 1 should be used (LRG_321t1, LRG_321p1). Special attention is warranted when annotating variants detected by NGS, especially since many bioinformatics pipelines do not fulfill the requirements for correct variant description according to the HGVS nomenclature. More specifically: (i) insertions and deletions are often not handled accurately; (ii) duplications are often misinterpreted as insertions; (iii) varying reference sequences for TP53 within the same output are used; and (iv) the 3′ rule is not always implemented correctly. This is of particular importance for TP53 and other genes that are oriented in the reverse direction on the chromosome. In such situations, the alignment and variant calling steps may introduce errors if aligning to the 3′ end with respect to the chromosome position rather than the coding sequence orientation.

Interpretation

Databases

The detected variant should be checked using locus-specific databases, i.e., either the IARC TP53 database (http://p53.iarc.fr/TP53GeneVariations.aspx) [41] or the TP53 website (UMD database; http://p53.fr/) [42]. These databases compile data from peer-reviewed literature as well as general databases, and provide information about: (i) the functional impact of all possible single-nucleotide exchanges within the coding region; (ii) the variant frequencies noted in both the somatic and the germline context; and (iii) additional relevant information, including links to other resources. The TP53 website also provides a web-service tool called Seshat that is capable of managing files generated from NGS both in the vcf and bam formats. Seshat helps the user to: (i) check the variant nomenclature for consistency and generate a full description of each variant formatted according to HGVS; (ii) assess the pathogenicity of each variant according to general prediction algorithms and algorithms developed specifically for analyzing the TP53 gene; and (iii) obtain functional and structural data for each TP53 variant. Finally, variants can also be checked using the COSMIC (http://cancer.sanger.ac.uk/cosmic) or ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) databases; however, these databases are only recommended as a complementary analysis to the locus-specific databases.

Polymorphisms and neutral variants

In general, it is not recommended to include common polymorphisms and benign variants in the report to physicians. If, however, the local practice requires that these variants are detailed in the clinical report, it should be clearly indicated that the detected variant is not clinically relevant.

According to the IARC database, there are six validated exonic polymorphisms within the TP53 gene; two are synonymous (c.108G>A: p.Pro36= and c.639A>G: p.Arg213=) and four are nonsynonymous (c.91G>A: p.Val31Ile; c.139C>T: p.Pro47Ser; c.215C>G: p.Pro72Arg, and c.1096T>G: p.Ser366Ala). The most frequent polymorphism is c.215C>G: p.Pro72Arg, where the ancestral allele C coding for proline is less frequent in the general population than the allele G [43] with latitude-dependent variations. Although the two alleles were reported to have different capabilities in inducing apoptosis and G1 arrest [44], studies analyzing the clinical impact of p.Pro72Arg and its associations with TP53 mutations in CLL reported inconclusive results [45,46,47,48]. Reporting of the p.Pro72Arg status is therefore not recommended due to a lack of convincing evidence with regard to prognostic or clinical relevance.

Using dbSNP for filtering out polymorphisms and neutral variants is strongly discouraged as many variants listed in dbSNP exhibit loss of function and are frequently observed in cancer patients despite not being reported as pathogenic in ClinVar [49]. More specifically, of the 100 most frequent deleterious somatic variants described in the IARC database, 65 are present in dbSNP147 and only 34 are described as being pathogenic [41]. Using the data set collected within the context of the Genome Aggregation Database (gnomAD) is more accurate; however, it should be noted that several pathological variants are also listed in this database (http://gnomad.broadinstitute.org/, originally Exome Aggregation Consortium [43]).

Variants with preserved activity

If a rare variant or a variant with preserved functionality is detected, it is recommended to repeat the entire analysis, starting from the PCR step, so as to exclude analytical errors. If the variant is verified and the VAF is approximately 50%, suggesting a germline origin, it is advisable to verify the germline or somatic nature of the variant by testing patient-matched germline DNA, obtained from CD3+ cells, saliva, a buccal swab or a skin biopsy (it is advised to rule out the contamination with CLL cells by flow cytometry or by testing the patient-specific IGHV rearrangement). Variants that have preserved transactivation capabilities are often found as germline and the carriers do not show any personal or family cancer-history associated with Li-Fraumeni or another cancer-predisposing syndrome. Specific examples of variants that should be considered with caution and are often inaccurately reported are c.704A>G: p.Asn235Ser or c.847C>T: p.Arg283Cys. If the somatic origin of such a variant is confirmed, the variant should be reported to the clinician clearly stating that a variant of unknown significance was found. In the case that the variant is of germline origin, reporting should follow the recommendations of The American College of Medical Genetics and Genomics [50, 51] (recommendations of The European Society of Human Genetics are currently under preparation).

Intronic variants

Variants affecting splice sites (+2/−2 intronic nucleotides) are considered pathogenic as they lead to aberrant mRNA splicing. Pathogenicity of intronic variants outside the donor and acceptor sequence is largely unexplored, and therefore they should not be reported unless their functional impact is proven at the RNA or protein level by documenting the presence of aberrantly spliced transcripts or shortened protein products. As these methods are not usually accessible in diagnostic labs, reporting of intronic variants with the exception of splice sites is not recommended within clinical routine.

Synonymous variants

If a synonymous variant is detected, it is important to check its predicted effect on splicing [52] via the IARC database or the TP53 website. For instance, synonymous variants in codon 125 (c.375G>A and c.375G>T) have been found in various cancers and Li-Fraumeni families and shown to affect the splicing of exon 4 [53], therefore they are classified as pathogenic.

Indel variants

Insertions and deletions leading to the formation of a premature stop codon (frameshift variants) as well as in-frame indels within the DNA-binding domain are considered as likely pathogenic.

Clinical reporting of subclonal variants with low variant allele frequency detected by NGS

The definition of the term “subclonal” is generally used to describe variants that are not present in the entire tumor population, as opposed to “clonal” [21]. Terms such as “minor subclone”, “low-burden”, or “low-level” variants refer to variants with allelic burdens below the detection limit of Sanger sequencing, i.e., <10% VAF. Of note, caution is necessary when interpreting VAFs as its calculation does not take into consideration the CCF and the presence of genomic copy number aberrations. Therefore, it is important to bear in mind that a 5% VAF could be clonal if the CCF is only 10% and no del(17p) or copy-neutral loss of heterozygosity is present.

Several publications have suggested that TP53 mutations within minor clones are clinically relevant, which is particularly important considering that administration of therapeutic regimens based on DNA-damaging agents represents a risk for the selection of these low-level TP53-mutated subclones [20,21,22,23, 33, 54]. However, the extent of the risk posed by minor subclones harboring TP53 mutations has not been conclusively defined, and the current evidence on the poor outcome of TP53-mutated patients treated with chemoimmunotherapy in clinical trials is based on data obtained using Sanger sequencing only. Therefore, currently, the presence of minor subclonal mutations should not impact clinical decision-making. Based on current knowledge, the recommended threshold for reporting of mutations detected by NGS should reflect the Sanger-like threshold of approximately ~10% VAF. That said, bearing in mind that the 10% threshold is arbitrary, variants with 5–10% VAF can also be reported; however, always mentioning in the report that the clinical significance of TP53 mutations with VAF 5–10% is currently unknown, since we are lacking data from prospective clinical studies addressing this issue. Importantly, NGS technology should be validated to a LOD above which there are no false positives (minimum 10% VAF). Confirmation of mutations detected at the level near the validated LOD is desirable either by Sanger sequencing or, in the case of minor clone variants, by digital PCR, independent NGS run or allele-specific PCR.

Report form

In addition to the obligatory standard medical report content (e.g., patient and lab identifiers, date of sampling, type of material), the report should always contain the following information: (i) the type of analysis and description of the method: methodology used, exons analyzed, LOD, and coverage in the case of NGS (median and ≥99% minimum); (ii) results and interpretation: description of the identified variant(s) according to the HGVS nomenclature, reference sequence used, type of variant (missense/truncating etc.), effect according to the TP53 locus-specific database, frequency, and any known association with cancer; (iii) conclusion: clinical consequence of the variant and summary of the finding in the context of the current knowledge; and (iv) other optional data: VAF of the detected variant if available (estimations from Sanger sequence traces can also be informative), comparison with a previously tested sample from the same patient and, if evidenced, description of clonal evolution.

All labs issuing clinical reports of their results must have accreditation according to their national authorities. ERIC is also regularly conducting TP53 mutational Analysis Certification to confirm the reliability and reproducibility of the results provided by participating labs. Examples of report forms for both Sanger sequencing and NGS are provided in the Supplementary Material and a template report form can be found on the ERIC website (http://www.ericll.org/).

Publishing and scientific reporting in the databases

It is important to distinguish between clinical reporting and reporting variants for research purposes in scientific journals. Data from publications are transferred to databases, and these databases then serve as the source of information for general use [42, 49]. For this reason, in order to prevent incorrect entries, it is essential to follow specific rules in addition to all above-mentioned basic procedures: (i) using consistent sample and patient identifiers if the data are repeatedly published, as inconsistent identification leads to redundancy in mutation databases; (ii) including the genomic coordinate and reference genome in the variant description to avoid ambiguities; (iii) listing all variants that are found in the patient including synonymous and other benign variants [55]. It is recommended to include the complete list of variants in the Supplementary Material, with appropriate description of their clinical significance. Note that if more than one variant in a patient is found, all variants should be listed. Centers following ERIC recommendations are kindly asked to mention ERIC in the “Material and methods” section of their studies and refer to this manuscript.

Concluding remarks

In CLL, inactivation of the TP53 gene by deletion and/or mutation is strongly associated with adverse prognosis and refractoriness to chemoimmunotherapy. Detection of del(17p) and TP53 gene mutations has become an integral part in routine diagnostics and should always be performed before deciding about treatment. Analysis of TP53 exons 4–10 is a minimal requirement; however, ideally, the entire coding sequence, i.e., exons 2–11, should be analyzed, and this can be performed by either bidirectional Sanger sequencing or NGS. NGS also allows the parallel analysis of multiple genes and is capable of identifying variants undetectable by Sanger sequencing. That notwithstanding, NGS currently faces certain technical limitations and may lead to problems with data interpretation. The clinical importance of mutations within minor clones remains an unresolved issue and there is currently not enough evidence for making therapeutic decisions based on the presence of mutations undetectable by Sanger sequencing. To assist the community with the implementation of TP53 mutational analysis in a harmonized manner, ERIC created the TP53 Network with the following objectives: regular certification of laboratories for TP53 mutation status assessment (both for Sanger and NGS), the organization of educational events, and regular updating of recommendations for TP53 analysis. The Network also provides tools facilitating laboratories to achieve reliable and comparable results that are accessible via the ERIC web page (http://www.ericll.org/).