In recent years, determination of the mutational status of rearranged immunoglobulin heavy chain variable (IGHV) genes in large series of patients with chronic lymphocytic leukaemia (CLL) has shown strong and independent prognostic value.1, 2, 3, 4, 5, 6, 7, 8 IGHV mutational status is thus being considered one of the most important prognostic factors to stratify patients in clinical trials.9 For this reason, recommendations on how to perform and interpret IGHV mutational analysis in CLL are strongly needed to set standards that may allow comparisons between different patient series, thus avoiding discrepancies.

As part of an ERIC (European Research Initiative on CLL effort, several investigators involved in CLL clinical research from different European Countries, including France, Germany, Great Britain, Greece, Italy, Spain and Sweden, gathered to critically discuss the current and most updated literature as well as their own laboratory experience on this issue. This resulted in a critical analysis of the pros and cons of several technical aspects allowing the authors to reach a consensus on the minimum requirements for a reliable and reproducible analysis of the rearranged IGHV sequences in CLL (Table 1a).

Table 1 Recommendations for reproducible IGHV mutational analysis

The following summary provides a guide to the analysis of IGHV gene mutational profiles in cases of CLL:

Anticoagulant: Biological material (e.g. peripheral blood and bone marrow aspirates) should be drawn in tubes containing either ethylenediaminetetraacetic acid (EDTA) or trisodium citrate/pyridoxal 5′-phosphate/Tris (CPT). The use of heparin-containing tubes may not be advisable based on previous literature that shows a potential inhibition of both DNA polymerase10 and reverse transcriptase,11 through an unknown mechanism.12 That notwithstanding, in our experience the use of heparinized blood (e.g. drawn for cytogenetics or flow cytometry analysis) usually does not preclude a successful analysis.

Materials: Both complementary DNA (cDNA) and genomic DNA (gDNA) are suitable for IGHV analysis (Table 1a and b). cDNA has the advantage, over gDNA, of preferentially identifying the functional rearrangement, thereby avoiding the need to perform extra sequencing reactions when two rearrangements are present, owing to a non-productive and/or non-transcribed immunoglobulin heavy chain (IGH) gene rearrangement occurring on the second allele that is detectable when analysing gDNA. However, double rearrangements, both in-frame, can also be detected, though rarely, when analysing cDNA, probably owing to the lack of allelic exclusion,13 at least at the transcript level. The use of cDNA also provides an opportunity to define the isotype of the expressed rearrangement(s).

Table 2 Pros and cons for material and primers

That notwithstanding, analysis of gDNA, which is more frequently used, avoids the reverse transcription step that is necessary to prepare cDNA, and can be more easily performed when utilizing stored materials (e.g. frozen cell pellets, frozen sections of infiltrated tissues or paraffin-embedded material). Also, when long transport times are anticipated gDNA should be the preferred choice.

Primers: Leader primers allow the whole sequence of the IGHV region to be obtained and thereby the precise definition of the percentage of identity to the closest germline gene (Table 1a).2, 14 In contrast (Table 1b), the use of framework-region 1 (FR1)-primers15, 16, 17 prevents the analysis of the complete nucleotide IGHV region, and the percentage of identity will be approximated (see determination of homology). On the other hand, FR1 primers are widely used in many diagnostic laboratories and may have a higher efficiency as compared to leader primers. An attempt with FR1 primers should be always performed when leader primers do not yield a reliable result. Vice versa, when using FR1 primers as the first option, an attempt with leader primers should be performed when the percentage of homology is ‘borderline’ (see next). In occasional cases, other primer sets (including FR2 primers) may be employed even though this will make analysis of mutational level rather difficult.

Detection of IGH rearrangement: After performing the polymerase chain reaction (PCR) reactions, it is necessary to check for the presence of rearranged bands. Agarose gel electrophoresis may be sufficient in most cases, although the use of gDNA can produce multiple IGH rearrangements, which might lead to misinterpretation of the clonality. Polyacrylamide gel electrophoresis (PAGE) analysis or GeneScan,16 when available, is more appropriate in a diagnostic setting to get a conclusive result.

Sequencing: Direct sequencing of the PCR reaction with forward and reverse primers is always advisable. For clinical purposes, gel excision of multiple bands and/or subcloning should be considered when the sequencing attempt is not informative (e.g., cases with double rearrangements). In addition, subcloning analysis may identify intraclonal diversification of IGHV–D–J sequences, indicating ongoing mutational activity, a phenomenon which is still a matter of ongoing research in CLL.18 However, intraclonal diversification has not been shown so far to carry a prognostic or a clinical value.

Alignment: Among the immunoglobulin databases available on line international immunogenetics information (IMGT) (, V-Base ( and GenBank (, IMGT/GENE-DB19, 20 is the most comprehensive and more regularly updated database in terms of immunoglobulin gene polymorphisms/alleles. This is an essential point to reach a correct calculation of the percentage of IGHV gene identity compared to the closest IGHV germline gene.21, 22 For this reason, both V-base and GenBank cross-reference their databases with IMGT. Nevertheless, the algorithm and the software used to analyse and align IGHV sequences are different (V-QUEST for IMGT; Dna Plot for V-Base; IgBLAST for GenBank). This may lead to different interpretation of nucleotide changes, thereby potentially producing significantly different percentages of identity between web sites.21, 22 More recently, the IMGT/V-QUEST tool has been improved so it also provides automatically the calculation of the percentage of IGHV identity to germline, the number and description of mutations per FR-IMGT and CDR-IMGT, and the identification and localization of the hot spots in the germline.20 One has also to keep in mind that nucleotide insertions/duplications or deletions in IGH genes may occur in up to 3% of CLL sequences.23 This does not allow a proper alignment using both V-QUEST and DNA plot, hampering a correct analysis. In contrast, the recognition of extra or missing nucleotide is possible with Ig BLAST.

Determination of homology: The percentage of identity is calculated based on the ratio between the number of nucleotide differences, that is, mutations within the IGHV region of the IGHV-D-J rearranged sequence, and the length in nucleotides of the most homologous germline IGHV gene (Table 1a). The percentage of identity is calculated from the first (FR1-IMGT or Kabat codon 1) to the last codon (CDR3-IMGT codon 105/106/107 or Kabat codon 95/96/97, depending on exonuclease trimming) of the IGHV gene. A change in the last codon can be counted as a mutation only if occurring at the first nucleotide position. In case of insertions/duplications or deletions, each inserted/duplicated or deleted sequence should be counted as only one mutation, regardless the number of the actual extra or missing nucleotides. For correct evaluation of the number of somatic mutations, when using FR1 primers a number of nucleotides equal to the primer length should be excluded from the 5′ of the IGHV region.

Patient classification according to IGHV mutational status: The vast majority of the current literature use 98% as the homology cutoff value to make a clinically relevant distinction between ‘mutated’ and ‘unmutated’ CLL cases.3, 5, 24 The 98% cutoff was chosen as a short cut to exclude potential polymorphic variant sequences, avoiding the need of analysing the corresponding germline gene in each patient.

Although other homology cutoff values have also been suggested (e.g. 97% and even 95%),6, 25 it is not the purpose of these consensus guidelines to discuss modification of the percentage threshold to be used in order to define mutated versus unmutated cases.

Nevertheless, as for any mathematical cutoff value applied to biological phenomenon, one has to be cautious when dealing with ‘borderline cases’. Interestingly, it has been shown that most sequence differences in the 98–99.5% homology group correspond to low level of somatic hypermutation.26 That notwithstanding, successive studies have confirmed the 98% value as a reasonable cutoff point to discriminate cases with a clinically different prognosis,27 at least at the cohort level. Further studies in larger series of patients are needed to define this issue for each individual patient. Finally, one has to consider that in rare cases, double in-frame IGHV–D–J rearrangements with discordant mutational status may be amplified, hampering assignment of the patient to a prognostic subset.

For all these reasons, it is probably advisable to report to clinicians the actual percentage of homology, in addition to a simplistic ‘mutated’ or ‘unmutated’ definition.

In conclusion, the use of consensus procedures will allow direct comparison of IGHV sequence data between laboratories, which will be especially important for new multicentre treatment studies in which IGHV mutational status analysis will be used to stratify patients.

Nevertheless, as mentioned earlier, cases exist that are difficult to categorize (e.g. double in-frame rearrangements, mutated out-of-frame coexisting with an unmutated in-frame rearrangements, transcribed out-of-frame rearrangements) and need in-depth analysis. To this purpose, a discussion forum will be available at where problematic cases can be submitted and will be discussed with a review board of experts.