Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

IMPRES does not reproducibly predict response to immune checkpoint blockade therapy in metastatic melanoma

Matters Arising to this article was published on 05 December 2019

The Original Article was published on 20 August 2018

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: The algorithm from Auslander et al. cannot reproducibly predict immunotherapy response.

Data availability

All raw data used in our study are freely available in our GitHub repository: https://github.com/JasonACarter/IMPRES_Correspondence/Datasets.

Code availability

All data analysis code used in this manuscript is freely available in our GitHub repository: https://github.com/JasonACarter/IMPRES_Correspondence/Code.

References

  1. Auslander, N. et al. Robust prediction of response to immune checkpoint blockade therapy in metastatic melanoma. Nat. Med. 24, 1545–1549 (2018).

    CAS  Article  Google Scholar 

  2. Hugo, W. et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165, 35–44 (2016).

    CAS  Article  Google Scholar 

  3. Riaz, N. et al. Tumor and microenvironment evolution during immunotherapy with nivolumab. Cell 171, 934–949 (2017).

    CAS  Article  Google Scholar 

  4. Van Allen, E. M. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

    Article  Google Scholar 

  5. Prat, A. et al. Immune-related gene expression profiling after PD-1 blockade in non-small cell lung carcinoma, head and neck squamous cell carcinoma and melanoma. Cancer Res. 77, 3540–3550 (2017).

    CAS  Article  Google Scholar 

  6. Chen, P.-L. et al. Analysis of immune signatures in longitudinal tumor samples yields insight into biomarkers of response and mechanisms of resistance to immune checkpoint blockade. Cancer Discov. 6, 827–837 (2016).

    Article  Google Scholar 

  7. The Cancer Genome Atlas Research Network et al.The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    Article  Google Scholar 

  8. Zhang, W. et al. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction. Genome Biol. 16, 133 (2015).

    CAS  Article  Google Scholar 

  9. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    CAS  Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

J.A.C. and P.G. analyzed data. J.A.C. prepared the manuscript with input from P.G. and G.S.A. G.S.A. supervised the research.

Corresponding author

Correspondence to Gurinder S. Atwal.

Ethics declarations

Competing interests

G.S.A. is now an employee at Regeneron Pharmaceuticals. The other authors declare no competing interests.

Additional information

Peer review information Saheli Sadanand was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Information theoretic analysis of the IMPRES feature set.

(A) The mutual information, a common measure of dependence between two random variables, was calculated for each feature considered by the IMPRES algorithm. Higher mutual information values correspond to a feature providing more information about spontaneous NBL regression in the full NBL dataset. Although variations in training sets and potential interactions between features mean that a greedy algorithm may not always select the most individually informative features, IMPRES (red) contained 0 of the 17 features containing the most information about NBL regression in the training datasets. Features selected using Auslander and colleagues’ machine learning algorithm tended to be those that were highly informative of NBL regression (blue), as expected for a greedy algorithm. (B) Running Auslander and colleagues’ algorithm 500 times with random selection of training sets as outlined by Auslander et al., demonstrates that an unbiased greedy algorithm tends to select features that are more informative of NBL regression than those found in IMPRES. (C) Features selected using Auslander and colleagues’ algorithm were significantly more informative of NBL than those found in the IMPRES features set (p = 0.0015 by Mann Whitney U, n=15 features), suggesting that the bias introduced by the original IMPRES feature sets may have resulted in under-fitting of the NBL dataset. This is further supported by IMPRES achieving an area under the receiver operator characteristic curve (AUC) of 0.91 when applied to the NBL training data itself, as compared to an AUC of 0.98 for feature sets selected using Auslander and colleagues’ algorithm (see Fig. 2D). Boxplots show median (center line) with interquartile range (box). (D) The mutual information of each feature, calculated individually, is proportional to the fraction of feature sets that feature is found in when the IMPRES algorithm is run using random training sets (p = 0.0001 by Wald test, n=48 features selected through 500 runs), further confirming that the IMPRES greedy algorithm is heavily biased by using the non-random training sets of Auslander et al.

Extended Data Fig. 2 Auslander and colleagues’ algorithm is unable to reproducibly predict melanoma response to ICB therapy.

(A) Receiver operator characteristic (ROC) curves are shown for each of the melanoma datasets separately when using IMPRES features selected using random training sets. The shaded region represents standard deviations between 500 feature sets selected using the algorithm described by Auslander et al. For each dataset, the total number of samples are list first and the number of non-responding samples given in parenthesis. PDF- Probability distribution function. TPR- True positive rate. FPR- False positive rate. (B) True positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) are shown for varying thresholds for unbiased IMPRES feature sets selected. No threshold replicates the predictive performance of the biased IMPRES feature set. (C) Recall and precision curves are shown for new feature sets generated using Auslander and colleagues’ algorithm (n=500 feature sets). (D) The algorithm described by Auslander et al. was unable to reproducibly predict progression free survival (PFS) in either the Van Allen et al.3 (n=42) or (E) Prat et al.5 (n=25) melanoma ICB datasets. p-values were calculated using the Wilcoxon rank-sum test. Boxplots show median (center line) with interquartile range (box).

Extended Data Fig. 3 IMPRES fails to predict reparameterized melanoma responses to immunotherapy.

(A) Auslander et al.1 did not standardize the RNA abundance quantification methods in their original study. For example, IMPRES was successful when applied to datasets analyzed using Tophat2 (Hugo et al.), DESeq2 (Riaz et al.), and Cufflinks (Van Allen et al.). Furthermore, Auslander et al. successfully applied IMPRES to datasets parameterized using both raw counts and RPKM. However, IMPRES failed to predict melanoma response to ICB therapy on these datasets when the raw sequencing datasets were processed using Kallisto9. p-values were calculated using the Mann-Whitney U test. Boxplots show median (center line) with interquartile range (box). (B) Receiver operator characteristic (ROC) curves are shown for IMPRES applied to the Hugo et al.2, Van Allen et al.4, and Riaz et al. pre-therapy datasets processed using Kallisto. TPR- true positive rate. FPR- false positive rate. Given that Auslander et al. did not standardize their datasets, instead deferring to the substantially varied RNA quantification methods used by the original authors of each study, it is unclear why IMPRES would not be applicable to datasets obtained using different quantification methods. These results therefore suggest that IMPRES was overfit to the provided melanoma datasets via the biased NBL training sets and raise serious concerns regarding the application of IMPRES to new datasets.

Extended Data Fig. 4 Table highlighting conserved training sets in biased IMPRES training sets.

A portion of the training sets used by Auslander et al. are shown below, with each column representing a distinct training set. Each set consists of 13 regressing and 13 non-regressing samples. Identical training sets found within this range are indicated by color.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Carter, J.A., Gilbo, P. & Atwal, G.S. IMPRES does not reproducibly predict response to immune checkpoint blockade therapy in metastatic melanoma. Nat Med 25, 1833–1835 (2019). https://doi.org/10.1038/s41591-019-0671-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-019-0671-4

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing