Although a significant reduction in risk of death due to allogeneic hematopoietic blood or marrow transplantation has been achieved over the last two decades, acute GVHD remains a major problem.1, 2 Strategies to eliminate GVHD while maintaining the beneficial graft-versus-tumor (GVT) effect have not yet been developed for clinical use. Unfortunately, only about 60% of patients with acute GVHD respond to upfront treatment and far fewer respond to salvage therapies. Consequently, better treatment continues to be a serious unmet need.

For a variety of reasons, there are few therapeutic trials for acute GVHD, and currently no agents are approved by the United States Food and Drug Administration (FDA) for either prevention or treatment of acute GVHD.3 One of the most serious challenges in moving new drugs for treatment of acute GVHD through the approval process is that the essential parameters for the clinical development program have not been fully established. In contrast to chronic GVHD, for which a series of consensus documents provided unified recommendations for diagnosis, staging and response criteria,4 for acute GVHD only the grading criteria system has been examined critically.5, 6

In response to this limitation, an NIH-FDA public workshop was convened in 2009 to ‘inform and assist’ in facilitating clinical development programs for products to prevent or treat acute GVHD.7 Defining the optimal end point and timing for evaluation of new therapeutics for acute GVHD was a major topic of the workshop. Survival, although universally accepted as an end point in oncology clinical trials suitable for regulatory approval, is not always practical for development of GVHD therapeutics, owing to the multiple contributing causes of death in the allogeneic transplantation setting.8 Response (complete or partial) to therapeutic intervention at a defined time point within weeks of starting therapy has long been used as a measure of success and appeared to correlate with transplant mortality, thereby making response a prime surrogate for clinical benefit.9, 10 Further, assessments of response at a single pre-specified time point would be favored for practical reasons such as determination of trial sample size estimates and more effective description of clinical benefit in the individual patient.3

In the past, acute GVHD therapy trials have used variable time points for therapy response assessments, including day 28, day 42 or day 50 post-intervention, but no formal validation attempts of these timings were conducted.11, 12, 13, 14 In a recent issue of Bone Marrow Transplantation, Saliba and colleagues15 from the MD Anderson Cancer Center present a study validating therapy response as an outcome measure in 83 predominantly hematological malignancy patients who had been enrolled on two clinical trials testing upfront therapy for acute GVHD.16, 17 They assessed acute GVHD responses on days 7, 14 and 28 after starting primary steroid-based therapy. About 60% of the patients responded to the therapy; however, the proportion of complete responses increased from 53% on day 7 to 92% on day 28. Most importantly, both in the univariate and multivariate analyses, day 14 and day 28 responses were the most significant predictors of 6-month and 2-year non-relapse mortality (NRM). Upon additional statistical analysis, day 28 response was found to be more predictive than day 14 response for NRM, and the study team recommended the use of this end point in future acute GVHD upfront therapy trials. Of interest, unlike some other studies, GVHD severity grade did not influence rates of therapy response or the day 28 response impact on NRM; these observations suggest that response is an independent predictive characteristic in this patient population.

Two other studies published recently also addressed the question of the best timing of an early end point for use in trials of upfront acute GVHD therapy (Table 1). MacMillan et al.18 at the University of Minnesota determined in a retrospective study of 864 patients that day 28 and 56 responses are similarly effective in predicting 2-year TRM. When compared with patients with response, patients with no response had a 2.78-fold higher chance of dying (P<0.0001). These findings held true for patients receiving myeloablative and non-myeloablative peripheral blood or marrow transplants, and also for recipients of umbilical cord transplants. Levine et al.19 evaluated the best time for measuring response in a four-arm phase II randomized study conducted though the Blood and Marrow Transplant Clinical Trials Network (BMT CTN). Survival and NRM varied significantly according to CR or PR status on day 14 post-intervention; however, survival and NRM were similar on days 28 or 56 post-intervention whether one achieved a CR or PR. Multivariate analyses and specificity/sensitivity analyses identified that the day 28 response (either CR or PR) best categorized patients by NRM at 9 months from the start of acute GVHD treatment.

Table 1 Studies validating therapeutic response after upfront therapy for acute GVHD

This study by Saliba also identified a striking difference in causes of death between GVHD responders and non-responders. Among non-responders, NRM accounted for 95–96% of deaths, while among responders, relapse of underlying malignancy accounted for 56–61% of deaths. Notably, however, acute GVHD treatment response did not significantly affect overall survival. This stands in contrast to the study by Levine et al.,19 wherein day 28 response did in fact correlate with 9-month survival endpoint definition.

In short, all three studies provided data that strongly suggest day 28 as a suitable time point to measure response to first-line therapy in clinical trials for acute GVHD. These three studies also collectively support the recent expert panel endorsement of ‘day 28’ response as a primary end point in acute GVHD treatment trials aiming for regulatory approval. Earlier time points (days 7 or 14) might not be a sufficient interval for optimal responses, while later time points (day 56) carry the risk of confounding interpretation because patients may develop chronic GVHD or other transplant-related complications.3 What is not clear from any of these studies, however, is whether a PR itself actually confers a clinical benefit, because, as pointed out by Saliba et al., the proportion of patients with only a PR represents a minority of the responders. Recently proposed alternatives for assessing acute GVHD therapy response, such as functional very good partial response or the GVHD activity index, also need to be tested in retrospective and prospective studies to determine whether they improve the degree of correlation of response with clinical benefit and whether they have any inherent practical limitations.3, 20

These current publications provide greatly needed, data-driven guidance for end points in trials of upfront therapy of acute GVHD. However, it is essential to understand that the conclusions drawn may not be applicable in the salvage therapy setting, in different patient populations, in trials using agents from other therapeutic classes or with different trial designs. Therefore, additional data must be collected and analyzed to extend the validation to these settings. Success in this endeavor will require that we minimize heterogeneity in other factors such as patient population, type of transplant or steroid dose, which might alter the end point being measured.3 Owing to challenges in the reproducibility of clinician's assessments of responses, it is also critical to pursue evaluation of novel end points, such as biomarkers and patient-reported symptoms.21, 22