Estimating and reporting treatment effects in clinical trials for weight management: using estimands to interpret effects of intercurrent events and missing data

In the approval process for new weight management therapies, regulators typically require estimates of effect size. Usually, as with other drug evaluations, the placebo-adjusted treatment effect (i.e., the difference between weight losses with pharmacotherapy and placebo, when given as an adjunct to lifestyle intervention) is provided from data in randomized clinical trials (RCTs). At first glance, this may seem appropriate and straightforward. However, weight loss is not a simple direct drug effect, but is also mediated by other factors such as changes in diet and physical activity. Interpreting observed differences between treatment arms in weight management RCTs can be challenging; intercurrent events that occur after treatment initiation may affect the interpretation of results at the end of treatment. Utilizing estimands helps to address these uncertainties and improve transparency in clinical trial reporting by better matching the treatment-effect estimates to the scientific and/or clinical questions of interest. Estimands aim to provide an indication of trial outcomes that might be expected in the same patients under different conditions. This article reviews how intercurrent events during weight management trials can influence placebo-adjusted treatment effects, depending on how they are accounted for and how missing data are handled. The most appropriate method for statistical analysis is also discussed, including assessment of the last observation carried forward approach, and more recent methods, such as multiple imputation and mixed models for repeated measures. The use of each of these approaches, and that of estimands, is discussed in the context of the SCALE phase 3a and 3b RCTs evaluating the effect of liraglutide 3.0 mg for the treatment of obesity.


INTRODUCTION AND BACKGROUND
Novo Nordisk (the sponsor) has submitted a new drug application (NDA) for liraglutide 3.0 mg/day as an adjunct to a reduced caloric diet and physical exercise for chronic weight management in adult patients that are overweight with co-morbidities or obese. This document summarizes the primary efficacy findings from five randomized Phase 2 and Phase 3 Trials included in the NDA. This review mainly focuses on the three Phase 3 trials (1839, 1922 and 1923) due to their weight management objective, trial duration (at least 56 weeks), and their ability to support our preferred analysis. For these trials an emphasis is placed on 1) the extent and impact of missing data, and 2) the statistical methods used to explore the potential impact of missing data.
This document is organized as follows.
Section 2 discusses statistical considerations of two elements of the 2007 Draft FDA Guidance for weight management-efficacy benchmarks and analysis methods. Section 3 summarizes the individual trial designs, statistical methods, patient disposition and trial results. In Section 3.3 limitations of the sponsor's missing data sensitivity analyses are explored and discussed. In Section 3.4 the primary prespecified analysis is shown to over-estimate the intention-to-treat (ITT) effect using our preferred analysis by a relative change of up to 15%. Our preferred approach is an ITT analysis that represents missing data on the primary endpoint using information from subjects that prematurely discontinued but returned for a primary endpoint measurement. Based on this approach (detailed in Section 3.3), subjects treated with liraglutide 3.0 mg compared to placebo, had an average excess reduction from baseline to week 56 in fasting weight of 4.8% (95% CI =4.3, 5.3) in Trial 1839 and 3.4% (95% CI =2.3, 4.5) in Trial 1922. When liraglutide was used after an initial 5% weight reduction from a low caloric diet in Trial 1923, liraglutide treated subjects lost on average an additional 5.3% (95% CI =3.8, 6.8) compared to placebo. Section 4 provides a brief summary of findings.
The statistical evaluation of cardiovascular events is addressed in a separate statistical review conducted by the Division of Biometrics VII.

Draft FDA Guidance for products for weight management: Statistical considerations
In 2007 FDA released the Draft Guidance for Industry: Developing Products for Weight Management that provides recommendations for the development of drugs for the indication of weight management. The content relevant to evaluating the effectiveness of liraglutide is described in the sections on efficacy benchmarks and statistical methods. Below excerpts from these sections are provided along with a discussion of statistical considerations.
An analysis that uses the last available observation on-treatment (LAO-OT) presents unique challenges interpreting the results overall and relative to the estimate of the intention-to-treat (ITT) effect. Some of the challenges associated with the recommended analysis are: Part of a therapy's effect is mitigated through the ability to tolerate the therapy. Therefore, an analysis that excludes observations after discontinuing therapy likely inflates the treatment effect since subjects that go off-treatment tend to regain weight. The average endpoint may have limited utility for a patient making a treatment decision because it is not known (nor is it possible to know) how long they will tolerate treatment; this can only be known after starting a treatment. The endpoint may not be clinically relevant for subjects with limited treatment adherence (e.g., one or two months) given the long-term goals of weight management. The distribution of the timing of the last available on-treatment measurement can differ across treatment arms. When this occurs the comparison of on-treatment experiences across treatment arms can be time-confounded.
Based on these considerations our preferred analysis is one that estimates the intent-to-treat effect using data from all subjects at the landmark visit. Because none of the sponsor's sensitivity analyses were found to adequately estimate this quantity for reasons described in Section 3.3, we fit two different statistical models to estimate this quantity; details of these model are provided in Section 3.3.

Study Design and Endpoints
A summary of the study design and endpoints for the trials reviewed in this document are shown in Table 1. Additional details of the trial designs are provided in Sections 3.1.1 to 3.1.5 with primary efficacy endpoints described in Section 3.1.6. Across the Phase 3 trials the studies differed in important ways. In particular, Trial 1922 was the only study in subjects with type 2 diabetes mellitus (T2DM); Trial 1923 studied subjects after having lost 5% of their bodyweight during a 12 week low calorie diet (LCD); and Trial 3970 primary objective was not related to inducing or maintaining weight loss. In all trials subjects received diet and activity counseling.

Trial 1807
Trial 1807 was a Phase 2, randomized, partially blinded, parallel group, placebo and active controlled dose-finding trial in non-diabetic, obese subjects. A total of 564 subjects in 19 sites in 8 European countries were randomized 1:1:1:1:1:1 to one of four liraglutide doses (1.2, 1.8, 2.4, or 3.0 mg once daily), matching liraglutide placebo, or open-label orlistat (120 mg three times daily). Randomization was stratified by gender. The treatment duration was planned for 20 weeks with an optional 84 week extension period. A total of 398 randomized subjects consented to and continued study treatment in the extension phase. After the 52 week visit subjects treated with liraglutide or placebo were initially treated with the open-label 2.4 mg dose. Subjects were subsequently switched to the 3.0 mg dose following discussion from the planned week 52 analysis.

Trial 1839
Trial 1839 was a randomized, double-blind, placebo controlled, parallel group trial in nondiabetic obese or overweight subjects with co-morbidities. A total of 3731 subjects in 191 sites including 69 in the US were randomized 2:1 to liraglutide 3.0 mg or placebo. Randomization was stratified by pre-diabetes status (with, or without) and BMI (≥ 30 kg/m 2 , or < 30 kg/m 2 ). Subjects in the pre-diabetes stratum were randomized to 160 weeks of treatment; data post 56 weeks was not included in the submission. Subjects in the stratum without pre-diabetes were randomized to 56 weeks of treatment followed by a 12 week re-randomization treatment period. Subjects randomized to liraglutide were then re-randomized 1:1 to liraglutide or placebo. Subjects that prematurely discontinued were asked to attend a follow-up visit that took place 56 weeks after their randomization date.

Trial 1922
Trial 1922 was a 56 week randomized, double-blind, placebo controlled, three-arm parallel group trial in obese or overweight subjects with T2DM. A total of 846 subjects in 126 sites including 67 in the US were randomized 2:1:1 to liraglutide 3.0 mg, liraglutide 1.8 mg or placebo as an add-on to their background diabetes treatment. Randomization was stratified by HbA1c (≥ 8.5%, or < 8.5%) and background treatment (diet and exercise or single compound oral antidiabetic treatment, or combination oral antidiabetic treatment). Subjects that prematurely discontinued were asked to attend a follow-up visit that took place 56 weeks after their randomization date.

Trial 1923
Trial 1923 was a 56 week randomized, double-blind, placebo controlled parallel group trial in non-diabetic obese or overweight subjects with dyslipidemia and/or hypertension. Subjects were randomized if they lost at least 5% of their bodyweight during a 12 week low calorie diet (1200-1400 kcal/day) run-in period. A total of 422 subjects in 36 sites in the US (26) and Canada (10) were randomized 1:1 to liraglutide 3.0 mg or placebo. Randomization was stratified by comorbidity status (presence or absence of treated or untreated hypertension or dyslipidemia). Subjects that prematurely discontinued were asked to attend a follow-up visit that took place 56 weeks after their randomization date.

Trial 3970
Trial 3970 was a 32 week randomized, double-blind, placebo controlled parallel group trial in non-diabetic obese subjects with moderate or severe obstructive sleep apnea (OSA). The primary study objective was to evaluate whether liraglutide reduces the severity of OSA assessed by apnea-hypopnoea index (AHI). A total of 359 subjects in 40 sites in the US (35) and Canada (5) were randomized 1:1 to liraglutide 3.0 mg or placebo.

Efficacy Endpoints
The pre-specified primary efficacy endpoints for the individual trials are displayed in the table below. Note that for Trial 1839 the fourth primary endpoint is still being collected at the time of the NDA submission; interim results are not presented in this review. Furthermore, it is noted that the primary endpoint definition from trial protocols (fixed time-point) is not consistent with the endpoint in the primary analysis that relies on LAO-OT. This lack of harmonization not only can lead to results being misinterpreted, it is also problematic for this submission because the treatment effect estimated from the primary analysis is found to over-state the estimated ITT treatment effect using our preferred approach.
The primary efficacy endpoints of percent change in fasting body weight from baseline and 5% responders is consistent with what is described in the Draft FDA Guidance. The 10% responder commented on the ineffectiveness of the therapy (Table 13 in the Appendix). The extent to which this occurred in Trial 1839 and the other trials is not known.
In Trial 1807, 472 or 84% of the 564 randomized subjects completed the 20 week main treatment period, with 74 of them not enrolling into the 84 week extension period. The decision not to continue follow-up appears to be associated with degree of weight loss at week 20, with the subjects that enrolled in the extension having more favorable average weight reductions than those that did not (Table 3). This trend was consistent across study arms except for the 1.2 mg liraglutide dose. A relationship was also observed between the timing of the last on-treatment assessment and the change in the primary endpoint for Trial 1839 ( Figure 1) and Trial 1922 ( Figure 5 in the Appendix). In particular: Subjects that had a 56 week on-treatment assessment (thick lines) consistently had a more favorable mean response profile over the study duration than the subjects that did not have a week 56 assessment. This observation was consistent across treatment groups. There was a positive relationship between the timing of the last on-treatment assessment and weight loss, with the average reduction being more favorable for subjects that had their assessment later in the trial compared to earlier.
The distribution of the timing of the last available on-treatment was not the same across treatment arms. The plots do not describe what the average response at week 56 would have been for those that did not have an on-treatment assessment at week 56. For subjects that prematurely discontinued and returned for a week 56 assessment, the LAO-OT was found not to adequately characterize the week 56 response.

Table 5. Comparison of fasting weight change (%) at LAO-OT and week 56 for subjects that withdrew and returned for a week 56 follow-up assessment
Source: FDA statistical reviewer Differences were observed in the frequency of responders based on LAO-OT and week 56. In Trial 1839 the proportion of 5% responders for placebo using LAO-OT under-estimated the response rate at week 56 (9% vs. 22%); for liraglutide the proportion of responses were fairly similar (LAO-OT: 34%; week 56: 32%). In Trial 1923, the proportion subjects that were able to maintain their baseline weight (i.e., the weight after a 5% reduction during the LCD run-in) was over-estimated at week 56 using LAO-OT for liraglutide (LAO-OT: 11/12; week 56: 7/12) and under-estimated using LAO-OT for placebo (LAO-OT: 7/18; week 56: 11/18).

Statistical Methods Analysis Populations:
Two of the sponsor's analysis populations were the full analysis set (FAS) and the completers. The FAS was the primary analysis population, and included all randomized subjects exposed to at least one dose of the trial product and with at least postbaseline assessment of body weight in Trials 1807 and 1923, or of any efficacy endpoint in Trials 1839 and 1922. The FAS in Trial 3970 was defined as all randomized subjects. This population is consistent with the modified ITT population defined in the Draft FDA Guidance (Box 2). The completer population included subjects in the FAS with a valid end of trial efficacy assessment.
The FDA analyses are performed on the ITT population, defined as randomized subjects with a baseline assessment.
All analyses use the randomized treatment.
Statistical methods for the primary analysis of the primary efficacy endpoints: Consistent with the Draft FDA Guidance the primary analysis was performed on the FAS using LAO-OT. In Trial 1922 the analysis was performed using last available pre-rescue observation on treatment. Continuous primary endpoints were analyzed using an analysis of covariance (ANCOVA) model that included treatment, country, sex, baseline response, and randomization stratum as independent variables. Categorical endpoints were analyzed using a logistic regression model using the same independent variables.
Note that in Trial 1922 the decision to limit the analysis to pre-rescue observation has the potential to inflate the treatment effect since subjects randomized to placebo were more likely to require rescue medication overall and earlier on average in the trial.
Sample size: The Phase 3 trials were individually powered to test the individual study endpoints with at least 85% power. The trials, in particular Trial 1839, were over-sized for the efficacy endpoints to comply with safety considerations outlined in the Draft FDA Guidance. The Guidance recommends approximately 3,000 subjects are randomized to active doses and no fewer than 1,500 subjects are randomized to placebo.
Approach to multiplicity: The Phase 3 trials (1839,1922,1923,3970) individually preserved the study-wise type-I error at 5% by hierarchically testing the study endpoints according to their order in Table 2. Under this approach the statistical testing for an endpoint is performed only if the statistical test for the preceding endpoint in the hierarchy is statistically significant at the twosided 5% level. For Trial 1922 that investigated two liraglutide doses, the hierarchy ordered the hypotheses for the 3.0 mg dose first followed by hypotheses for the 1.8 mg dose.
Approximately 15 to 20 secondary endpoints were prespecified for investigation in each of the Trials. None of the secondary endpoints, including those related to body composition in Trial 3970, were incorporated into the hierarchical testing sequence to preserve the study-wise type-I error.
For Trial 1807 the pairwise comparisons at week 20 between the separate liraglutide doses to placebo and orlistat were done using Dunnett's method for simultaneous confidence intervals. The nominal study-wise error was not preserved at the 5% level as a separate 5% alpha was used for the placebo comparison and the orlistat comparison.
Sensitivity analyses for the primary efficacy endpoints: In my opinion, the sponsor's sensitivity analyses used to assess the potential impact of missing data are inadequate. None of their analyses attempted to estimate the ITT effect at week 56 under a reasonable set of assumptions. Our recommended/preferred approach represent the missing week 56 response for subjects that prematurely discontinued using information from the subjects that also prematurely discontinued but returned for their week 56 assessment. This approach can be implemented only for Trials 1839, 1922 and 1923 because they retrieved dropouts. Additionally, I do not concur with the sponsor's definition/notion of missing data. Our notion is that all study subjects (if alive) have a weight at week 56, with their missing status being defined by whether or not the endpoint was assessed. Thus, the retrieve dropouts have a valid endpoint even though they were no longer receiving study therapy. In the sponsor's investigation of missing data the majority of their analyses did not use a subject's actual off-treatment week 56 measurements. This approach has significant implications on the interpretation of treatment effect at week 56, as detailed for the sponsor's MMRM and imputation analysis below.

Continuous endpoints (Sponsor's):
Below is a description of the sponsor's sensitivity analyses that are presented in this document. With the exception of the MMRM analyses the endpoint was analyzed using an ANCOVA model using the covariates in the primary analysis.
1. Completers -Subset analysis that includes subjects that did not have their endpoint imputed in the primary analysis. 2. Last available observation (LAO) -Used fasting or nonfasting weight measurements, offdrug measurements, post-rescue and the follow-up weight measurements after 56 weeks after randomization for early withdrawal (retrieve dropout). The analysis for Trial 1923 excluded post-rescue measurements.

Baseline observation carried forward (BOCF) -Baseline observations were carried
forward for subjects without a valid post-baseline assessment. This analysis was applied to all randomized subjects. This analysis was not performed in Trial 1923. 4. MMRM -a longitudinal analysis of on-treatment fasting weights that set off-treatment measurements to missing. A contrast and 95% CI was constructed for the difference in percent weight change for liraglutide compared to placebo at week 56.

Multiple imputation (MI) -Off-treatment responses in both treatment groups were
imputed assuming the distribution of their pre-and post-withdrawal values is the same as the distribution of placebo completers. Off-treatment follow-up measurements were not included in either the imputation or the analysis.

Comments on the limitation of the sponsor's MMRM and MI analysis:
MMRM-The MMRM model assumes missing data are missing at random. Under this assumption the statistical behavior of the missing data (given the observed responses and model covariates) is assumed to be same as the observed data. Because the model uses only ontreatment observations, the model estimates the treatment effect at week 56 assuming all subjects in the FAS could adhere to randomized therapy, contrary to the fact that a sizable number could not. This analysis therefore attempts to estimate a treatment effect under conditions that were not observed in the clinical trials, nor could occur in clinical practice. Therefore, it is my opinion that the findings from this sensitivity analysis lack clinical relevance due to the underlying implausibility of achieving perfect treatment adherence.
Multiple imputation-The analysis anchors the imputed week 56 responses based on the placebo completers. Whether this is appropriate is debatable and was not justified by the sponsor. An assumption of their imputation model is, for a liraglutide treated subject, the on-treatment experiences are attributable to placebo and not the treatment received. Due to the sponsor's approach to missing data the implication of this assumption can be empirically evaluated. This was done for Trial 1839 by comparing the average imputed value with their actual value for the retrieve dropouts (Figure 3). It is evident that for liraglutide treated subjects the imputation model had them having greater average loss at week 56 than they actually did. The average decrease at week 56 from baseline was 6.1% based on the imputation, which was double the 3.0% average decrease that was actually observed and surprisingly greater than the 4.9% average decrease at the LAO-OT. For placebo the differences between imputed and observed values were not dramatic. As a consequence of these findings, it is likely that this analysis will over-state the ITT effect at week 56. mg arm in Trial 1922 due to the small number of retrieve dropouts; our preferred approach for Trial 1922 and comparison involving liraglutide 1.8 mg is described below.
For the continuous endpoints a total of 100 imputed datasets were created, and results were combined using Rubin's rule (Rubin, D., Multiple Imputation for Nonresponse in Surveys, New York: Wiley & Sons (1987)). For the categorical endpoints response status was determined from the imputed continuous response. A total of 1000 imputed data sets were created. The imputed data were analyzed using a Beta-Binomial model with a uniform prior. For each imputed dataset a sample for each group was drawn from their respective posterior distribution, which thus incorporated imputation variability. Difference in probabilities was summarized using 50 th , 2.5 th and 97.5 th percentiles of the distribution.
Retrieve dropout weighted analysis (RD-Weighted) -In this analysis subjects were assigned differential weights, which up-weighted the contribution of subjects that prematurely discontinued and returned for a week 56 measurement while those missing a week 56 measurement were assigned zero weight (and did not contribute to the analysis). A subject with an on-treatment or other week 56 measurement was assigned a weight of one. The degree to which a subject was up-weighted depended on their treatment group and the timing of their LAO-OT.
For the continuous endpoints the data were analyzed using a weighted ANCOVA model. For the categorical endpoints the weighted sample was analyzed using a Beta-Binomial model with a uniform prior. A total of 100,000 samples were taken for each treatment group, and the difference in probabilities was summarized using 50 th , 2.5 th and 97.5 th percentiles of the distribution.

Trial 1807
Results from the analysis of primary endpoints at week 20 are shown below (

Trials 1839, 1922, and 1923
In each of the Phase 3 weight management trials all of the efficacy endpoints evaluated under the hierarchical testing sequence were statistically significant. To allow for a more fluid discussion of study findings the results will not be presented according to the pre-specified testing sequence. Furthermore, we caution contrasting results across trials since the trials differed in important ways with respect to study design and study population.

Change in body weight:
Results from the pre-specified primary analysis of the primary efficacy endpoint is shown in Table 7. In each of the Trials liraglutide 3.0 mg treated subjects had a statistically significant greater reduction in body weight change from baseline compared to placebo. For Trials 1839 and 1922 the confidence interval did not rule out the difference in average reduction for liraglutide compared to placebo of 5%.
In Trial 1922 the liraglutide 1.8 mg treated subjects had a statistically significant greater weight reduction compared to placebo, although the difference was not as large as the reduction observed for the 3.0 mg dose.
In our preferred analysis (MI-RD for Trials 1839 and 1922, and RD-Weighted for Trial 1923) the estimate of the ITT effect remained statistically significantly better than placebo (Table 8) but the magnitude of the estimated treatment effect was attenuated relative to the primary prespecified analysis. For Trial 1839 the estimated effect was 11% smaller and 15% smaller for Trials 1922 and 1923. These findings were reasonably aligned with the second FDA sensitivity analysis that attempted to estimate the ITT effect albeit with smaller. Results from the sponsor's sensitivity analyses were found to be aligned with the findings from the primary pre-specified analysis.

Table 7. Primary analysis results for change in fasting body weight (%) in Trials 1839, 1922, and 1923
Source: FDA statistical reviewer Responder endpoints: Results from the pre-specified primary analysis of the responder endpoints is shown in Table 7. In each trial for each of the two responder endpoints, the liraglutide 3.0 mg treated subjects had a statistically significant excess number of subjects respond compared to placebo. For Trials 1839 and 1922 the estimated proportion of liraglutide 3.0 mg treated subjects having a 5% response were notably greater than 35% and more than double the proportion in placebo.
In Trial 1922 the liraglutide 1.8 mg treated subjects also had a statistically significant excess number of subjects responders compared to placebo. The estimated proportion of liraglutide 1.8 mg treated subjects having a 5% response was similar to 35% (36%) and more than double the proportion in placebo.
In our preferred analysis the estimate of the ITT effect remained statistically significantly better than placebo (Table 10) but, similar to the findings from the continuous endpoint, the magnitude of the estimated treatment effect was attenuated relative to the primary prespecified analysis.  Cumulative distribution plots were constructed to allow investigating of different thresholds beyond those considered above. (Plots for Trials 1922 and1923 are displayed in the Appendix.) Importantly, randomized subjects that were no longer on-treatment by week 56 and/or did not have an endpoint assessment were assigned the worst possible weight change. This resulted in the initial step in the curves, but removed the potential of having time-confounded curves. The expectation in such a plot is that if liraglutide was not efficacious the liraglutide curve would be similar or worse (due to potential adverse effects) than placebo over the changes from baseline that are considered meaning (e.g., > 5%). This was not what was observed, with the proportion of responders being greater in the liraglutide group.

Summary results
Based on our preferred analysis subjects treated with liraglutide were found to have statistically significant changes in body weight. Compared to placebo, the excess reduction in fasting weight from baseline to week 56 for liraglutide 3.0 mg was 4.8% (95% CI =4.3, 5.3) in Trial 1839 and 3.4% (95% CI =2.3, 4.5) in Trial 1922. When liraglutide was used after an initial 5% weight reduction from a LCD, liraglutide 3.0 mg treated subjects lost an additional 5.3% (95% CI =3.8, 6.8) compared to placebo. Although the magnitude of the estimated treatment effects from our preferred approach were attenuated relative to the pre-specified primary analysis, the changes that were observed for liraglutide 3.0 mg relative to placebo were in-line with the efficacy benchmarks outlined in the 2007 Draft FDA Guidance.

APPENDIX 5.1 Supportive Material
Definition of obstructive apnea and hypopnea events per study protocol (Section 3.2)

Apnea Rules
Score an apnea when all of the following criteria are met: There is a drop in the peak thermal sensor excursion by ≥90% of baseline The duration of the event lasts at least 10 seconds At least 90% of the event's duration meets the amplitude reduction criteria of apnoea

Hypopnea Rules
Score a hypopnea if all of the following criteria are met: The nasal pressure signal excursions (or those of the alternative hypopnea sensor) drop by ≥30% of baseline The duration of this drop occurs for a period lasting at least 10 seconds There is a ≥4% desaturation from pre-event baseline At least 90% of the event's duration must meet the amplitude reduction of criteria for hypopnea

Details of the FDA sensitivity analyses
MI-RD -The imputation was done within groups defined by randomized treatment and the timing (month) of their last on-treatment measurement. In Trial 1839 the visits were grouped by month as follows: 0 to 1, 2 to 3, 4 to 6, 7 to 9, after 10. In Trial 1922 the visits were grouped based on whether the last on-treatment measurement was on or before month 5. For subjects in the FAS the imputation model, fit within each group, included baseline and last on-treatment measurement. Imputation for randomized subjects excluded from the FAS was done as follows. These subjects were first grouped with the subjects that had their last on-treatment measurement during the first time period (Trial 1839: 0 to month 1; Trial 1922: 0 to month 5). In the first step the missing week 56 response was imputed using only their baseline measurement. Next, the distribution of imputed values was centered per subject around their baseline measurement (i.e., MI version of BOCF).
RD-Weighted -Subjects with a week 56 assessment that were not a retrieve dropout were assigned an analysis weight of one. Subjects without a week 56 assessment were assigned an analysis weight of 0. The retrieve dropouts were assigned weights that depended on the time of their last on-treatment observation and randomized treatment. Specifically, the analysis weight assigned to a subject that was a retrieve dropout in group i was (A i + B i )/A i where A i is the number of retrieve dropouts in the group and B i is the number of subjects in the group with the missing endpoint. For Trial 1839 and 1922 the timing used to define the groups was based on the MI-RD analysis (see above). In Trial 1923 the visits were grouped based on whether the last ontreatment measurement was on or before month 4