Introduction

Charged particle therapy with protons or heavier ions (e.g., 12C) is nowadays considered to be the most advanced form of radiotherapy, offering new opportunities for non-invasive tumor treatments1,2,3,4. In contrast to the conventional radiation therapy with photons, it offers a more favorable dose distribution (the so-called Bragg peak) and is, therefore, better suited for the treatment of deep-seated tumors. Most of the deposited dose would be efficiently concentrated in the target, while the healthy tissue located in the plateau region would receive less damage. In addition, ions heavier than protons offer increased efficacy in terms of the high relative biological effectiveness (RBE) at the end of their range, meaning that for at a given absorbed dose, the biological effects would be 2–3 times higher5. However, in reality, the sharp dose fall-off is a double-edged sword as it makes the success of treatment extremely sensitive to the precise localization of the Bragg peak in the patient: under- or overestimations of the particle range would lead to the underdosage in the tumor and high-dose hotspot in the nearby healthy tissue. The uncertainties affecting the Bragg peak localization typically include the setup (patient positioning) and range (see below) errors, as well as uncertainties in beam delivery and dose calculation.

The range uncertainties in treatment planning come from unexpected variations of tissue density in the beam path (i.e., anatomical changes) or from the inaccuracies in the conversion of the CT Hounsfield units (HU) into the water-equivalent path length (WEPL) of the ions. Overall, they might alter the range of particles on the order of several millimeters. Accounting for the range uncertainty by either expanding the tumor planned volume (PTV) during the non-robust treatment planning procedure or by considering it in robust planning on a clinical target volume (CTV) would in both cases lead to the irradiation of healthy tissue and possibly critical structures in the target proximity, potentially inducing toxicities. For serial organs at risk (OAR), a dose higher than the tolerance threshold to any of its subunits is sufficient to increase the probability of a severe side effect significantly. Thus, one would expect that reduced tumor margins, associated with range uncertainty, significantly improve the treatment related toxicity. In fact, the decrease of head and neck toxicities was already predicted following the in-silico range margin reduction study in proton patients6. With regard to parallel OAR, where the damage would be correlated with a certain volume or the fraction of the organ receiving a threshold dose, the impact of the margin reduction is also expected to play a positive role however the size of the effect would probably strongly depend on the volumes of the OAR and the tumor.

To fully exploit the benefits of a sharp Bragg peak, a large effort is put nowadays into investigating different range verification methods. Among them, positron emission tomography (PET) remains one of the most widely used approaches to monitor dose delivery in clinical practice by exploiting β+-emitting isotopes7. In 12C ion therapy, the contributors to the PET signal are the radioactive fragments of the primary ions, mainly 11C and 10C. The activity peak is then observed upstream of the Bragg peak because these fragments have shorter ranges than the primary projectiles at the same velocity8. Unfortunately, the usefulness of PET in 12C therapy remains marginal and insufficient for a complete range uncertainty elimination for several reasons. First, the half-life of the most abundantly produced radionuclides is too long for instantaneous feedback. Instead, the short-lived radionuclides are produced at a low rate and exhibit a long positron range before annihilation9. Second, the spatial shift in the β+-activity and dose peaks as well as the biological washout requires Monte Carlo (MC) simulations10 to analyze the data. Uncertainties in those calculations together with the low counting rates of β + -emitting fragments currently limit the accuracy of the PET-based range verification method to about 2–5 mm11,12.

One could overcome the above-mentioned difficulties by using radioactive ion beams (RIBs), such as 11C or 10C for simultaneous treatment and online beam verification13. This way, compared to the irradiation with stable ions, all the primaries will potentially contribute to the PET signal, significantly increasing the signal-to-noise ratio14. Shorter half-lives of the RIBs (20.33 min and 19.3 s for 11C and 10C, respectively), would reduce the time of PET signal acquisition and make the impact of the biological washout less significant15. Finally, the activity peaks would match the mean ranges of the primaries and will be correlated with the R80 of the Bragg peaks (SOBP), i.e., the position in the distal fall-off where the dose drops to 80% of the SOBP value16.

Until now, the pre-clinical and clinical studies of RIBs were held back by the difficulties in producing RIBs at sufficiently high intensities. However, the interest in these studies has been recently revived following the new developments at modern accelerator facilities, such as the intensity upgrade of the SIS-18 accelerator at GSI (Darmstadt, Germany) in frames of the FAIR facility construction17. The work on RIBs characterization and clinical application has been resumed in frames of the BARB (Biomedical Applications of Radioactive ion Beams) project18 funded by the European Research Council (ERC) in 2020. There have been already two successful experimental campaigns in 2021 aimed at the RIB characterization and first imaging tests with PET detectors, however, these are not part of this work and will be presented in separate contributions19.

In this study, instead, we attempt to demonstrate the rationale for using RIBs in clinical practice. First, we estimate the potential accuracy of dose visualization and its shift detection with RIB activity data. We conclude that if the measured annihilation map matches the predicted one, the same should be expected for the dose distribution and thus there will be no longer the need for the margins associated with range uncertainties to be applied. Following that, we perform an in-silico treatment planning study to quantitively estimate the potential benefit from 11C treatment planning with reduced range margins. By retrospectively robustly re-optimizing patient plans for the head-and-neck and liver tumors with 12C and 11C beams, we estimate the impact of range margin reduction on serial (optic nerves) and parallel (liver) OAR on the level of treatment planning (i.e., meeting the dose constraints) and, furthermore, in terms of the predicted normal tissue complication probabilities (NTCP)20.

Results

RIB annihilation maps to detect range shifts

Figure 1 shows the results of a single-field dose optimization on a patient CT. The RBE-weighted dose distribution, calculated with TRiP98, is shown in Fig. 1a, while Figs. 1b and 1c show the resulting absorbed dose and time-integrated annihilation maps calculated with FLUKA. In Fig. 1d we include the profiles of the absorbed dose and annihilation distributions along the beam direction, yielded by the three raster files, produced using the default and scaled HLUT. Full distributions, normalized to the respective peak values, as well as the zoom-in of the target region are shown. The positions of the peaks of the annihilation distributions correlate with the positions of the fall-offs of the respective dose distributions, and the distances between them reflect the shifts in the dose distributions caused by either 3.5% or 1% differences in interpreting CTs in terms of the ion path lengths. Notably, the proximal rise of the annihilation distribution, even though not as steep as its distal fall-off, correlates with the location of the frontal edge of the SOBP. This all would ideally allow to spot the deviations in the dose distributions, associated with the range uncertainties, and correct them online.

Figure 1
figure 1

Dose and activity distributions calculated in a patent CT. (a) RBE-weighted dose map calculated with TRiP98, (b) absorbed dose and (c) annihilation maps calculated with FLUKA for a single-field 11C treatment plan optimized in a patient CT. Color bars reflect the respective distributions with red corresponding to the maximum observed values. (d) Normalized beam-eye-view profiles extracted from the dose and annihilation maps for the original treatment plan and those optimized using the scaled HLUT tables (for 3.5% or 1%) to mimic the range uncertainties. Depth is given in the CT coordinates. Blue solid lines: dose profiles, orange dashed lines – annihilation profiles. Left plot is a zoom-in of the distributions in the target region.

Margin reduction with 11C beams—impact on treatment plans

Following the outcomes of the preliminary study described in the previous section, we assumed here the possibility of precise range detection with PET for 11C beams. If the measured annihilation distribution would match the estimated one, this would automatically mean that actual position of the spread-out Bragg peak also matches its prediction, eliminating the further need for using the range-associated margins during the robust plan optimization. With this assumption we performed the robust biological (RBE-weighted) optimization of adenoid cystic carcinoma (ACC) and liver plans setting the expected range uncertainty in the 11C plans to 0%, while for the 12C plans the optimization and dose calculation was done assuming the value of 3.5%. The setup uncertainty of ± 3 mm has kept included for both ions.

An example of the field configuration and organ contours for the ACC plan optimization is shown in Fig. 2a. This information was preserved from the original plans used at GSI for patient treatment, where the nerves sometimes were bordering or overlapping with the CTVs. In these cases, since the priority was given to the target coverage (D95 > 95%), sometimes the nerve(s) sparing could have been compromised.

Figure 2
figure 2

Treatment planning outcome for the adenoid cystic carcinoma patients. (a) Example of the double-field 11C dose distribution and planning contours. Black – tumor CTV; magenta – eyes, white – optic nerves, turquise – brainstem. Color bar represents the dose distribution (red color corresponds to the prescribed CTV dose of 3 Gy(RBE)). (b) Patient-by-patient treatment planning outcome in terms of exceeding the dosimetric constraint for the optic nerves (2.46 Gy(RBE) to 0.03 cm3 of the organ). Each point corresponds to the percentage of treatment uncertainty scenarios when the constraint is met. Orange circles represent 12C plans (3 mm setup margin and 3.5% range margin), blue triangles – 11C plans (3 mm setup only).

Figure 2b summarizes the treatment planning outcomes for both nerves patient by patient, i.e. the percentage of treatment scenarios (out of a total of 21) when the treatment planning constraint (D0.03cm3 > 2.46 Gy(RBE)) was violated to ensure the sufficient target coverage. The worst outcome would be 0% for both nerves, meaning that it is impossible to meet the planning constraint without compromising CTV coverage. Instead, the best outcome (100% for both nerves) would mean that sufficient target coverage will be achieved in all the treatment scenarios without exceeding the recommended doses for any nerve. The use of 11C plans with reduced range margins led to a significant improvement in the treatment plan quality. For only 2 patients out of 15 (in contrast to 8 out of 15 in the case of 12C), the maximal allowed dose was exceeded for both nerves in more than half of the analyzed scenarios.

Figure 3a shows the example of the liver patient plan optimized for a uniform CTV (black contour) dose of 10 Gy(RBE). The liver is depicted with a brown contour. As in the ACC plans, the primary goal was achieving the prescribed target dose. The dose constraints set for the remaining healthy liver (as suggested by21), D700cm3 < 4.2 Gy(RBE) or 3.5 Gy(RBE) per fraction for 4- and 6-fraction schemes, respectively, were met for all the analyzed treatment plans.

Figure 3
figure 3

Treatment outcomes planning for the liver tumor patients. (a) Example of the single-field dose distribution and planning contours. Black – tumor CTV, brown – liver. Color bar represents the dose distribution (red color corresponds to the prescribed CTV dose of 10 Gy(RBE)). (b) Total mean dose to the healthy liver (liver – GTV). (c) Relative volumes of healthy liver, receiving the total dose of at least 5 Gy (V5). (d) Relative volumes of healthy liver, receiving the total dose of at least 30 Gy (V30). Points represent the single patient median values among 21 uncertainty scenario (10 patients in total). Orange color: 12C plans, blue color: 11C plans. Stars represent the result of the Wilcoxon Signed-Rank test (P = 0.002) for all plots.

Box plots in Fig. 3b-d show the distribution of these parameters’ median values from all the patients (i.e., 10 values per group). The outlier points in these plots correspond to the parameters derived for patient 6 with the largest CTV (223 cm3), and the smallest healthy liver volumes (927 cm3). Results of the Wilcoxon Signed-Rank reveal a significant difference (P = 0.002) for the Dmean, V5, and V30 values between 11 and 12C groups suggesting that also for the liver the use of 11C would be beneficial in terms of dose reduction.

Influence of reduced margins on the estimated normal tissue toxicity

Serial organs: optic nerves

Box plots in Fig. 4a depict the results of the NTCP calculations with the LKB model for the ACC patients with the endpoint of optic neuropathy for the right and left optic nerves. Each point represents the median NCTP value for a single patient over the 21 plan uncertainty scenarios. The median NTCP values for the left nerve for all the analyzed patients and scenarios are 2.5% versus 1.1% for 11C and 12C plans, respectively; for the right nerve, these are 3.5% versus 7.6%. The highest worst-case scenario NTCP values observed for 12C plans were 45.6% and 53.2% for the right and left nerves, respectively; in 11C, these values for the same patients dropped to 2.9% and 20.4%, respectively (not depicted here). The worst outcome in terms of NTCP was predicted for the patients where either the nerve was overlapping with the target or ‘wrapped’ by the tumor. Results of the Wilcoxon Signed-Rank reveal a significant difference (P = 0.00006 for the right nerve and P = 0.0003 for the left nerve) between the predicted NTCP values for 11C and 12C plans, suggesting that 11C plans with reduced range margins might be notably safer options compared to conventional 12C plans.

Figure 4
figure 4

Normal tissue complication probability estimations for ACC patients. (a) NTCP (endpoint optic neuropathy) estimated for the right and left optic nerves for each patient using the Lyman-Kutcher-Burman model with parameters from51 for 12C and 11C treatment plans. Single points correspond to the patient-specific median NTCP values. Stars represent the result of the Wilcoxon Signed-Rank test (P = 0.00006 for the right nerve and P = 0.0003 for the left nerve) for both plots. (b) Relative risks (RR) representing the ratio of 11C versus 12C NTCP values for individual patients for the right (dashed bar) and left (solid color bars) optic nerves. RR < 1 means that the 11C plan is beneficial compared to 12C plan in terms of optic system toxicity. Top: nominal scenarios, bottom: worst-case scenarios.

We have additionally calculated the relative risks (RR) as RR = NTCP11C/NTCP12C to quantify the relative NTCP difference between different ion plans for nominal and worst-case scenarios patient-by-patient. Figure 4b shows the RR values for both right (dashed bar) and left (solid color bar) nerves. RR < 1 would mean a safer outcome for 11C plans as compared to 12C plans. Despite the RR being > 1 for three patients in nominal scenarios, in the worst-case scenarios the use of 11C with decreased margins might reduce the relative risk down to less than 0.2 in most cases.

Parallel organs: liver

To estimate liver toxicity using the LKB model, we have used six models with different endpoints. The comparison of outcomes from 12C versus 11C treatment plans is given in Fig. 5. Apart from variations in the severity of the endpoints (from the liver failure (Fig. 5a) and the radiation-induced liver injury (Fig. 5b) to the potentially slightly less severe toxicities such as changes in albumin-bilirubin (ALBI) score, Child–Pugh (CP) score, or enzymatic changes (Fig. 5c-f), the models also use different parameters as reported in Table 2. Of note, the NTCP predictions of the Burman model are quite varying from patient to patient and might reach quite high values; we attribute it to the low value of the volume effect parameter which increases the weight of the high-dose regions.

Figure 5
figure 5

Normal tissue complication probability estimations for liver tumor patients estimated for each patient using the LKB parameters from six different models. (a) Burman model for the endpoint of liver failure, (b) Dawson model for the endpoint of radiation-induced liver injury (RILI) grade > 3, (c) El Naqa model for changes in ALBI and CP scores, (d) El Naqa model for enzymatic changes, (e) Pursley model for changes in CP score, (f) Pursley model for changes in ALBI score. Single points correspond to the patient-specific median NTCP values. Orange and blue bars correspond to the predictions for 12C and 11C plans, respectively. Stars represent the result of the Wilcoxon Signed-Rank test (P = 0.002 for ad and F, P = 0.0039 for e).

Wilcoxon Signed-Rank test reveals a significant difference (P < 0.01) between the NTCP values for 11C and 12C plans predicted by all six models. This suggests that 11C plans with reduced range margins might be a better option also in terms of sparing the parallel critical structures.

Similar to the ACC plan analysis, relative risks were calculated for an individual patient for every model and are presented in Fig. 6 (top – nominal scenarios, bottom – worst-case scenarios). Depending on the model selected for comparison, 11C plans would be either safer (Burman or Dawson models) than 12C ones, or similar (El Naqua and Pursley models) in terms of relative risks. 11C plans are most beneficial according to the Dawson model, where the RR decreases to 0.14 in the worst-case scenario for patient 7. However, one needs to keep in mind that for the Dawson model the absolute NTCP values are the lowest (less than 2%) compared to all the other models presented here.

Figure 6
figure 6

Relative risks representing the ratio of 11C versus 12C NTCP values for individual liver patients for each of the six models considered in the study. Value of RR < 1 means that the 11C plan is beneficial compared to 12C plan in terms of liver sparing. Top: nominal scenarios, bottom: worst-case scenarios.

Discussion

The results presented here show the benefits of the elimination of implicit margins, associated with range uncertainties in robustly optimised treatment plans for two different tumor types, that could be potentially achieved on a clinical level by using 11C beams in combination with the improved PET range verification. In the previous overview of the BARB project18, we have already presented a FLUKA comparison of the activity maps of 12C and 11C ion treatment plans in the simple geometry (sphere in a water tank). In the first part of the current work, we demonstrated that the annihilation map in the patient, irradiated with 11C beams, can theoretically be used to precisely determine the positions of the SOBP fall-offs and thus to recognize the shifts in actual dose distributions of about 1 mm. This would mean that radioactive beams, in particular, 11C studied here, would be an ideal tool for detecting the range changes during patient irradiation, allowing one to apply the necessary plan corrections online or adjust the beam delivery accordingly (or indicating the need of a repeated CT scan in case of large or inconsistent shifts). In turn, the margins to account for the range uncertainty, that are added to the target volume (either explicitly in PTV-based optimization, or implicitly estimated from the value of range uncertainty passed to the system during the robust optimization) can be reduced, or, in theory, eliminated.

While the simulations presented here reflect the best possible scenario, where the whole annihilation signal could be used to generate PET images reflecting the shifts in the dose distributions caused by the range uncertainties, and a perfect scanner resolution, the practical feasibility of range margin complete elimination remains an open question. For example, the signal coming from 11C beams with a half-life of about 20.33 min still might be affected by the washout that might slightly smear the distribution; from this point of view, 10C beams might be more practical. Moreover, even if the look-up tables for the Hounsfield units would correctly reflect the ion’s pathlengths, the precision of the Bragg peak position prediction in the patient would be limited by the CT resolution. Additionally, efficiency of the imaging detectors, the capability of the online image reconstruction software tools and the limited time used to accumulate the signal further limit the best possible achievable image quality. The results presented here should be thus interpreted as a ‘best asymptotic’ approximation of what could be achieved with RIB treatments.

It is important to point out that collecting the annihilation signal from the full plan delivered to the patient will be impractical if one aims at the online range verification since in this case the corrections can be applied only in the following fraction. The best approach to acquire enough signal in the shortest timespan possible to make any necessary plan modifications is yet to be identified. One option could be irradiating the whole target area but only at a small fraction of the target dose. Alternatively, a ‘pilot shot’ irradiating only a selected tumor slice with single energy would be faster since there is no time needed to switch between the different energies; however, there will be an increased risk of missing anatomical changes at higher depths. There is a need of a future in-depth Monte Carlo study to estimate how the range monitoring efficiency will be affected by the uncertainties in positron detection, coupled with the biological washout and the fact that only the limited fraction of the irradiation plan should be delivered for the initial online range detection. This would help propose a new more realistic value of range uncertainty to be used in robust treatment planning with 11C.

Reduced range margins would make treatment planning procedures easier by increasing the chances of meeting the constraints set for the critical structures without compromising the desired target dose coverage. As the ACC plan analysis shows, with the standard margins, more than half of the plans were not acceptable in terms of exceeding the recommended maximal dose to the optic nerves in most of the treatment scenarios, while in the case of reduced margins the number of such ‘problematic’ plans is significantly reduced.

NTCP calculations for the optic nerves suggest the evident benefit of 11C plans with reduced margins for serial organs in target proximity. Apart from the reduction of the median value of NTCP estimated for each patient plan, the spread between the best and the worst-case scenarios was significantly reduced compared to 12C plans. Since the toxicity induction for the serial organ is rather the question of exceeding the certain dose in a point region, the range margin reduction, shrinking the high-dose region already by a millimeter, would play a crucial role in the organ sparing. Thus, the closer the serial OAR to the target structure, the more pronounced toxicity reduction should be expected. This agrees with the study6 aimed at estimating the NTCP changes following the reduction of the range margins in proton therapy. To note, margin reduction in carbon therapy might have a stronger impact on the toxicity outcomes compared to protons, due to the 2–threefold increase of RBE at the distal edge of the SOBP.

Concerning liver tumors, it is difficult to correlate the decrease of NTCP with some specifics of patient geometry and/or field configuration. While all the NTCP models applied here indeed predict the highest toxicities for the patients with bigger tumors and higher ratios of tumor volume to a healthy liver, to our surprise, the most decreased relative risks were estimated for the patients with smaller tumors.

The less pronounced effect of margin reduction for the liver plans compared to the ACC plans might be at least partially explained by the healthy liver volume definition. Treatment plans were optimized for a uniform CTV dose, whereas the healthy liver was defined as the total liver minus GTV, following standard recommendations21. That would mean that for both 11C and 12C plans a small fraction of the healthy liver (CTV – GTV) would always receive the target dose and thus the margin reduction would not have any pronounced impact on the maximal dose value received by the organ.

It is also important to note here, that when comparing liver plans in terms of the NTCP, attention needs to be paid to the differences in the models used. For example, in certain uncertainty scenarios for some patients, the Burman model for liver failure might predict the NTCP values higher than 60%, even though the treatment planning constraints were met. This might be attributed to the low value of the n parameter used in the respective LKB model which de-emphasizes the volumetric effects in the parallel organ and treats it more like a serial one. On the other hand, the Dawson model for the RILI predicts the NTCP values which are practically zero for all the patients, which might make the benefit of the five-time NTCP decrease with 11C beams questionable. Moreover, the models used here consider different endpoints, among which some of them might be considered more severe than others (e.g., liver failure vs. enzymatic changes). That is why we think that including all these models in the analysis would give a more complete picture. All in all, for the liver plans and NTCP models studied here, 11C plans reduce liver toxicity following the statistical analysis with the Wilcoxson test.

To further investigate which tumors will benefit the most from the margin reduction with RIBs, especially those close to the parallel organs, broader patient sets with different tumor sizes and their relative locations need to be analyzed. From the same point of view, considering other parallel organs, e.g., the kidney, would be interesting.

Reduction of target margins that are associated with range uncertainties, enabled by the use of 11C beams, apart from reducing the doses to the adjacent structures by default, might further improve the flexibility and quality of treatment by enabling irradiation from different angles. A recent study22 has shown further brainstem NTCP decrease from using novel proton beam arrangements, enabled by the reduced range margins. As the authors have shown, new beam arrangements might also enable the possibility of redistributing the LET distributions in the target and increasing its mean value. Concerning the 11C beams, the redistribution of the stronger LET gradients might be of further benefit for both sparing the OARs and tackling more radioresistant, e.g., hypoxic target regions23,24,25.

In the present study, we did not account for target motion that would exacerbate the range uncertainty issue. For ACC tumors, this is a fair assumption. Instead, liver patients studied here were originally receiving SBRT with abdominal compression – for the ion fields with sharp dose fall-offs, this mitigation technique would not be sufficient. Future studies should be thus extended to the 4D robust optimization to include additional motion-induced range changes.

Conclusion

Radioactive ion beams, such as 11C, can be ideal candidates to tackle the problem of range uncertainty in particle radiotherapy thanks to their sharp activity distribution that can be monitored using PET. In contrast to the PET signal produced by 12C ions, used nowadays in clinical practice, the signal from 11C would need a shorter acquisition time, and have a higher peak-to-noise ratio and a well-defined peak that can be better correlated with the dose distribution. All of this would allow in vivo beam monitoring in the patient. Improved precision of range monitoring would lead to smaller or eliminated range margins set for the target, which, in turn, would decrease the volume of normal tissue receiving unnecessary high doses of radiation. We have shown here that it is expected to have a significant positive impact on both the treatment planning outcomes (i.e., meeting treatment planning constraints without compromising the target dose coverage) and the predicted post-treatment toxicities (reduction of NTCP for various endpoints) both serial and parallel organs at risk in target proximity.

Materials and methods

Patient data

For this study, we have retrospectively considered consecutive patients with ACC and liver tumors to evaluate the impact of margin reduction on the damage to the serial (optic nerves) and parallel (liver) organs, respectively.

ACC patients (N = 15) analysed here had tumors located mainly in the frontal areas of the skull and were originally treated in 2002–2008 at GSI in frames of the pilot project, implying a 12C ion boost following a course of intensity-modulated radiotherapy in Heidelberg26,27. The study was approved by the ethical committee of the University of Heidelberg back in 1997. Anonymized treatment plans of all patients are stored for research purposes at the GSI Helmholtzzentrum für Schwerionenforschung in Darmstadt, where treatment was actually delivered, according to the German law. Informed consent is waived by the ethical committee of the University of Heidelberg for anonymized plans used for research plans.

The liver patients (N = 10) studied here have originally undergone photon stereotactic body radiation therapy (SBRT) with abdominal compression at the Department of Radiotherapy of the University Federico II of Naples. Patients treatment at the University Federico II is performed under approval of the ethical committee of the University, specifically IRB 222–10. The approval includes use of anonymized treatment plans for research purposes only.

Treatment planning considerations

GSI in-house treatment planning system (TPS) TRiP98 for active beam scanning was used to perform biologically weighted robust treatment plan optimization28,29 on a CTV using 12C and 11C beams for the plans comparison study. While in a PTV-based optimization the margins are explicitly added to the tumor contours, in the robust optimization in TRiP98 the planning input are the expected values of setup and range uncertainties, and the target margins are generated implicitly during the plan optimization. The initial 3D raster scanner grid, estimated from the CTV contours, gets extended to contain all the pencil beam spots resulting from the interplays of possible range over- and underestimations and setup shifts in all anatomical directions (in total there are 21 uncertainty scenarios).

In our study, the treatment planning uncertainties included ± 3.5% (12C) or 0% (11C) range uncertainty and ± 3 mm shifts in all cardinal directions to consider positioning uncertainties.

The magnitudes of uncertainties for 12C were preserved from a previous work on robust optimization29 (where they were adapted from the proton study30) for both sets of patient plans, even though for the liver with more homogeneous structure the range uncertainties might be slightly too conservative31. Target motion was not considered. The planning objective was to achieve at least 95% of the prescribed dose in at least 95% of the CTV volume in at least 95% of treatment scenarios, obeying the OAR constraints listed below. If a compromise could not be achieved, target coverage was favoured over OAR sparing.

ACC. We have preserved the original two-field configuration as well as the OAR contours from the original treatment plans from the pilot project, adjusting them to a 20 × 3 Gy(RBE) fractionation scheme. The planning constraint for the optic nerves was estimated following the European Particle Therapy Network (EPTN) recommendations32 to D0.03cm3 < 2.46 Gy(RBE) per fraction. The α/β ratio of both tumor and normal tissues was assumed to be 2 Gy32,33.

Liver tumors. Depending on the tumor location relative to the OARs, a single- or double-field configuration was chosen for each tumor. Where possible, a fractionation scheme of 4 × 10 Gy(RBE) was used, otherwise, it was changed to 6 × 7 Gy to reduce the dose to the serial OAR in target proximity. Planning constraints were adapted to the fractionation schemes following the recommendations of the American Association of Physicists in Medicine group (AAPM)34. The α/β ratio of the tumor was assumed to be 10 Gy35,36, and 2 Gy for the healthy liver37,38.

The details for each treatment plan are given in Table 1.

Table 1 Adenoid cystic carcinoma (ACC) and liver patient plans characteristics summarizing the numbers and sizes of clinical tumor volumes, fractionation schemes (number and dose per fraction), numbers of fields, and organs at risk considered during the plan optimization.

Base data

Apart from the patient-specific input (CT scans and RT structure contours, prescribed doses, and field configuration), TRiP98 requires a valid physical beam model describing the interaction of an ion with water, the reference medium in radiotherapy. The so-called base data includes laterally integrated depth-dose distributions in water and energy-dependent fragment spectra, necessary for the calculation of biological effects, at representative depths in water39.

For this study, a new set of base data for 11C, which were not previously used in TRiP98, was generated with the FLUKA Monte Carlo code (version 2020.1.10) 40, which was also used for calculating annihilation maps (see next section). To keep consistency between the two programs (FLUKA and TRiP98), energy loss tables for the primaries and all the fragments were extracted from FLUKA and used in TRiP98 together with the new set of base data. Additionally, still for consistency, a new set of base data for the 12C was also specifically generated for this work.

For both ion beams the FLUKA simulations were performed using the recommended default setting for ion beam therapy applications (cf. defaults ‘HADROTHErapy’). For both isotopes, the beam spot was assumed to be Gaussian with a full width at half maximum (FWHM) of 8 mm in the isocenter. The Bragg peak broadening due to the use of a ripple filter was considered by the introduction of a momentum spread (dp/p) of 1% for all energies. Energy dependent fragment spectra and laterally integrated depth dose distributions were calculated in the energy range between 80 and 440 MeV/u, for beams impinging directly from vacuum to water, as in TRiP98 the energy degradation in the nozzle is considered through an adjustable offset value. A more in-depth description of the Monte Carlo calculation of base data for TPS can be found in41.

Monte Carlo simulation of annihilation maps for range shifts detection

To prove whether the information from the 11C β+-decays can be at least theoretically used to precisely detect shifts between the planned and actual dose distributions, we have performed an additional TRiP98-FLUKA study in a patient geometry.

As a first step, we produced a treatment plan for a patient using the look-up tables for converting the HU into the ion’s WEPL (Hounsfield look-up table, or HLUT) extracted from FLUKA. To simplify the output and its interpretation, here the robust optimization was omitted; the plan was optimized for a uniform RBE-weighted dose of 3 Gy delivered with a single field. Then, the optimization procedure was repeated using the same HLUT scaled either by 1% or by 3.5% to emulate the range uncertainties associated with CT interpretation. As a result, three different raster scanner files containing information on energies and numbers of particles were produced. If loaded back into the system for the dose calculation with the default HLUT, only the first file would yield the dose distribution matching the target contours, while the other two would cause partial underdose and hotspots in the nearby healthy tissue.

In the second step, we loaded the patient CT and the raster scanner files in FLUKA to calculate the physical (absorbed) dose and annihilation distributions (as a best possible scenario the integral annihilation signal was scored). In this case the FLUKA built-in material description and HU-WEPL conversion were used for all three scenarios.

A specifically developed ‘source.f' user routine, similar to the one described in41 was used to read the TRiP98 raster scanner files, specifying the sequence of the beam delivery, in FLUKA. This routine allows the possibility of full Monte Carlo simulation of TPS-optimized plans while using the same beam settings adopted for the base data calculations. In addition to the calculation of the annihilation maps in a realistic treatment planning scenario, this routine was used for consistency check between the MC calculations and the expected dose deposition distributions obtained with TRiP98.

As a result, we have produced three sets of dose and annihilation maps that were visualized and analyzed.

NTCP evaluation

All dose maps produced following ACC and liver patient plans optimization were first converted from Voxelplan (native TRiP98 format) into DICOM RT format using 3D Slicer42 and then into Matlab-readable format (MathWorks, Natick, MA, USA) for further analysis. Next, each dose map has been voxelwise converted into 2-Gy equivalent dose (EQD2) by an in-house developed function for Matlab 43,44. For the ocular structures in the ACC plans, the linear-quadratic model45 was used with α/β = 2 Gy33. Instead, for the liver plans where high doses per fraction were planned (> 6 Gy per fraction), the EQD2 conversion was performed under linear-quadratic-linear condition46, using the parameters α/β = 2.5 Gy, γ/α = 5 Gy, dt = 5 Gy as reported in47,48.

NTCP values were estimated using the Lyman-Kutcher-Burman (LKB) model49,50 with organ- and toxicity-specific parameters from the models reported in literature. For the optic nerves, only one model for the optic neuropathy endpoint was applied. Instead, for the healthy liver (liver minus GTV), several models related to endpoints of different severities exist and were included in the analysis. The models and their key parameters are summarized in Table 2.

Table 2 Summary of the LKB NTCP model parameters used in the study. TD50 is the value of the uniform dose given to the entire organ surface corresponding to the 50% probability to induce toxicity; m is inversely proportional to the slope of the dose–response curve; and n (or a) accounts for the volume effect.

As pointed out by48,52,53, the mean total (multiplied by the number of fractions) dose Dmean, as well as the V5 and V30 representing the fractions of the healthy liver receiving the total dose of more than 5 or 30 Gy, respectively, can be used as predictive parameters for estimating the hepatic toxicity.

Statistical analysis

Wilcoxon signed rank test with a threshold significance level of 0.05 was used to study the differences between the 12C and 11C treatment planning and NTCP modelling outcomes and was performed with GraphPad Prism Software V9.4.0. The test results are represented in the respective plots using asterisks as follows: ****P < 0.0001—extremely significant; 0.0001 < ***P < 0.001—extremely significant; 0.001 < **P < 0.01—very significant; 0.01 < *P < 0.05 -significant, P ≥ 0.05—not significant (ns).