Introduction

Although new induction therapies for AML and/or implementation of target therapies have increased response and survival rates of patients with AML or MDS, allogeneic hematopoietic cell transplantation (HCT) remains a valuable curative-intent treatment but carries a significant risk of morbidity and mortality. Due to improvements of supportive care, optimization of transplant procedures and developments of effective treatments for the prevention and treatment of transplant complications, an increasing number of older adults are receiving allogeneic HCT [1]. In many cases, death in remission after transplantation is primarily attributed to infections and graft-versus-host disease (GvHD) but also includes other complications of allo-HCT [2]. To reduce the treatment risk for patients who are not candidates for traditionally used myeloablative conditioning (MAC) regimens and in whom the underlying disease has been well controlled, reduced-intensity conditioning (RIC) or non-myeloablative regimens have been developed [3]. MAC regimens are typically linked to a reduced risk of disease relapse but are accompanied by a higher risk for non-relapse related mortality (NRM). Therefore, RIC regimens are of particular importance for patients aged 60 years or older or those with underlying medical conditions, as they may not be suitable candidates for MAC [4,5,6].

Fractionated total body irradiation TBI (8 Gy) combined with fludarabine (8GyTBI/Flu) represents an established preparative regimen for AML patients in complete remission (CR). In a randomized study, this regimen showed a significantly lower risk of NRM without corresponding increase in the risk of relapse, compared to a classical full toxic conditioning [7, 8]. More recently, a preparative regimen consisting of treosulfan in combination with fludarabine (Flu/Treo), has emerged as an alternative standard treatment, especially for patients aged ≥ 55 years or those with pre-existing medical conditions. In a large, randomized trial, this regimen has shown significantly higher post-transplant survival rates compared to a long established busulfan-based RIC [9]. To date, there is no comprehensive clinical study that prospectively evaluates the efficacy and toxicity profiles of 8GyTBI/Flu and Flu/Treo. TBI is associated with some, in part, long-term complications due to radiation-related side effects, including endocrine and gonadal disorders, cardiac complications, and lung fibrosis [10, 11]. This may in part explain the lower NRM rates in patients with AML in CR1 aged > 55 years who underwent conditioning with Flu/Treo compared to 8GyTBI/Flu in an analysis of the European Society for Blood and Marrow Transplantation (EBMT) registry [12]. However, this registry-based analysis is limited by data availability and a relatively short follow-up time, leaving unanswered questions regarding the prognostic impact of measurable residual disease, cytogenetic parameters, and comorbidities. Additionally, comparative data on outcomes of both conditioning regimens in patients with myelodysplastic syndromes (MDS) is not available. We aimed to compare outcomes in a large cohort of patients with AML (in CR) and MDS who underwent first allo-HCT to determine the true impact of these regimens on transplant outcomes and to evaluate potentially underlying prognostic factors that might influence clinical decision making with respect to selection of the optimal conditioning strategy.

Methods

Data collection

Clinical data were retrospectively extracted from the medical records and electronic patient files. We retrospectively analyzed 311 patients with AML or MDS who received their first allo-HCT after preparatory treatment with either Flu/Treo or 8GyTBI/Flu. Inclusion criteria, treatment modalities and definitions are described in detail in the supplemental material.

Statistical analysis

All outcomes were calculated from the day of transplantation. Surviving patients were censored at the time of the last contact. We conducted propensity score matching (PSM) in a 1:1 ratio between the Flu/Treo and 8GyTBI/Flu treatment groups. PSM and statistical tests are described in detail in the supplemental material.

Results

Patient characteristics and transplant modalities

In total, we identified 311 patients (215 AML and 96 MDS) patients meeting the inclusion criteria, of which 207 patients were treated with Flu/Treo and 104 patients with 8GyTBI/Flu as conditioning therapy. Baseline characteristics grouped by conditioning therapy for the entire cohort are shown in Table 1. Across conditioning therapy groups, we noted significant differences in disease characteristics between the Flu/Treo and 8GyTBI/Flu group in terms of median age at allo-HCT [64 years (range: 19–76) vs. 47 years (range: 18–69)], proportion of AML patients [59.4% vs. 88.5%, p < 0.001], ECOG PS [ECOG 2-3: 16.0% vs. 6.8%, p = 0.002], proportion of de novo AML [59.3% vs. 87.0%, p < 0.001], complex karyotype in cytogenetics [20.5% vs. 9.9%, p = 0.023] and adverse/intermediate risk AML (ELN 2017) [77.3% vs. 67.4%, p = 0.011]. Further differences were observed with a higher proportion of AML patients with measurable residual disease (available for 196/215 AML patients) in the Flu/Treo group [68.3% vs. 47.8%, p = 0.018], more transplantations from HLA-matched-related donors [24.2% vs. 14.4%, p = 0.046], higher proportion of HCT-CI Score ≥3 [49.8% vs. 25.0%, p < 0.001] and the less frequent application of in vivo T-cell depletion with ATLG (Neovii) [77.3% vs. 87.5%, p = 0.033]. Follow-up time was significantly shorter for the Flu/Treo group (2.7 vs. 3.5 years, p = 0.003). For GvHD prophylaxis, nearly all cases relied on a combination of cyclosporin A and either MTX or MMF. Supplementary Table 1 additionally displays baseline characteristics grouped by the conditioning group and disease.

Table 1 Clinical and transplantation characteristics of all patients by conditioning group before matching.

Key outcomes for unmatched patients

We performed Kaplan-Meier analyses for RFS and OS, as well as cumulative incidences or relapse, NRM and GvHD for the unmatched cohort as summarized in Supplementary Table 2. For the Flu/Treo-cohort, 1-year, and 3-year RFS after allo-HCT was 79% and 63%, respectively, for the 8GyTBI/Flu-patients 81% and 72% (both p = 0.240). Overall survival was not significantly different with a 3-year OS of 71% for the Flu/Treo cohort and 79% for the 8GyTBI/Flu cohort (p = 0.061)] (Supplementary Fig. 1). We observed higher cumulative incidence of NRM at 1 year for the Flu/Treo group when compared to 8GyTBI/Flu with 8.4% vs. 4.8% (p = 0.037) [3-year NRM: 16% vs. 7%], while cumulative relapse incidence at 1-year and 3-years were similar across both groups [1-year: 13% vs. 14%; 3-year: 21% vs. 21%; p = 0.750] (Supplementary Fig. 2). In terms of GvHD incidences, no significant differences for acute GvHD grade II-IV (p = 0.294), grade III-IV (p = 0.454) and chronic GvHD (p = 0.054) were noted (Supplementary Fig. 3).

Propensity score matching

To address the substantial differences observed in key outcomes and clinical parameters with known prognostic significance between patients treated with Flu/Treo and 8GyTBI/Flu, we conducted propensity score matching (PSM) using age at allo-HCT, sex, underlying disease (AML or MDS) as matching parameters. The baseline characteristics of the matched patient cohort are presented in Table 2. A total of 106 matched patients were grouped based on the conditioning therapy (PSM-Flu/Treo-cohort, PSM-8GyTBI/Flu-cohort). As presented in Table 2, baseline characteristics were balanced for PSM-Flu/Treo and PSM-8GyTBI/Flu patients with respect to age (57 vs. 55 years, p = 0.203), female patients (49.1% vs. 45.3%, p = 0.846), underlying disease (AML: 83.0% vs. 83.0%, p = 1.000), and HCT-CI score groups (HCT-CI ≥ 3: 41.5% vs. 39.6%, p = 0.179). Of note, the Flu/Treo group showed a significantly higher rate of ECOG scores of 2-3 compared to the comparison group (17.0% vs. 7.6%, p = 0.014). Further disease characteristics were equally distributed. In particular, the proportion of patients with de novo AML was comparable in the PSM-Flu/Treo and PSM-8GyTBI/Flu group [72.7 vs. 79.5%, p = 0.618], complex karyotype was found in 11.3% vs. 9.8% (p = 1.000) and no significant differences in the distribution ELN2017 risk categories for AML (adverse risk: 40.9% vs. 31.8%, p = 0.637) as well as IPSS-R risk groups for MDS (high/very high risk: 88.9% vs. 66.7%, p = 0.576) emerged. The distribution of transplant characteristics revealed consistent proportions across several factors, including MRD status prior to allo-HCT for AML patients (p = 1.000), donor types (p = 0.144) and the utilization of in vivo T-cell depletion (p = 0.092) as shown in Table 2. Nevertheless, it’s noteworthy that the PSM-Flu/Treo cohort exhibited a significantly shorter median time from diagnosis to allo-HCT [3.6 months (range: 1.9–58.9) vs. 4.3 months (range: 1.9–39.7), p = 0.046]. Furthermore, Supplementary Table 3 provides an overview of the clinical characteristics within the PSM cohort, by conditioning regimen and underlying disease. Among all 106 PSM-matched patients, AML patients were older than MDS patients at allo-HCT [median: 56 (27–71) vs. 47 years (19–65)], while the distribution of other clinical characteristics was comparable for both disease groups.

Table 2 Clinical and transplantation characteristics by conditioning groups for matched patients.

Key outcomes for matched patients

With a median follow-up of alive patients of 3.3 years in the Flu/Treo and 4.1 years in the 8GyTBI/Flu group (p = 0.084), Kaplan-Meier estimates for RFS were similar (Fig. 1, Table 3). In the PSM-Flu/Treo-cohort, 83%, 78% and 59% patients showed a RFS at 1, 3 and 5 years after allo-HCT, respectively, opposed to 77%, 66% and 63% (p = 0.283) in the PSM-8GyTBI/Flu-cohort (Table 3). One and 3-year OS were comparable in the PSM-Flu/Treo and 8GyTBI/Flu groups with 90% vs. 87% and 81% vs. 74% (both p = 0.704), respectively (Table 3). There were no statistically significant differences between Flu/Treo and 8GyTBI/Flu in terms of cumulative relapse incidence (3 year: 20% vs. 20%, p = 0.811), while 1-year NRM (1.9% vs. 9.5%, p = 0.029) was lower for the Flu/Treo group (Table 3, Fig. 2). Among matched patients aged ≥ 55 years, NRM was found to be higher for those who underwent 8GyTBI/Flu conditioning, with a 1-year NRM of 14%, while no NRM event occurred for the Flu/Treo-treated patients (p = 0.015). No significant differences were noted between both conditioning regimens for patients ≥55 years of age with respect to OS (p = 0.253), RFS (p = 0.179) and relapse incidence (p = 0.897). Furthermore, the propensity score matching model was additionally employed for AML patients to account for potential differences between the two diseases. For 40 AML pair-matched patients, 3-year RFS and OS rates in the Flu/Treo group was 71% and 76%, respectively, compared to 66% (p = 0.929) and 76% (p = 0.436) in the and 8GyTBI/Flu group. There were no significant differences for 1-year NRM (5.1% vs. 7.5%, p = 0.722) and relapse (3-year: 20% vs. 27%, p = 0.940).

Fig. 1: Kaplan-Meier estimates for matched patients by conditioning groups.
figure 1

Relapse-free survival (a) and overall survival (b) for propensity score matched (PSM) patients with Flu/Treo and 8GyTBI/Flu conditioning.

Table 3 Univariate outcomes by conditioning groups.
Fig. 2: Cumulative incidences of relapse and NRM for matched patients by conditioning groups.
figure 2

Cumulative incidences of relapse (a) and non-relapse mortality (b) for propensity score matched (PSM) patients with Flu/Treo and 8GyTBI/Flu conditioning.

Infection was the leading cause of NRM in the PSM-8GyTBI/Flu group (4/7 patients) and PSM-Flu/Treo-group (1/1 patients). Two PSM-8GyTBI/Flu patients died from GvHD-related causes, and one patient deceased due to mesenteric ischemia. Importantly, no secondary malignancies were observed in the PSM cohort. Additionally, no significant differences between PSM-Flu/Treo and PSM-8GyTBI/Flu in the cumulative incidence of acute GvHD II-IV at day 100 (11% vs. 23%, p = 0.112), acute GvHD III–IV at day 100 (5.7% vs. 5.7%, p = 0.749) or chronic GvHD at 3 years (34% vs. 36%, p = 0.979) were observed (Fig. 3). Next, we performed univariable and multivariable Cox regression analysis for RFS and OS as shown in Table 4. After adjusting for conditioning groups, age group ( > 60 years), HCT-CI score groups and MRD status, we did not observe any underlying factors associated with the key outcomes. Similarly, we also employed the Cox regression model in the unmatched cohort; however, we were unable to identify factors associated with the key outcomes in the multivariate model (Supplementary Table 4).

Fig. 3: Cumulative incidences of acute and chronic GvHD for matched patients by conditioning groups.
figure 3

Cumulative incidences of acute GvHD Grade II-IV (a), acute GvHD III-IV (b), and chronic GvHD (c) for propensity score matched (PSM) patients with Flu/Treo and 8GyTBI/Flu conditioning. GvHD Graft-versus-host disease.

Table 4 Univariate and multivariate Cox proportional hazards models for RFS and OS.

Discussion

In our retrospective analysis patients treated with treosulfan-based or 8 Gy TBI-based conditioning prior allogeneic HCT showed no significant differences in survival outcomes or cumulative incidences of relapse or acute / chronic GvHD. Recognizing the need to address potential biases resulting from variations in patient characteristics, we utilized a propensity score matching approach, which enabled us to balance for baseline factors, such as age and HCT-CI scores. Moreover, given the pivotal roles of measurable residual disease (MRD) in AML patient outcomes [13,14,15], our propensity score matching model successfully addressed these factors. With a sufficient follow-up of 4 years and comprehensive data for disease and patient characteristics, we noted comparable efficacy of both conditioning regimens in all subgroups. In summary, our study results suggest that both regimens, Flu/Treo, and 8GyTBI/Flu, are safe and highly effective conditioning therapies prior allogeneic HCT for patients with AML and MDS.

So far, prospective randomized studies comparing TBI- and non-TBI based conditioning regimens for adult AML / MDS patients are lacking. Traditionally, TBI-based conditioning regimens for patients with AML have been associated with a survival benefits compared to chemotherapy-only conditioning [4, 8]. With the introduction of intravenous busulfan patient outcomes improved and recent large retrospective studies showed no conclusive results [16, 17]. However, most of these studies compared TBI (mainly 12 Gy TBI, given in various fractions) and busulfan based myeloablative conditioning. For AML patients aged 18–60 years a randomized study revealed that, compared to 12GyTBI/cyclophosphamide, 8GyTBI/Flu is associated with a reduced incidence of NRM, which was particularly evident for individuals between 41 and 60 years of age, without resulting in a higher incidence of relapse [8]. Formally, 8GyTBI-based conditioning is not classified as a reduced intensity conditioning (RIC) regimen. However, this reduced-toxicity conditioning (RTC) regimen has shown to be feasible in elderly patients and / or patients not eligible for MAC [7]. Of note, the transplant conditioning intensity (TCI) weighted risk score categorizes the applied Flu/Treo regimen (virtually all patients had a treosulfan dose of 30 g/m2) as a low intensity regimen with a score of 1.5, while the applied 8 Gy TBI/Flu regimen is positioned within the intermediate risk TCI category with a score of 2.5 [18]. A register study from the EBMT compared 8GyTBI based RTC with busulfan based MAC and showed better overall survival and reduced relapse incidences particular for patients aged up to 50 years, whereas for patients aged 50 years or older the use of 8GyTBI/Flu was associated with increased incidence of NRM [19]. However, the reported NRM rate of 26% at 2 years observed in patients ≥50 years was relatively high and might also reflect different strategies in supportive care, donor selection and levels of center experience [19].

Within the multicenter MC-FludT.14/L trial, 570 AML and MDS patients aged 50–70 years or <50 years with a comorbidity index (HCT-CI) of >2, were randomized to 30 g/m2 treosulfan or 6.4 mg/kg of busulfan [9]. Both agents were combined with standard dosages of fludarabine. The estimated EFS and OS at 3 years was significantly better for patients receiving treosulfan compared to busulfan based conditioning (60% vs. 50% and 67% vs. 56%, respectively). While 3-year relapse incidences (26%) were identical between both groups, the cumulative incidence for non-relapse related mortality was significantly lower for the treosulfan group (14% vs. 21%) [9]. Our comparative cohort study revealed that both, 8 Gy TBI and treosulfan based conditioning might allow relevant reduction of disease relapse. While the efficacy of 8GyTBI/Flu was comparable to the results reported from Bug et al., the relapse incidence for the Flu/Treo cohort was significantly higher with 35% at 2 years [12]. Of note, the NRM rate of 28% in elderly patients was significantly higher after TBI-based conditioning in this EBMT study and might in part be explained by difference in the fractions of TBI applied and radiation technique (not reported for the EBMT-cohort) [12]. We believe this finding might be in part attributed to a standardized TBI delivery approach employed at our center utilizing a PRIMUS linear accelerator or a TrueBeam linear accelerator. This methodology is generally associated with relatively low toxicity rates as previously shown [20]. The observed overall toxicity, represented by NRM rates for both conditioning regimens, was generally comparable to major randomized trials [4, 7,8,9]. For Flu/Treo, our findings suggested a potentially more favorable NRM risk compared to the pivotal trial, which could be attributed to differences in age distributions (median age at allo-HCT 56 years vs. 60 years) [9]. Additionally, the large EBMT analysis by Nagler et al. could show comparable NRM results for CR1 AML patients with 8.5% at 5 years (median age 57 years) [21]. On the other hand, the 1-year-NRM rates with <10% for 8GyTBI/Flu were in line with those reported in the randomized trial [8, 22].

Interestingly, the cumulative incidence of chronic GvHD in our matched cohort appeared more favorable when compared to a randomized trial, likely due to a relatively high proportion of patients receiving in vivo T-cell depletion [9]. Regarding measurable residual disease at transplantation for AML patients, our study comprehensively assessed the molecular or cytogenetic aberrations detectable at allo-HCT. This single-center approach allowed for a more consistent evaluation in contrast to registry-based multicenter analyses which may suffer from variations and limited data availability in MRD assessment [12, 21, 23]. While many reports highlight the prognostic relevance of MRD status prior to allo-HCT, we could not identify MRD positivity as an independent prognostic factor in our matched cohort [14, 23]. In line with this finding, the EBMT analysis by Bug et al. failed to retrospectively show a prognostic relevance of MRD status for these conditioning regimens in CR1 AML patients, despite the registry-related limitations of MRD evaluation [12]. However, it has been shown in a large cohort that RIC conditioning for AML (in CR) and MDS in patients with detectable molecular alterations evaluated by a comprehensive NGS panel, was associated with an increased risk of relapse and decreased outcomes as opposed to MAC-conditioning [6]. Due to the real-world nature of our data and potential inconsistencies in data availability or the depth of molecular analysis, these findings necessitate further research in a larger, more comprehensive setting that also encompasses a wider range of RIC protocols. Notably, we did not identify any prognostic parameters, including AML or MDS risk groups and adverse cytogenetics, that significantly influenced survival outcomes. These results were almost identical to those reported from studies that used busulfan-, TBI- and treosulfan-based reduced intensity conditioning regimens [8, 9, 12, 22]. The primary cause of treatment failure remained relapse, consistent with the results of randomized trials for both conditioning regimens [8, 9]. Our findings, along with data from Bug et al. and the phase 3 trial, suggest higher NRM in patients ≥ 55 years receiving TBI conditioning indicating that treosulfan-based conditioning is effective should be the preferred option [9, 12]. However, further investigation through well-designed clinical trials is warranted. These trials could directly compare TBI-based conditioning to treosulfan-conditioning and exploring different treosulfan doses within treosulfan-based regimens as an effective alternative for younger patients.

The present study has limitations. While we employed robust propensity score matching techniques, unknown factors may still have influenced the outcomes. Additionally, the single-center design potentially limits the generalizability of our findings to other populations or clinical settings. The relatively small sample size constrained our ability to detect significant differences between the two regimens. While propensity score matching (PSM) effectively reduced confounding variables between our initially different patient groups, the resulting relatively small number of patients in each group may limit the ability to detect statistically significant differences between the TBI-based conditioning and non-TBI conditioning arms. While we cannot completely rule out the inclusion of outcome data from a subset of AML patients in an EBMT registry analysis, which may affect less than 20% of our total cohort, our analysis exceeds such registry-based analyses in terms of data depth. Nevertheless, our study, with its long follow-up of more than 4 years and comprehensive data availability, contributes valuable insights into the outcomes of AML and MDS patients undergoing allogeneic HCT with Flu/Treo or 8GyTBI/Flu conditioning.

Our data underline the effectiveness of both Flu/Treo and 8GyTBI/Flu conditioning regimens for patients with AML in CR or MDS, each demonstrating an acceptable toxicity profile. Particularly, treosulfan-based conditioning showed a low NRM rate and comparable efficacy when compared to TBI-based conditioning. Detectable MRD for AML patients did not constitute a prognostic factor in our study. The results of our study form the basis for future prospective studies comparing TBI and non-TBI-based reduced-toxicity conditioning regimen with an additional focus on disease status prior transplantation, radiation technique and supportive care, particularly GvHD prophylaxis.