Abstract
Our purpose was to evaluate an imaging parameter–response relationship between the extent of tumor hypoxia quantified by dynamic 18F-fluoromisonidazole (18F-FMISO) PET/CT and the risk of relapse after radiotherapy in patients with head and neck cancer. Methods: Before a prospective cohort of 25 head and neck cancer patients started radiotherapy, they were examined with dynamic 18F-FMISO PET/CT 0–240 min after tracer injection. 18F-FMISO image parameters, including a hypoxia metric, MFMISO, derived from pharmacokinetic modeling of dynamic 18F-FMISO and maximum tumor-to-muscle ratio (TMRmax) at 4 h after injection, gross tumor volume (GTV), relative hypoxic volume based on MFMISO, and a logistic regression model combining GTV and TMRmax, were assessed and compared with a previous training cohort (n = 15). Dynamic 18F-FMISO was used to validate a tumor control probability model based on MFMISO. The prognostic potential with respect to local control of all potential parameters was validated using the concordance index for univariate Cox regression models determined from the training cohort, in addition to Kaplan–Meier analysis including the log-rank test. Results: The tumor control probability model was confirmed, indicating that dynamic 18F-FMISO allows stratification of patients into different risk groups according to radiotherapy outcome. In this study, MFMISO was the only parameter that was confirmed as prognostic in the independent validation cohort (concordance index, 0.71; P = 0.004). All other investigated parameters, such as TMRmax, GTV, relative hypoxic volume, and the combination of GTV and TMRmax, were not able to stratify patient groups according to outcome in this validation cohort (P = not statistically significant). Conclusion: In this study, the relationship between MFMISO and the risk of relapse was prospectively validated. The data support further evaluation and external validation of dynamic 18F-FMISO PET/CT as a promising method for patient stratification and hypoxia-based radiotherapy personalization, including dose painting.
Tumor hypoxia is a major cause of resistance to radiotherapy and to other treatments, such as chemotherapy, and consequently leads to a poor outcome (1–4). Several studies have shown that in locally advanced primary head and neck cancer (HNC), tumor hypoxia is associated with a poor response to radiotherapy (5–9). Consequently, strategies to overcome hypoxia-induced treatment resistance, such as increasing the radiation dose to the whole tumor or to tumor subvolumes—that is, dose painting—have been proposed (10–12).
However, robust and accurate selection of patients for hypoxia-based radiotherapy interventions is crucial. Different methods of hypoxia detection have been proposed, such as hypoxia gene classifiers (9,13), biopsy- or blood-based biomarkers (14,15), and noninvasive PET using dedicated hypoxia tracers such as 18F-fluoromisonidazole (18F-FMISO) (5,7,8,16–20), 18F-fluoroazomycin-arabinoside (2,6), or 18F-fluortanidazole (21,22). Because the noninvasive, 3-dimensional measurement of tumor hypoxia is complex and depends sensitively on the image analysis approach, dynamic hypoxia PET imaging has been proposed by several groups to obtain a measure of tumor hypoxia quantitatively from the kinetics of tracer uptake (23–29).
On the basis of earlier findings about the prognostic value of 18F-FMISO PET (7,18,29,30), a randomized study to investigate the effectiveness of hypoxia dose painting in HNC was initiated by our institution (NCT 02352792). The results of a planned interim analysis were published, indicating the clinical feasibility of hypoxia dose painting (8). Importantly, dynamic 18F-FMISO PET was shown to stratify patients into 2 groups with different risks of locoregional failure. However, predictive biomarkers, including imaging parameters to modify radiotherapy, require independent prospective validation before implementation in the clinic (18). The aim of the present study was to compare and validate hypoxia imaging parameters derived from a previous training cohort (29,30) with data from an independent prospective cohort (8).
MATERIALS AND METHODS
Study Design and Patients
This study compared 2 groups of patients: a training cohort consisting of 15 HNC patients who underwent dynamic 18F-FMISO PET imaging and radiotherapy between 2003 and 2006, and a prospective validation cohort with 25 patients recruited from 2009 to 2012 by our center. The study was designed as shown in Figure 1 and was performed according to the TRIPOD statement (31).
The patient and tumor characteristics of the training cohort were reported previously (28–30). Patients were recruited into the validation cohort in a randomized phase II trial testing the efficacy of hypoxia imaging–based dose painting. The interim analysis has been published (8). The trial was approved by the local ethical committee and by the expert panel of the German Society of Radiation Oncology and was registered at www.clinicaltrials.gov (NCT 02352792). Written informed consent was obtained from all participants. Three patients received significant dose escalation, that is, 77 Gy in a volume of more than 5 cm3, and were excluded from the analysis to guarantee comparable radiation dose levels.
Imaging and Image Analysis
In the training and validation cohorts, all patients were examined before the start of radiotherapy with 18F-FDG PET for staging purposes only and with dynamic 18F-FMISO PET followed by radiotherapy-planning CT. In the validation cohort, dynamic 18F-FMISO PET/CT was performed on a Biograph mCT (Siemens Healthineers) in radiotherapy patient position using thermoplastic head and shoulder masks, neck support, and a flat table top. The protocol consisted of a list-mode acquisition during the first 40 min after tracer injection (framing: 12 × 10 s, 8 × 15 s, 11 × 60 s, and 5 × 300 s) followed by 2 static acquisitions 2 and 4 h after injection. PET data were reconstructed using 3-dimensional ordered-subset expectation maximization with 4 iterations and 8 subsets, with 200 × 200 voxels per slice and a voxel size of 4.07 × 4.07 × 5 mm. Corresponding CT images were acquired with 120 kVp, a 5-mm slice thickness, and an in-plane voxel size of 1.52 × 1.52 mm. 18F-FMISO PET/CT data were rigidly registered to the planning CT scan. Tumor volumes for radiotherapy were manually defined by an experienced radiation oncologist on the basis of the planning CT scan. These volumes served for PET image analysis. First, a maximum tumor-to-muscle ratio (TMRmax) of tracer uptake in the tumor volume was derived from static 18F-FMISO PET data acquired 4 h after injection. Second, the full dynamic PET series was included in a voxel-based kinetic analysis in the tumor area using a 2-compartment model optimized for hypoxia PET data analysis, as previously described (28). Using this approach, for each patient a hypoxia metric, MFMISO, was calculated from voxel-based parameters on tissue perfusion and tracer retention in the gross tumor volume (GTV). MFMISO defines the overall hypoxia level of the tumor and was derived from pharmacokinetic analysis of dynamic 18F-FMISO PET data (28):Eq. 1
Here, and define the local hypoxia and perfusion status of voxel i, respectively, and are assessed from a voxel-based fit of a modified 2-compartment model to the dynamic 18F-FMISO PET data. The parameters A, b, and c are fit parameters to the tumor control probability (TCP) model derived in the original study as , , and . Further details can be found in previous publications (28–30).
TCP Model
Classic TCP models in radiotherapy are dose–effect relationships that link the radiation dose D with the expected outcome of a patient in a cohort. Because several previously published studies report hypoxia PET information to have a prognostic value for a given dose D, we hypothesize that increased levels of hypoxia in a tumor subregion may cause a higher level of radiation resistance and therefore counteract the dose effect. In a previous study, we developed an imaging–response relationship in the form of a TCP model that relates tumor hypoxia measured with dynamic 18F-FMISO PET to a continuous outcome variable (30).
On the basis of dynamic 18F-FMISO PET data acquired earlier for a training group of 15 HNC patients (28,29), the TCP model was defined asEq. 2
withEq. 3
Here, α refers to the mean radiation sensitivity of the tumor tissue in Gy−1 and D to the mean radiation dose given to the GTV in Gy. ρ and n denote the mean number of cells per voxel and the number of tumor voxels, respectively.
We assume that for a constant radiation dose D = D0, TCP depends mainly on the level of tumor hypoxia. This leads to the following formulation:Eq. 4
Radiotherapy
In the training cohort, all patients were treated with combined radiochemotherapy, prescribing 70, 60, and 54 Gy to the planning tumor volumes of the first, second, and third order, respectively. In the validation cohort, the same dose prescription was used for patients in the standard treatment arm, whereas for patients in the experimental arm a radiation dose escalation of 10% to the hypoxic part of the primary tumor was the goal (8). However, only 3 patients in the validation cohort received the prescribed dose escalation because in the other patients the hypoxic volumes were too small (<5 mL) for delivery of the extra radiation dose. Patients with achieved dose escalations (n = 3) were excluded from this analysis. Thus, all remaining patients (n = 22) in this analysis received 70 Gy.
Statistical Analysis
The imaging characteristics and general patient data of the 2 cohorts were compared using the Mann–Whitney U test.
We tested different parameters extracted from static 18F-FMISO PET acquired 4 h after injection that might be alternatives to dynamic 18F-FMISO PET that are easier to assess in clinical routine. To stratify patients according to outcome after radiochemotherapy, we created univariable Cox regression models based on the training data for 18F-FMISO TMRmax, size of the GTV, and MFMISO determined from dynamic 18F-FMISO PET and the relative hypoxic volume associated with MFMISO. Outcome data were available as time-to-event data for local control. To evaluate the prognostic performance of the Cox models, the concordance index was calculated. Bootstrap resampling was used to estimate the confidence intervals for the concordance index. Thresholds for stratification of patient subgroups were defined as median values in the training cohort. Those thresholds were then applied to the validation cohort. To compare the potential of the investigated parameters for risk group stratification, Kaplan–Meier analysis, including log-rank tests, were used for the training and validation cohorts and for a merged patient group. Additionally, a logistic regression model was trained for TMRmax and GTV to check for the ability of a combined parameter to predict local failure in HNC.
To validate the TCP model, the original TCP model function (Eq. 4) was fitted to all available data from the training and validation cohorts (n = 37). The original model function and the function fitted to the merged dataset were compared using ANOVA for model comparison to assess the probability of rejecting the null hypothesis, where one model function does not fit the data better than the other one.
All statistical analyses were performed in R (version 3.1.1). To account for multiple testing (5 parameters), Bonferroni adjustment was used. Consequently, P values of less than 0.01 were considered statistically significant. When there were statistically significant differences in local control rates, hazard ratios (HRs) were calculated to estimate the risk ratio of the 2 groups.
RESULTS
Comparison of the 2 patient cohorts showed a comparable median patient age, whereas median 18F-FMISO TMRmax was 2.47 in the training cohort and 1.80 in the validation cohort (P = 0.0002). Similarly, differences between the 2 cohorts were observed in terms of tumor volume, with the median GTV being 114.7 cm3 in the training group and 74.0 cm3 in the validation group (P = 0.027). MFMISO as assessed from dynamic 18F-FMISO PET data resulted in comparable median values of 8.36 and 8.01 for the training and validation cohorts, respectively (P = 0.134). In contrast, the relative hypoxic volumes derived from MFMISO differed significantly between the 2 groups, with median values of 15.3% and 0.9%, respectively (P = 0.0008). Further details on the 2 patient cohorts are presented in Table 1.
To establish a dedicated imaging–response relationship for dynamic 18F-FMISO PET, we had earlier defined a TCP model based on the training cohort to link expected radiotherapy outcome to MFMISO. The TCP function defining this relationship was validated by a fit of the model to the merged patient groups. Here, the parameter A was fitted as (P = 0.0003), whereas the value based on the training cohort only was confirmed as (P = 0.0183). Furthermore, the ANOVA for model comparison resulted in a nonsignificant P value of 0.11 for rejecting the null hypothesis, where one model does not fit the data better than the other one. Consequently, these results confirm the initially defined TCP model based on dynamic 18F-FMISO PET. Figure 2 presents the TCP curve as a function of MFMISO based on the initial training dataset only and on the merged data groups, in addition to the observed clinical outcomes.
Among all univariable models trained for local control prediction, only MFMISO showed prognostic potential in the training group (concordance index, 0.77; P = 0.001; HR, 17.0). All other investigated parameters were not able to stratify patients into risk groups associated with local control. However, a logistic regression model trained for TMRmax and GTV yielded a concordance index of 0.73 (P = 0.010; HR, 2.4) in the training cohort. Of those 2 models that were associated with local control in the training cohort, only MFMISO was confirmed as a prognostic parameter in the validation cohort. Consequently, the threshold defined for MFMISO in the training cohort was able to stratify patients according to outcome also in the independent validation cohort (P = 0.004; HR, 6.7). In contrast, the 2-parameter logistic regression model could not be confirmed in the validation cohort (P = 0.1). Interestingly, relative hypoxic volume was able to stratify the validation cohort according to outcome (P = 0.001) even though, in the training cohort, no significant prognostic potential had been observed. Similarly, significant patient stratification was obtained when applying the thresholds to the overall, merged, cohort when using the logistic regression model trained for TMRmax and GTV (P = 0.001) and MFMISO (P < 0.001; HR, 9.4). However, only MFMISO derived from dynamic 18F-FMISO PET was identified as a prognostic parameter in the training cohort and validated in the independent validation cohort. Table 2 presents the detailed analysis of the prognostic potential of the investigated variables in terms of the concordance index, the HR, and the P value of the Kaplan–Meier analysis for the training, validation, and merged cohorts, respectively. Figure 3 displays the Kaplan–Meier curves for local control of patients stratified according to MFMISO in the training, validation, and merged patient cohorts.
DISCUSSION
In this study, a previously proposed TCP model using hypoxia imaging was confirmed using a prospective independent cohort of patients. MFMISO derived from dynamic 18F-FMISO PET/CT was validated as a prognostic parameter for locoregional relapse after radiotherapy in HNC patients. In addition, hypoxia quantification based on dynamic 18F-FMISO was shown to be more robust than simple, static measures, as only MFMISO derived from dynamic 18F-FMISO PET was identified as a prognostic parameter in the training cohort and validated in the independent validation cohort. Hence, dynamic 18F-FMISO PET data may in future be used as a valuable tool for functional imaging–based radiotherapy interventions in HNC. Hypoxia dose painting might not be suitable for all tumor types, as shown in a recently published trial on lung cancer (29).
The present study compared an independent validation cohort with data from an earlier training cohort. The detailed analysis showed significant differences in the 2 patient cohorts. Several parameters, such as TMRmax, GTV, and relative hypoxic volume, were lower in the validation cohort. This finding hints at a less hypoxic population in the validation group, as is corroborated by a lower number of observed local failures. Inherently, a difference in the overall hypoxia status of the 2 groups is a challenge for image biomarker validation. This issue may be a consequence of the low patient numbers in the 2 groups. Another reason for differences in PET data may be the fact that the training cohort was examined using a different PET scanner (Advance; GE Healthcare), which presented with a different overall performance in terms of hardware efficiency and image reconstruction, as well as injected tracer activities. In line with our observation, biomarker studies recruiting patients over long periods are susceptible to potential cohort effects, as was shown in a recently published study on 18F-FMISO PET in HNC patients (18). This susceptibility emphasizes the need for robust and validated predictive parameters. Our TCP model appears to fulfil this requirement, as it correctly predicted a lower number of local recurrences due to the lower hypoxic status in the validation. On the other side, additional knowledge may be acquired in the field during the validation phase of a model. This knowledge might lead to revision of the model itself, such as did some recent findings on hypoxia imaging using functional MRI instead of PET (32).
The presented study is limited by the low number of patients in both the training and the validation phases. Furthermore, both steps were performed at a single center. A next phase of model validation would require independent, external validation.
A potentially considerable confounder of our analysis is that the human papilloma virus status of the training cohort was unknown. Human papilloma virus status is an established substantial prognostic factor for response to radiochemotherapy in HNC patients (33).
In addition, several aspects related to the methodology of hypoxia imaging require further discussion. TMRmax was investigated as one of the analysis parameters to describe static 18F-FMISO PET data. This parameter may be subject to variation due to the noisy nature of PET data and might thus be compensated partially by using TMRpeak, which has been shown to be more robust for static data analysis (34). Several studies have investigated the reproducibility of hypoxia PET imaging in terms of test–retest studies and found this functional imaging modality to have discrepant but mainly good repeatability and spatial stability (35–37).
In contrast to other studies (7,38) our results did not identify TMRmax as a prognostic biomarker. The small sample size during training may be a reason. However, the fact that dynamic 18F-FMISO was identified as a powerful biomarker for local control in HNC indicates the robustness of kinetic hypoxia PET information.
In this study, we could show that MFMISO derived from dynamic 18F-FMISO PET is prognostic for local control in HNC radiotherapy and thus a robust parameter to stratify patients for individualized radiotherapy approaches in the future. This finding confirms the results presented recently by other groups, which hypothesized that dynamic hypoxia PET data were more robust and reproducible in terms of hypoxia quantification in HNC (24,25). Dynamic data acquisition in hypoxia PET allows for the time-dependent monitoring of tracer uptake and diffusion and is therefore a robust tool for hypoxia quantification in tumors. Nevertheless, dynamic PET scanning is demanding of the patients and challenging in terms of scanner time and data analysis. To reduce examination times for patients and required scanner time, Grkovski et al. proposed dedicated methods for scan time reduction while maintaining the most important features of dynamic 18F-FMISO PET for hypoxia quantification (26). However, dynamic PET has been shown to be reproducible with regard to the noise level of the imaging data (27). In contrast, static hypoxia parameters such as TMRmax or TMRpeak have been shown to be associated with only limited reproducibility due to subjective muscle activity definition for normalization (39). Grkovski et al. (25) reported a disagreement between visual hypoxia assessment on static scans and pharmacokinetic modeling of dynamic data in approximately 20% of cases. Such a disagreement may directly affect patient management in interventional trials.
A recent study identified the residual hypoxic volume after the first 2 wk of radiotherapy as a prognostic parameter to be used for future radiotherapy interventions to overcome hypoxia-induced radiation resistance (18). However, repeated 18F-FMISO PET/CT examinations before and at a second time during radiotherapy also seem to be highly challenging from a logistic point of view and from the patient’s perspective. We hypothesize that the information from dynamic 18F-FMISO PET information acquired before the start of radiotherapy might be similar. Dynamic hypoxia PET allows one to assess the perfusion status of a tissue region, as well as active retention of hypoxia tracer in viable tumor areas (28,40). As such, it may be the ideal tool to predict local reoxygenation during the first weeks of radiotherapy and thus substitute for an examination early during treatment (30).
Our study confirmed that hypoxia assessment with dynamic 18F-FMISO PET can be used as an input variable for a dedicated TCP model linking quantitative hypoxia information with expected outcome after HNC radiotherapy. Such a TCP model is a unique option not only to stratify patients into 2 binary risk groups for therapy adaptation followed by arbitrary testing of escalated radiation doses but also to assess the patient’s risk for relapse on a continuous scale. Assuming that radiation resistance can be counteracted by higher radiation doses, the implication is that a TCP model can ultimately link the local hypoxia status to a required dose escalation level (12,30). Consequently, the imaging–response relationship directly translates into a dose–response relation, and thus, the TCP model validated here inherently defines a dose prescription function for personalized, hypoxia-modification radiotherapy interventions (5,8). Therefore, a clinical implementation of this model not only might allow division of HNC patients into 2 risk groups but also presents a methodology to derive individual dose escalation metrics for each patient from dynamic 18F-FMISO scans. Therefore, this approach allows for personalization of radiotherapy in terms of higher doses where deemed necessary and dose deescalation in nonhypoxic patients to improve quality of life for those patients.
CONCLUSION
In this study, an imaging–response relationship linking MFMISO to risk of local failure after radiotherapy in HNC patients was confirmed. MFMISO, derived from dynamic 18F-FMISO PET, was independently validated as a strong prognostic parameter. This validation supports the further investigation of dynamic 18F-FMISO PET before the start of radiotherapy for personalized hypoxia-based radiotherapy interventions.
DISCLOSURE
This project received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013), ERC starting grant agreement 335367. Daniela Thorwarth and Daniel Zips declare departmental research collaborations with Elekta, Philips, and Siemens. Christian La Fougère has research collaborations with Siemens Healthineers. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: Is there an imaging–response relationship between a hypoxia metric derived from dynamic 18F-FMISO PET and outcome to radiotherapy in HNC patients?
PERTINENT FINDINGS: This independent validation study showed dynamic 18F-FMISO PET to have strong prognostic value for outcome to radiotherapy in HNC patients. The study confirmed a previously proposed imaging–response relationship that links dynamic 18F-FMISO PET to a continuous outcome prediction. A series of potential imaging biomarkers was tested for prognostic value, but only dynamic 18F-FMISO PET was independently validated as a prognostic parameter in HNC radiotherapy.
IMPLICATIONS FOR PATIENT CARE: The results of this study are a key factor for future hypoxia imaging–based radiotherapy interventions such as dose painting.
Footnotes
Published online May 10, 2019.
- © 2019 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication February 21, 2019.
- Accepted for publication May 2, 2019.