Quantitative Assessments of Tumor Activity in a General Oncologic PET/CT Population: Which Metric Minimizes Tracer Uptake Time Dependence?

Visual Abstract

In oncologic imaging, SUV changes between scans are critical for treatment response assessment (1).However, SUV depends on uptake times, as many tumors accumulate tracer continuously (2,3).The logistic demands of busy clinical PET services often preclude precise scan-to-scan reproduction of uptake times, reducing the reliability of SUV as an oncologic biomarker (4).The Patlak model, which attempts to ameliorate this shortcoming, assumes that circulating tracer is trapped irreversibly, allowing tracer uptake to be quantified via the net influx rate (K i ) (5,6).Several clinically used PET tracers, including [ 18 F]FDG, approximate this behavior, permitting Patlak modeling of dynamic PET data.Once steady-state conditions are achieved between the blood and tissue compartments, the K i should remain relatively constant, whereas the SUV is expected to increase with time.Furthermore, K i -based metrics are promising prognostic biomarkers for several cancer types, occasionally outperforming SUV-based metrics (7,8).
However, K i derivation requires direct measurement or estimation of arterial input functions (AIFs) and dynamic acquisitions to generate tissue time-activity curves (5).The required modifications to PET protocols may increase imaging time or introduce motionrelated quantitative errors (9).The uptake-time-corrected SUV (cSUV), which involves retrospectively modifying an observed SUV on the basis of actual versus targeted uptake times, is an alternative means of addressing the uptake time dependence of SUVs (10).Furthermore, the uptake-time-corrected tumor-to-blood standardized uptake ratio (cSUR) may allow for K i estimation, without the need for AIFs or dynamic imaging (11).
To our knowledge, no prior studies have assessed the relative temporal stabilities of SUV, SUR, K i , cSUV, and cSUR in a broad oncologic PET population.Thus, our primary aim was to quantify the intrascan repeatability of these metrics and thereby determine which approach provides the most time-independent assessment of tracer avidity on [ 18 F]FDG PET and DOTATATE PET.An exploratory aim was to determine the ability of cSUR to estimate K i .enrolled 78 subjects from a pool of consecutive patients scheduled to undergo standard-of-care (SOC) oncologic PET/CT for various indications, using [ 18 F]FDG, [ 68 Ga]Ga-DOTATATE, [ 64 Cu]Cu-DOTATATE, or [ 18 F]piflufolastat (note that [ 68 Ga]Ga-DOTATATE and [ 64 Cu]Cu-DOTATATE are hereafter collectively called DOTA-TATE, as these scans were analyzed together).These tracers have been reported to satisfy the Patlak model's assumptions (12)(13)(14).The 2 [ 18 F]piflufolastat studies were excluded because of insufficient cases for tracer-specific analysis.All imaging occurred at a tertiary-care center between June 2020 and October 2022.Inclusion criteria included being at least 18 y of age, having the ability to provide written informed consent, and self-reporting the ability to tolerate approximately 90 min of near-motionless supine positioning.Study imaging was performed before and after SOC PET/CT using the same tracer dose.

Imaging Protocol
The study imaging protocol (details are available in the supplemental materials; available at http://jnm.snmjournals.org) is summarized in Figure 1.All patients were imaged on a single Biograph Vision 600 PET/CT scanner (Siemens Healthineers) equipped with commercially available software for direct reconstruction of multiparametric PET images (FlowMotion Multiparametric PET Suite; Siemens Healthineers).

PET Image Reconstruction
Using automated scanner tools, volumes of interest were placed in the descending thoracic aorta on a 6-min dynamic chest acquisition and the subsequent 10 whole-body (WB) passes (15).Per default scanner software settings, AIFs were generated from measured blood activity concentrations via exponential ([ 18 F]FDG) or linear piecewise (DOTATATE) curve fitting.After all WB PET passes were reviewed dynamically for large bulk motion events, early and late SUV and K i images were reconstructed per manufacturer-recommended parameters (Supplemental Table 1).Each reconstruction used data from three 5-min WB passes with targeted acquisition times of 35-50 min (early) and 75-90 min (late) after injection.Note that the scanner software requires at least 3 WB passes for Patlak analysis.The 3 latest pre-SOC WB passes were selected for the early images, ensuring adequate time to achieve steady-state conditions.Importantly, subjects left the scanner to void immediately before SOC imaging per our standard clinical protocol, precluding automated scanner measurement of post-SOC blood tracer concentrations due to different patient positioning.Consequently, the AIF for the post-SOC K i reconstructions was automatically derived by the scanner software from extrapolation of the pre-SOC AIF (i.e., no incorporation of measured post-SOC blood tracer concentrations).SUV was based on actual body weight with units of grams per milliliter.K i had units of milliliter per minute per 100 mL.In contrast to [ 18 F]FDG, intravascular DOTATATE does not enter red blood cells, requiring correction of measured K i values for the subjects' hematocrit levels (16): Quantitative Analysis Tracer-avid lesions deemed to represent sites of viable malignancy on the SOC PET/CT interpretation were selected by one author.In cases of numerous lesions, the largest or most tracer-avid lesions were selected (5 per subject maximum).Each lesion was manually segmented in MIM version 7.1.5(MIM Software) on 4 PET image sets (K i -early, SUV-early, K i -late, SUV-late) to generate volumes of interest, using coregistered CT images for guidance.Maximum and peak values were extracted.Additionally, a cylindric volume of interest (1-cm diameter, 6-cm length) was placed in the descending thoracic aorta (avoiding vessel walls) to extract a mean value for SUR calculation: tumor SUR 5 tumor SUV blood SUV : SUV max and SUV peak were used to calculate maximum SUR (SUR max ) and peak SUR (SUR peak ), respectively; the SUV mean of blood was used in both cases.

Uptake Time Correction
Actual uptake time ranges were extracted for each image set, with the mid point defining the effective uptake time (e.g., 44.5 min for 37-52 min after injection).cSUV and cSUR were calculated as follows (10,11): SUV and SUR are measured values, T 0 is the actual uptake time, and T c is the correction time reference.T c was set to 60 min, reflecting a commonly targeted uptake time in [ 18 F]FDG and DOTATATE 0 60 protocols (1). Figure 2 shows the SUV and SUR correction procedure for a representative case.For [ 18 F]FDG, the value of the parameter b of 0.313 was based on a prior study (10).For DOTATATE, we empirically derived a b of 0.63 by determining the value (averaged across all subjects) that best reproduced the observed late SUV max from the observed early SUV max .The early and late values were then corrected to 60 min with the cSUV equation.

Manual Patlak Analysis
To explore apparent temporal variations in K i , we selected 6 [ 18 F]FDG, 1 [ 68 Ga]Ga-DOTATATE, and 2 [ 64 Cu]Cu-DOTATATE cases with at least 1 lesion exhibiting a large (.20% or ,20%) testretest percent change (%D) in maximum K i (K i,max ) for further analysis.For all 9 cases, extrapolated AIF curve fits were compared with manually measured blood activity concentrations on the WB passes.Areas under the time-activity curve were compared for extrapolated AIF curve fits versus manual measurements via trapezoidal integration.For 4 cases, full manual Patlak analysis was performed for selected lesions and reference organs (supplemental materials) (17-20).

Statistical Analysis
Statistical analysis was conducted in Prism 9 (GraphPad) and Excel 2016 (Microsoft) by one author with statistician guidance.Participant and scan characteristics were summarized descriptively.Because of the anticipated pharmacokinetic differences, the [ 18 F]FDG cases were analyzed separately from the DOTATATE cases.The 2-tailed Wilcoxon signed-rank test was used for pairwise comparisons of quantitative metrics.Intrascan test-retest changes were computed: test-retest D5late2early: To facilitate comparisons across metrics of different magnitudes, intrascan test-retest %D was also computed: test-retest %D5 late2early ðlate1earlyÞ=2 : Results were displayed via Bland-Altman plots and box-and-whisker plots (21).The mean (m) and SD (s) of the test-retest D and test-retest %D distributions were determined for each metric.The 95% limits of repeatability were defined as follows: 95% limits of repeatability5m62s: Given that near-zero %D values could be due to averaging of large negative and positive changes, absolute test-retest %D (test-retest |%D|) values were also computed.Intraclass correlation coefficients (ICC) and coefficients of determination (R 2 ) were also used to assess test-retest repeatability and to quantify the accuracy of K i prediction by other metrics.A P value of less than 0.05 defined statistical significance.More detailed statistical methods are available in the supplemental materials (22).Test-retest %D values were 1.5%, 49.8%, and 78.6% for K i,max , SUV max , and SUR max , respectively, indicating much better intrascan repeatability for K i,max .Procedure for correcting SUV max and SUR max to 60 min after injection is also shown.Test-retest %D values were 21.4% for cSUV max and 17.0% for cSUR max , similar to K i,max results.
As expected, SUV and SUR metrics showed large, statistically significant early-to-late increases.In contrast, cSUV and cSUR metrics were similar at early and late time points, though with some small but significant early-to-late changes.For the maximum cSUV (cSUVmax ), the early and late values were statistically equivalent (median, 8.0 vs. 7.3; P 5 0.17).Surprisingly, the K i metrics exhibited significant early-to-late increases (median, 1.8 vs. 2.3; P , 0.001).The early and late values of each metric were strongly correlated (R2 , 0.90-0.96).However, the ICCs showed substantially better agreement between early and late values for K i , cSUV, and cSUR metrics (range, 0.91-0.97)than for SUV and SUR metrics (range, 0.27-0.75).For DOTATATE, the results were similar, except that the K i metrics exhibited significant (or nearly significant) early-to-late decreases.
In the Bland-Altman analysis, cSUV max and maximum cSUR (cSUR max ) showed the least bias between early and late values, with mean test-retest %D values of 26% and 7%, respectively, compared with 11% for K i,max .In contrast, the mean test-retest %D values for SUV max and SUR max were 47% and 81%, respectively, indicating large early-to-late increases.Regarding the magnitude of deviation from perfect repeatability (i.e., test-retest |%D| 5 0), the test-retest |%D| of K i,max (median, 13%) was similar to those of cSUV max (median, 12%; P 5 0.90) and cSUR max (median, 13%; P 5 0.67) but significantly less than those of SUV max (median, 48%; P , 0.001) and SUR max (median, 81%; P , 0.001).The test-retest |%D| of the peak K i (K i,peak ) (median, 15%) was significantly lower than that of all other relevant metrics except for the peak cSUR (cSUR peak ) (median, 13%; P 5 0.36).For DOTATATE, the results were similar to those of [ 18 F] FDG for the K i,max analysis, though the median test-retest |%D| of the K i,peak was similar to that of SUV peak (rather than cSUR peak ).

Prediction of K i by cSUR
Supplemental Figures 7 and 8

Manual Patlak Analysis
Supplemental Figures 11 and 12 show K i,max test-retest %D values for each subject's lesions for [ 18 F]FDG and DOTATATE, respectively.Supplemental Tables 5 and 6 present manual Patlak analyses for several [ 18 F]FDG and DOTATATE subjects, respectively.Supplemental Figures 13 and 14 capture AIF and tissue-response curves and manual Patlak plots for representative [ 18 F]FDG and DOTATATE cases, respectively.Supplemental Figure 15 illustrates the effects of motion and image noise on K i and SUV.For [ 18 F]FDG, the AIF curve-fit extrapolation mildly underestimated late blood activity concentrations, contributing to higher late K i values.Furthermore, motion of small lesions across WB passes contributed to K i errors that were ameliorated by manual frameby-frame segmentations.For DOTATATE, the AIF curve-fit extrapolation moderately overestimated late blood activity concentrations, resulting in lower late K i values; additionally, DOTATATE binding appeared to be reversible at late time points for some cases.More details are provided in the supplemental materials.
Early-to-late increases are a well-known limitation of SUVs and SURs for tumor response assessments (2,3,23).As such, the Quantitative Imaging Biomarkers Alliance recommends that uptake times for baseline and follow-up scans be approximately 60 min with a no more than 10 min difference between scans (1).However, differences greater than 10 min are not uncommon.Methods to correct SUV and SUR for uptake time (i.e., cSUV, cSUR) have been published (10,23).For example, a study reported that correcting SUVs and SURs from 20 min to 55 min after injection reduced differences with actual values at 55 min from 230% to 2% for SUV and from 252% to 23% for SUR (10).This study, which used data from 9 male patients with colorectal liver metastases, proposed the simple SUV and SUR correction equations used in our work.
We verified that cSUV, using the published time parameter b of 0.313, is a relatively time-independent marker of tumoral [ 18 F]FDG avidity.For [ 18 F]FDG, our mean test-retest %D values of 26% and 7% for cSUV max and cSUR max , respectively, are slightly greater in magnitude than the values cited above, possibly because of our longer early-to-late intervals or heterogeneous patient cohort.We empirically derived a b value of 0.63 for DOTATATE and found that cSUV is also a relatively time-independent marker of tumoral DOTATATE avidity, with mean test-retest %D values of 2% and 27% for cSUV max and cSUR max , respectively.Compared with cSUV max and cSUR max , cSUV peak and cSUR peak showed worse intrascan repeatability, with sizeable negative test-retest %D values for both tracers.The reason for this somewhat surprising finding is unclear, as peak measurements (because of their larger sampling volumes and lower potential for noise-related errors) are generally considered more repeatable than maximum measurements (24).
In terms of test-retest |%D| and ICC, the intrascan repeatability was similar across K i,max , cSUV max , and cSUR max for both tracers.However, we observed small but statistically significant early-to-late increases and decreases in K i,max for [ 18 F]FDG and DOTATATE cases, respectively.In contrast, cSUV max and cSUR max showed no significant early-to-late changes for either tracer, with the exception of a small significant increase in cSUR max for DOTATATE.For both tracers, the observed early-to-late K i changes were partially attributable to inaccurate AIF curve-fit extrapolations, the need for which arose from incorporating SOC imaging into our study design.A protocol using nonextrapolated image-derived AIFs or population-based AIFs might reduce these apparent temporal changes in K i .Several cases suggested late reversibility of DOTATATE binding, also contributing to the observed early-to-late K i decreases.Overall, cSUV max and cSUR max provided intrascan repeatability similar to that of K i,max , without dynamic imaging or AIF estimation.
K i images may still be worth their inherent complexities, as K i metrics appear useful for guiding treatment decisions and predicting oncologic outcomes (7,8,25,26).One study showed that K i correlated with SUR (R 2 , 0.96) much more strongly than with SUV (R 2 , 0.37), with all metrics measured at 50-60 min after injection (11).In contrast, we found that SUV max , cSUV max , SUR max , and cSUR max all strongly correlated with K i,max for both [ 18 F]FDG (R 2 , 0.81-0.92)and DOTATATE (R 2 , 0.88-0.96),though cSUR max had the best agreement with K i,max across early and late time points for [ 18 F]FDG (ICC, 0.69-0.75)and DOTATATE (ICC, 0.90-0.91).Our findings indicate that K i can be predicted from cSUR and that cSUR max exhibits a nearly 1:1 proportionality to K i,max .To this point, cSUR and K i appear to predict postchemoradiation lung cancer outcomes better than does SUV (27).That said, Patlak images may provide higher lesion conspicuity and fewer false positives than with SUV images (28,29).
Our study has limitations, including its single-center, singlescanner design.The results should be corroborated at other centers on other scanners.Our patient cohort was heterogeneous; the relatively small sample size precluded subgroup analysis by cancer type or imaging indication.The b parameter of 0.63 for DOTA-TATE was derived empirically (rather than from AIF curve fitting) and needs to be validated in other cohorts.Again, the AIF curve-fit extrapolations created late K i errors.A more thorough investigation of potential causes of the observed temporal variability in K i is still warranted.Finally, our study excluded subjects who anticipated difficulty with a 90-min imaging period, potentially enriching our cohort for patients capable of remaining relatively motionless; as such, K i images may be more degraded by motion in an unselected oncologic population.CONCLUSION K i,max , cSUV max , and cSUR max exhibit comparably high intrascan repeatability in a general oncologic population undergoing PET with [ 18 F]FDG or DOTATATE, with significantly less uptake time dependence compared with SUV max and SUR max .cSUR max can predict K i,max without dynamic acquisitions.

DISCLOSURE
This work was supported by a research grant from Siemens Healthineers to Washington University, including salary support for Tyler Fraum.Richard Wahl has received consulting income from Siemens Healthineers.All participants were imaged on a Siemens PET/CT scanner.Saeed Ashrafinia and Anne Smith are Siemens employees.These authors participated in the initial study design, provided occasional technical support, and critically reviewed the manuscript.However, all data collection, analysis, and manuscript preparation were performed by Washington University authors.No other potential conflict of interest relevant to this article was reported.

FIGURE 2 .
FIGURE 2. Uptake time correction procedure for SUV and SUR.Axial [ 18 F]FDG PET and fused [ 18 F]FDG PET/CT images are shown for K i (top) and SUV (bottom) reconstructions at early and late time points.K i,max , SUV max , and SUR max of [ 18 F]FDG-avid mediastinal lymph node (arrows) are shown.Test-retest %D values were 1.5%, 49.8%, and 78.6% for K i,max , SUV max , and SUR max , respectively, indicating much better intrascan repeatability for K i,max .Procedure for correcting SUV max and SUR max to 60 min after injection is also shown.Test-retest %D values were 21.4% for cSUV max and 17.0% for cSUR max , similar to K i,max results.

FIGURE 5 .
FIGURE 5. Test-retest %D and |%D| distributions for [ 18 F]FDG-avid lesions.Box-and-whisker plots show test-retest %D (A and B) and |%D| (C and D) distributions for maximum (A and C) and peak (B and D) values of K i , SUV, cSUV, SUR, and cSUR.All P values are based on comparison to K i,max or K i,peak .Table 2 provides descriptive statistics.

Table 2 .
24/41) with a mean age of 63.8 y.Additional patient and scan characteristics are captured in Supplemental Intrascan Repeatability of Tumor Uptake Metrics Test-retest repeatability results are summarized in Table 1 ([ 18 F]FDG) and Supplemental Table 3 (DOTATATE).Scatterplots of late versus early metric values are shown in Supplemental Figures

TABLE 1
([ 18 F]FDG) and Supplemental Figures 9 and 10 (DOTATATE) show correlation results for K i,max versus SUV max , SUR max , cSUV max , and cSUR max .For [ 18 F]FDG, the Intrascan Repeatability of SUV, cSUV, SUR, cSUR, and K i Metrics Among [ 18 F]FDG-Avid Lesions *Values are median with first quartile and third quartile in parentheses.† Early versus late values via Wilcoxon signed-rank test.‡ Values are mean with 95% limits of repeatability in parentheses.T-RT 5 test-retest.Bold P values are statistically significant.

Table 2
provides descriptive statistics.