Abstract
PET/CT-ascertained bone marrow involvement (BMI) constitutes the single most important reason for upstaging by PET/CT in Hodgkin lymphoma (HL). However, BMI assessment in PET/CT can be challenging. This study analyzed the clinicopathologic correlations and prognostic meaning of different patterns of bone marrow (BM) 18F-FDG uptake in HL. Methods: One hundred eighty newly diagnosed early unfavorable and advanced-stage HL patients, all scanned at baseline and after 2 adriamycin-bleomycin-vinblastine-dacarbazine (ABVD) courses with 18F-FDG PET, enrolled in 2 international studies aimed at assessing the role of interim PET scanning in HL, were retrospectively included. Patients were treated with ABVD × 4–6 cycles and involved-field radiation when needed, and no treatment adaptation on interim PET scanning was allowed. Two masked reviewers independently reported the scans. Results: Thirty-eight patients (21.1%) had focal lesions (fPET+), 10 of them with a single (unifocal) and 28 with multiple (multifocal) BM lesions. Fifty-three patients (29.4%) had pure strong (>liver) diffuse uptake (dPET+) and 89 (48.4%) showed no or faint (≤liver) BM uptake (nPET+). BM biopsy was positive in 6 of 38 patients (15.7%) for fPET+, in 1 of 53 (1.9%) for dPET+, and in 5 of 89 (5.6%) for nPET+. dPET+ was correlated with younger age, higher frequency of bulky disease, lower hemoglobin levels, higher leukocyte counts, and similar diffuse uptake in the spleen. Patients with pure dPET+ had a 3-y progression-free survival identical to patients without any 18F-FDG uptake (82.9% and 82.2%, respectively, P = 0.918). However, patients with fPET+ (either unifocal or multifocal) had a 3-y progression-free survival significantly inferior to patients with dPET+ and nPET+ (66.7% and 82.5%, respectively, P = 0.03). The κ values for interobserver agreement were 0.84 for focal uptake and 0.78 for diffuse uptake. Conclusion: We confirmed that 18F-FDG PET scanning is a reliable tool for BMI assessment in HL, and BM biopsy is no longer needed for routine staging. Moreover, the interobserver agreement for BMI in this study proved excellent and only focal 18F-FDG BM uptake should be considered as a harbinger of HL.
Hodgkin lymphoma (HL) has long been considered a disease of the lymphatic system, with a less frequent extranodal spread compared with B-cell aggressive non-Hodgkin lymphoma. However, this concept has been challenged in recent years on introduction, in HL staging, of 18F-FDG PET/CT (1–4). The latter proved also valuable for lymphoma restaging, to discriminate active tumor lesions from structural abnormalities and nonviable residual masses with high accuracy (5–7). Compared with CT, PET/CT proved more sensitive at baseline for extranodal site detection, which can be recognized as an increased 18F-FDG uptake in otherwise normally structured organs (1,4). As a result, PET/CT upstages 15%–29% of HL patients and modifies treatment plans in a clinically relevant fraction of them (8). The most frequent reason for stage IV migration in HL is the detection of single or multiple sites of focally increased 18F-FDG uptake in bone marrow (BM) without histologic evidence of HL in the iliac crest BM biopsy (BMB) (2,4). Many studies have shown a superior diagnostic sensitivity of PET/CT for bone marrow involvement (BMI) assessment over BMB, because the latter often fails to detect patchy BMI (4,9–11). It is generally accepted that focally increased 18F-FDG uptake in the BM with or without the presence of CT abnormalities is a sign of BMI, but the prognostic relevance of this finding is controversial (8,12). However, some patients display a diffuse baseline BM 18F-FDG uptake with an intensity superior to that in the liver. In expert opinion, diffuse BM uptake represents inflammatory changes, although sporadic positive BMB in the setting of a diffuse BM 18F-FDG uptake has been reported (13,14).
In the present study, we analyzed the prognostic meaning of different patterns of 18F-FDG uptake in the BM of newly diagnosed adult HL patients and their association with clinicopathologic features.
MATERIALS AND METHODS
Patients
Patients included in the present report were previously enrolled in 2 international studies aimed at assessing the prognostic role of interim PET in adriamycin-bleomycin-vinblastine-dacarbazine (ABVD)–treated HL: the International Validation Study (IVS) and the Polish observational study. The IVS cohort has been described in detail elsewhere (7,15). In short, patients diagnosed with classic HL in the period 2002–2009 were enrolled if they had stage IIB–IVB disease or stage IIA disease with adverse prognostic factors (bulky disease, ≥3 nodal lesions, erythrocyte sedimentation rate > 40 mm/h, or subdiaphragmatic presentation). Other IVS inclusion criteria were first-line therapy of ABVD and PET/CT at baseline and after 2 cycles of chemotherapy. No treatment change was allowed based solely on a positive PET scan. Additional patients, fulfilling the same inclusion criteria of IVS were included from the Polish observational study on the predictive role of early and very-early interim PET on ABVD treatment outcome in HL (16). The results of masked independent central review of interim PET scans in both studies have been published elsewhere (7,15,16).
PET/CT Equipment and Image Acquisition
Baseline and interim PET/CT studies were performed according to standard protocol in use at each PET site. Scans were obtained from the skull base to the midthigh level, and attenuation correction was done using iterative reconstructions. All baseline and interim PET/CT studies were anonymized and uploaded to a central server located in the study core lab (Medical Physics Department, Cuneo Hospital, Italy). The image quality of each individual PET/CT study was critically assessed before inclusion in the study.
PET/CT Review
Two reviewers masked to treatment outcome and other clinical information independently reported baseline and interim PET/CT studies. Review results were presented in a joint session, and consensus decisions were made in the case of disagreement. Disease stage was determined according to the Ann Arbor Classification for staging of lymphoma (17) with Cotswolds modifications (18) and to the Lugano classification (19). Focal BM lesions (fPET+) were visually defined as focally increased 18F-FDG uptake with an intensity > liver 18F-FDG uptake with or without corresponding CT abnormalities in at least 2 slices of fused images. The number of focal BM lesions (0, 1, or ≥2) and their anatomic localization were recorded. Diffusely increased 18F-FDG uptake in the BM (dPET+) was visually categorized as diffuse uptake with an intensity > liver (dPET+). No uptake (nPET+) was defined as complete absence of 18F-FDG uptake or a faint diffuse uptake ≤ than that of liver. Finally, CT images were reviewed for structural abnormalities corresponding to areas of focally increased 18F-FDG uptake (osteolytic, osteosclerotic lesion, mixed lesions, or no CT abnormalities). 18F-FDG-uptake in the spleen was categorized as focal 18F-FDG uptake or diffuse 18F-FDG uptake > liver, and CT-ascertained structural abnormalities were recorded.
Statistical Analysis and Ethics
Differences between categoric values were tested with the Fisher exact and χ2 tests, whereas differences between continuous variables were tested with the Wilcoxon test. Overall survival was defined as the time from diagnosis until death from any cause or censoring in patients still alive at the time of last follow-up. Progression-free survival (PFS) was defined as the time from diagnosis until progression, death, or censoring at the time of last follow-up. The prognostic significance of the 18F-FDG uptake patterns in the BM was examined using univariate and multivariate Cox regression models and log-rank tests. Statistical analyses were performed using R.3.2.2 software for Windows. Double-sided P values of less than 0.05 were considered statistically significant. The Ethical Committee of the coordinating center in Cuneo approved the IVS study, and data collection was compliant with national regulations. The Ethical Committee of the coordinating center in Gdańsk approved the Polish observational study.
RESULTS
Baseline Characteristics and Treatment
Overall, 180 patients with stage IIA and adverse risk factors (bulky disease, 3 or more nodal localizations, erythrocyte sedimentation rate > 40 mm/h, or subdiaphragmatic disease) or stage IIB–IV were included in the present study. The patient breakdown according to Ann Arbor stage was stage II, n = 62 (34.4%); stage III, n = 58 (32.2%); and stage IV, n = 60 (33.3%). Detailed baseline characteristics and treatment information are provided in Table 1. The median age was 38.6 y (range, 19–82 y), and the male-to-female ratio was 0.8. The first-line chemotherapy regimen was ABVD × 4 courses (early stage unfavorable) or ABVD × 6 courses (advanced stage). Involved-field radiotherapy was given as standard treatment of early stage disease in 62 of 180 patients (35%); 6 patients with advanced-stage disease had consolidation radiotherapy for residual mass at the end of treatment.
Clinicopathologic Characteristics and Treatment of the 180 Patients
PET/CT BM Findings
At baseline, 89 patients (49.4%) had normal 18F-FDG uptake (nPET+). Thirty-eight patients (21.1%) had focal BM lesions (fPET+, 10 patients with unifocal, 28 with multifocal lesions, 9 with lytic, 8 with sclerotic, 2 with mixed CT lesions, and 19 without any CT corresponding abnormality). In 21 of 38 fPET+ patients (55%), a diffuse 18F-FDG uptake was simultaneously present (f/dPET+). Pure dPET+ without evidence of focal uptake was recorded in 53 patients (30.1%) (Fig. 1; Table 2). Of 60 patients with stage IV disease, 38 (63.3%) had focal BM lesions, 27 (40%) had focal BM lesions only (this was the only criterion that upstaged them to stage IV), and 11 had focal BM and other extranodal lesions. Twenty-two patients had stage IV disease based on extra osseous extranodal lesions. In contrast, only 17 of 142 patients (11.6%) with dPET+ or nPET+ had extranodal disease outside the BM. The relationship between the pattern of 18F-FDG uptake and BMB-detected BMI is shown in Table 2. Routine BMB was performed as part of the routine staging workup in all but 2 patients (98.9%). BMB was positive for BMI in 6 of 38 patients with fPET+ and in 1 of 53 patients with pure dPET+ (15.7%, P = 0.022 and 1.9%, P = 0.185, respectively). However, BMB-ascertained BMI was found in 5 patients without focal BM uptake on staging PET/CT, which led to upstaging of 5 patients from stage III to IV, thereby increasing their International Prognostic Score value by 1 point (Supplemental Table 1; supplemental materials are available at http://jnm.snmjournals.org). However, none of these patients would have had their treatment upgraded by BMB because stage III and IV patients are treated the same according to standard treatment guidelines. When only positive BMB was considered as the reference standard, PET/CT had a sensitivity, specificity, positive predictive value, and negative predictive value of 50% (95% confidence interval [CI], 21–79), 81% (95% CI, 74–86), 16% (95% CI, 6–31), and 96% (95% CI, 91–98), respectively. When both fPET+ and positive BMB were considered as reference standards, the sensitivity and negative predictive value for BMB and PET/CT were 27% (95% CI, 15–43) and 81% (95% CI, 74–86) versus 84% (95% CI, 70–93) and 95% (95% CI, 90–98), respectively. PET/CT had a higher overall accuracy (95%; 95% CI, 91–98) than BMB (82%; 95% CI, 76–87).
(A) Example of pure diffuse BM uptake > liver uptake in baseline 18F-FDG PET (coronal CT slices [a], PET [b] and fused PET/CT [c], and maximum-intensity projection [d]). BMB was negative, and patient was in complete remission (follow-up, +81 mo). (B) Example of multifocal BMI, with 3 focal BM lesions (L2, left ischium, and right scapula) in baseline 18F-FDG PET, without corresponding CT abnormalities (sagittal slices of CT [a], PET [b] and fused PET/CT [c], maximum-intensity projection [d], axial fused PET/CT [e], and CT slices [f]). BMB was negative, and patient relapsed 6 mo after end of ABVD treatment.
BM 18F-FDG Uptake Patterns of the 180 Patients
Clinical Imaging Correlations and Prognosis
Compared with non-fPET+ patients (i.e., nPET+ or pure dPET+ patients), fPET+ patients (either pure fPET+ or f/dPET+) had lower levels of albumin (P = 0.002) and hemoglobin (P = 0.013) and a higher frequency of B symptoms (P = 0.002). Overall, compared with the entire patient population of 180 patients, dPET+ was associated with younger age (P = 0.002), bulky disease (P = 0.004), lower hemoglobin levels (P < 0.001), and higher leukocyte counts (P < 0.001). dPET+ patients also more often displayed diffuse uptake > liver in the spleen (P = 0.049) (Supplemental Table 2). After a median follow-up of 33.8 mo (range, 16.6–108.6 mo), 38 patients (21.2%) progressed or relapsed, and 9 (5%) died (all deaths preceded by disease progression). The resulting 3-y overall survival and PFS estimates were 96.1% (95% CI, 0.93–0.99) and 79.2% (95% CI, 0.73–0.86), respectively. In univariate analyses, bulky disease, with a hazard ratio (HR) of 2.5 (P = 0.007), International Prognostic Score (HR, 3.2; P = 0.0003), multifocal BM fPET+ lesions (HR, 1.9; P = 0.011), and positive interim PET/CT (HR, 11.0; P < 0.0001) were the only factors significantly associated with poor 3-y PFS. In multivariate analysis including the covariates significant at the individual level, only positive interim PET/CT retained an independent statistical significance (P < 0.0001) (Supplemental Table 3). PET-ascertained BMI was not prognostic for outcome (P = 0.072) in the present patient cohort. Figure 2 shows the PFS Kaplan–Meier curves according to the pattern of 18F-FDG uptake in the baseline PET/CT of patients without any 18F-FDG uptake (no uptake), patients with diffuse uptake only (pure diffuse), patients with single focal uptake (with or without dPET+) (unifocal), and patients with more than 1 focal lesion (with or without dPET+) (multifocal). With no uptake as the reference group, the 53 patients with a pure dPET+ had a 3-y PFS identical to the 89 patients with nPET+: 82.9% and 82.2%, respectively; P = 0.918; HR, 0.95; 95% CI, 0.42–2.2. Patients with a single focal lesion (n = 10) or multiple focal lesions (n = 28) had similar 3-y PFS: 68.6% (HR, 2.4; 95% CI, 0.8–7.1) and 66.1% (HR, 1.9; 95% CI, 0.86–4.4), respectively. Importantly, patients with fPET+, either uni- or multifocal, had a significantly inferior long-term disease control than patients with dPET+ and nPET+ (66.7% and 82.5%, respectively, P = 0.03) (Fig. 3). The presence of CT morphologic changes in areas of abnormal focal uptake in the BM was not prognostic for PFS (HR, 1.8; 95% CI, 0.43–7.7). In 33 of 38 fPET+ patients (86.8%), all the focal 18F-FDG lesions disappeared in the interim PET, and in 3 of these patients a photopenic aspect of the scan was recorded in the areas of a previous hot focal lesion, consistent with a classic mirror effect (uptake less to other skeletal areas). Five of 38 (13.2%) fPET+ patients had persisting 18F-FDG uptake in the interim PET in the same BM focal areas recorded at baseline (Deauville score 4 or 5), and 4 of them relapsed, with a significantly worse PFS compared with patients with focal uptake who became PET-negative on interim assessment (P = 0,007) (Supplemental Fig. 1). The κ values for interobserver agreement for BM uptake were 0.83 for focal uptake (focal vs. nonfocal and number of focal lesions) and of 0.78 for diffuse uptake, both consistent with a high degree of reviewer agreement. In 3 of 38 fPET+ cases, a disagreement between reviewers was recorded concerning the number of focal BM lesions (uni- vs. multifocal). In 15 of 53 dPET+ cases (28%), the disagreement between reviewers concerned the intensity of visually assessed BM uptake.
Kaplan–Meier PFS curves of different BM 18F-FDG uptake patterns: no 18F-FDG uptake, pure diffuse 18F-FDG uptake > liver, unifocal 18F-FDG uptake, and multifocal 18F-FDG uptake. Log-rank P = 0.19.
Kaplan–Meier PFS curves of focal PET lesions and nonfocal BM uptake groups (pure dPET+ and nPET+). Log-rank P = 0.03.
DISCUSSION
An accurate HL staging, including detection of BMI, is clinically relevant, because disease stage remains a major determinant to outcome and treatment strategy (9,20,21). In the present analysis of 180 patients with treatment-naïve HL, we first gave a detailed description of BM 18F-FDG uptake patterns to provide helpful key points for BMI assessment in PET/CT-staged patients. The significance of pure strong diffuse BM uptake is still unknown, but probably not related to BMI by neoplastic tissue, as recently stressed (8,12). Moreover, we confirmed that BMB is not a clinically relevant diagnostic tool, because of its scarce sensitivity and low likelihood to upstage PET/CT-staged patients (no patient upstaged from limited to advanced stage by BMB), thus far confirming that in the PET era BMB can be safely omitted for HL staging (8,19,20). Interestingly, a strong diffuse BM 18F-FDG uptake, a common finding at baseline in HL, was reported in 53 of 180 patients (29.4%) in the present study, with a higher rate than previously described (9.3% in a recent cohort of 75 patients (22), 5.2% in El-Galaly et al. cohort of 454 patients (9)). This discrepancy is probably accounted by the thorough imaging review in our study, whereas in other studies data were based on nuclear medicine reports only. Consistent with previous literature (9,12,14,23), the results of this study confirm a clear correlation between strong diffuse BM uptake and some clinical parameters, such as lower hemoglobin level and higher leukocyte count. In this patient group, we also frequently noticed a strong diffuse uptake in the spleen, younger age, bulky disease, and lower levels of albumin compared with patients with no 18F-FDG uptake in BM. The homogeneous and diffuse 18F-FDG uptake in BM probably reflects an unspecific metabolic activation, or simply a hyperplasia of hemopoietic cell compartment at the time of HL diagnosis (24), with a morphologic aspect similar to that recorded in patients treated with hematopoietic growth factors to prevent chemotherapy-induced neutropenia (8). More importantly, only 1 positive BMB was observed in our study in patients showing dPET+, in full agreement with previous studies (0/24 patients according to El-Galaly et al. (9), 0/7 patients according to Adams et al. (22), 2/11 according to Muzahir et al. (11)). However, in the latter study no hint was provided on the coexistence of focal areas of 18F-FDG uptake in the context of a dPET+. Finally, to our knowledge, this is the first study reporting an identical PFS of patients with pure strong diffuse and no significant 18F-FDG uptake in BM. Taken together, these observations suggest that a pure strong diffuse 18F-FDG uptake recorded in BM of HL at baseline is a nonspecific finding and should not be considered as a harbinger of BMI by lymphoma. In the present cohort, a focal BM 18F-FDG uptake (fPET+: visually defined as focally increased 18F-FDG uptake > liver 18F-FDG uptake visible on at least 2 PET slices with or without corresponding CT abnormalities) was reported in 38 patients (21.1%), with a frequency in keeping with the existing literature, showing a prevalence of BMI ranging between 12.9% and 26.8% (1,9,12). Notably, 5 of 89 patients (5.6%) with a totally absent 18F-FDG uptake in BM (nPET+) were upstaged from stage III to IV after BMB (false-negative results); nonetheless, none of these patients had a treatment change by BMB, as reported by El-Galaly et al. (5/27 patients with positive BMB upstaged from stage III to IV) (9). Moreover, in the Danish study and in our study, no patient staged II by PET/CT had a positive BMB. In conclusion, when fPET+ or positive BMB was considered as diagnostic for BMI, the sensitivity and negative predictive value of BMB and PET/CT were 27% and 81% versus 84% and 95%, respectively. In previous studies, the same high diagnostic performance of 18F-FDG PET/CT was observed, as described in a meta-analysis of 955 patients with sensitivity ranging from 87.5% to 100% (10). The sensitivity of PET for BMI was indeed suboptimal in our study (84%), and this finding, though not influential on treatment decision, should be taken into account during patient restaging in the case of resistant or relapsing lymphoma. When either a positive BMB or focal BM lesions on PET/CT that disappear during treatment in the following scans or both is considered as the standard reference for BMI by HL, as previously suggested (10,12), this is a more accurate method than BMB alone, because of the inability of BMB to detect BMI for the patchy nature of BM infiltration by HL. As a matter of fact, in our study only 12 of 180 patients (6.6%) had a positive BMB, whereas, predictably, most focal lesions recorded at baseline (86.8%) disappeared in the interim PET, in keeping with the negativization rate of interim PET in ABVD-treated HL (80%–85%) (25). Overall, these data stress the likelihood of the suggestion that focal 18F-FDG–avid lesions are indeed true BM invasion by lymphoma. Interestingly, El-Galaly et al. found a similar proportion of negativization of BM lesions in the interim PET in 72 of 82 patients (87.8%) (9). Importantly, fPET+ patients had a worse prognosis and a higher proportion of extranodal site disease (62.5%) than non-fPET+ patients (11.6%). Moreover, 24 of 38 fPET+ patients (63%) in stage IV by PET had extranodal sites in BM only, and patients with unifocal lesions were classified as stage IV according to the Lugano classification (19). The fact that patients with fPET+ had a significantly worse treatment outcome (P = 0.03) than did those with dPET+ and nPET+ could depend on a possible protective effect of patients with a dPET+, as witnessed by the younger age (36.5 vs. 41.8, P = 0.002), and on the adverse prognostic meaning of a stage IV disease (26). However, when all the known clinical, biologic, and imaging variables were considered, the only strong significant predictive factor associated with a poor PFS in multivariate analysis was positive interim PET/CT (P < 0.0001), in agreement with the large literature data (26,27). The good interobserver agreement in PET reporting points toward feasibility in clinical practice of these simple rules for BMI detection.
CONCLUSION
The present study suggests that 18F-FDG PET scanning is a reliable tool for the assessment of BM invasion by HL, BMB is no longer needed for HL staging and could be safely omitted, only focal 18F-FDG uptake at the BM level should be considered as a harbinger of HL, and focal BMI could be detected with high accuracy and interobserver agreement in routine HL staging with PET/CT.
DISCLOSURE
No potential conflict of interest relevant to this article was reported.
Acknowledgments
We thank Dr. Fabrizio Bergesio for imaging quality check, collecting, and distribution to reviewers.
Footnotes
Published online Jan. 26, 2017.
- © 2017 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication September 14, 2016.
- Accepted for publication December 21, 2016.