Visual Abstract
Abstract
Treatment regimens for pediatric Hodgkin lymphoma (HL) depend on accurate staging and treatment response assessment, based on accurate disease distribution and metabolic activity depiction. With the aim of radiation dose reduction, we compared the diagnostic performance of 18F-FDG PET/MRI with a 18F-FDG PET/CT reference standard for staging and response assessment. Methods: Twenty-four patients (mean age, 15.4 y; range, 8–19.5 y) with histologically proven HL were prospectively and consecutively recruited in 2015 and 2016, undergoing both 18F-FDG PET/CT and 18F-FDG PET/MRI at initial staging (n = 24) and at response assessment (n = 21). The diagnostic accuracy of 18F-FDG PET/MRI for both nodal and extranodal disease was compared with that of 18F-FDG PET/CT, which was considered the reference standard. Discrepancies were retrospectively classified as perceptual or technical errors, and 18F-FDG PET/MRI and 18F-FDG PET/CT were corrected by removing perceptual error. Agreement with Ann Arbor staging and Deauville grading was also assessed. Results: For nodal and extranodal sites combined, corrected staging 18F-FDG PET/MRI sensitivity was 100% (95% CI, 96.7%–100%) and specificity was 99.5% (95% CI, 98.3%–99.9%). Corrected response-assessment 18F-FDG PET/MRI sensitivity was 83.3% (95% CI, 36.5%–99.1%) and specificity was 100% (95% CI, 99.2%–100%). Modified Ann Arbor staging agreement between 18F-FDG PET/CT and 18F-FDG PET/MRI was perfect (κ = 1.0, P = 0.000). Deauville grading agreement between 18F-FDG PET/MRI and 18F-FDG PET/CT was excellent (κ = 0.835, P = 0.000). Conclusion: 18F-FDG PET/MRI is a promising alternative to 18F-FDG PET/CT for staging and response assessment in children with HL.
Hodgkin lymphoma (HL) is the most common childhood lymphoma and one of the most common pediatric and adolescent cancers (1). Treatment outcomes are critically dependent on accurate imaging-defined staging and, thereafter, early treatment response assessments.
Staging consists of detection of anatomic locations of disease, using either CT or MRI, combined with metabolic assessment (2). This metabolic assessment consists of using 18F-FDG PET to detect increased glucose metabolism of lymphatic and extralymphatic tissue involved by HL (3). Imaging is repeated after 2 (for classic HL) or 3 (for lymphocyte-predominant HL) cycles of chemotherapy to determine early response assessment, which determines whether posttreatment radiotherapy is necessary (3). A decision to consolidate with radiotherapy is made if there is either insufficient reduction in size of disease or persistent abnormal metabolic activity. It is well established that the addition of 18F-FDG PET to standard cross-sectional imaging improves accuracy for both initial staging and treatment response assessment; most patients therefore undergo repeated multimodality imaging.
Both CT and 18F-FDG PET impart a significant radiation dose, and repeated studies result in a cumulative dose associated with secondary malignancies (4,5). Children and young adults are particularly vulnerable to the long-term effects of radiation, and according to the linear no-threshold model of radiation, any reduction in radiation exposure for this group would be beneficial (6,7). With hybrid 18F-FDG PET/MRI systems now available and able to potentially replace 18F-FDG PET/CT with a single 18F-FDG PET/MRI examination (8,9), diagnostic radiation exposure could potentially be reduced by up to 80% (5,8,10) while maintaining both anatomic and metabolic information.
We aimed to test the hypothesis that 18F-FDG PET/MRI is an alternative to 18F-FDG PET/CT for both staging and early response assessment. This hypothesis was tested by prospectively assessing and comparing the diagnostic performance of 18F-FDG PET/MRI against standard 18F-FDG PET/CT for initial staging and early response assessment in pediatric and adolescent patients with HL.
MATERIALS AND METHODS
Patient Population
This single-center prospective study was approved by the institutional review board, and written informed consent was obtained from either the participants or their guardians.
Patients were consecutively included between February 2015 and June 2016. The inclusion criteria were histologically confirmed HL; an age of 0–20 y; and inclusion into the EuroNet PHL-C1 trial, EuroNet LP1 trial, or successor EuroNet trials, including EuroNet PHL C2. The exclusion criteria were previous HL, prior chemotherapy or radiotherapy, pregnancy, breastfeeding, and contraindications to MRI.
Study Summary
As part of standard-of-care staging imaging at our institution, all patients underwent 18F-FDG PET/CT. The trial 18F-FDG PET/MRI was performed on the same day as the 18F-FDG PET/CT, using the same 18F-FDG injection, with the patient transferring to the 18F-FDG PET/MRI suite immediately after completion of the 18F-FDG PET/CT examination.
After 2 (EuroNet PHL-C1) or 3 (EuroNet LP1) cycles of chemotherapy, disease was reassessed with 18F-FDG PET/CT, and the trial 18F-FDG PET/MRI was again performed immediately after the 18F-FDG PET/CT. 18F-FDG PET/CT was the standard of reference against which 18F-FDG PET/MRI was compared.
18F-FDG PET/CT Protocol
18F-FDG PET/CT data were acquired using a Discovery PET/CT in-line system (GE Healthcare) without time-of-flight technology. Patients fasted for 6 h, and hyperglycemia (>10 mmol/L) was ruled out before 18F-FDG administration. A weight-adjusted dose of 18F-FDG was injected 60 min before imaging. Patients were scanned from skull vertex to mid-thigh level, at 3 min per bed position. CT was performed at a low dose for attenuation correction, and acquisition parameters were adjusted according to patient weight. No intravenous iodinated contrast medium was administered. Combined axial emission images of 18F-FDG PET and CT were reconstructed to a 128 × 128 resolution image with a 2.5-mm slice thickness.
18F-FDG PET/MRI Protocol
18F-FDG PET/MRI sequences were acquired on a hybrid 3-T MRI scanner without time-of-flight technology (Biograph mMR; Siemens Healthineers). All sequences were acquired from skull vertex to mid-thigh level at 5 min per bed position.
The 18F-FDG PET/MRI comprised axial T2-weighted half-Fourier acquisition single-shot turbo spin-echo (T2-HASTE) and axial and coronal turbo inversion recovery magnitude sequences, along with axial postgadolinium T1-weighted Dixon fat suppression sequences (Table 1). Four-tissue-class (soft tissue, fat, lung, and air) attenuation correction maps were calculated from the 2-point Dixon sequences. Combined axial emission images of 18F-FDG PET and T2-HASTE sequences were reconstructed into a 128 × 128 resolution image with a 5-mm slice thickness.
MRI Sequences
Image Analysis
All studies were read sequentially under trial conditions using OsiriX workstation software (OsiriX MD) after all scans were performed. The readers were masked to the patient data, clinical data, and results of other modalities, including the reference standard.
18F-FDG PET/CT scans were read by a nuclear medicine specialist (with 10 y of dedicated nuclear medicine experience). 18F-FDG PET/MRI sequences were evaluated by a nuclear medicine specialist and a pediatric radiologist in consensus (both with 15 y of experience).
Derivation of Enhanced Reference Standard and Correction for Perceptual Errors
Discrepancies between 18F-FDG PET/MRI and the 18F-FDG PET/CT reference standard were reviewed by an independent nonmasked pediatric radiologist (with 5 y of dedicated pediatric radiology experience), using methodology similar to that of Latifoltojar et al. (11). Minor differences in disease localization at site boundaries were not labeled as discrepancies. All other discrepancies were reviewed in consensus between two of the reviewers and assigned as a perceptual or a technical error type. Perceptual errors entailed retrospectively visible disease. Technical errors entailed discrepancies unrelated to reader detection, such as difference in measured metabolic activity between 18F-FDG PET/MRI and 18F-FDG PET/CT.
Technical discrepancies were always resolved in favor of the 18F-FDG PET/CT reference standard. By comparing datasets corrected for perceptual error, we aimed to remove the human perception bias and better compare technical performance.
Staging Definitions
Staging was performed according to the Cotswolds modified Ann Arbor classification (12,13). Nineteen nodal sites were assessed: cervical (left and right), anterior mediastinum, paratracheal, lung hilum (left and right), diaphragm, axilla (left and right), hepatic hilum, splenic hilum, spleen, celiac trunk, paraaortic, mesenteric, iliac (left and right), and inguinal (left and right). Ten extranodal sites were also registered: lung, chest wall, kidneys, bone marrow, pleura, pericardium, bowel, stomach, liver, and pancreas.
Nodal and extranodal disease involvement was defined as a focal area of 18F-FDG with an SUVmax (measured using a 1 cm2 circular region of interest) above that of the mediastinal blood pool (14) or 18F-FDG uptake greater than in the surrounding background in a location incompatible with normal physiologic activity (15,16). Nodal volumes were derived from measurements in 3 orthogonal dimensions (volume = 1/6π [a × b × c]).
Any focal lung parenchymal 18F-FDG–avid focus above the SUVmax of the mediastinal blood pool was considered a site of involvement. As per the EuroNet criteria, focal lung parenchymal involvement was also considered a site of involvement if there was a focal consolidation, a solitary nodule larger than 1 cm, or more than 3 subcentimeter nodules.
Extranodal extension of disease was considered present if there was a contiguous extension of tissue beyond a nodal mass into adjacent structures.
At staging, bone marrow involvement was defined as focal or multifocal 18F-FDG uptake above the level in the liver (17). At response assessment, bone marrow involvement was defined as focal or multifocal uptake higher than that in normal marrow but less than that at baseline (with diffuse changes from chemotherapy allowed) (18). Bone marrow lesions on MRI were considered to represent bone marrow involvement only when combined with increased 18F-FDG uptake.
For response assessment, each site was reevaluated and residual metabolic activity classified using the 5-point Deauville scale (19).
Statistical Analysis
Detection rates (sensitivity and specificity, with a 95% CI) and area under the curve were calculated for nodal and extranodal sites combined. 18F-FDG PET/MRI was subsequently compared with 18F-FDG PET/CT as the reference standard, with and without correction for perceptual error.
The κ-statistic was determined to test agreement between modalities. This was classified as poor for a κ of less than 0.00, slight for 0.00–0.20, fair for 0.21–0.40, moderate for 0.41–0.60, good for 0.61–0.80, and excellent for 0.81–1.00 (20).
Volume and SUVmax for individual nodal and extranodal sites were compared between 18F-FDG PET/MRI and 18F-FDG PET/CT at staging and response assessment. Correlation was calculated using the Spearman correlation test.
Statistical analysis was performed using SPSS (release 26; IBM) for Windows (Microsoft). The level of significance was set at an α value of less than 0.05.
RESULTS
Twenty-six patients were recruited (male-to-female ratio, 14:12; median age, 16 y; range, 8–19 y). After exclusion, 24 patients were enrolled for staging, and 21 of these remained for response assessment. Twenty-two patients had histologically confirmed classic HL; 2 had nodular lymphocyte-predominant HL. The study flowchart is presented in Figure 1.
Study flowchart.
The median time between 18F-FDG injection and 18F-FDG PET/CT was 68 min (interquartile range, 60–76 min). The median interval between 18F-FDG PET/CT and 18F-FDG PET/MRI was 95 min (interquartile range, 77–112 min).
Staging
18F-FDG PET/CT detected 141 disease-positive sites (136/456 nodal and 5/240 extranodal: 696 sites total) (Table 2). The modified Ann Arbor stage distribution was stage 1 in 1 patient (1/24, or 4.2%), stage 2 in 13 patients (13/24, or 54.2%), stage 3 in 6 patients (6/24, or 25%), and stage 4 in 4 patients (4/24, or 16.7%).
Technical and Perceptual Errors, Nodal and Extranodal Sites Combined
18F-FDG PET/MRI detected 135 of 141 true-positive sites, 551 true-negative sites, 4 false-positive sites, and 6 false-negative sites (Table 2). Six of 6 false-negative sites were because of perceptual error. Three of 4 false-positive sites were because of technical error; these false-positive sites were small in volume (0.02–1.3 cm3) but 18F-FDG–avid, whereas they were not 18F-FDG–avid on the 18F-FDG PET/CT reference standard. The remaining single false-positive site was because of perceptual error (Table 2).
Uncorrected and corrected sensitivity and specificity are given in Table 3. 18F-FDG PET/MRI intermodality agreement for disease sites compared with 18F-FDG PET/CT detection was excellent for both uncorrected and corrected data (Table 3).
18F-FDG PET/MRI Detection of Involved Sites
There were no discrepancies between 18F-FDG PET/MRI and 18F-FDG PET/CT for modified Ann Arbor staging (Table 4). Figure 2 demonstrates an involved nodal site at staging that was concordant between 18F-FDG PET/CT and 18F-FDG PET/MRI.
Axial CT (A), 18F-FDG PET (B), 18F-FDG PET/CT (C), T2-HASTE MRI (D), 18F-FDG PET (E), and 18F-FDG PET/MRI (F) images showing concordant (indicating that PET/CT and PET/MRI show the same lesion) upper right deep cervical lymphadenopathy (arrows) in a 12-y-old boy with lymphocyte-predominant HL at staging.
Staging and Early Response Assessment, Agreement Between 18F-FDG PET/MRI and 18F-FDG PET/CT
Response Assessment
At response assessment, 18F-FDG PET/CT demonstrated 6 incomplete metabolic response sites (4/399 nodal and 2/210 extranodal: 609 sites total) in 3 of 21 patients (Table 2). Deauville grade distribution was grade 2 in 11 patients (11/21, or 52.4%), grade 3 in 7 patients (7/21, or 33.3%), and grade 4 in 3 patients (3/21, or 14.3%). Figure 3 demonstrates 18F-FDG PET/MRI nodal site involvement at staging and response assessment.
(A–C) Coronal STIR (A), 18F-FDG PET (B), and 18F-FDG PET/MR (C) images demonstrating lymphadenopathy in right supraclavicular fossa and right paratracheal region and nodule in right lung (arrows) in a 14-y-old boy with HL. (D–F) Early response assessment after 2 cycles of chemotherapy showing residual smaller lymphadenopathy in the right supraclavicular fossa on coronal STIR image (D) and no uptake on coronal 18F-FDG PET (E) or 18F-FDG PET/MRI (F) (arrows). Other sites of disease have resolved.
18F-FDG PET/MRI correctly detected 5 of 6 incomplete metabolic response sites; 1 incomplete metabolic response site was not detected because of a technical error (Deauville 3 on 18F-FDG PET/MRI compared with Deauville 4 on 18F-FDG PET/CT). There were no perceptual errors. Uncorrected and corrected sensitivity and specificity are given in Table 3.
Intermodality agreement for disease sites between 18F-FDG PET/MRI and 18F-FDG PET/CT detection was excellent (Table 3). 18F-FDG PET/MRI response assessment according to Deauville was excellent (Table 4).
Extranodal Disease
At staging, 5 of 141 involved sites were extranodal (3 bone marrow sites and 2 lung sites). Regarding the lung sites, 1 patient had 1 lung nodule measuring 12 and 14 mm on T2-HASTE and CT, respectively. The other patient had multiple lung nodules: 12 nodules were detected on T2-HASTE, measuring up to 14 mm, and 41 nodules measured up to 18 mm on CT.
At response assessment, 2 incomplete metabolic response sites were extranodal (1 bone marrow site and 1 lung site). Further statistical analyses were omitted because of the low number of extranodal sites.
Volume
At staging, correlation between individual nodal disease site volumes at 18F-FDG PET/MRI and 18F-FDG PET/CT was excellent (r = 0.817, P = 0.000). Volumes are given in Table 5.
Volume and SUVmax per Individual Nodal Sites
At response assessment, correlation was good (r = 0.728, P = 0.000). Volumes and volume reduction percentages at response assessment are given in Table 5.
DISCUSSION
This study prospectively compared 18F-FDG PET/MRI with 18F-FDG PET/CT for both staging and early postchemotherapy response reassessment in a cohort of children and adolescents with HL.
By comparing the diagnostic performance of these modalities, we wanted to test the hypothesis that 18F-FDG PET/MRI is an alternative to 18F-FDG PET/CT in pediatric HL, with the aim of reducing the cumulative radiation dose that these children receive throughout their illness. To correct for human perception bias, we present data corrected and uncorrected for perceptual error. Technical error due to differences in technique was not corrected for.
When corrected for perceptual error, our results showed perfect agreement for modified Ann Arbor staging between 18F-FDG PET/MRI and 18F-FDG PET/CT and excellent response assessment agreement according to the Deauville scale. The notion that 18F-FDG PET/MRI may replace 18F-FDG PET/CT is further supported by a good to excellent intermodality correlation of SUVmax and nodal size at staging and by a moderate to good correlation at response assessment.
In adults, excellent to perfect staging agreement (κ = 0.979–1.0) between 18F-FDG PET/CT and 18F-FDG PET/MRI for staging of mixed groups of HL and non-HL has been reported (13,21,22). However, given the differences in body habitus, the differences in physiology (e.g., brown fat), the challenges of prolonged MRI protocols in children, and the possible differences in behavior of lymphomas, adult studies comparing 18F-FDG PET/MRI with 18F-FDG PET/CT cannot be extrapolated to children and adolescents (4,11,23). Those available pediatric studies comparing 18F-FDG PET/MRI and 18F-FDG PET/CT have so far consisted of a mix of both HL and non-HL and studied either staging or response assessment (4,24). Our study aimed to improve on the pediatric literature by prospectively including only pediatric HL and studying both staging and early response assessment.
Because of a longer study duration, a smaller bore diameter, and loud noises, undergoing 18F-FDG PET/MRI instead of 18F-FDG PET/CT may be challenging for some children (and adults). Preparation using virtual-reality glasses, a mock MRI scanner, and play specialists may help children to become comfortable in an MRI scanner (25). However, diagnostic performance must have priority over reduction of radiation dose. For those children unable to lie still in the MRI scanner, either anesthesia or 18F-FDG PET/CT instead of 18F-FDG PET/MRI may be necessary.
In addition to the reduced radiation dose, there may be another benefit to replacing 18F-FDG PET/MRI with 18F-FDG PET/CT. MRI sequences have intrinsic superior soft-tissue contrast compared with CT, potentially allowing for better delineation of lymph nodes (e.g., hilar lymph nodes) and focal lesions in solid organs (e.g., liver, spleen, kidney). However, although lung nodules can be seen on MRI, CT has superior air-to-tissue contrast, and diagnostic chest CT is advised at staging of lung nodules (26). Consequently, our current practice consists of 18F-FDG PET/MRI at staging and response assessment, in combination with diagnostic non–contrast-enhanced chest CT at staging in all patients and at response assessment in those children with lung involvement at staging.
A limitation of this study was the necessity to scan and then move the patients between the 18F-FDG PET/CT and 18F-FDG PET/MRI devices. This limitation was again unavoidable in a study in which ethics allow only a single 18F-FDG injection.
The effects of a delay between injection and PET scanning are 2-fold. The first is a reduction in 18F-FDG activity due to decay of the isotope, which may influence SUV measurements. The second is prolonged uptake time, which at some sites may increase disease detection due to washout from normal structures and at other sites may cause higher activity due to slow tracer uptake (27). There are published examples of this phenomenon, such as adrenal lesions that are false-negative on 18F-FDG PET/CT but true-positive on 18F-FDG PET/MRI (13)—a finding that is thought be the result of prolonged uptake time.
Although we found no difference in staging between 18F-FDG PET/MRI and 18F-FDG PET/CT at diagnosis, there were site discrepancies that could have a potential clinical impact on children requiring radiotherapy. These discrepancies, regarded in our study as technical errors, may be due to the delay in 18F-FDG PET/MRI relative to 18F-FDG PET/CT. In our study, 3 (small volume: range, 0.02–1.3 cm3) sites were considered false-positive on 18F-FDG PET/MRI at staging because of a higher SUVmax on 18F-FDG PET/MRI than on 18F-FDG PET/CT. Although the reason for this discrepancy cannot be verified, we postulate that it may be related to the prolonged uptake time for the 18F-FDG PET/MRI. Conversely, differences in perfusion and washout may also account for the lower Deauville scale measurement at 2 sites on 18F-FDG PET/MRI during response assessment.
Lastly, there were only 6 sites at response assessment that did not show a complete metabolic response. Although there was excellent agreement between 18F-FDG PET/MRI and 18F-FDG PET/CT at response assessment, these numbers are too small to draw definitive conclusions. Further studies to confirm these findings should include a larger population, which may be feasible in, especially, a multicenter collaboration.
CONCLUSION
In our cohort of patients, 18F-FDG PET/MRI showed no difference from 18F-FDG PET/CT in overall staging of HL in children and adolescents, and there was an excellent response assessment agreement. With the aim of reducing cumulative radiation dose, we suggest that pediatric or adolescent HL staging and response assessment be performed using 18F-FDG PET/MRI instead of 18F-FDG PET/CT wherever possible.
DISCLOSURE
Funding was secured from Great Ormond Street Hospital Children’s Charity. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: Can 18F-FDG PET/MRI replace 18F-FDG PET/CT for staging and response assessment of pediatric HL?
PERTINENT FINDINGS: This prospective observational study demonstrated that 18F-FDG PET/MRI is a promising alternative to 18F-FDG PET/CT for staging and chemotherapy response assessment in pediatric HL.
IMPLICATIONS FOR PATIENT CARE: Replacing 18F-FDG PET/CT with 18F-FDG PET/MRI allows children with HL to receive a lower cumulative radiation dose at staging and response assessment while maintaining diagnostic accuracy.
Footnotes
Published online February 19, 2021.
- © 2021 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication November 12, 2020.
- Accepted for publication January 29, 2021.