Visual Abstract
Abstract
The aim of this initiative was to provide consensus recommendations from a consortium of academic and industry experts in the field of lymphoma and imaging for the consistent application of imaging assessment with the Lugano classification. Methods: Consensus was obtained through a series of meetings from July 2019 to October 2021 sponsored by the PINTaD (Pharma Imaging Network for Therapeutics and Diagnostics) as part of the ProLoG (PINTaD RespOnse criteria in Lymphoma wOrking Group) consensus initiative. Results: Consensus recommendations encompass all technical imaging aspects of the Lugano classification. Some technical considerations for PET/CT and diagnostic CT are clarified with regards to required imaging series and scan visits, as well as acquisition and reconstruction of PET images and influence of lesion size and background activity. Recommendations are given on the role of imaging and clinical reviewers as well as on training and monitoring. Finally, an example template of an imaging case report form is provided to support efficient collection of data with Lugano Classification. Conclusion: Consensus recommendations are made to comprehensively address technical and imaging areas of inconsistency and ambiguity in the classification encountered by end users. Such guidance should be used to support standardized acquisition and evaluation with the Lugano 2014.
In 2014, the Lugano classification (1) together with an imaging-focused companion report (2) (referred together as Lugano 2014) provided a standardized approach to classifying response based on 18F-FDG PET/CT in 18F-FDG–avid lymphomas. The Lugano 2014 was an update to the Revised Response Criteria for Malignant Lymphoma published in 2007 (referred to as Cheson 2007) (3).
The Lugano 2014 has since been used by regulatory agencies for recent drug approval and widely adopted both by the pharmaceutical industry and also by clinicians for evaluation of Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL). Currently, hundreds of actively recruiting and ongoing investigational trials are using the Lugano classification (https://clinicaltrials.gov).
The PRoLoG committee (PINTaD RespOnse criteria in Lymphoma wOrking Group), sponsored by the PINTaD (Pharma Imaging Network for Therapeutics and Diagnostics) (https://www.pintad.net), is a cross-functional group of volunteers from the industry and academy who engaged in discussions to provide expert end-users consensus recommendations for the consistent application of the Lugano classification.
This article, focusing on the technical imaging recommendations, is not intended to replace the classification. It may also be applied to some extent to the newer lymphoma response assessment criteria (e.g., Lymphoma Response to Immunomodulatory Therapy Criteria 2016 (4) and Response Evaluation Criteria in Lymphoma 2017 (5)). Although these recommendations are primarily given for clinical trial end-users, it may be valuable information for health-care providers as well.
MATERIALS AND METHODS
Task forces (TFs) were created to evaluate technical imaging and clinical considerations of the Lugano classification that could affect its uniformity in evaluating lymphoma response.
The TF members included representatives from academic or scientific organizations (n = 3), pharmaceutical industry (n = 9), clinical research organizations (n = 13), and other clinical trial specialists (n = 4), as well as independent research leaders. A steering committee oversaw the activities of each TF. All meetings were held virtually, from July 2019 to October 2021, recorded and transcribed into minutes that were approved by the TF members. In instances for which there was lack of evidence-based data, or consensus, a call for future research on that topic was suggested. Additional recommendations from the TF, primarily for clinical imaging considerations, have been previously published (6).
Any individual involved in the implementation of the Lugano classification is considered an end-user. Any physician responsible for assessing response in lymphoma is considered a reviewer.
TECHNICAL CONSIDERATIONS FOR IMAGE ACQUISITION, RECONSTRUCTION, AND EVALUATION
Required Images and Viewing Stations
The assessment of the Lugano classification is informed by both anatomic imaging (diagnostic CT preferred; however, it can be interchangeable with MRI; ultrasound should not be used because of the operator-dependency of the method) and metabolic (18F-FDG PET/CT) imaging, for 18F-FDG–avid lymphomas.
The following images should be provided to the reviewers, when available: 18F-FDG PET/CT images and diagnostic CT images. 18F-FDG PET/CT images include PET attenuation-corrected (AC) images; PET non–attenuation-corrected images; low-dose CT for attenuation correction (CTAC) and for localization purposes; and reconstructed images—AC MIP (maximum-intensity-projection) and PET/CT fusion images, unless the viewing software enables creation from AC images, and care should be taken that no patient identifiers are embedded on reconstructed images.
Diagnostic CT images should include CT images with anatomic coverage to encompass all areas of known or suspected disease with appropriate acquisition settings for kVp, mAs, slice thickness of ≤ 5 mm, intravenous contrast, and patient positioning and breathing instructions (e.g., deep inspiration breath-hold); and standard soft-tissue and lung reconstruction images.
Viewing stations for image review and interpretation should provide adequate functionality to allow multiplanar display (i.e., axial, coronal and sagittal views) of PET, diagnostic CT and fused PET/CT images for image interpretation and lesion cross-referencing purposes. PET images should be scaled to a set SUV range and color table.
PET software should allow creation of maximum-intensity-projection images (of special importance for providing visual scoring assessments of distant lesions to mediastinum and liver reference tissues). Reading software should allow for vendor-neutral evaluation of PET images, including semiquantitative uptake measurements, and of CT images, including size measurements, and may ideally allow for volumetric assessments (which are interesting exploratory measurements but not included in the Lugano classification). The Quantitative Imaging Biomarkers Alliance (QIBA [https://rsna.org/QIBA]) has provided guidance on system’s technical performance standards (7,8) when the aim is to use 18F-FDG PET as a quantitative imaging biomarker.
18F-FDG PET/CT and Diagnostic CT Scan Visits
PET/CT should provide sufficient anatomic coverage to accurately assess whole-body tumor burden. As a minimum for all patients, PET/CT should include common areas of disease involvement including the neck, chest, abdomen, and pelvis (including groin). Coverage should be adjusted to include additional areas of known or suspected disease (e.g., extremities). Inclusion of the brain is dependent on the lymphoma disease status and imaging center standard protocol. It is highly recommended that 18F-FDG PET emission scanning commences in the pelvis or thigh region and extend to the upper body, to avoid reconstruction artifacts due to high bladder uptake. The same PET/CT scanner and scanning direction should be used on follow-up time points, and consistent patient positioning and breathing instructions should be ensured across all imaging visits. Time from injection of 18F-FDG to acquisition of PET images should be kept rigorously constant across successive scans in a patient to allow for comparability of metabolic images (ideally ±5 min, up to ±10 min, compared with time used at baseline), and acquisition should always be timed to close to 60 min after injection (55–75 min is acceptable) (7–10). Factors affecting SUV calculation (e.g., injection time, but also administered activity, weight) that are entered manually onto the scanner should be carefully checked and documented for quality control purposes.
Whenever possible, 18F-FDG PET/CT and diagnostic CT scans, if both are required at the same time point, should be acquired on the same scanner during the scheduled imaging visit for patient convenience. CTAC scans should be obtained without intravenous or positive oral bowel contrast. Diagnostic CT with intravenous contrast should be performed after the PET CTAC acquisition in order to avoid overattenuation of the PET images from the CT contrast medium.
A CT should be considered of diagnostic quality (so-called diagnostic CT) if it has adequate resolution to detect and accurately measure lesions and spleen size and should contain intravenous contrast, unless contraindicated, ideally in the portal venous phase for clinical trials. Oral contrast is recommended per site standard of care, especially in patients with known or suspected hollow viscus involvement or mesenteric lymphadenopathy. Technical acquisition parameters, use of intravenous contrast unless medically contraindicated, breath-hold techniques, and arm positioning should be specified beforehand in study documents and kept as consistent as possible for a given subject across time points, and as much as possible for the trial. The CT portion of a PET/CT can be used for lesion and spleen measurements if it is considered of acceptable diagnostic quality.
For situations in which a patient is diagnosed at a center different from the treating institution, it is of utmost importance that the baseline scan (images and image acquisition fields) be made available in DICOM format to enable comparison to subsequent imaging. Ideally, all scans for a same patient should be conducted with the same scanner and at same institution throughout the trial.
Further recommendations are provided in Supplemental Table 1 (available at http://jnm.snmjournals.org).
PET Acquisition and Image Reconstruction
Phantom-based quantitative calibration validation is strongly recommended before a clinical trial is started, and is even critical in trials in which main endpoints require SUV/activity concentration–based quantitative measurements. However, for trials with no quantitative measurements, the regular quality control that is used for clinical care recommended by the imaging facilities, manufacturer, and institution may be sufficient.
Semiquantitative SUV read-outs can be of interest in trials using the Lugano classification (2), and it is highly recommended that the comprehensive QIBA 18F-FDG profile (7,8) be implemented at each site as a guideline for standardization of the 18F-FDG PET workflow. Other guidance exists, such as the European Association of Nuclear Medicine procedure guidelines for tumor imaging with 18F-FDG PET/CT (10). The scanning sites and study sponsor should agree on key PET reconstruction parameters in order to harmonize image quality and quantification.
Change in SUVmax (ΔSUV) and metabolic tumor volume may be promising tools for response evaluation and prognosis in lymphoma (11,12), including for clinical trials, further emphasizing the need for standardization of PET acquisition (13). A change in SUV measurement (e.g., ΔSUVmax of less than or equal to 66% in 18F-FDG PET/CT after 2 cycles of chemotherapy for diffuse large B-cell lymphoma as a correlate to an unfavorable outcome (14–17)) has been suggested for response and prognosis evaluation at interim PET as well as for assessment in PET-guided therapy (18). This promising measurement is undergoing further validation (19–21).
Acquisition and reconstruction methods should be kept consistent throughout the trial and between patient visits. PET 3-dimensional mode acquisition with time-of-flight is preferred when available. In the interest of harmonizing image acquisition across sites, newer reconstruction methods that may not be widely available (e.g., point spread function corrections, regularized reconstructions, artificial intelligence–based acquisition and reconstruction algorithms) and for which the effect on the 5-point scale (5-PS) is not yet known should be used cautiously to assess study outcomes for PET-guided therapy decisions until the impact of these newer methods on the 5-PS is better understood.
However, the TF acknowledges that phantom harmonization programs that align scanner performances across institutions may help to mitigate such differences between newer reconstruction methods, especially for semiquantitative assessments (e.g., SUV and metabolic tumor volume). Although prospective harmonization of PET scanners in a multiinstitutional clinical trial setting is desirable, it may not always be entirely practical or feasible due to variety of reasons (including the use of different reconstruction algorithms, such as Bayesian penalized likelihood and point spread function, compared with older methods, such as traditional ordered-subset expectation maximization).
Technical Influence of Lesion Size and Background Activity
The influence of lesion size and activity concentration on partial volume is difficult to correct for in smaller lesions. This is particularly relevant when using the 5-PS to assess small residual lesions in lymphoma response assessment. In phantom studies using different sized spheres filled with identical concentrations of 18F to mimic tumor sizes, smaller lesions (<2 cm) appeared to have less 18F-FDG activity than larger lesions (≥2 cm) (22–24). This is due to the inability of PET scanners to fully recover all the counts (i.e., partial-volume effects) from smaller compared with larger spheres (or lesions) (22).
Although newer scanners may have advanced reconstruction algorithms to account for the loss of signal (point-response function or regularized reconstructions), there have been no well-controlled studies addressing this issue or its influence on the application of the 5-PS.
Therefore, a uniform recommendation by the TF on how to integrate lesion size information into Lugano evaluation is not possible at this time, and further investigation is encouraged.
Signal-to-noise ratio plays an important role in lesion detection. Image reconstruction and postprocessing of images with available reconstruction algorithms and filtering help to control for and remove noise, which should be optimized for individual scanners based on either phantom testing or according to the suggested recommendations of manufacturers’ specifications. However, the conspicuity of lesions is not only dependent on lesion signal but also on the uptake or signal in surrounding tissue and organs. Therefore, the reader should be aware of this phenomenon when interpreting scans.
SUV Measurements
Some semiquantitative measurements are routinely recorded (e.g., most hypermetabolic lesion, reference regions), and such measurement may be used to confirm visual assessment, for example, to assign a score of 5 on the 5-PS (2,11).
SUVs that are captured (e.g., most hypermetabolic lesion, reference regions) usually represent the SUVmax, in alignment with the Lugano classification. However, other types of measurements (e.g., lesion SUVpeak, reference region SUVmean) are frequently recorded in clinical trials (11).
SUVmax represents the uptake in the single voxel exhibiting the highest tracer uptake in the region of interest. It is easily available on read stations, has good interreader reproducibility, and is relatively unaffected by partial-volume effects. However, SUVmax is influenced by noise.
SUVpeak is the average of the SUV in the 1 cm3 of voxels with the highest activity in a volume of interest. SUVpeak (corrected for lean body mass) is used in PERCIST (25). PERCIST was proposed in 2009 to better standardize PET response criteria in solid tumors and to combine good interreader reproducibility, reduce the influence of partial volume with SUVmax, and improve count rate stability.
SUVmean represents the mean tracer uptake in the region of interest. Usually, the most metabolically active portion within the area of interest should be used within the region of interest in which SUVmean is calculated. Measurement of the mean is dependent on the size of the region or volume of interest, which should be standardized.
Further work is warranted in this field to identify the optimal measure for lymphomas. Besides, metabolic assessments (e.g., metabolic tumor volumes) and other radiomic features may become more important in the future.
Terminology for Image Evaluation and Reporting
Lugano 2014 considers both metabolic and anatomic assessments when evaluating 18F-FDG–avid lymphomas. With regard to response assessed on diagnostic CT, both radiographic and anatomic terminology have been used. The TF recommends using the term “anatomic” to describe response.
When response to therapy is evaluated, it is recommended that the metabolic, anatomic, imaging (metabolic response, anatomic response, or combination of both when both available), and overall (used to determine endpoints, integrating clinical data when available) responses be assessed and recorded. In order to differentiate anatomic and overall responses—which currently are using the same terminology—it was suggested to incorporate “anatomic” when recording the anatomic response. Thus, the anatomic response is now referred to as complete anatomic response (CAR), partial anatomic response (PAR), stable anatomic disease (SAD), and progressive anatomic disease (PAD). Metabolic response remains defined as complete metabolic response or partial metabolic response (CMR or PMR, respectively), no metabolic response (NMR, preferred term, because “stable disease” usually refers to radiographic stability) or stable metabolic disease (SMD), and progressive metabolic disease (PMD). The overall response remains defined as complete response, partial response, stable disease, and progressive disease. Thus, it is clear what each component of the response is, and how each component complementarily results in the overall response.
ROLE OF THE REVIEWERS: EXPERIENCE AND QUALIFICATIONS, TRAINING, AND MONITORING
Imaging Reviewers Qualifications and Experience
Dependent on the read requirements of a clinical protocol, the imaging reviewers should meet certain qualifications, including documentation of competency in diagnostic CT or PET/CT.
Reviewers should be board-eligible (BE) or board-certified (BC) nuclear medicine physicians (or the regional/national equivalent) with experience or certification in CT/MRI, or BE/BC radiology physicians with experience or training in PET/CT imaging.
Clinical Reviewers Qualifications and Experience
Although Lugano classification does not specifically recommend separate imaging and clinical reviews, if a hematology–oncology review is requested, then the selection of clinical reviewers should meet prespecified qualifications including the credentials as a BE or BC physician in hematology or oncology (or the regional/national equivalent).
Additional experience in clinical care of hematologic malignancies—either through clinical practice or in clinical trials—is required.
In addition, all reviewers, both imaging and clinical, should provide documented evidence of prior clinical experience with lymphomas and clinical trial participation in lymphoma studies on their CV or through attestations of participation. In cases where a reviewer may have no prior experience in clinical trial reads, a program of appropriate training about the application of the Lugano classification in the context of clinical trials and including test cases is required.
Close monitoring of on-trial performance is recommended for all reviewers, both imaging and clinical, regardless of training or experience.
Role of the Imaging Reviewer
The role of a blinded independent central reviewer (BICR) is to provide independent review of cases without bias or unblinding to treatment. It is recommended, when possible, that the reviewer remains the same throughout the reads of all time points for a patient. Where feasible, it is ideal to have the same reviewer provide assessment of both the 18F-FDG PET/CT and the diagnostic CT throughout the entire study for an individual patient basis. If separate reads of diagnostic CT and 18F-FDG PET/CT occur, it is recommended that both readers meet for an integration read of anatomic and metabolic assessments that should be conducted to provide 1 patient-level imaging time-point assessment.
Whenever there are 2 BICRs evaluating scans from the same patient and modality, a third independent reviewer (adjudicator) should be assigned to review the scans in cases of time-point assessment discrepancies to resolve any disagreements that would impact the overall time-point responses.
During an adjudication event, the adjudicator should select which reader he or she most closely agrees with, rather than providing a third independent assessment, and a rationale for the selection should be provided. Alternative adjudication workflows exist, which are beyond the scope of this article.
Reviewers Training and Monitoring
Recommended activities that both imaging and clinical reviewers should complete before the start of on-study reads include training on Lugano classification (and any protocol-specified modifications or clarifications) and on completion of imaging case report forms as well as familiarization with workstation usage and group review of clinical cases for formulating consensus on scan interpretation and time-point responses.
Borderline and challenging cases should be involved in the training; the number of cases to be included should be dependent on the study design and experience of reviewers with the response criteria (best practice is to consider 3 cases as a minimum and it should be more especially in the case of less-experienced readers or more complex studies), being mindful that statistics on such small sample of training may not be significant.
Monitoring (e.g., intra- and interreader variability, adjudication rates) is recommended per guidance documents of the Food and Drug Administration (26) and should be performed for all reviewers regardless of training or experience. Members from the PINTaD recently published additional information on reader variability and monitoring of performance (27,28). Reader monitoring should start early in the course of the trial to allow for timely retraining when necessary. Group retraining is recommended on the basis of monitoring results or as periodic follow-up group retraining or reviews to ensure that all readers are provided with identical information to ensure systematic discordance is not introduced.
An example of an imaging case report form and a summary of recommendations can be found in the supplemental materials and Supplemental Table 1, respectively.
CONCLUSION
The PRoLoG initiative has created a platform to gather recommendations from an international group of recognized imaging and clinical expert end-users from academia and industry in the field of lymphoma response assessment to standardize application of the Lugano classification in clinical trials and beyond.
These recommendations are intended for clinical users, at local sites and central facilities, in academic and pharmaceutical clinical trials to enhance standardized acquisition and evaluation with the Lugano classification, facilitating conduction of clinical trials and regulatory review, ultimately leading to improved lymphoma patient outcome.
DISCLOSURE
Sally Barrington acknowledges support from the National Institute for Health and Care Research (NIHR) (RP-2-16-07-001) and by core funding from the Wellcome/EPSRC Centre for Medical Engineering at King’s College London (WT203148/Z/16/Z) and the NIHR Biomedical Research Centre based at Guy’s and St. Thomas’ NHS Foundation Trust and King’s College London and the NIHR Clinical Research Facility. The views expressed by Professor Barrington are not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. This project was sponsored by PINTaD, and the views expressed are those of the authors, not necessarily of their institution. Fabien Ricard, employed by Bayer at the time of first submitting the manuscript for this article, is now an employee and has shares of Relay Tx. Paul Galette is employed by and a shareholder of GSK. Greg Goldmacher is employed by Merck and has stocks in Merck, ImmunoGen, and Aveo. Julie Gillis, formerly employed by Imaging Endpoints, is at Merigold LLC. Pierre Terve is with Keosys Medical Imaging. Min Liu has shares at Autolus. Larry Schwartz reports to BMS, Regeneron, and Merck as a member of an independent review panel and data safety monitoring board for clinical trials. Jayant Narang and Rudresh Jarecha, formerly employed by Calyx, are with Takeda Pharmaceuticals and Deciphera Pharmaceuticals, respectively. Ron Korn is owner and CMO at Imaging Endpoints Core Lab, a consultant for the Virginia G. Piper Cancer Center and ImaginAB Technologies, and a shareholder of Verve Medical, Telelite Health, Globavir, and Renibus. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: How can the Lugano classification be consistently applied among imaging end-users?
PERTINENT FINDINGS: These consensus recommendations should be used as a companion to the Lugano classification with regards to required imaging series and scan visits and acquisition and reconstruction of PET images. The roles of imaging and clinical reviewers as well as of training and monitoring are clarified.
IMPLICATIONS FOR PATIENT CARE: This guidance will enhance usage of the Lugano classification, facilitating the conduction of clinical trials and regulatory review, ultimately leading to improved lymphoma patient outcome.
ACKNOWLEDGMENTS
We thank all the members from the PINTaD who participated to the PRoLoG initiative, especially Klaus Noever, Melissa Burkett, Anand Devasthanam, Nick Enus, Andres Forero Torres, Nick Galante, Sayali Karve, Katarina Ludajic, Michael ONeal, Ravikanth Mankala, Daniel Mollura, and Jason Vilardi. In addition, we thank Tina Nielsen and John Sunderland for their review of the manuscript, as well as the original authors of the classification and companion paper. Finally, we thank Deming Litner for her administrative support and the CDISC group for discussion on terminology for anatomic and metabolic responses.
Footnotes
Published online Jul. 14, 2022.
- © 2023 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication March 11, 2022.
- Revision received July 7, 2022.