Abstract
3351
Introduction: Automated and reliable lesion segmentation from whole-body PET images is a major prerequisite to wide spread use of imaging biomarkers in clinical practice. The time-consuming task of segmentation is among the important barriers in the dissemination of quantitative PET measures (i.e. Total Metabolic Tumor Volume). The challenges of data-hungry supervised (SUP) AI techniques for segmentation include: 1) lack of consensus on ground truth (GT) generation; 2) time-consuming task of manual labeling that suffers from intra- and inter-observer and center variabilities; 3) lack of precision at the edges of predicted masks (PMs) that can be caused by imprecise GTs and improper loss functions. Deep networks are typically trained (supervised) on domain-specific data with limited diversity compared to its target application, hampering generalizability to “unseen” data. In the absence of substantial annotated data, unsupervised (UNSUP) and semi-supervised (SEMI) approaches can be helpful.
Methods: We used a fuzzy clustering loss function (Chen et al. 2021) i.e. fuzzy C-means (FCM), for training our 3D U-Net, incorporating the fuzziness of overlap between foreground and background classes. We used the robust FCM (RFCM) loss that integrates the spatial constraints by using the regularization term based on Markov-random-fields for penalizing the changes in the neighborhood voxels. Table 1 lists the loss functions and approaches we used in this study. FCM loss functions can be used for training convolutional neural networks (CNNs) with different levels of supervision i.e. SUP (FCMLabel), UNSUP (RFCM), and SEMI (FCMLabel+RFCM). SUP FCM loss is calculated based on the softmax output of the PM and GT labels, while for UNSUP, it characterizes the intensity statistics of the input image and the 2-channel softmax activation function of the final layer of the 3D U-Net in the objective function of RFCM, independent of GT labels. Each channel of softmax activation represents the probability of being classified as lymphoma lesion and background. Parameter q in RFCM, enables a voxel to be in multiple classes that is helpful to quantify partial volume effect in PET images where a voxel usually contains uptake information from more than one class.
Our diffuse large B-cell lymphoma (DLBCL) cases are from two different centers (center 1 (n=42) and center 2 (n=80)), with different scanners & voxel sizes. PET images were interpolated to a common resolution of 1 mm3by linear interpolation and transformed independently using Z-score normalization, performed on each image with the mean and standard deviation of non-zero voxels of body region. The performance of 3D U-Net was evaluated for SUP, SEMI and UNSUP approaches by Unified focal (UF), FCM, Mumford-Shah (MS) (Kim et al.) and Dice Similarity Coefficient (DSC) based losses (corresponding hyper-parameters mentioned in Table 1).
Results: SUP approaches with UF, DSC and FCM losses were trained on (n=92) labeled data, and 3D U-Net with UF loss outperformed all other approaches on test data from center 2 (n=30) (Figure 1 & Table 1). SEMI approach consisting of UNSUP RFCM loss + SUP loss (RFCM+FCMLabel) outperformed the UNSUP model (in terms of Dice score). The Dice score of SEMI model (RFCM+FCMLabel) (Dice=0.64±0.26) is close to SUP with DSC-loss (Dice=0.67±0.13), the former using less annotated data. The performance of SEMI model (RFCM+FCMLabel) in terms of Hausdorff Distance are lower compared to SUP approach and higher compared to SEMI approach (MS+DSC).
Conclusions: Given the wider availability of unlabeled PET data, it is possible to leverage the former via UNSUP and SEMI approaches, towards automated segmentation of tumors. In particular, we showed that a SEMI approach, with an UNSUP loss function (RFCM) coupled to a SUP loss (FCM), can achieve promising performance, expected to further improve when utilizing the significantly larger number of unlabeled data that we are considering in our future work.