Abstract
P594
Introduction: In the current clinical practice, Gleason Score (GS) in needle biopsy is the main determining factor for the decision of radical prostatectomy (RP), as the first step in a multitherapeutic approach. A promising standardized workflow is in urgent need to be well established for the aid of precise clinical decisions. In this scenario, multi-omics, as an integration of diverse omics data, gives urologists comprehensive insights into various aspects of prostate cancer in a more microscopic and detailed pattern, which compass the genetic signatures from genomics, protein abundance from proteomics, molecular behavior from radiomics. In this study, we aim at deploying a high-throughput machine learning model to merge multiple omics data (including radiomics data, genomics data and proteomics data) and predict the Gleason grading of primary prostate cancer for smart and precise patient stratification.
Methods: Patients and Materials From May 2014 to April 2020, 146 patients with newly diagnosed PCa in the Vienna General Hospital were retrospectively enrolled in this study, all of whom underwent 68Ga-PSMA-11 PET/MR scans before radical prostatectomy in the nuclear medicine department of Vienna General Hospital. After the surgery, patients were followed up until 1st December 2021.
Radiomics Data Acquisition To ensure robustness, standardization and to avoid intra-operator and inter-operator variability, an automated U-net-based semantic segmentation algorithm was employed to automatically delineate the prostate based on T2WI images and subsequently perform a PET-based lesion segmentation on the previously created whole-prostate mask.
Genomics Data Acquisition DNA was isolated from the formalin-fixed paraffin-embedded tissue (FFPE) samples derived from the radical prostatectomy. And whole exome sequencing (WES) analysis was performed.
Proteomics Data Acquisition We constructed the tissue microarray from RP specimens in a good pattern which allows a homogeneous fixation for the subsequent immunohistochemistry. Immunohistochemical analysis was performed with PCa-specific biomarkers.
ML-based Data Integration The EBM classification algorithm were applied to the integration of clinical parameters, radiomics data, genomics data and proteomics data. The classification results were validated using 10-fold Monte Carlo cross-validation to ensure robustness of performance metrics. To improve the interpretability of the model, relevant feature importance was derived using permutation feature importance.
Results: Based on the potential prognostic genomic markers (tumor mutational burden and copy number variant burden) and clinical parameters (BCR status and ISUP), the heatmap (Fig. 1) showed the distribution of genes with the frequency of ≥10% cases are substantially sparse and there were no significant or meaningful correlation with the prognostic biomarkers. So, in the subsequent ML-based analysis, only pathway-level data will be used to conform to the balanced grouping standards of the machine learning system.
In the trial phase, the SNS, SPC, PPV, NPV, ACC and AUC of three ML-based approaches were respectively calculated and the resulting EBM classification algorithm gave the best performance with 0.75, 0.88, 0.75, 0.88, 0.83 and 0.81 accordingly as shown in Fig. 2. Then 10-fold Monte Carlo cross-validation ensures the robustness of performance metrics.
The performance of needle biopsy to predict Gleason grading is 0.62, 0.92, 0.81, 0.81 and 0.77 respectively. Compared to the performance of EBM model, the SNS, NPV, ACC and AUC were elevated by 13%, 7%, 2% and 4% while SPC, PPV were decreased by 4% and 6%.
Conclusions: In conclusion, our findings suggest that our multiomics-based machine learning model has the better performance for the prediction of Gleason grading than the current clinical baseline, which potentially facilitates the clinical decision-making and personalized management of prostate cancer.