Abstract
1051
Objectives: We aimed to identify distinct disease progression pathways in Parkinson’s disease (PD), making use of clinical and imaging features, towards improved understanding of disease and powering of clinical trials. In addition, we studies machine learning approaches to predict progression pathways from early (year 0 and 1) data.
Methods: We studied 885 PD-subjects derived from longitudinal datasets (years 0, 1, 2 & 4; Parkinson’s Progressive Marker Initiative). We generated and analyzed 980 features, including Movement Disorder Society’s Unified Parkinson's Disease Rating Scale (MDS-UPDRS) measures, a range of task/exam performances, socioeconomic/family histories, and radiomics features (RFs) extracted for each region-of-interest (ROI; left and right caudate as well as putamen) using our standardized SERA radiomics software. Segmentation of ROIs on DAT SPECT images were performed via MRI images. After performing cross-sectional clustering to identify disease subtypes (3 sub-clusters robustly identified in our prior work, namely i) mild, ii) intermediate, and ii) severe) for any given patient in any given year, we performed identification of optimal longitudinal pathways by applying a hybrid system (HS) including Principal Component Analysis (PCA) as a dimension reduction algorithm (DRA), and Hierarchical Agglomerative Clustering (HAC) as a clustering method, to the longitudinal dataset. To optimize the number of longitudinal trajectories (clusters), we applied the Elbow clustering evaluation method to our results (for a range of 2-9 longitudinal clusters/pathways) as generated by HSs including PCA+K-Means Algorithm (KMA) as well as PCA+HAC. Our optimized number of pathways were further confirmed by two other methods: Bayesian Information Criteria (BIC) and Calinski Harabatz Criteria (CHC) as applied on clustering results provided by KMA. Subsequently, prediction of the identified trajectories based on early years (data in year 0 and 1) was performed using multiple HSs including 16 DRAs coupled to 10 classifiers.
Results: Our analysis revealed significant heterogeneity in disease progression. We identified 3 distinct progression trajectories. The pathways included those with (i,ii) disease escalation (2 pathways, 27% and 38% of patients) and (iii) stable disease (1 pathway, 35% of patient). For prediction from early year data (years 0 and 1), HSs including the stochastic neighbor embedding algorithm (SNEA, as a DRA) as well as locally linear embedding algorithm (LLEA, as a DRA), linked with the new probabilistic neural network classifier (NPNNC, as a classifier) resulted in accuracies of 78.4% and 79.2% respectively, while other HSs such as SNEA+Lib_SVM (Library for Support Vector Machines) and t_SNE (t-distributed Stochastic Neighbor Embedding)+NPNNC resulted in 76.5% and 76.1% respectively.
Conclusions: We demonstrated that appropriate HS frameworks enabled identification of disease progression (3 distinct longitudinal trajectories) as well as robust prediction of disease progression in PD subjects.