Clustering of Largely Right-Censored Oropharyngeal Head and Neck Cancer Patients for Discriminative Groupings to Improve Outcome Prediction

Supervised Scaling Approach. The clustering implementation here shown is consensus clustering over 1k runs of the k-median (k = 2) clustering method using different initial seeds and Manhattan distance as the dissimilarity measure.

Authors: Tosado, J., Zdilar, L., Elhalawani, H., Elgohari, B., Vock, D., Marai, G.E., Fuller, C.DMohamed, A.S., Canahuate, G.

Publication: Scientific Reports


Clustering is the task of identifying groups of similar subjects according to certain criteria. The AJCC staging system can be thought as a clustering mechanism that groups patients based on their disease stage. This grouping drives prognosis and influences treatment. The goal of this work is to evaluate the efficacy of machine learning algorithms to cluster the patients into discriminative groups to improve prognosis for overall survival (oS) and relapse free survival (RfS) outcomes. We apply clustering over a retrospectively collected data from 644 head and neck cancer patients including both clinical and radiomic features. In order to incorporate outcome information into the clustering process and deal with the large proportion of censored samples, the feature space was scaled using the regression coefficients fitted using a proxy dependent variable, martingale residuals, instead of follow-up time. Two clusters were identified and evaluated using cross validation. The Kaplan Meier (KM) curves between the two clusters differ significantly for OS and RFS (p-value < 0.0001). Moreover, there was a relative predictive improvement when using the cluster label in addition to the clinical features compared to using only clinical features where AUC increased by 5.7% and 13.0% for OS and RFS, respectively.

Funding: NIH NCI-R01CA214825, NIH NCI-R01CA2251

Date: March 2, 2020

Document: View PDF

Related Entries



Related Categories