Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction
In order to evaluate mortality predictions based on boosted trees, this retrospective study uses electronic medical record data from three academic health centers for inpatients 18 years or older with at least one observation of each vital sign. Predictions were made 12, 24, and 48 hours before death. Models fit to training data from each institution were evaluated using hold-out test data from the same institution, and from the other institutions. Gradient-boosted trees (GBT) were compared to regularized logistic regression (LR) predictions, support vector machine (SVM) predictions, quick Sepsis-Related Organ Failure Assessment (qSOFA), and Modified Early Warning Score (MEWS) using area under the receiver operating characteristic curve (AUROC). For training and testing GBT on data from the same institution, the average AUROCs were 0.96, 0.95, and 0.94 across institutional test sets for 12-, 24-, and 48-hour predictions, respectively. When trained and tested on data from different hospitals, GBT AUROCs achieved up to 0.98, 0.96, and 0.96, for 12-, 24-, and 48-hour predictions, respectively. Average AUROC for 48-hour predictions for LR, SVM, MEWS, and qSOFA were 0.85, 0.79, 0.86 and 0.82, respectively. GBT predictions may help identify patients who would benefit from increased clinical care.