Ventilator-associated pneumonia (VAP) is the most common and fatal nosocomial infection in intensive care units (ICUs). Existing methods for identifying VAP display low accuracy, and their use may delay antimicrobial therapy. VAP diagnostics derived from machine learning (ML) methods that utilize electronic health record (EHR) data have not yet been explored. The objective of this study is to compare the performance of a variety of ML models trained to predict whether VAP will be diagnosed during the patient stay.
A retrospective study examined data from 6126 adult ICU encounters lasting at least 48 hours following the initiation of mechanical ventilation. The gold standard was the presence of a diagnostic code for VAP. Five different ML models were trained to predict VAP 48 hours after initiation of mechanical ventilation. Model performance was evaluated with regard to the area under the receiver operating characteristic (AUROC) curve on a 20% hold-out test set. Feature importance was measured in terms of Shapley values.
The highest performing model achieved an AUROC value of 0.854. The most important features for the best-performing model were the length of time on mechanical ventilation, the presence of antibiotics, sputum test frequency, and the most recent Glasgow Coma Scale assessment.
Supervised ML using patient EHR data is promising for VAP diagnosis and warrants further validation. This tool has the potential to aid the timely diagnosis of VAP.