Analytics in Action News

Machine Learning Predicts C. Diff Infection

By studying electronic health records, researchers found that machine learning can effectively predict which patients will develop C. diff.

machine learning electronic health records

Source: Getty Images

By Erin McNemar, MPA

- According to recently published data from the Association for Professionals in Infection Control (APIC), several commonly used machine learning algorithms (MLAs) can accurately predict which hospitalized patients will become infected with Clostridiodes difficile (C. diff).

The new information could support infection prevention and early diagnosis, as well as the timelier implementation of infection control measures to minimize C. diff spread.

“Our study findings suggest that MLAs could play a significant role in reducing the clinical and economic impact of healthcare-associated infections such as C. diff by providing early predictions of at-risk patients prior to them developing serious complications,” Vice President of Science at Dascena, Jana Hoffman, said in a press release.

“These data are consistent with a growing body of evidence that validates artificial intelligence and MLAs as integral components of healthcare management that can improve patient outcomes and assist time-constrained clinicians in providing the best patient care.”

C. diff infection (CDI) is the leading cause of hospital-acquired diarrhea and is associated with significant morbidity, mortality, and healthcare costs. Currently, there isn’t a gold standard tool to assess a patient’s risk of CDI.

For this study, the research team used a database made up of electronic health record (EHR) patient data from over 700 hospitals to train and evaluated three different machine learning and deep learning methods.

The researchers initially examined various models of each of the methods to determine whether they could effectively predict CDI among hospitalized patients using early inpatient data. The team then used an external dataset to evaluate the generalizability of the best-performing MLA models.

The results indicated that MLAs could predict CDI with excellent discrimination using only the first six hours of inpatient data.

“Among the three methods studied, a machine-learning method called XGBoost provided the highest overall accuracy in predicting CDI, despite being the least complex model. XGBoost also demonstrated generalizability by maintaining its predictive performance in an external dataset,” the press release stated.

“The other two methods researchers evaluated, neural networks known as Deep Long Short Term Memory (D-LSTM) and one-dimensional convolutional neural network (1D-CNN), also demonstrated high levels of predictive accuracy, though were less generalizable.”

The best-performing XGBoost, D-LSTM, and 1D-CNN models used similar features to predict CDI among patients, all of which were previously identified as risk factors. In the study, age was the leading CDI risk factor, followed by clinical measurements including sodium, body mass index, white blood cell count, and heart rate; active treatment with antibiotics or proton pump inhibitors; glycated hemoglobin; and race.

“This study supports earlier research suggesting that MLAs provide reliable infection-risk prediction that can empower clinical teams to implement appropriate infection control measures at earlier time points and thereby improve healthcare outcomes,” said Linda Dickey, RN, MPH, CIC, FAPIC, and 2022 APIC president.