Analytics in Action News

Natural Language Processing Helps Assess COVID-19 Complications

Researchers used natural language processing to examine EHR data and determine COVID-19 related complications.

Natural Language processing EHR data COVID-19

Source: Thinkstock

By Erin McNemar, MPA

- To understand the relationship between pre-existing conditions and complications of COVID-19 infection, researchers are using natural language processing (NLP) to sift through unstructured EHR data.

“While the structured EHR fields such as ICD codes are modestly informative, the true context of comorbidities and complications is buried in the millions of unstructured patient notes,” the authors explained in the study.

“In this study, we have leveraged ‘augmented curation’ of EHR notes in COVID-19 patients to map the relationships between complications, comorbidities, and outcomes in the hospitalized COVID-19 patients and non-COVID-19 hospitalized matched controls.”

The researchers examined data from 1,803 patients that were hospitalized with COVID-19 from March 12, 2020, to September 15, 2020. Beginning with the positive COVID-19 case, researchers looked at the clinical notes of each patient, comparing pre-COVID-19 and post-COVID-19 phases.

Then using the deep language model, researchers used the 20 risk factors for COVID-19 severe illness previously reported by the Centers for Disease Control and Prevention (CDC) and the 18 COVID-19-associated complications to analyze their association in the cohort.

The authors included the general characteristics of the study population.

“All age groups are included and, as expected from the severity of the disease in different age groups, more than 35.6% of the patients were over 65-year-old with only 3.7% under 19. Female, male and different ethnic origins of the US population are adequately represented,” the authors wrote.

“The most frequent comorbidities were hypertension (500 patients, 27.7%), type 2 diabetes mellitus (278 patients, 15.4%), obesity (227 patients, 12.6%), and cancer (254 patients, 14.1%), reflecting the most common causes of chronic diseases in the US.”

According to the analysis, the most common COVID-19 complications recorded in the data were respiratory followed by cardiovascular, acute kidney injury, anemia, sepsis, and diabetic decompensation/hyperglycemia.

Comorbidities linked to complications within the first month post-infection were hypertension, cardiovascular chronic disease, anemia, and chronic kidney disease. Additionally, the researchers investigated indicators of long-term complications of COVID-19 infection.

“In the case of pleural effusion, which remains the most frequent complication, the prevalence decreases from 4.9% (89 patients) during the early onset time period (days 1–30) to <1% (20 patients) during the later onset time periods (days 31–90),” the authors said.

“In particular, patients with cardiomyopathy (2/56), chronic kidney disease (4/235), coronary artery disease (3/112), heart failure (3/138), and hypertension (5/499) appear to be more susceptible. Patients with liver disease, stroke, and type 1 diabetes also appear to be more susceptible to complications during days 31–90 post-infection.”

The researchers also examined EHR data from patients who were not infected with COVID-19, analyzing the same risk factors indicated by CDC. According to the study, hypertension is the single most significant risk factor of all the complications except for deep vein thrombosis, in line with what previous studies have reported.

“Specifically, our data suggest that a recent history of hypertension is the strongest predictor of ARDS, the most significant and life-threatening complication of COVID-19, among hospitalized COVID-19 patients, similar to previous observations,” the authors said.

The researcher said that due to the richness and complexity of information in clinical notes, there are multiple ways to advance the research and development of natural language processes. Additionally, natural language processing methods could be used to determine other health indicators from clinical notes including disease severity and quality of life.