- Using natural language processing (NLP) to translate confusing and complex medical jargon into everyday speech may help patients feel more in control of the information presented to them.
More than a third of American adults have limited health literacy, according to the 2006 National Assessment of Adult Literacy, which puts them at risk of failing to understand provider instructions about medications, post-discharge care, or recommended preventive services.
While access to patient health records through online portals attempts to solve this problem by giving patients the opportunity to review their personal information outside of the clinic setting or after a hospitalization, the data presented is often taken directly from sources intended primarily for physician use with little consumer-friendly interpretation.
Physicians have repeatedly expressed concern over patient misinterpretation of electronic health record (EHR) data when presented with initiatives such as the Open Notes project, which aims to give consumers complete access to their information.
Wary of having to spend extra time defining terms for patients or soothing the fears of individuals who misread a diagnosis, many providers have approached the patient data transparency question with caution.
Patients currently have few options other than asking their physicians for help, pointed out a team of researchers from Yale University, the University of Massachusetts, and Bedford VA Medical Center in a recent study published in JMIR Medical Informatics.
“The readability levels of health educational materials on the Internet often exceed the level that is easily understood by the average patient,” said the study’s authors.
“For example, the term ‘nephrotic syndrome’ is defined in the US National Cancer Institute vocabulary as ‘a collection of symptoms that include severe edema, proteinuria, and hypoalbuminemia; it is indicative of renal dysfunction.’”
The majority of terms included in the definition of the condition are unlikely to be accessible to patients without a certain degree of medical education, let alone an individual with low literacy skills or one whose first language is not English.
However, by applying NLP algorithms to electronic health record data and adding the results to a patient portal or visit summary, providers could help bridge the gap between the requirements of clinical documentation and the way consumers need to interact with their personal records.
The researchers tested this theory by applying an NLP algorithm to clinical documents with the goal of matching specific medical terms to their lay-language equivalents.
While there is an existing Consumer Health Vocabulary (CHV) resource developed by correlating patient search results with medical terms, the team noted that the database is incomplete and is missing many terms often included in EHRs. Expanding the depth and breadth of available terminology dictionaries is one of the team’s major goals.
“We are building a lay language resource for EHR comprehension by including medical terms from EHRs and creating lay definitions for those terms,” the team explained. “This is a time-consuming process that involves collecting candidate definitions from authorized health educational resources, and curating and simplifying these definitions by domain experts.”
“Since the number of candidate terms mined from EHRs is large (hundreds of thousands of terms), we ranked candidate terms based on how important they are for patients’ comprehension of EHRs, and therefore prioritized the annotation effort of lexical entries based on those important terms.”
Adding to the challenge is the fact that familiar terms, such as “pneumonia,” may not need additional translation for patients, the study said, but compound terms including familiar words, like “community-acquired pneumonia,” may require explanation even if an individual understands each component within the phrase.
“The distinctions between important and non-important EHR terms in our task were more subtle than that between medical terms and nonmedical terms,” the team observed.
The complexity of compound terms, along with the familiarity of individual words commonly used by physicians, formed two of the criteria for the NLP algorithms to identify and address.
Using a set of training data curated by human domain experts, the team employed two different extraction and comparison methodologies to examine 7839 discharge summary notes containing 5.4 million words from a databank operated by the University of Pittsburgh.
Both NLP variations, including one algorithm using a technique called adaptive distance supervision (ADS) outperformed baseline systems on precision when presented with unlabeled evaluation data.
Many of the errors were caused by compound terms, the team noted.
The system “ranked some terms (such as ‘malignant cell,’ ‘chronic rhinitis,’ and ‘viral bronchitis’) as high, even though their meanings could be easily inferred from their component words,” the authors said. “It also ranked certain good compound terms (e.g., ‘community-acquired pneumonia,’ ‘end-stage kidney failure,’ and ‘left ventricular ejection fraction’) as low when these terms contained familiar words.”
“This suggests that advanced features generated by a compound term detector may improve the system’s performance, which we may explore in the future.”
Overall, however, performance was satisfactory. “The evaluation results on this dataset suggest that our ADS system is effective in ranking EHR terms and can be used to facilitate the expansion of lexical resources that support EHR comprehension,” the study said.
“In particular, it can be used to alleviate the data sparseness problem when there are very few target-domain training data and can be used to boost the performance of supervised learning when the size of the training data increases.”
The study did not attempt to develop a fully-functioning EHR translation tool ready to apply to patient data in a live clinical setting, but such a product or service may not be too far away from hitting the shelves.
Natural language processing is one of the fastest growing technology segments in healthcare, underpinning a wide variety of machine learning and artificial intelligence activities that aim to improve consumer engagement and simplify clinical documentation.
The global NLP marketplace is expected to be worth $4.3 billion by 2024, comprising just under 20 percent of the $22.7 billion artificial intelligence sector, says a series of recent market reports.
Potential applications for NLP in the EHR space range from computer-assisted coding and supporting physician dictation to flagging phrases within patient notes that indicate high-risk social determinants of health.
In order to ensure accuracy and reliability in real-world situations, however, many natural language processing tools will need to be refined further.
For the medical jargon translation tool in particular, the researchers suggest integrating more domain-specific knowledge to distinguish between truly confusing terms and those that are more likely to be understood by patients with lower health literacy levels.
As natural language processing becomes more advanced and better equipped to extract meaning from complex unstructured healthcare data, it may become a valuable partner for providers looking to help their patients understand their care without feeling overwhelmed or unable to participate in discussions about their personal health.