Features

Breaking Down 3 Types of Healthcare Natural Language Processing

Natural language processing, understanding, and generation may help healthcare stakeholders make better use of their wealth of unstructured, text-based data.

Source: Thinkstock

- Healthcare generates massive amounts of data as patients move along their care journeys, often in the form of notes written by clinicians and stored in EHRs. These data are valuable to improve health outcomes but are often difficult to access and analyze.

Natural language processing (NLP) technologies provide a potential solution, as these tools can help care teams and researchers sift through mountains of data and generate meaningful insights for applications in population health management and clinical decision support.

The concept of NLP use in healthcare has been the subject of much hype as clinician burnout and frustrations with existing EHR systems plague the industry, but the two major components of NLP—natural language understanding (NLU) and natural language generation (NLG) have garnered less attention.

Below, HealthITAnalytics will take a deep dive into NLP, NLU, and NLG, differentiating between them and exploring their healthcare applications.

DIFFERENTIATING NLP, NLU, AND NLG

IBM characterizes NLP, NLU, and NLG as related but distinct concepts. Broadly, NLU and NLG are subsets of NLP.

NLP leverages methods taken from linguistics, artificial intelligence (AI), and computer and data science to help computers understand verbal and written forms of human language. Using machine learning and deep-learning techniques, NLP converts unstructured language data into a structured format via named entity recognition.

Named entity recognition is a type of information extraction that allows named entities within text to be classified into pre-defined categories, such as people, organizations, locations, quantities, percentages, times, and monetary values.

Through named entity recognition and the identification of word patterns, NLP can be used for tasks like answering questions or language translation.

As a component of NLP, NLU focuses on determining the meaning of a sentence or piece of text. NLU tools analyze syntax, or the grammatical structure of a sentence, and semantics, the intended meaning of the sentence. NLU approaches also establish an ontology, or structure specifying the relationships between words and phrases, for the text data they are trained on.

Syntax, semantics, and ontologies are all naturally occurring in human speech, but analyses of each must be performed using NLU for a computer or algorithm to accurately capture the nuances of human language.

NLU is often used in sentiment analysis by brands looking to understand consumer attitudes, as the approach allows companies to more easily monitor customer feedback and address problems by clustering positive and negative reviews.

While NLU is concerned with computer reading comprehension, NLG focuses on enabling computers to write human-like text responses based on data inputs.

NLG tools typically analyze text using NLP and considerations from the rules of the output language, such as syntax, semantics, lexicons, and morphology. These considerations enable NLG technology to choose how to appropriately phrase each response.

NLG is used in text-to-speech applications, driving generative AI tools like ChatGPT to create human-like responses to a host of user queries.

HEALTHCARE USE CASES

The potential benefits of NLP technologies in healthcare are wide-ranging, including their use in applications to improve care, support disease diagnosis, and bolster clinical research.

One of the most promising use cases for these tools is sorting through and making sense of unstructured EHR data, a capability relevant across a plethora of use cases.

Using data extracted from EHRs, NLP approaches can help automate quality measures for heart failure, identify bias in EHR-based opioid misuse classifiers, assess COVID-19 complications, predict severe maternal morbidity, gain insights into bipolar disorder, support public health research, and identify contributing factors to patient safety events.

NLP is also being leveraged to advance precision medicine research, including in applications to speed up genetic sequencing and detect HPV-related cancers.

Currently, a handful of health systems and academic institutions are using NLP tools. The University of California, Irvine, is using the technology to bolster medical research, and Mount Sinai has incorporated NLP into its web-based symptom checker.

NLU has been less widely used, but researchers are investigating its potential healthcare use cases, particularly those related to healthcare data mining and query understanding.

In particular, research published in Multimedia Tools and Applications in 2022 outlines a framework that leverages ML, NLU, and statistical analysis to facilitate the development of a chatbot for patients to find useful medical information.

Another case study demonstrates that NLU may be helpful in interpreting drug therapy information from discharge summaries.

Like NLU, NLG has seen more limited use in healthcare than NLP technologies, but researchers indicate that the technology has significant promise to help tackle the problem of healthcare’s diverse information needs.

NLG could also be used to generate synthetic chief complaints based on EHR variables, improve information flow in ICUs, provide personalized e-health information, and support postpartum patients.

BARRIERS TO ADOPTION

Despite the promise of NLP, NLU, and NLG in healthcare, these technologies have limitations that hinder deployment.

Many of these are shared across NLP types and applications, stemming from concerns about data, bias, and tool performance.

Researchers writing in the Canada Communicable Disease Report noted that NLP shares one major limitation with AI, ML, and other advanced analytics technologies: data access and quality. The availability of appropriate and high-quality data is key to training NLP tools, and while accessible biomedical datasets exist, they can be limited by data type or research area.

The authors further indicated that failing to account for biases in the development and deployment of an NLP model can negatively impact model outputs and perpetuate health disparities. Privacy is also a concern, as regulations dictating data use and privacy protections for these technologies have yet to be established.

The researchers note that, like any advanced technology, there must be frameworks and guidelines in place to make sure that NLP tools are working as intended. However, these frameworks and guidelines also have yet to be developed.

In addition to these challenges, one study from the Journal of Biomedical Informatics stated that discrepancies between the objectives of NLP and clinical research studies present another hurdle.

NLP tools are developed and evaluated on word-, sentence-, or document-level annotations that model specific attributes, whereas clinical research studies operate on a patient or population level, the authors noted. While not insurmountable, these differences make defining appropriate evaluation methods for NLP-driven medical research a major challenge.

NLP technologies of all types are further limited in healthcare applications when they fail to perform at an acceptable level.

Technologies and devices leveraged in healthcare are expected to meet or exceed stringent standards to ensure they are both effective and safe. In some cases, NLP tools have shown that they cannot meet these standards or compete with a human performing the same task.

One study published in JAMA Network Open demonstrated that speech recognition software that leveraged NLP to create clinical documentation had error rates of up to 7 percent. The researchers noted that these errors could lead to patient safety events, cautioning that manual editing and review from human medical transcriptionists are critical.

Likewise, NLP was found to be significantly less effective than humans in identifying opioid use disorder (OUD) in 2020 research investigating medication monitoring programs. Overall, human reviewers identified approximately 70 percent more OUD patients using EHRs than an NLP tool.

NLP models also struggle to extract meaningful, healthcare-related insights from social media data, a potentially valuable data source for medical researchers, according to a 2018 study published in the Journal of the American Medical Informatics Association.

Despite these limitations to NLP applications in healthcare, their potential will likely drive significant research into addressing their shortcomings and effectively deploying them in clinical settings.