Healthcare Analytics, Population Health Management, Healthcare Big Data

Analytics in Action News

Challenges Persist in Training Deep Learning Models on EHR Data

Despite early success in using deep learning in areas of healthcare analytics, researchers still face barriers when using EHR data to develop these models.

Challenges persist in training deep learning models on EHR data

Source: Thinkstock

By Jessica Kent

- Deep learning models have demonstrated early potential in improving healthcare analytics, but researchers still have to overcome significant challenges when using electronic health record (EHR) data to develop these models, a study published in JAMIA found.

Deep learning models have shown success in several areas of care delivery, including image classification and disease prediction. The large and complex datasets that are available in healthcare can help facilitate the training of deep learning models. However, training a deep learning model using EHR data could present unique challenges for researchers, due to data quality issues and heterogeneity of data elements.

To evaluate the recent development of deep learning models for EHR data, the researchers conducted a literature review of studies published between January 2010 and January 2018.

The team assessed the tasks that investigators are carrying out with deep learning, including disease detection and classification, sequential prediction of clinical events, concept embedding, data augmentation, and EHR data privacy.

The group saw that in each case they evaluated, special challenges arose from EHR data, including issues with temporality, irregularity, and multiple modalities.

Temporality and irregularity issues occurred when investigators used longitudinal EHR data, the researchers said.

“Some studies found that patient records vary significantly in terms of data density, since events are irregularly sampled. Such irregularity, if not properly handled, would affect the model performance,” the team wrote.

To overcome this issue, the group said that some researchers used an algorithm that would measure the similarity between two temporal sequences of varying speeds and align the sequences within the EHR.

The researchers on the 2018 study also found that investigators had issues when EHR data came from multiple modalities, including unstructured clinical notes, lab tests, and medical images.

“Researchers have confirmed that finding patterns among multimodal data can increase the accuracy of diagnosis, prediction, and overall performance of the learning system. However, multimodal learning is challenging due to the heterogeneity of the data,” the team said.

Investigators often employed a multitask learning approach to improve this problem. Multitask learning allows models to jointly learn data across multiple modalities. Certain neurons in the neural network model are shared among all tasks, while other neurons are specialized for specific tasks, which could be specific types of data modalities.  

A lack of labels on EHR records also presented a major barrier to deep learning models, the researchers found. Labels refer to the target of interest, such as true disease phenotypes or true states of clinical outcomes. These labels are often not consistently captured in EHR data and aren’t typically available in large numbers for training models.

The team stated that manually crafting labels could provide a solution to this problem, as well as transfer learning. Transfer learning takes a model’s learned knowledge and transfers this information to new datasets to accomplish similar tasks.

Moreover, researchers identified a lack of transparency and interpretability of deep learning models as a significant challenge.

“Although deep learning models can produce accurate predictions, they are often treated as black-box models that lack interpretability and transparency of their inner working,” the group wrote. “This is an important problem because clinicians often are unwilling to accept machine recommendations without clarity as to the underlying reasoning.”

To combat this issue, the team said that some researchers used a knowledge distillation approach, which compresses the knowledge learned from a complex model into a simpler model that is easier to deploy and interpret.

The findings demonstrate that while deep learning models have improved machine learning performance and health analytics in many domains, the technology still has several barriers to overcome.

“Results from the reviewed articles have shown that as compared to other machine learning approaches, deep learning models excel in modeling raw data, minimizing the need for preprocessing and feature engineering, and significantly improving performance in many analytical tasks,” the researchers concluded.

“However, although deep learning techniques have shown promising results in performing many analytics tasks, several open challenges remain.” 

X

Join 25,000 of your peers

Register for free to get access to all our articles, webcasts, white papers and exclusive interviews.

Our privacy policy


no, thanks

Continue to site...