Healthcare Analytics, Population Health Management, Healthcare Big Data

Analytics in Action News

Is Smart Data Better than Bigger Data for Predictive Analytics?

Working smarter, not bigger, with electronic health record data could produce more effective and targeted predictive analytics for 30-day hospital readmissions.

By Jennifer Bresnick

- Bigger isn’t always better when it comes electronic health record data and predictive analytics, according to a new study from the University of Texas Southwestern. 

Big data analytics and predictive analytics

Researchers found that performing analytics with EHR data collected throughout a patient’s entire hospital stay was not significantly more accurate at predicting 30-day readmissions than data from just the first twenty-hours, suggesting that value over volume is an important mantra for more than just the financial aspects of healthcare.

A team of UT researchers examined nearly 33,000 patient admissions from 75 regional hospitals between 2009 and 2010, and found that 12.7 percent experienced a 30-day hospital readmission.

The study intended to examine how a patient’s in-hospital complications, such as hospital-acquired infections (HAIs), and their stability during discharge affected their risk of 30-day readmissions.  The researchers collected and analyzed EHR data from the entire course of the patient’s stay to test several different predictive analytics models.

They found that certain common risk factors, such as C. difficile infection, vital sign instability upon discharge, and longer length of stay in the hospital were positively correlated with an unplanned 30-day readmission.  Using full length-of-stay data resulted in a +2.4 likelihood ratio of readmission.

READ MORE: VA, DOE Launch Healthcare Big Data, Machine Learning Project

However, when they examined data from just the patient’s first twenty-four hours in the hospital, which often did not include adverse events that may have taken place later in the stay, they found that the 24-hour data was nearly as good at predicting readmissions on its own, producing a likelihood ratio of between +1.8 and 2.1.

“Our group’s previous research found that using clinical data from the first day of admission was more effective in predicting hospital readmissions than using administrative billing data,” said lead author Dr. Oanh Nguyen, Assistant Professor of Internal Medicine and Clinical Sciences at UT Southwestern in a press release.

“We expected that adding even more detailed clinical data from the entire hospitalization would allow us to better identify which patients are at highest risk for readmission. However, we were surprised to find that this was not the case.”

The results suggest that non-clinical factors, such as patient health literacy, behavioral health issues, the degree of discharge planning, and socioeconomic challenges after leaving the hospital, may have a greater-than-expected impact on which patients are most likely to return within the costly 30-day window.

Data on this issues is rarely included in the electronic health record, but may be vital for crafting truly effective predictive analytics and risk scores, added Dr. Ethan A. Halm, Chief of the William T. and Gay F. Solomon Division of General Internal Medicine and Chief of the Division of Outcomes and Health Services Research in the Department of Clinical Sciences at UT Southwestern.

READ MORE: Web-Based Health Risk Assessments Accurately Flag Depression

“More ‘big data’ alone did not make much of a difference,” he said. “Better models for predicting readmissions will require ‘better data’ on things like psychosocial and behavioral factors that are not currently captured in electronic health records.”

The UT Southwestern study adds to a growing body of evidence indicating that sophisticated big data analytics aren’t always enough to truly impact a patient’s health status. 

While predictive analytics can effectively flag high-risk patients with common conditions such as heart failure, providers must engage in care coordination strategies and frequent follow-up to ensure that patients have the ongoing support they need to maintain medication adherence and follow other discharge instructions.

An unrelated study from the American Journal of Managed Care (AJMC) found that a patient’s cognitive state during a hospitalization was directly correlated with their risk of readmission.  The use of tailored interventions, such as depression medication trials, referrals to substance abuse or psychiatric services, and specialist consultations for undiagnosed conditions helped to drop readmission rates by approximately five percent.

Research from 2013 also shows that patient education, along with provider care transition planning, can significantly reduce 30-day readmission rates.  When providers used health IT tools to communicate more effectively, conduct medication reconciliation, and ensured that patients understood their at-home care plans, readmission rates plummeted by more than twenty percent.

READ MORE: Penn Med. Population Health Analytics Dashboard Wins ECRI Prize

These results speak to one of the most difficult questions of big data analytics: how researchers can extract true value and actionable insights from large, unwieldy, and potentially inconsequential datasets. 

While many of the industry’s efforts have focused on breaking down data siloes that prevent data scientists from creating a big enough data pool for meaningful analytics, perhaps a more focused and selective approach to big data will be more effective.

“Raw data alone cannot lead to systematic improvement,” stated the National Quality Forum (NQF) in a recent white paper examining the barriers to better patient care.  “It has to be turned into meaningful information.”

Taking a more thoughtful approach to separating the signal from the noise can also help analysts avoid the trap of drawing false conclusions from large, multi-faceted datasets that might not actually relate to one another, noted Austin B. Frakt, PhD, and Steven D. Pizer, PhD, in an AJMC editorial.

“For instance, for every 5 million packages of x-ray contrast media distributed to healthcare facilities, about 6 individuals die from adverse effects,” they explained.  “With big data, we learn that such deaths are highly correlated with electrical engineering doctorates awarded, precipitation in Nebraska, and per capita mozzarella cheese consumption (correlations 0.75, 0.85, and 0.74, respectively).  

“However, because we cannot conceive of a causal mechanism, it is obvious that these variables play no causal role in x-ray contrast media deaths. That such high correlations can be easily mined from big data is concerning nonetheless, because it is not always trivial to assess whether they are telling us something useful.”

Providers should be selective when choosing which segments of their datasets to use for quality improvement initiatives like readmissions reductions or population health management. As the healthcare industry continues to mature its big data analytics capabilities, researchers will be better able to avoid the mistakes of confusing correlation with causation. 

When it comes to using EHR data to stratify risk for readmissions, including patient-centered risk factors such as socioeconomic and behavioral health challenges in future predictive analytics tools may be a first step towards helping providers make smarter, more targeted decisions about how to manage patients before and after they leave the hospital setting.


Join 25,000 of your peers

Register for free to get access to all our articles, webcasts, white papers and exclusive interviews.

Our privacy policy

no, thanks

Continue to site...