Healthcare Analytics, Population Health Management, Healthcare Big Data

Tools & Strategies News

Using Big Data Analytics, Systems Engineering to Tackle Sepsis

EHR big data analytics and statistical modeling of the healthcare system are helping researchers understand the development and impact of sepsis.

By Jennifer Bresnick

- Sepsis is a silent and shockingly effective killer, causing mortality in close to a third of patients, but researchers at North Carolina State University are planning to make big data analytics an even more powerful force than this insidious, quickly-moving disease. 

Big data analytics, machine learning, and sepsis

At NC State’s Industrial Engineering and Systems Engineering department, Dr. Julie Ivy and her colleagues at Mayo Clinic and Christiana Care Health System in Delaware are working to outsmart sepsis using a combination of electronic health record data, machine learning, and a high-level look at how to reengineer the process of delivering care to potential sepsis cases.

The three-year Sepsis Early Prediction Support Implementation System project, supported by the National Science Foundation and the National Institutes of Health, enlists the aid of clinicians, computer scientists, big data experts, and industrial systems engineers to tackle a number of fundamental problems with how the healthcare system identifies and treats sepsis patients.

Accounting for close to $24 billion in annual spending, 8.4 percent of Medicare hospitalizations, and 258,000 deaths per year, sepsis ranked as the United States’ most expensive condition in 2013, according to data from the Agency for Healthcare Quality and Research (AHRQ).  

The disease consumes more than 6 percent of all hospital costs, and its impact on the healthcare system may be growing as the number of sepsis-related hospitalizations continues to tick upward. 

READ MORE: Top 4 Machine Learning Use Cases for Healthcare Providers

The industry is starting to refine its understanding of how sepsis develops and which patients are most at risk for this complication of common infections, but providers still struggle to react in time to prevent the disease from taking hold of hundreds of thousands of patients each year.

“There has been a lot of discussion about what defines sepsis, how to treat it, and how to prevent it, but we can’t answer these questions without a lot more insight into the healthcare system than we have right now,” said Ivy, a Professor at North Carolina State University in the Edward P. Fitts Department of Industrial and Systems Engineering and a Fitts Faculty Fellow in Health Systems Engineering. 

“With this project, we want to develop a system-level overview of the problem,” she continued.  “It’s not just about how to determine when patients should get antibiotics or what labs we should run in order to diagnose the disease.”

“It’s also about the costs and the impacts any potential changes will have on the healthcare as a whole.  How will new processes increase the workload on those laboratories?  What resources do we need to actually operationalize new policies without overwhelming clinical workers?”

Using machine learning, statistical modeling, and computer simulations, Ivy and her team are hoping to mine the untapped treasure trove of EHR data at Mayo Clinic and Christiana Care to understand the patterns of sepsis development and provide clinicians with the predictive analytics and clinical decision support resources they need to get the jump on treating patients headed for a crisis.

READ MORE: Artificial Intelligence in Healthcare: Augmentation or Companionship?

“Our goal is to build a tool that will help inform decision-making for clinicians.  It’s very important that we develop something that clinicians will accept and use,” Ivy said.  “Without a high level of confidence in its accuracy and usefulness, they’re not going to bring it into their workflows.  If we want clinicians to use decision support applications, we need to get them to trust that they will actually work to improve care.”

How Healthcare Can Prep for Artificial Intelligence, Machine Learning

Developing such a tool is a complicated proposition, but Ivy believes her team’s approach to creating applications in the context of broader systemic changes will equip them with several advantages.

“Industrial systems engineers really like to understand how all the pieces of a whole work together and interact, which is key for something with as many moving parts as healthcare,” she explained.  “It’s important not to look at patients in isolation.  We want to understand what brings the patient into the environment – comorbidities, socioeconomic factors, acute care needs – and then how the patient interacts with the healthcare system.”

“We try to capture this through building optimization and mathematical models to describe some of those relationships and think about how well they align with the goals of the provider, the patient, and the health system.  Then we come up with policies that move the experience closer to the ideal.”

READ MORE: Patient Safety Errors are Common with Electronic Health Record Use

Instead of testing new ideas directly on patients, however, the project is leveraging computer simulations that allow researchers to brainstorm and test the impact of new strategies in a safe and cost-effective manner.

“Virtual environments let us replicate the patient population and the health system so we can see what impacts our ideas may have on the system without jeopardizing patient safety in any way,” Ivy said.  “It’s a good way to think about how these strategies are going to scale up, especially when it comes to resource use, without creating real logjams in an actual clinical setting.”

To feed these models, Ivy and her colleagues have to extract and analyze clinical data from both their health system partners.   

“One of the novel features of this project is the fact that we’re using data from two health systems, and they’re very different from each other,” she said.  “Mayo Clinic is a destination health center that handles some of the most complex cases in the country.  There are certain things that happen there that don’t happen at a community-based delivery network like Christiana Care – and there are situations that Christiana Care deals with that aren’t as prevalent at Mayo, so we’re getting a great look at two unique populations.”

“The two sets of data are going to look extremely different, which will help us to cross-validate our models and gain a better understanding of its accuracy.  It’s very important to test things in more than one environment, because we want to make our tools as precise as they can be in every possible setting.”

While examining workflow and treatment processes at two very different health systems has its advantages, trying to synthesize information from two separate EHR systems is just as challenging as creating statistical models that use the right data to accurately capture the experience of a sepsis patient.

“We’ve gone through about six generations of data pools trying to bring together the right attributes from different places to accurately chart the path of a patient through the health system,” Ivy said.  “We need pieces of data that are held in different places – vitals aren’t necessarily held in the same place as lab results, which may be kept separately from your allergies or your medications.”

“The task is complicated by the fact that these two health systems have different rules for documenting and recording data, so it’s been a learning experience to figure out how to bring them together.  We talk a lot about making data available, but a lot of health systems haven’t really focused on the processes required to translate that data into usable information.”

When healthcare systems make data available to researchers, they often report on quality and outcomes in aggregate, Ivy added, which isn’t as useful as it could be for projects that require a very granular level of information. 

“They tend to focus more on averages and means, but we need the raw data so we can make sure we’re getting the whole picture,” she said.  “Raw data needs to be cleaned up, though, and it’s never exactly the way we want it.  So there’s a lot of work to do with the data before we can even start asking questions.”

Unstructured data poses another major challenge.  Physician notes are chock full of pertinent information, but the free-text in these documents is much more difficult to extract and analyze than structured information from the EHR.

“We’re looking at using natural language processing to dissect those notes, at least for a subset of patients, to squeeze out as much information as we can from another very valuable resource,” Ivy said.  “Sometimes the note will indicate that a patient really should have been diagnosed as septic, but the diagnosis is never coded anywhere structured, so that also raises questions about how reliable our data can be.”

What is the Role of Natural Language Processing in Healthcare?

And if integrating structured and unstructured data elements from two disparate systems isn’t difficult enough, Ivy is also hoping to develop strategies to make use of data that isn’t even there at all.

“We also have to think about the concept of ‘missingness,’” she explained.  “Sometimes an element is missing unintentionally, because there’s an error in the data set, or it’s not applicable to a patient – asking about pregnancy for a male, for example.  But other times it can convey information.” 

“If a patient chooses not to identify their gender or their ethnic group, that could indicate something about that patient.  If a portion of their historical records from another health system isn’t available, that says something about their experience within the care continuum. It’s important to look at what isn’t there, too, and to learn how those gaps contribute to the patient’s story.”

Sepsis is a disease of subtleties, and the timing of interventions is another critical component of understanding how providers are delivering care, she added.

“We also have to look at data over time.  A patient doesn’t always receive a medication as soon as it’s ordered, so we need to understand why there was a lag.  Or if there were changes to the order, why did they occur?”

“Why did the provider choose to start taking observations every four hours after previously asking for reports every eight hours?  This tells us something about the clinical decision-making process, and it could tell us something about the features of the patient and how they are starting to change.”

Creating big data models that accurately describe the nuances of sepsis has not been an easy process, Ivy said, and there is still a great deal of work to do before the project can achieve its ultimate goals.

“When we wrote our proposal, we thought it wouldn’t be that much of a challenge to bring all the data together.  In our timeline, we estimated it would take a month.  Six months, nine months later…well, it’s a learning process,” she laughed.  “There are always new questions and new issues to think about, so we’re thinking of it as an evolution.”

“I’m confident that these challenges aren’t unique to our situation.  Working with this type of big data is difficult for everyone.  It’s definitely been an adventure so far, but we are excited about the work we’re doing and the potential to make an impact for patients.  We are hoping to share what we’ve learned to help other organizations improve their processes and reduce the burdens of sepsis on their providers and their patients.”


Join 25,000 of your peers

Register for free to get access to all our articles, webcasts, white papers and exclusive interviews.

Our privacy policy

no, thanks

Continue to site...