Analytics in Action News

Using SDOH Data in Machine Learning to Improve Predictive Models

By incorporating social determinants of health into machine learning, predictive models can better treat cardiovascular disease in diverse populations.

social determinants of health machine learning

Source: Thinkstock

By Erin McNemar, MPA

- According to researchers from New York University’s School of Global Public Health and Tandon School of Engineering, machine learning can accurately predict cardiovascular disease and guide physicians to select treatment options. However, by incorporating social determinants of health, providers can better serve diverse groups.

Cardiovascular disease is responsible for almost a third of all deaths worldwide and disproportionately impacts those in lower socioeconomic groups. Increased cardiovascular disease is in part affected by the social determinants of health.

“Cardiovascular disease is increasing, particularly in low- and middle-income countries and among communities of color in places like the United States," Rumi Chunara, the study’s senior author and associate professor of biostatistics at NYU School of Global Public Health and of computer science and engineering at NYU Tandon School of Engineering, said in a press release.

"Because these changes are happening over such a short period of time, it is well known that our changing social and environmental factors, such as increased processed foods, are driving this change, as opposed to genetic factors which would change over much longer time scales," Chunara continued.

With its use of predictive analytics, machine learning is rapidly developing in cardiovascular research to assess disease risks, incidence, and outcomes. While statistical methods are central in determining cardiovascular disease risks, predictive models can give providers actionable information by quantifying a patient’s risk and guiding treatment recommendations.

Typically, cardiovascular disease is examined using clinical information such as blood pressure and cholesterol levels. However, social determinants are rarely considered. The team of researchers studied how social and environmental factors are beginning to be integrated into machine learning algorithms for cardiovascular disease.

"Social and environmental factors have complex, non-linear interactions with cardiovascular disease," said Chunara. "Machine learning can be particularly useful in capturing these intricate relationships.”

By examining existing research on machine learning and cardiovascular disease risk, they found that including social determinants of health improved the machine learning predictive models. However, the models did not usually include the full list of community-level or environmental factors that are critical in determining cardiovascular disease risk.

Additionally, some of the studies did not include factors such as income, marital status, social isolation, pollution, and health insurance. Only five studies examined environmental factors including the walkability of a community and the availability of resources like grocery stores.

The researchers also acknowledged a lack of geographic diversity in the studies. Most of the data represented the United States, countries in Europe, and China, while parts of the world experiencing an increase in cardiovascular disease were neglected.

"If you only do research in places like the United States or Europe, you'll miss how social determinants and other environmental factors related to cardiovascular risk interact in different settings and the knowledge generated will be limited," said Chunara.

According to Stephanie Cook, assistant professor of biostatistics at NYU School of Global Public Health and a study author, the study demonstrated that there is room for growth when it comes to incorporating social determinants of health into cardiovascular disease statistical risk prediction models,"

"In recent years, there has been a growing emphasis on capturing data on social determinants of health--such as employment, education, food, and social support--in electronic health records, which creates an opportunity to use these variables in machine learning studies and further improve the performance of risk prediction, particularly for vulnerable groups," Cook said.

To address disparities, Chunara said it’s important to include social determinants of health in machine learning to target the issues and intervene.

"For example, it can improve clinical practice by helping health professionals identify patients in need of referral to community resources like housing services and broadly reinforces the intricate synergy between the health of individuals and our environmental resources," Chunara concluded.