Big Data Analytics May Lead to More Precise Cancer Treatments

By Jessica Kent

November 30, 2020 - As big data analytics technologies continue to move from research labs to clinical settings, organizations are increasingly leveraging these tools to design more comprehensive cancer treatments.

Across the US, cancer is one of the most prevalent chronic diseases. The National Cancer Institute reports that the rate of new cancer cases is 442.4 per 100,000 men and women per year, and approximately 39.5 percent of men and women will be diagnosed with cancer at some point in their lifetimes.

In addition to being pervasive, cancer is also incredibly complex. Treating patients with cancer requires providers to consider an enormous amount of data – often too much for human clinicians to analyze on their own.

To accelerate the development of better cancer treatments, researchers are building big data analytics tools that can quickly draw meaningful insights from cancer data.

At Massachusetts General Hospital (MGH), a team designed a machine learning model that can identify imaging biomarkers on screening mammograms to predict a patient’s risk of developing breast cancer.

The results of the study showed that the machine learning model can forecast patient risk better than traditional models, which currently incorporate only a small fraction of patient data such as family history, prior breast biopsies, and hormonal reproductive history. Breast density is the only feature from the breast mammography itself that is included in traditional models.

"Traditional risk assessment models do not leverage the level of detail that is contained within a mammogram," said Leslie Lamb, MD, MSc, breast radiologist at MGH. "Even the best existing traditional risk models may separate sub-groups of patients but are not as precise on the individual level."

Using data from five MGH breast cancer screening sites, researchers developed the model on a population that included women with a personal history of breast cancer, implants, or prior biopsies. The study data included 245,753 consecutive 2D digital bilateral screening mammograms performed in 80,818 patients between 2009 and 2016.

Researchers compared the accuracy of the machine learning image-only model to a commercially available risk assessment model in predicting future breast cancer within five years of the index mammogram. The results showed that the machine learning model achieved a predictive rate of 0.71, while the traditional model achieved a rate of just 0.61.

The findings reveal the ability of data analytics tools to help providers design more targeted treatment plans for breast cancer.

"Our deep learning model is able to translate the full diversity of subtle imaging biomarkers in the mammogram that can predict a woman's future risk for breast cancer," Lamb said.

"Traditional risk models can be time-consuming to acquire and rely on inconsistent or missing data. A deep learning image-only risk model can provide increased access to more accurate, less costly risk assessment and help deliver on the promise of precision medicine."

The study also demonstrates the potential for machine learning and other analytics models to examine many pieces of critical patient data, researchers noted.

“Why should we limit ourselves to only breast density when there is such rich digital data embedded in every woman's mammogram?" said senior author Constance D. Lehman, MD, PhD, division chief of breast imaging at MGH.

"Every woman's mammogram is unique to her just like her thumbprint. It contains imaging biomarkers that are highly predictive of future cancer risk, but until we had the tools of deep learning, we were not able to extract this information to improve patient care."

Researchers at UC San Francisco and Princeton University are also using machine learning tools to improve cancer treatments. The team developed complementary strategies to design therapies that can kill cancer cells while leaving normal tissue unscathed.

The cell therapies remain inert unless triggered by combinations of proteins that only ever appear together in cancer cells.

“Currently, most cancer treatments, including CAR T cells, are told ‘block this,’ or ‘kill this,’” said Wendell Lim, PhD, head researcher in the UCSF Cell Design Initiative and National Cancer Institute-sponsored Center for Synthetic Immunology. “We want to increase the nuance and sophistication of the decisions that a therapeutic cell makes.”

Using machine learning, researchers analyzed massive databases of thousands of proteins found in both cancer and normal cells. The team then combed through millions of possible protein combinations to assemble a catalog of combinations that could be used to precisely target only cancer cells while leaving normal ones alone.

In a separate paper, the group showed how this protein data could drive the design of effective and highly selective cell therapies for cancer.

“The field of big data analysis of cancer and the field of cell engineering have both exploded in the last few years, but these advances have not been brought together,” said Olga G. Troyanskaya, PhD, a computer scientist at Princeton’s Lewis-Sigler Institute for Integrative Genomics and the Simons Foundation’s Flatiron Institute.

“The computing capabilities of therapeutic cells combined with machine learning approaches enable actionable use of the increasingly available rich genomic and proteomic data on cancers.”

The research illustrates the importance of considering all data elements when treating patients with cancer.

“You’re not just looking for one magic-bullet target. You’re trying to use all the data,” Lim said.

“We need to comb through all of the available cancer data to find unambiguous combinatorial signatures of cancer. If we can do this, then it could launch the use of these smarter cells that really harness the computational sophistication of biology and have real impact on fighting cancer.”

Analytics in Action News

Big Data Analytics May Lead to More Precise Cancer Treatments

Healthcare leaders are working to discover accurate, targeted cancer treatments using big data analytics tools.

Next in Analytics in Action