- Georgetown Lombardi Comprehensive Cancer Center will now make its collection of brain cancer data freely available to precision medicine researchers worldwide.
The dataset, called the REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), is one of only two such repositories in the country, and contains information on 671 adults from 14 contributing institutions.
Thousands of researchers from the US and worldwide already access the data site on a regular basis, and Georgetown investigators expect that the number of users will increase as word about the resource spreads.
“We want this data to be widely used by the broadest audience — the entire biomedical research community — so that imagination and discovery is maximized,” said Yuriy Gusev, PhD, Associate Professor and a faculty member of the ICBI.
“Our common goal is to tease apart the clues hidden within this biomedical and clinical information in order to find ways that advance diagnostic and clinical outcomes for these patients.”
REMBRANDT is unique in that it contains both genomic information and diagnostic treatment and outcomes data, while most data collections contain either one or the other. The data collection consists of genomic data from 261 samples of glioblastoma, 170 of astrocytoma, 86 tissues of oligodendroglioma, and a number that are mixed or of an unknown subclass.
The repository also includes more than 13,000 points of outcomes data.
In addition to containing a large amount of genomic and diagnostic information, the Georgetown data collection interface is very easy to use, said Subha Madhavan, PhD, Chief Data Scientist at Georgetown University Medical Center and Director of the Innovation Center for Biomedical Informatics (ICBI) at Georgetown Lombardi.
“It sits on Amazon Web Services, and has a simple web interface access to data and analysis tools,” she said.
“All a researcher needs is a computer and an internet connection to log onto this interface to select, filter, analyze and visualize the brain tumor datasets.”
The REMBRANDT dataset was originally created at the National Cancer Institute (NCI), where researchers collected data from 2004 to 2006.
In 2015, NCI transferred the data to Georgetown, where it is now physically located in the Georgetown Database of Cancer (G-DOC), a cancer integration and sharing platform. G-DOC researchers developed innovative analytical tools to newly process the information.
The genomic data in REMBRANDT includes the specific genes within individual tumors that are either over-expressed or under-expressed, as well as the number of times that gene is repeated in a chromosome.
“We inherit two copies of a gene — one from Mom and one from Dad — but in cancer cells, DNA segments containing important tumor suppressor or onco-genes can be entirely deleted or amplified,” said Madhavan.
“It isn’t unusual to see a chromosome within a tumor that has 11 copies of a gene, each of which may be producing a toxic protein that helps the cancer grow uncontrollably.”
REMBRANDT also includes RNA information, which is produced by genes and can be measured to assess genes that are dysregulated.
The collection interface allows researchers to search their gene of interest, check their expression and amplification status, and link that to clinical outcomes. They can then save their discoveries to their workspace on the G-DOC site and share them with their collaborators.
Georgetown researchers expect that the REMBRANDT data collection will facilitate research partnerships and discoveries, allowing investigators to draw actionable conclusions from a large variety of brain cancer types, as well as the nearly 20,000 protein coding genes in the human genome.
“We are just beginning to understand the science of how these cancers evolve and how best to treat them, and datasets like this will likely be very helpful,” Madhavan said.