Using a national supercomputer, Swedish researchers have analysed how protein expression of the 315 known cancer genes translates into cancer overall survival.
Mathias Uhlén, Director of the Human Protein Atlas consortium and leader of the Pathology Atlas, and his team at University of Stockholm analysed the difference in expression patterns of individual cancers in 8,000 patients with 17 main cancer types. The resulting Pathology Atlas allows identification of molecular subtypes of cancers, which is improving patient diagnostics and personalised treatment. In addition, the systems biology approach used to construct the Pathology Atlas demonstrates the power of Big Data to change how medical research is performed. The open-access resource allows to explore how the expression of specific genes influences patient survival in 17 different types of cancer. More than 900,000 patient survival profiles are available, including for tumors of colon, prostate, lung, and breast origin. This interactive data set can also be used to generate personalized patient models to predict how metabolic changes can influence tumor growth.
“We show a new concept to present patient survival data,“ said Uhlén. „called Interactive Survival Scatter plots, and in the atlas, we present more than 400,000 such plots. A national supercomputer center was used to analyse more than 2.5 petabytes of underlying publicly available data from the Cancer Genome Atlas (TCGA) to generate more than 900,000 survival plots describing the consequence of RNA and protein levels on clinical survival. The Pathology Atlas also contains 5 million pathology-based images generated by the Human Protein Atlas consortium.
“This study differs from earlier cancer investigations,“ Uhlén said, “since it is not focused on the mutations in cancers, but the downstream effects of such mutations across all protein-coding genes. We show, for the first time, the influence of the gene expression levels demonstrating the power of Big Data to change how medical research is performed. It also shows the advantage of open access policies in science in which researchers share data with each other to allow integration of huge amounts of data from different sources.” Dr Adil Mardinoglu, the leader of the systems biology effort in the project said: “We are now in possession of incredibly powerful systems biology tools for medical research, allowing, for the first time, genome-wide analysis of individual patients with regards to the consequence of their expression profiles for clinical survival.”
Uhlén and co-workers report several important findings from their analyses. Firstly, a large fraction of genes is differentially expressed in cancers - and in many cases - has an impact on overall patient survival. The research also showed that gene expression patterns of individual tumors varied considerably, and could exceed the variation observed between different cancer types. Shorter patient survival was generally associated with up-regulation of genes involved in mitosis and cell growth, and down-regulation of genes involved in cellular differentiation. The data allowed the researchers to generate personalised genome-scale metabolic models for cancer patients to identify key genes involved in tumour growth.
The Pathology Atlas team also looked to demonstrate the utility of the new tool in two particular cancers. “For lung and colorectal cancer, a selection of prognostic genes identified in the Atlas were also analyzed in independent, prospective cancer cohorts using immunohistochemistry to validate the gene expression patterns at the protein level,” says Fredrik Ponten, Professor in Pathology of Uppsala University.” We are pleased to provide a stand-alone open-access resource for cancer researchers worldwide, which we hope will help accelerate their efforts to find the biomarkers needed to develop personalised cancer treatments.”
The pharmaceutical industry has also realised that such Big Data analyses could offer new business opportunities. For instance, Roche’s partner Foundation Medicine (FMI), which has 125,000 mutation profiles in its proprietary data base, plans to link up sequencing and pathology data with outcome data into a medical decision support system that helps oncologists to select the best treatment for an individual patient. Discussions on how to share the data derived from FMI’s cancer gene panel analysis of tumour biopsies with oncologists and pathologists derived are ongoing.