Cancer Genomics, TCGA, and the GenePool software platform
Presented by Sandeep Sanga, PhD and Tod M Klingler, PhD
Cancer research and treatment is the area of medicine that is currently being transformed most rapidly by genomic technologies. While genome sequencing can elucidate the germline and somatic mutations that contribute to the development of cancer, other genomic modalities, including gene expression, copy number and methylation analyses, are used to identify the specific functional changes that lead to cancers initially and then it’s progression. These data have also been used to narrow in on optimal, patient-centric treatment strategies.
The Cancer Genome Atlas (TCGA) is a large, multi-center project that is characterizing the molecular profiles of samples from patients with difficult-to-treat cancers. Data that has been generated in this project includes exome, whole genome, RNA-seq, miRNA-seq, copy number, methylation and protein expression datasets from more than 16,000 cancer patients and more than 25 cancer types. The volume and quality of this genomic data is extremely valuable for studying these cancers, however, the informatics challenges in accessing and working with this data are a significant impediment for realizing that value.
GenePool is a software platform that enables clinical researchers to manage, analyze, visualize and share the data and results of genome sequencing projects. Station X has imported the open-access TCGA datasets, matched them with the the available patient and sample data, into GenePool. Using GenePool, we demonstrate the flexible and efficient analyses of complex queries within and across cancer types in TCGA. In addition, while germline variants and read-level data is controlled access (i.e. available by application and approval of dbGaP), that data can be made available to researchers with the appropriate access privileges.
In this seminar, we will give an overview of cancer genome projects, focusing on TCGA, and demonstrate the use of GenePool for the large scale analysis of TCGA data.