The Cancer Research Data Commons (CRDC) is an initiative by the National Cancer Institute (NCI) that provides access to multiple cancer data sources from the federal government. Sources include the Genomic Data Commons (GDC), Proteomic Data Commons (PDC), and others.
This directory contains NCI-CRDC studies generated using the ISB-CGC portal. Data is pulled from the ISB-CGC BigQuery tables once every 3 months and reflects the latest data available for each study. More details about methods and data transformations can be found in the README files for each individual study.
- Cancer type mapping: Each study corresponds to one TCGA project. The suffix of the TCGA project is taken and converted to an OncoTree code, which is used for the name of the study.
- Example: For the TCGA project
TCGA-LAML
, theLAML
suffix is taken and converted to the Oncotree codeAML
. The resulting cBioPortal study isaml_tcga_gdc
. - Mapping file
- Example: For the TCGA project
acc_tcga_gdc
: TCGA-ACC, Adenocortical Carcinomaaml_tcga_gdc
: TCGA-LAML, Acute Myeloid Leukemiablca_tcga_gdc
: TCGA-BLCA, Bladder Urothelial Carcinomabrca_tcga_gdc
: TCGA-BRCA, Breast Invasive Carcinomaccrcc_tcga_gdc
: TCGA-KIRC, Kidney Renal Clear Cell Carcinomacesc_tcga_gdc
: TCGA-CESC, Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinomachol_tcga_gdc
: TCGA-CHOL, Cholangiocarcinomachrcc_tcga_gdc
: TCGA-KICH, Kidney Chromophobecoad_tcga_gdc
: TCGA-COAD, Colon Adenocarcinomadifg_tcga_gdc
: TCGA-LGG, Brain Lower Grade Gliomadlbclnos_tcga_gdc
: TCGA-DLBC, Lymphoid Neoplasm Diffuse Large B-cell Lymphomaesca_tcga_gdc
: TCGA-ESCA, Esophageal Carcinomagbm_tcga_gdc
: TCGA-GBM, Glioblastoma Multiformehcc_tcga_gdc
: TCGA-LIHC, Liver Hepatocellular Carcinomahgsoc_tcga_gdc
: TCGA-OV, Ovarian Serous Cystadenocarcinomahnsc_tcga_gdc
: TCGA-HNSC, Head and Neck Squamous Cell Carcinomaluad_tcga_gdc
: TCGA-LUAD, Lung Adenocarcinomalusc_tcga_gdc
: TCGA-LUSC, Lung Squamous Cell Carcinomamnet_tcga_gdc
: TCGA-PCPG, Pheochromocytoma and Paragangliomansgct_tcga_gdc
: TCGA-TGCT, Testicular Germ Cell Tumorspaad_tcga_gdc
: TCGA-PAAD, Pancreatic Adenocarcinomaplmeso_tcga_gdc
: TCGA-MESO, Mesotheliomaprad_tcga_gdc
: TCGA-PRAD, Prostate Adenocarcinomaprcc_tcga_gdc
: TCGA-KIRP, Kidney Renal Papillary Cell Carcinomaread_tcga_gdc
: TCGA-READ, Rectum Adenocarcinomaskcm_tcga_gdc
: TCGA-SKCM, Skin Cutaneous Melanomasoft_tissue_tcga_gdc
: TCGA-SARC, Sarcomastad_tcga_gdc
: TCGA-STAD, Stomach Adenocarcinomathpa_tcga_gdc
: TCGA-THCA, Thyroid Carcinomathym_tcga_gdc
: TCGA-THYM, Thymomaucec_tcga_gdc
: TCGA-UCEC, Uterine Corpus Endometrial Carcinomaucs_tcga_gdc
: TCGA-UCS, Uterine Carcinosarcomaum_tcga_gdc
: TCGA-UVM, Uveal Melanoma
- Cancer type mapping: CPTAC is comprised of the CPTAC-2 and CPTAC-3 projects, both of which encompass multiple cancer types. Each study corresponds to a subset of these projects with a particular OncoTree code. The code is determined by looking at
disease_type
andprimary_site
in the BigQuery tables.- Example: The
luad_cptac
study is generated from all samples with primary siteBronchus and lung
and disease typeAdenomas and Adenocarcinomas
. - Mapping file)
- Example: The
brain_cptac_gdc
: Brain Cancerbreast_cptac_gdc
: Breast Cancercoad_cptac_gdc
: Colon Cancerluad_cptac_gdc
: Lung Adenocarcinomalusc_cptac_gdc
: Lung Squamous Cell Carcinomaohnca_cptac_gdc
: Head and Neck Cancerovary_cptac_gdc
: Ovarian Cancerpancreas_cptac_gdc
: Pancreatic Cancerrcc_cptac_gdc
: Renal Canceruec_cptac_gdc
: Endometrial Cancer
- Cancer type mapping: Each study corresponds to one or more TARGET projects. The name of the project is derived from the OncoTree code defined in the mapping file.
- Example: The
bll_target_gdc
study has OncoTree codeBLL
and is sourced from the GDC projectsTARGET-ALL-P1
andTARGET-ALL-P2
. - Mapping file
- Example: The
alal_target_gdc
: TARGET-ALL-P3, Acute Lymphoblastic Leukemia - Phase IIIaml_target_gdc
: TARGET-AML, Acute Myeloid Leukemiabll_target_gdc
: TARGET-ALL-P1 and TARGET-ALL-P2, Acute Lymphoblastic Leukemia - Phases I and IIccsk_target_gdc
: TARGET-CCSK, Clear Cell Sarcoma of the Kidneymrt_target_gdc
: TARGET-RT, Rhabdoid Tumornbl_target_gdc
: TARGET-NBL, Neuroblastomaos_target_gdc
: TARGET-OS, Osteosarcomawt_target_gdc
: TARGET-WT, High-Risk Wilms Tumor