Helper functions for managing MultiAssayExperiment from curatedTCGAData
Source:R/curatedTCGAData-helpers.R
curatedTCGAData-helpers.Rd
Additional helper functions for cleaning and uncovering metadata
within a downloaded MultiAssayExperiment
from curatedTCGAData
.
Usage
getSubtypeMap(multiassayexperiment)
getClinicalNames(diseaseCode)
TCGAsplitAssays(multiassayexperiment, sampleCodes = NULL, exclusive = FALSE)
sampleTables(multiassayexperiment, vial = FALSE)
Arguments
- multiassayexperiment
A
MultiAssayExperiment
object- diseaseCode
A TCGA cancer code (e.g., "BRCA")
- sampleCodes
character (default NULL) A string of sample type codes (refer to
data(sampleTypes)
;TCGAsplitAssays
section)- exclusive
logical (default FALSE) Whether to return only assays that contain all codes in
sampleCodes
- vial
(logical default FALSE) whether to display vials in the table output
Value
getSubtypeMap: A
data.frame
with explanatory names and their in-data variable names. They may not be present for all cancer types.getClinicalNames: A
vector
of common variable names that may be found across several cancer disease codes.
Details
Note that for getSubtypeMap
, the column of in-data variable names
may need to go through make.names
to be found in the colData
of the
MultiAssayExperiment
.
getSubtypeMap
provides a two column data.frame
with
interpreted names and in-data variable names. 'Name' usually refers to the
colData
row names a.k.a. the patientID
.
getClinicalNames
provides a vector of common variable names that
exist in the colData
DataFrame
of a curatedTCGAData
MultiAssayExperiment
object. These variables are directly obtained
from the BroadFirehose clinical data (downloaded with
getFirehoseData) and tend to be present across cancer
disease codes.
TCGAsplitAssays
Separates samples by indicated sample codes into different assays
in a MultiAssayExperiment
. Refer to the sampleTypes
data object for a list of available codes. This operation generates
n times the number of assays based on the number of sample codes
entered. By default, all assays will be split by samples present in
the data.
Examples
library(curatedTCGAData)
gbm <- curatedTCGAData("GBM", c("RPPA*", "CNA*"), version = "2.0.1", FALSE)
#> Querying and downloading: GBM_CNACGH_CGH_hg_244a-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_CNACGH_CGH_hg_415k_g4124a-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_CNASNP-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_RPPAArray-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_colData-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_metadata-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_sampleMap-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> harmonizing input:
#> removing 5922 sampleMap rows not in names(experiments)
getSubtypeMap(gbm)
#> GBM_annotations GBM_subtype
#> 1 Patient_ID Case
#> 2 methylation_subtypes MGMT promoter status
#> 3 mutation_subtypes IDH/codel subtype
#> 4 histological_subtypes Histology
#> 5 mrna_subtypes Original Subtype
#> 6 mrna_subtypes Transcriptome Subtype
#> 7 mrna_subtypes Pan-Glioma RNA Expression Cluster
#> 8 mrna_subtypes IDH-specific RNA Expression Cluster
#> 9 methylation_subtypes Pan-Glioma DNA Methylation Cluster
#> 10 methylation_subtypes IDH-specific DNA Methylation Cluster
#> 11 methylation_subtypes Supervised DNA Methylation Cluster
#> 12 methylation_subtypes Random Forest Sturm Cluster
#> 13 protein_subtypes RPPA cluster
sampleTables(gbm)
#> $`GBM_CNACGH_CGH_hg_244a-20160128`
#>
#> 01 10 11
#> 267 145 26
#>
#> $`GBM_CNACGH_CGH_hg_415k_g4124a-20160128`
#>
#> 01 10
#> 169 169
#>
#> $`GBM_CNASNP-20160128`
#>
#> 01 02 10 11
#> 577 13 488 26
#>
#> $`GBM_RPPAArray-20160128`
#>
#> 01 02
#> 233 11
#>
TCGAsplitAssays(gbm, c("01", "10"))
#> Warning: Some 'sampleCodes' not found in assays
#> Warning: Inconsistent barcode lengths: 28, 27
#> A MultiAssayExperiment object of 7 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 7:
#> [1] 01_GBM_CNACGH_CGH_hg_244a-20160128: RaggedExperiment with 81512 rows and 267 columns
#> [2] 10_GBM_CNACGH_CGH_hg_244a-20160128: RaggedExperiment with 81512 rows and 145 columns
#> [3] 01_GBM_CNACGH_CGH_hg_415k_g4124a-20160128: RaggedExperiment with 57975 rows and 169 columns
#> [4] 10_GBM_CNACGH_CGH_hg_415k_g4124a-20160128: RaggedExperiment with 57975 rows and 169 columns
#> [5] 01_GBM_CNASNP-20160128: RaggedExperiment with 602338 rows and 577 columns
#> [6] 10_GBM_CNASNP-20160128: RaggedExperiment with 602338 rows and 488 columns
#> [7] 01_GBM_RPPAArray-20160128: SummarizedExperiment with 208 rows and 233 columns
#> Functionality:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DataFrame
#> sampleMap() - the sample coordination DataFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DataFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
#> exportClass() - save data to flat files
getClinicalNames("COAD")
#> [1] "years_to_birth"
#> [2] "vital_status"
#> [3] "days_to_death"
#> [4] "days_to_last_followup"
#> [5] "tumor_tissue_site"
#> [6] "pathologic_stage"
#> [7] "pathology_T_stage"
#> [8] "pathology_N_stage"
#> [9] "pathology_M_stage"
#> [10] "gender"
#> [11] "date_of_initial_pathologic_diagnosis"
#> [12] "days_to_last_known_alive"
#> [13] "radiation_therapy"
#> [14] "histological_type"
#> [15] "residual_tumor"
#> [16] "number_of_lymph_nodes"
#> [17] "race"
#> [18] "ethnicity"