Select primary tumors from TCGA datasets — TCGAprimaryTumors • TCGAutils

Tumor selection is decided using the sampleTypes data. For 'LAML' datasets, the primary tumor code used is "03" otherwise, "01" is used.

Usage

TCGAprimaryTumors(multiassayexperiment)

Arguments

multiassayexperiment: A MultiAssayExperiment with TCGA data as obtained from curatedTCGAData::curatedTCGAData()

Value

A MultiAssayExperiment containing only primary tumor samples

Examples


example(getSubtypeMap)
#> 
#> gtSbtM> library(curatedTCGAData)
#> Loading required package: MultiAssayExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#> 
#> Attaching package: ‘matrixStats’
#> The following object is masked from ‘package:GenomicDataCommons’:
#> 
#>     count
#> 
#> Attaching package: ‘MatrixGenerics’
#> The following objects are masked from ‘package:matrixStats’:
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#> 
#> Attaching package: ‘generics’
#> The following objects are masked from ‘package:base’:
#> 
#>     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#>     setequal, union
#> 
#> Attaching package: ‘BiocGenerics’
#> The following objects are masked from ‘package:stats’:
#> 
#>     IQR, mad, sd, var, xtabs
#> The following object is masked from ‘package:utils’:
#> 
#>     data
#> The following objects are masked from ‘package:base’:
#> 
#>     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#>     as.data.frame, basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#>     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#>     rank, rbind, rownames, sapply, saveRDS, scale, sequence, table,
#>     tapply, transform, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: ‘S4Vectors’
#> The following object is masked from ‘package:GenomicDataCommons’:
#> 
#>     expand
#> The following object is masked from ‘package:utils’:
#> 
#>     findMatches
#> The following objects are masked from ‘package:base’:
#> 
#>     I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: Seqinfo
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Attaching package: ‘Biobase’
#> The following object is masked from ‘package:MatrixGenerics’:
#> 
#>     rowMedians
#> The following objects are masked from ‘package:matrixStats’:
#> 
#>     anyMissing, rowMedians
#> 
#> gtSbtM> gbm <- curatedTCGAData("GBM", c("RPPA*", "CNA*"), version = "2.0.1", FALSE)
#> Querying and downloading: GBM_CNACGH_CGH_hg_244a-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> require(“RaggedExperiment”)
#> Querying and downloading: GBM_CNACGH_CGH_hg_415k_g4124a-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_CNASNP-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_RPPAArray-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_colData-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_metadata-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Querying and downloading: GBM_sampleMap-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> harmonizing input:
#>   removing 5922 sampleMap rows not in names(experiments)
#> 
#> gtSbtM> getSubtypeMap(gbm)
#>          GBM_annotations                          GBM_subtype
#> 1             Patient_ID                                 Case
#> 2   methylation_subtypes                 MGMT promoter status
#> 3      mutation_subtypes                    IDH/codel subtype
#> 4  histological_subtypes                            Histology
#> 5          mrna_subtypes                     Original Subtype
#> 6          mrna_subtypes                Transcriptome Subtype
#> 7          mrna_subtypes    Pan-Glioma RNA Expression Cluster
#> 8          mrna_subtypes  IDH-specific RNA Expression Cluster
#> 9   methylation_subtypes   Pan-Glioma DNA Methylation Cluster
#> 10  methylation_subtypes IDH-specific DNA Methylation Cluster
#> 11  methylation_subtypes   Supervised DNA Methylation Cluster
#> 12  methylation_subtypes          Random Forest Sturm Cluster
#> 13      protein_subtypes                         RPPA cluster
#> 
#> gtSbtM> sampleTables(gbm)
#> $`GBM_CNACGH_CGH_hg_244a-20160128`
#> 
#>  01  10  11 
#> 267 145  26 
#> 
#> $`GBM_CNACGH_CGH_hg_415k_g4124a-20160128`
#> 
#>  01  10 
#> 169 169 
#> 
#> $`GBM_CNASNP-20160128`
#> 
#>  01  02  10  11 
#> 577  13 488  26 
#> 
#> $`GBM_RPPAArray-20160128`
#> 
#>  01  02 
#> 233  11 
#> 
#> 
#> gtSbtM> TCGAsplitAssays(gbm, c("01", "10"))
#> Warning: Some 'sampleCodes' not found in assays
#> Warning: Inconsistent barcode lengths: 28, 27
#> A MultiAssayExperiment object of 7 listed
#>  experiments with user-defined names and respective classes.
#>  Containing an ExperimentList class object of length 7:
#>  [1] 01_GBM_CNACGH_CGH_hg_244a-20160128: RaggedExperiment with 81512 rows and 267 columns
#>  [2] 10_GBM_CNACGH_CGH_hg_244a-20160128: RaggedExperiment with 81512 rows and 145 columns
#>  [3] 01_GBM_CNACGH_CGH_hg_415k_g4124a-20160128: RaggedExperiment with 57975 rows and 169 columns
#>  [4] 10_GBM_CNACGH_CGH_hg_415k_g4124a-20160128: RaggedExperiment with 57975 rows and 169 columns
#>  [5] 01_GBM_CNASNP-20160128: RaggedExperiment with 602338 rows and 577 columns
#>  [6] 10_GBM_CNASNP-20160128: RaggedExperiment with 602338 rows and 488 columns
#>  [7] 01_GBM_RPPAArray-20160128: SummarizedExperiment with 208 rows and 233 columns
#> Functionality:
#>  experiments() - obtain the ExperimentList instance
#>  colData() - the primary/phenotype DataFrame
#>  sampleMap() - the sample coordination DataFrame
#>  `$`, `[`, `[[` - extract colData columns, subset, or experiment
#>  *Format() - convert into a long or wide DataFrame
#>  assays() - convert ExperimentList to a SimpleList of matrices
#>  exportClass() - save data to flat files
#> 
#> gtSbtM> getClinicalNames("COAD")
#>  [1] "years_to_birth"                      
#>  [2] "vital_status"                        
#>  [3] "days_to_death"                       
#>  [4] "days_to_last_followup"               
#>  [5] "tumor_tissue_site"                   
#>  [6] "pathologic_stage"                    
#>  [7] "pathology_T_stage"                   
#>  [8] "pathology_N_stage"                   
#>  [9] "pathology_M_stage"                   
#> [10] "gender"                              
#> [11] "date_of_initial_pathologic_diagnosis"
#> [12] "days_to_last_known_alive"            
#> [13] "radiation_therapy"                   
#> [14] "histological_type"                   
#> [15] "residual_tumor"                      
#> [16] "number_of_lymph_nodes"               
#> [17] "race"                                
#> [18] "ethnicity"                           

TCGAprimaryTumors(gbm)
#> harmonizing input:
#>   removing 878 sampleMap rows with 'colname' not in colnames of experiments
#>   removing 2 colData rownames not in sampleMap 'primary'
#> A MultiAssayExperiment object of 4 listed
#>  experiments with user-defined names and respective classes.
#>  Containing an ExperimentList class object of length 4:
#>  [1] GBM_CNACGH_CGH_hg_244a-20160128: RaggedExperiment with 81512 rows and 267 columns
#>  [2] GBM_CNACGH_CGH_hg_415k_g4124a-20160128: RaggedExperiment with 57975 rows and 169 columns
#>  [3] GBM_CNASNP-20160128: RaggedExperiment with 602338 rows and 577 columns
#>  [4] GBM_RPPAArray-20160128: SummarizedExperiment with 208 rows and 233 columns
#> Functionality:
#>  experiments() - obtain the ExperimentList instance
#>  colData() - the primary/phenotype DataFrame
#>  sampleMap() - the sample coordination DataFrame
#>  `$`, `[`, `[[` - extract colData columns, subset, or experiment
#>  *Format() - convert into a long or wide DataFrame
#>  assays() - convert ExperimentList to a SimpleList of matrices
#>  exportClass() - save data to flat files