Download data from the cBioPortal API

Obtain a MultiAssayExperiment object for a particular gene panel, studyId, molecularProfileIds, and sampleListIds combination. Default molecularProfileIds and sampleListIds are set to NULL for including all data. This option is best for users who wish to obtain a section of the study data that pertains to a specific molecular profile and gene panel combination. For users looking to download the entire study data as provided by the https://www.cbioportal.org/datasets, refer to cBioDataPack.

Usage

cBioPortalData(
  api,
  studyId = NA_character_,
  genePanelId = NA_character_,
  genes = NA_character_,
  molecularProfileIds = NULL,
  sampleListId = NULL,
  sampleIds = NULL,
  by = c("entrezGeneId", "hugoGeneSymbol"),
  check_build = TRUE,
  ask = interactive()
)

Arguments

api: An API object of class cBioPortal from the cBioPortal function
studyId: character(1) Indicates the "studyId" as taken from getStudies
genePanelId: character(1) Identifies the gene panel, as obtained from the genePanels function
genes: character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.
molecularProfileIds: character() A vector of molecular profile IDs
sampleListId: character(1) A sample list identifier as obtained from sampleLists()
sampleIds: character() Sample identifiers
by: character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')
check_build: logical(1L) Whether to check the build status of the studyId using an internal dataset. This argument should be set to FALSE if using alternative hostnames, e.g., 'pedcbioportal.kidsfirstdrc.org'
ask: logical(1) Whether to prompt the the user before downloading and loading study MultiAssayExperiment that is not currently building based on previous testing. Set to interactive() by default. In a non-interactive session, data download will be attempted; equivalent to ask = FALSE. The argument will also be used when a cache directory needs to be created when using downloadStudy.

Value

A MultiAssayExperiment object

Details

We are able to succesfully represent 98 percent of the study identifiers as MultiAssayExperiment objects as obtained via cBioPortalData with the IMPACT341 genePanelId as the example gene panel. Datasets that currently fail to import can be seen in the getStudies(..., buildReport = TRUE) dataset under the "api_build" column. Note that changes to the cBioPortal API may affect this rate at any time. If you encounter any issues, please open a GitHub issue at the https://github.com/waldronlab/cBioPortalData/issues/ page with a fully reproducible example.

Examples


cbio <- cBioPortal()

samps <- samplesInSampleLists(cbio, "acc_tcga_rppa")[[1]]

getGenePanelMolecular(
    cbio, molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA"),
    samps
)
#> # A tibble: 92 × 7
#>    uniqueSampleKey        uniquePatientKey molecularProfileId sampleId patientId
#>    <chr>                  <chr>            <chr>              <chr>    <chr>    
#>  1 VENHQS1PUi1BNUoyLTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  2 VENHQS1PUi1BNUozLTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  3 VENHQS1PUi1BNUo2LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  4 VENHQS1PUi1BNUo3LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  5 VENHQS1PUi1BNUo4LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  6 VENHQS1PUi1BNUo5LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  7 VENHQS1PUi1BNUpBLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  8 VENHQS1PUi1BNUpQLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#>  9 VENHQS1PUi1BNUpSLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 10 VENHQS1PUi1BNUpTLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> # ℹ 82 more rows
#> # ℹ 2 more variables: studyId <chr>, profiled <lgl>

acc_tcga <- cBioPortalData(
    cbio, by = "hugoGeneSymbol",
    studyId = "acc_tcga",
    genePanelId = "AmpliSeq",
    molecularProfileIds =
        c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations")
)
#> harmonizing input:
#>   removing 1 colData rownames not in sampleMap 'primary'

Usage

Arguments

Value

Details

See also

Examples