Obtain a MultiAssayExperiment
object for a particular gene panel,
studyId
, molecularProfileIds
, and sampleListIds
combination. Default
molecularProfileIds
and sampleListIds
are set to NULL for including all
data. This option is best for users who wish to obtain a section of the
study data that pertains to a specific molecular profile and gene panel
combination. For users looking to download the entire study data as provided
by the https://cbioportal.org/datasets, refer to cBioDataPack
.
cBioPortalData(
api,
studyId = NA_character_,
genePanelId = NA_character_,
genes = NA_character_,
molecularProfileIds = NULL,
sampleListId = NULL,
sampleIds = NULL,
by = c("entrezGeneId", "hugoGeneSymbol"),
check_build = TRUE,
ask = interactive()
)
An API object of class `cBioPortal` from the `cBioPortal` function
character(1) Indicates the "studyId" as taken from `getStudies`
character(1) Identifies the gene panel, as obtained from the `genePanels` function
character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.
character() A vector of molecular profile IDs
character(1) A sample list identifier as obtained from `sampleLists()``
character() Sample identifiers
character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')
logical(1L) Whether to check the build status of the
studyId
using an internal dataset. This argument should be set to
FALSE
if using alternative hostnames
, e.g.,
'pedcbioportal.kidsfirstdrc.org'
logical(1) Whether to prompt the the user before downloading and
loading study MultiAssayExperiment
that is not currently building based
on previous testing. Set to interactive()
by default. In a
non-interactive session, data download will be attempted; equivalent to
ask = FALSE
. The argument will also be used when a cache directory needs
to be created when using downloadStudy
.
A MultiAssayExperiment object
We are able to succesfully represent 98 percent of the study
identifiers as MultiAssayExperiment
objects as obtained via
cBioPortalData
with the IMPACT341
genePanelId
as the example
gene panel. Datasets that currently fail to import
can be seen in the getStudies(..., buildReport = TRUE)
dataset
under the "api_build"
column.
Note that changes to the cBioPortal API may affect this rate at any
time. If you encounter any issues, please open a GitHub issue at the
https://github.com/waldronlab/cBioPortalData/issues/ page with
a fully reproducible example.
cbio <- cBioPortal()
samps <- samplesInSampleLists(cbio, "acc_tcga_rppa")[[1]]
getGenePanelMolecular(
cbio, molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA"),
samps
)
#> # A tibble: 92 × 7
#> uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId
#> <chr> <chr> <chr> <chr> <chr>
#> 1 VENHQS1PUi1BNUoyLTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 2 VENHQS1PUi1BNUozLTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 3 VENHQS1PUi1BNUo2LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 4 VENHQS1PUi1BNUo3LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 5 VENHQS1PUi1BNUo4LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 6 VENHQS1PUi1BNUo5LTAxO… VENHQS1PUi1BNUo… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 7 VENHQS1PUi1BNUpBLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 8 VENHQS1PUi1BNUpQLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 9 VENHQS1PUi1BNUpSLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> 10 VENHQS1PUi1BNUpTLTAxO… VENHQS1PUi1BNUp… acc_tcga_linear_C… TCGA-OR… TCGA-OR-…
#> # ℹ 82 more rows
#> # ℹ 2 more variables: studyId <chr>, profiled <lgl>
acc_tcga <- cBioPortalData(
cbio, by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "AmpliSeq",
molecularProfileIds =
c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations")
)
#> harmonizing input:
#> removing 1 colData rownames not in sampleMap 'primary'