Managing data downloads is important to save disk space and re-downloading data files. This can be done effortlessly via the integrated BiocFileCache system.
cBioCache(..., ask = interactive())
setCache(
directory = tools::R_user_dir("cBioPortalData", "cache"),
verbose = TRUE,
ask = interactive()
)
removePackCache(cancer_study_id, dry.run = TRUE)
removeDataCache(
api,
studyId = NA_character_,
genePanelId = NA_character_,
genes = NA_character_,
molecularProfileIds = NULL,
sampleListId = NULL,
sampleIds = NULL,
by = c("entrezGeneId", "hugoGeneSymbol"),
dry.run = TRUE,
...
)
For cBioCache
, arguments passed to setCache
logical (default TRUE when interactive session) Confirm the file location of the cache directory
The file location where the cache is located. Once set future downloads will go to this folder.
Whether to print descriptive messages
character(1) The studyId
from getStudies
logical Whether or not to remove cache files (default TRUE).
An API object of class `cBioPortal` from the `cBioPortal` function
character(1) Indicates the "studyId" as taken from `getStudies`
character(1) Identifies the gene panel, as obtained from the `genePanels` function
character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.
character() A vector of molecular profile IDs
character(1) A sample list identifier as obtained from `sampleLists()``
character() Sample identifiers
character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')
cBioCache: The path to the cache location
Get the directory location of the cache. It will prompt the user to create
a cache if not already created. A specific directory can be used via
setCache
.
Specify the directory location of the data cache. By default, it will go to the user directory as given by:
tools::R_user_dir("cBioPortalData", "cache")
Some files may become corrupt when downloading, this function allows
the user to delete the tarball associated with a cancer_study_id
in the
cache. This only works for the cBioDataPack
function. To remove the entire
cBioPortalData
cache, run unlink("~/.cache/cBioPortalData")
.
cBioCache()
removePackCache("acc_tcga", dry.run = TRUE)
#> # A tibble: 1 × 10
#> rid rname create_time access_time rpath rtype fpath last_modified_time etag
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 BFC16 acc_… 2024-02-02… 2024-02-02… /git… web http… 2024-01-13 16:53:… a940…
#> # ℹ 1 more variable: expires <dbl>
cbio <- cBioPortal()
cBioPortalData(
cbio, by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "AmpliSeq",
molecularProfileIds =
c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations")
)
#> harmonizing input:
#> removing 1 colData rownames not in sampleMap 'primary'
#> A MultiAssayExperiment object of 3 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 3:
#> [1] acc_tcga_mutations: RangedSummarizedExperiment with 25 rows and 20 columns
#> [2] acc_tcga_linear_CNA: SummarizedExperiment with 8 rows and 90 columns
#> [3] acc_tcga_rppa: SummarizedExperiment with 1 rows and 46 columns
#> Functionality:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DataFrame
#> sampleMap() - the sample coordination DataFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DataFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
#> exportClass() - save data to flat files
removeDataCache(
cbio, by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "AmpliSeq",
molecularProfileIds =
c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations"),
dry.run = TRUE
)
#> experiment_cache.BFC15
#> "/github/home/.cache/R/cBioPortalData/13502808251d_13502808251d.rda"
#> clinical_cache.BFC3
#> "/github/home/.cache/R/cBioPortalData/114820abaa7b_114820abaa7b.rda"