Managing data downloads is important to save disk space and re-downloading data files. This can be done effortlessly via the integrated BiocFileCache system.

cBioCache(..., ask = interactive())

setCache(
  directory = tools::R_user_dir("cBioPortalData", "cache"),
  verbose = TRUE,
  ask = interactive()
)

removePackCache(cancer_study_id, dry.run = TRUE)

removeDataCache(
  api,
  studyId = NA_character_,
  genePanelId = NA_character_,
  genes = NA_character_,
  molecularProfileIds = NULL,
  sampleListId = NULL,
  sampleIds = NULL,
  by = c("entrezGeneId", "hugoGeneSymbol"),
  dry.run = TRUE,
  ...
)

Arguments

...

For cBioCache, arguments passed to setCache

ask

logical (default TRUE when interactive session) Confirm the file location of the cache directory

directory

The file location where the cache is located. Once set future downloads will go to this folder.

verbose

Whether to print descriptive messages

cancer_study_id

character(1) The studyId from getStudies

dry.run

logical Whether or not to remove cache files (default TRUE).

api

An API object of class `cBioPortal` from the `cBioPortal` function

studyId

character(1) Indicates the "studyId" as taken from `getStudies`

genePanelId

character(1) Identifies the gene panel, as obtained from the `genePanels` function

genes

character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.

molecularProfileIds

character() A vector of molecular profile IDs

sampleListId

character(1) A sample list identifier as obtained from `sampleLists()``

sampleIds

character() Sample identifiers

by

character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')

Value

cBioCache: The path to the cache location

cBioCache

Get the directory location of the cache. It will prompt the user to create a cache if not already created. A specific directory can be used via setCache.

setCache

Specify the directory location of the data cache. By default, it will go to the user directory as given by:


    tools::R_user_dir("cBioPortalData", "cache")

removePackCache

Some files may become corrupt when downloading, this function allows the user to delete the tarball associated with a cancer_study_id in the cache. This only works for the cBioDataPack function. To remove the entire cBioPortalData cache, run unlink("~/.cache/cBioPortalData").

Examples


cBioCache()

removePackCache("acc_tcga", dry.run = TRUE)
#> # A tibble: 1 × 10
#>   rid   rname create_time access_time rpath rtype fpath last_modified_time etag 
#>   <chr> <chr> <chr>       <chr>       <chr> <chr> <chr> <chr>              <chr>
#> 1 BFC16 acc_… 2024-02-02… 2024-02-02… /git… web   http… 2024-01-13 16:53:… a940…
#> # ℹ 1 more variable: expires <dbl>


cbio <- cBioPortal()

cBioPortalData(
    cbio, by = "hugoGeneSymbol",
    studyId = "acc_tcga",
    genePanelId = "AmpliSeq",
    molecularProfileIds =
        c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations")
)
#> harmonizing input:
#>   removing 1 colData rownames not in sampleMap 'primary'
#> A MultiAssayExperiment object of 3 listed
#>  experiments with user-defined names and respective classes.
#>  Containing an ExperimentList class object of length 3:
#>  [1] acc_tcga_mutations: RangedSummarizedExperiment with 25 rows and 20 columns
#>  [2] acc_tcga_linear_CNA: SummarizedExperiment with 8 rows and 90 columns
#>  [3] acc_tcga_rppa: SummarizedExperiment with 1 rows and 46 columns
#> Functionality:
#>  experiments() - obtain the ExperimentList instance
#>  colData() - the primary/phenotype DataFrame
#>  sampleMap() - the sample coordination DataFrame
#>  `$`, `[`, `[[` - extract colData columns, subset, or experiment
#>  *Format() - convert into a long or wide DataFrame
#>  assays() - convert ExperimentList to a SimpleList of matrices
#>  exportClass() - save data to flat files

removeDataCache(
    cbio, by = "hugoGeneSymbol",
    studyId = "acc_tcga",
    genePanelId = "AmpliSeq",
    molecularProfileIds =
        c("acc_tcga_rppa", "acc_tcga_linear_CNA", "acc_tcga_mutations"),
    dry.run = TRUE
)
#>                                               experiment_cache.BFC15 
#> "/github/home/.cache/R/cBioPortalData/13502808251d_13502808251d.rda" 
#>                                                  clinical_cache.BFC3 
#> "/github/home/.cache/R/cBioPortalData/114820abaa7b_114820abaa7b.rda"