This section of the documentation lists the functions that allow users to access the cBioPortal API. The main representation of the API can be obtained from the `cBioPortal` function. The supporting functions listed here give access to specific parts of the API and allow the user to explore the API with individual calls. Many of the functions here are listed for documentation purposes and are recommended for advanced usage only. Users should only need to use the `cBioPortalData` main function to obtain data.
cBioPortal(
hostname = "www.cbioportal.org",
protocol = "https",
api. = "/api/v2/api-docs",
token = character()
)
getStudies(api, buildReport = FALSE)
clinicalData(api, studyId = NA_character_)
molecularProfiles(
api,
studyId = NA_character_,
projection = c("SUMMARY", "ID", "DETAILED", "META")
)
fetchData(
api,
molecularProfileIds = NA_character_,
entrezGeneIds = NULL,
sampleIds = NULL
)
mutationData(
api,
molecularProfileIds = NA_character_,
entrezGeneIds = NULL,
sampleIds = NULL
)
molecularData(
api,
molecularProfileIds = NA_character_,
entrezGeneIds = NULL,
sampleIds = NULL
)
searchOps(api, keyword)
samplesInSampleLists(api, sampleListIds = NA_character_)
sampleLists(api, studyId = NA_character_)
allSamples(api, studyId = NA_character_)
getSampleInfo(
api,
studyId = NA_character_,
sampleListIds = NULL,
projection = c("SUMMARY", "ID", "DETAILED", "META")
)
genePanels(api)
getGenePanel(api, genePanelId = NA_character_)
genePanelMolecular(
api,
molecularProfileId = NA_character_,
sampleListId = NULL,
sampleIds = NULL
)
getGenePanelMolecular(api, molecularProfileIds = NA_character_, sampleIds)
geneTable(api, pageSize = 1000, pageNumber = 0, ...)
queryGeneTable(
api,
by = c("entrezGeneId", "hugoGeneSymbol"),
genes = NA_character_,
genePanelId = NA_character_
)
getDataByGenes(
api,
studyId = NA_character_,
genes = NA_character_,
genePanelId = NA_character_,
by = c("entrezGeneId", "hugoGeneSymbol"),
molecularProfileIds = NULL,
sampleListId = NULL,
sampleIds = NULL,
...
)
character(1) The internet location of the service (default: 'www.cbioportal.org')
character(1) The internet protocol used to access the hostname (default: 'https')
character(1) The directory location of the API protocol within the hostname (default: '/api/api-docs')
character(1) The Authorization Bearer token e.g., "63eba81c-2591-4e15-9d1c-fb6e8e51e35d" or a path to text file.
An API object of class `cBioPortal` from the `cBioPortal` function
logical(1) Indicates whether to append the build information to the `getStudies()` table (default FALSE)
character(1) Indicates the "studyId" as taken from `getStudies`
character(default: "SUMMARY") Specify the projection type for data retrieval for details see API documentation
character() A vector of molecular profile IDs
numeric() A vector indicating entrez gene IDs
character() Sample identifiers
character(1) Keyword or pattern for searching through available operations
character() A vector of 'sampleListId' as obtained from `sampleLists`
character(1) Identifies the gene panel, as obtained from the `genePanels` function
character(1) Indicates a molecular profile ID
character(1) A sample list identifier as obtained from `sampleLists()``
numeric(1) The number of rows in the table to return
numeric(1) The pagination page number
Additional arguments to lower level API functions
character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')
character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.
cBioPortal: An API object of class 'cBioPortal'
cBioPortalData: A data object of class 'MultiAssayExperiment'
* getStudies - Obtain a table of studies and associated metadata and optionally include a `buildReport` status (default FALSE) for each study. When enabled, the 'api_build' and 'pack_build' columns will be added to the table and will show if `MultiAssayExperiment` objects can be generated for that particular study identifier (`studyId`). The 'api_build' column corresponds to datasets obtained with `cBioPortalData` and the 'pack_build' column corresponds to datsets loaded via `cBioDataPack`.
* searchOps - Search through API operations with a keyword
* sampleLists - obtain all `sampleListIds` for a particular `studyId`
* allSamples - obtain all samples within a particular `studyId`
* genePanels - Show all available gene panels
* geneTable - Get a table of all genes by 'entrezGeneId' and 'hugoGeneSymbol'
* queryGeneTable - Get a table for only the `genes` or `genePanelId` of interest. Gene inputs are identified with the `by` argument
* clinicalData - Obtain clinical data for a particular study identifier ('studyId')
* molecularProfiles - Produce a molecular profiles dataset for a given study identifier ('studyId')
* fetchData - A convenience function to download both mutation and molecular data with `molecularProfileId`, `entrezGeneIds`, and `sampleIds`
* mutationData - Produce a dataset of mutation data using `molecularProfileId`, `entrezGeneIds`, and `sampleIds`
* molecularData - Produce a dataset of molecular profile data based on `molecularProfileId`, `entrezGeneIds`, and `sampleIds`
* samplesInSampleLists - get all samples associated with a 'sampleListId'
* getSampleInfo - Obtain sample metadata for a particular `studyId` or `sampleListId`
* getGenePanels - Obtain the gene panel for a particular 'genePanelId'
* genePanelMolecular - get gene panel data for a particular `molecularProfileId` and either a vector of `sampleListId` or `sampleId`
* getGenePanelMolecular - get gene panel data for multiple `molecularProfileId`s and a vector of `sampleIds`
* getDataByGenes - Download data for a number of genes within `molecularProfileId` indicators, optionally a `sampleListId` can be provided.
cbio <- cBioPortal()
getStudies(api = cbio)
#> # A tibble: 399 × 13
#> name description publicStudy pmid citation groups status importDate
#> <chr> <chr> <lgl> <chr> <chr> <chr> <int> <chr>
#> 1 Adenoid Cyst… Whole exom… TRUE 2609… Martelo… ACYC;… 0 2023-12-0…
#> 2 Adenoid Cyst… Whole-exom… TRUE 2368… Ho et a… ACYC;… 0 2023-12-0…
#> 3 Adenoid Cyst… Targeted S… TRUE 2441… Ross et… ACYC;… 0 2023-12-0…
#> 4 Adenoid Cyst… Whole-geno… TRUE 2686… Rettig … ACYC;… 0 2023-12-0…
#> 5 Adenoid Cyst… WGS of 21 … TRUE 2663… Mitani … ACYC;… 0 2023-12-0…
#> 6 Adenoid Cyst… Whole-geno… TRUE 2682… Drier e… ACYC 0 2023-12-0…
#> 7 Adenoid Cyst… Whole exom… TRUE 2377… Stephen… ACYC;… 0 2023-12-0…
#> 8 Adenoid Cyst… Multi-Inst… TRUE 3148… Allen e… ACYC;… 0 2023-12-0…
#> 9 Basal Cell C… Whole-exom… TRUE 2695… Bonilla… PUBLIC 0 2023-12-0…
#> 10 Acute Lympho… Comprehens… TRUE 2573… Anderss… PUBLIC 0 2023-12-0…
#> # ℹ 389 more rows
#> # ℹ 5 more variables: allSampleCount <int>, readPermission <lgl>,
#> # studyId <chr>, cancerTypeId <chr>, referenceGenome <chr>
searchOps(api = cbio, keyword = "molecular")
#> [1] "fetchGenePanelDataInMultipleMolecularProfilesUsingPOST"
#> [2] "getGenericAssayDataInMolecularProfileUsingGET"
#> [3] "fetchGenericAssayDataInMultipleMolecularProfilesUsingPOST"
#> [4] "fetchGenericAssayDataInMolecularProfileUsingPOST"
#> [5] "fetchMolecularDataInMultipleMolecularProfilesUsingPOST"
#> [6] "getAllMolecularProfilesUsingGET"
#> [7] "fetchMolecularProfilesUsingPOST"
#> [8] "getMolecularProfileUsingGET"
#> [9] "getDiscreteCopyNumbersInMolecularProfileUsingGET"
#> [10] "fetchDiscreteCopyNumbersInMolecularProfileUsingPOST"
#> [11] "getAllMolecularDataInMolecularProfileUsingGET"
#> [12] "fetchAllMolecularDataInMolecularProfileUsingPOST"
#> [13] "getMutationsInMolecularProfileBySampleListIdUsingGET"
#> [14] "fetchMutationsInMolecularProfileUsingPOST"
#> [15] "fetchMutationsInMultipleMolecularProfilesUsingPOST"
#> [16] "getAllMolecularProfilesInStudyUsingGET"
## obtain clinical data
acc_clin <- clinicalData(api = cbio, studyId = "acc_tcga")
acc_clin
#> # A tibble: 92 × 85
#> patientId AGE AJCC_PATHOLOGIC_TUMOR_STAGE ATYPICAL_MITOTIC_FIGURES
#> <chr> <chr> <chr> <chr>
#> 1 TCGA-OR-A5J1 58 Stage II Atypical Mitotic Figures Abse…
#> 2 TCGA-OR-A5J2 44 Stage IV Atypical Mitotic Figures Pres…
#> 3 TCGA-OR-A5J3 23 Stage III Atypical Mitotic Figures Abse…
#> 4 TCGA-OR-A5J4 23 Stage IV Atypical Mitotic Figures Abse…
#> 5 TCGA-OR-A5J5 30 Stage III Atypical Mitotic Figures Pres…
#> 6 TCGA-OR-A5J6 29 Stage II Atypical Mitotic Figures Abse…
#> 7 TCGA-OR-A5J7 30 Stage III Atypical Mitotic Figures Pres…
#> 8 TCGA-OR-A5J8 66 Stage III Atypical Mitotic Figures Pres…
#> 9 TCGA-OR-A5J9 22 Stage II Atypical Mitotic Figures Abse…
#> 10 TCGA-OR-A5JA 53 Stage IV Atypical Mitotic Figures Pres…
#> # ℹ 82 more rows
#> # ℹ 81 more variables: CAPSULAR_INVASION <chr>, CLIN_M_STAGE <chr>,
#> # CT_SCAN_PREOP_RESULTS <chr>,
#> # CYTOPLASM_PRESENCE_LESS_THAN_EQUAL_25_PERCENT <chr>,
#> # DAYS_TO_INITIAL_PATHOLOGIC_DIAGNOSIS <chr>, DFS_MONTHS <chr>,
#> # DFS_STATUS <chr>, DIFFUSE_ARCHITECTURE <chr>, ETHNICITY <chr>,
#> # FORM_COMPLETION_DATE <chr>, HISTOLOGICAL_DIAGNOSIS <chr>, …
molecularProfiles(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 8
#> molecularAlterationType datatype name description showProfileInAnalysi…¹
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 PROTEIN_LEVEL LOG2-VALUE Protein… Protein ex… FALSE
#> 2 PROTEIN_LEVEL Z-SCORE Protein… Protein ex… TRUE
#> 3 COPY_NUMBER_ALTERATION DISCRETE Putativ… Putative c… TRUE
#> 4 COPY_NUMBER_ALTERATION CONTINUOUS Capped … Capped rel… FALSE
#> 5 MUTATION_EXTENDED MAF Mutatio… Mutation d… TRUE
#> 6 METHYLATION CONTINUOUS Methyla… Methylatio… FALSE
#> 7 MRNA_EXPRESSION CONTINUOUS mRNA ex… mRNA gene … FALSE
#> 8 MRNA_EXPRESSION Z-SCORE mRNA ex… mRNA expre… TRUE
#> 9 MRNA_EXPRESSION Z-SCORE mRNA ex… Log-transf… TRUE
#> # ℹ abbreviated name: ¹showProfileInAnalysisTab
#> # ℹ 3 more variables: patientLevel <lgl>, molecularProfileId <chr>,
#> # studyId <chr>
genePanels(cbio)
#> # A tibble: 57 × 2
#> description genePanelId
#> <chr> <chr>
#> 1 Targeted (27 cancer genes) sequencing of adenoid cystic carcinom… ACYC_FMI_27
#> 2 Targeted panel of 232 genes. Agilent
#> 3 Targeted panel of 8 genes. AmpliSeq
#> 4 ARCHER-HEME gene panel (199 genes) ARCHER-HEM…
#> 5 ARCHER-SOLID Gene Panel (62 genes) ARCHER-SOL…
#> 6 Targeted sequencing of various tumor types via bait v3. bait_v3
#> 7 Targeted sequencing of various tumor types via bait v4. bait_v4
#> 8 Targeted sequencing of various tumor types via bait v5. bait_v5
#> 9 Targeted panel of 387 cancer-related genes. bcc_unige_…
#> 10 Research (CMO) IMPACT-Heme gene panel version 3. HemePACT_v3
#> # ℹ 47 more rows
(gp <- getGenePanel(cbio, "AmpliSeq"))
#> # A tibble: 8 × 2
#> entrezGeneId hugoGeneSymbol
#> <int> <chr>
#> 1 171023 ASXL1
#> 2 1788 DNMT3A
#> 3 3417 IDH1
#> 4 3418 IDH2
#> 5 23451 SF3B1
#> 6 6427 SRSF2
#> 7 7157 TP53
#> 8 7307 U2AF1
muts <- mutationData(
api = cbio,
molecularProfileIds = "acc_tcga_mutations",
entrezGeneIds = 1:1000,
sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)
exps <- molecularData(
api = cbio,
molecularProfileIds = c("acc_tcga_rna_seq_v2_mrna", "acc_tcga_rppa"),
entrezGeneIds = 1:1000,
sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)
sampleLists(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 5
#> category name description sampleListId studyId
#> <chr> <chr> <chr> <chr> <chr>
#> 1 all_cases_with_mrna_rnaseq_data Samp… Samples wi… acc_tcga_rn… acc_tc…
#> 2 all_cases_in_study All … All sample… acc_tcga_all acc_tc…
#> 3 all_cases_with_cna_data Samp… Samples wi… acc_tcga_cna acc_tc…
#> 4 all_cases_with_mutation_and_cna_data Samp… Samples wi… acc_tcga_cn… acc_tc…
#> 5 all_cases_with_mutation_and_cna_and_mr… Comp… Samples wi… acc_tcga_3w… acc_tc…
#> 6 all_cases_with_methylation_data Samp… Samples wi… acc_tcga_me… acc_tc…
#> 7 all_cases_with_methylation_data Samp… Samples wi… acc_tcga_me… acc_tc…
#> 8 all_cases_with_rppa_data Samp… Samples pr… acc_tcga_rp… acc_tc…
#> 9 all_cases_with_mutation_data Samp… Samples wi… acc_tcga_se… acc_tc…
samplesInSampleLists(
api = cbio,
sampleListIds = c("acc_tcga_rppa", "acc_tcga_cnaseq")
)
#> CharacterList of length 2
#> [["acc_tcga_cnaseq"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
#> [["acc_tcga_rppa"]] TCGA-OR-A5J2-01 TCGA-OR-A5J3-01 ... TCGA-PK-A5HA-01
genePanels(api = cbio)
#> # A tibble: 57 × 2
#> description genePanelId
#> <chr> <chr>
#> 1 Targeted (27 cancer genes) sequencing of adenoid cystic carcinom… ACYC_FMI_27
#> 2 Targeted panel of 232 genes. Agilent
#> 3 Targeted panel of 8 genes. AmpliSeq
#> 4 ARCHER-HEME gene panel (199 genes) ARCHER-HEM…
#> 5 ARCHER-SOLID Gene Panel (62 genes) ARCHER-SOL…
#> 6 Targeted sequencing of various tumor types via bait v3. bait_v3
#> 7 Targeted sequencing of various tumor types via bait v4. bait_v4
#> 8 Targeted sequencing of various tumor types via bait v5. bait_v5
#> 9 Targeted panel of 387 cancer-related genes. bcc_unige_…
#> 10 Research (CMO) IMPACT-Heme gene panel version 3. HemePACT_v3
#> # ℹ 47 more rows
getGenePanel(api = cbio, genePanelId = "IMPACT341")
#> # A tibble: 341 × 2
#> entrezGeneId hugoGeneSymbol
#> <int> <chr>
#> 1 25 ABL1
#> 2 84142 ABRAXAS1
#> 3 207 AKT1
#> 4 208 AKT2
#> 5 10000 AKT3
#> 6 238 ALK
#> 7 242 ALOX12B
#> 8 139285 AMER1
#> 9 324 APC
#> 10 367 AR
#> # ℹ 331 more rows
queryGeneTable(api = cbio, by = "entrezGeneId", genes = 7157)
#> # A tibble: 1 × 3
#> entrezGeneId hugoGeneSymbol type
#> <int> <chr> <chr>
#> 1 7157 TP53 protein-coding
getDataByGenes(
cbio, studyId = "acc_tcga", genes = 1:3,
by = c("entrezGeneId", "hugoGeneSymbol"),
molecularProfileId = "acc_tcga_rppa",
sampleListId = "acc_tcga_rppa"
)
#> named list()