This section of the documentation lists the functions that allow users to access the cBioPortal API. The main representation of the API can be obtained from the `cBioPortal` function. The supporting functions listed here give access to specific parts of the API and allow the user to explore the API with individual calls. Many of the functions here are listed for documentation purposes and are recommended for advanced usage only. Users should only need to use the `cBioPortalData` main function to obtain data.

cBioPortal(
  hostname = "www.cbioportal.org",
  protocol = "https",
  api. = "/api/v2/api-docs",
  token = character()
)

getStudies(api, buildReport = FALSE)

clinicalData(api, studyId = NA_character_)

molecularProfiles(
  api,
  studyId = NA_character_,
  projection = c("SUMMARY", "ID", "DETAILED", "META")
)

fetchData(
  api,
  molecularProfileIds = NA_character_,
  entrezGeneIds = NULL,
  sampleIds = NULL
)

mutationData(
  api,
  molecularProfileIds = NA_character_,
  entrezGeneIds = NULL,
  sampleIds = NULL
)

molecularData(
  api,
  molecularProfileIds = NA_character_,
  entrezGeneIds = NULL,
  sampleIds = NULL
)

searchOps(api, keyword)

samplesInSampleLists(api, sampleListIds = NA_character_)

sampleLists(api, studyId = NA_character_)

allSamples(api, studyId = NA_character_)

getSampleInfo(
  api,
  studyId = NA_character_,
  sampleListIds = NULL,
  projection = c("SUMMARY", "ID", "DETAILED", "META")
)

genePanels(api)

getGenePanel(api, genePanelId = NA_character_)

genePanelMolecular(
  api,
  molecularProfileId = NA_character_,
  sampleListId = NULL,
  sampleIds = NULL
)

getGenePanelMolecular(api, molecularProfileIds = NA_character_, sampleIds)

geneTable(api, pageSize = 1000, pageNumber = 0, ...)

queryGeneTable(
  api,
  by = c("entrezGeneId", "hugoGeneSymbol"),
  genes = NA_character_,
  genePanelId = NA_character_
)

getDataByGenes(
  api,
  studyId = NA_character_,
  genes = NA_character_,
  genePanelId = NA_character_,
  by = c("entrezGeneId", "hugoGeneSymbol"),
  molecularProfileIds = NULL,
  sampleListId = NULL,
  sampleIds = NULL,
  ...
)

Arguments

hostname

character(1) The internet location of the service (default: 'www.cbioportal.org')

protocol

character(1) The internet protocol used to access the hostname (default: 'https')

api.

character(1) The directory location of the API protocol within the hostname (default: '/api/api-docs')

token

character(1) The Authorization Bearer token e.g., "63eba81c-2591-4e15-9d1c-fb6e8e51e35d" or a path to text file.

api

An API object of class `cBioPortal` from the `cBioPortal` function

buildReport

logical(1) Indicates whether to append the build information to the `getStudies()` table (default FALSE)

studyId

character(1) Indicates the "studyId" as taken from `getStudies`

projection

character(default: "SUMMARY") Specify the projection type for data retrieval for details see API documentation

molecularProfileIds

character() A vector of molecular profile IDs

entrezGeneIds

numeric() A vector indicating entrez gene IDs

sampleIds

character() Sample identifiers

keyword

character(1) Keyword or pattern for searching through available operations

sampleListIds

character() A vector of 'sampleListId' as obtained from `sampleLists`

genePanelId

character(1) Identifies the gene panel, as obtained from the `genePanels` function

molecularProfileId

character(1) Indicates a molecular profile ID

sampleListId

character(1) A sample list identifier as obtained from `sampleLists()``

pageSize

numeric(1) The number of rows in the table to return

pageNumber

numeric(1) The pagination page number

...

Additional arguments to lower level API functions

by

character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')

genes

character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.

Value

cBioPortal: An API object of class 'cBioPortal'

cBioPortalData: A data object of class 'MultiAssayExperiment'

API Metadata

* getStudies - Obtain a table of studies and associated metadata and optionally include a `buildReport` status (default FALSE) for each study. When enabled, the 'api_build' and 'pack_build' columns will be added to the table and will show if `MultiAssayExperiment` objects can be generated for that particular study identifier (`studyId`). The 'api_build' column corresponds to datasets obtained with `cBioPortalData` and the 'pack_build' column corresponds to datsets loaded via `cBioDataPack`.

* searchOps - Search through API operations with a keyword

* sampleLists - obtain all `sampleListIds` for a particular `studyId`

* allSamples - obtain all samples within a particular `studyId`

* genePanels - Show all available gene panels

* geneTable - Get a table of all genes by 'entrezGeneId' and 'hugoGeneSymbol'

* queryGeneTable - Get a table for only the `genes` or `genePanelId` of interest. Gene inputs are identified with the `by` argument

Patient Data

* clinicalData - Obtain clinical data for a particular study identifier ('studyId')

Molecular Profiles

* molecularProfiles - Produce a molecular profiles dataset for a given study identifier ('studyId')

Molecular Data

* fetchData - A convenience function to download both mutation and molecular data with `molecularProfileId`, `entrezGeneIds`, and `sampleIds`

* mutationData - Produce a dataset of mutation data using `molecularProfileId`, `entrezGeneIds`, and `sampleIds`

* molecularData - Produce a dataset of molecular profile data based on `molecularProfileId`, `entrezGeneIds`, and `sampleIds`

Sample Data

* samplesInSampleLists - get all samples associated with a 'sampleListId'

* getSampleInfo - Obtain sample metadata for a particular `studyId` or `sampleListId`

Gene Panels

* getGenePanels - Obtain the gene panel for a particular 'genePanelId'

* genePanelMolecular - get gene panel data for a particular `molecularProfileId` and either a vector of `sampleListId` or `sampleId`

* getGenePanelMolecular - get gene panel data for multiple `molecularProfileId`s and a vector of `sampleIds`

Genes

* getDataByGenes - Download data for a number of genes within `molecularProfileId` indicators, optionally a `sampleListId` can be provided.

Examples

cbio <- cBioPortal()

getStudies(api = cbio)
#> # A tibble: 399 × 13
#>    name          description publicStudy pmid  citation groups status importDate
#>    <chr>         <chr>       <lgl>       <chr> <chr>    <chr>   <int> <chr>     
#>  1 Adenoid Cyst… Whole exom… TRUE        2609… Martelo… ACYC;…      0 2023-12-0…
#>  2 Adenoid Cyst… Whole-exom… TRUE        2368… Ho et a… ACYC;…      0 2023-12-0…
#>  3 Adenoid Cyst… Targeted S… TRUE        2441… Ross et… ACYC;…      0 2023-12-0…
#>  4 Adenoid Cyst… Whole-geno… TRUE        2686… Rettig … ACYC;…      0 2023-12-0…
#>  5 Adenoid Cyst… WGS of 21 … TRUE        2663… Mitani … ACYC;…      0 2023-12-0…
#>  6 Adenoid Cyst… Whole-geno… TRUE        2682… Drier e… ACYC        0 2023-12-0…
#>  7 Adenoid Cyst… Whole exom… TRUE        2377… Stephen… ACYC;…      0 2023-12-0…
#>  8 Adenoid Cyst… Multi-Inst… TRUE        3148… Allen e… ACYC;…      0 2023-12-0…
#>  9 Basal Cell C… Whole-exom… TRUE        2695… Bonilla… PUBLIC      0 2023-12-0…
#> 10 Acute Lympho… Comprehens… TRUE        2573… Anderss… PUBLIC      0 2023-12-0…
#> # ℹ 389 more rows
#> # ℹ 5 more variables: allSampleCount <int>, readPermission <lgl>,
#> #   studyId <chr>, cancerTypeId <chr>, referenceGenome <chr>

searchOps(api = cbio, keyword = "molecular")
#>  [1] "fetchGenePanelDataInMultipleMolecularProfilesUsingPOST"   
#>  [2] "getGenericAssayDataInMolecularProfileUsingGET"            
#>  [3] "fetchGenericAssayDataInMultipleMolecularProfilesUsingPOST"
#>  [4] "fetchGenericAssayDataInMolecularProfileUsingPOST"         
#>  [5] "fetchMolecularDataInMultipleMolecularProfilesUsingPOST"   
#>  [6] "getAllMolecularProfilesUsingGET"                          
#>  [7] "fetchMolecularProfilesUsingPOST"                          
#>  [8] "getMolecularProfileUsingGET"                              
#>  [9] "getDiscreteCopyNumbersInMolecularProfileUsingGET"         
#> [10] "fetchDiscreteCopyNumbersInMolecularProfileUsingPOST"      
#> [11] "getAllMolecularDataInMolecularProfileUsingGET"            
#> [12] "fetchAllMolecularDataInMolecularProfileUsingPOST"         
#> [13] "getMutationsInMolecularProfileBySampleListIdUsingGET"     
#> [14] "fetchMutationsInMolecularProfileUsingPOST"                
#> [15] "fetchMutationsInMultipleMolecularProfilesUsingPOST"       
#> [16] "getAllMolecularProfilesInStudyUsingGET"                   

## obtain clinical data
acc_clin <- clinicalData(api = cbio, studyId = "acc_tcga")
acc_clin
#> # A tibble: 92 × 85
#>    patientId    AGE   AJCC_PATHOLOGIC_TUMOR_STAGE ATYPICAL_MITOTIC_FIGURES      
#>    <chr>        <chr> <chr>                       <chr>                         
#>  1 TCGA-OR-A5J1 58    Stage II                    Atypical Mitotic Figures Abse…
#>  2 TCGA-OR-A5J2 44    Stage IV                    Atypical Mitotic Figures Pres…
#>  3 TCGA-OR-A5J3 23    Stage III                   Atypical Mitotic Figures Abse…
#>  4 TCGA-OR-A5J4 23    Stage IV                    Atypical Mitotic Figures Abse…
#>  5 TCGA-OR-A5J5 30    Stage III                   Atypical Mitotic Figures Pres…
#>  6 TCGA-OR-A5J6 29    Stage II                    Atypical Mitotic Figures Abse…
#>  7 TCGA-OR-A5J7 30    Stage III                   Atypical Mitotic Figures Pres…
#>  8 TCGA-OR-A5J8 66    Stage III                   Atypical Mitotic Figures Pres…
#>  9 TCGA-OR-A5J9 22    Stage II                    Atypical Mitotic Figures Abse…
#> 10 TCGA-OR-A5JA 53    Stage IV                    Atypical Mitotic Figures Pres…
#> # ℹ 82 more rows
#> # ℹ 81 more variables: CAPSULAR_INVASION <chr>, CLIN_M_STAGE <chr>,
#> #   CT_SCAN_PREOP_RESULTS <chr>,
#> #   CYTOPLASM_PRESENCE_LESS_THAN_EQUAL_25_PERCENT <chr>,
#> #   DAYS_TO_INITIAL_PATHOLOGIC_DIAGNOSIS <chr>, DFS_MONTHS <chr>,
#> #   DFS_STATUS <chr>, DIFFUSE_ARCHITECTURE <chr>, ETHNICITY <chr>,
#> #   FORM_COMPLETION_DATE <chr>, HISTOLOGICAL_DIAGNOSIS <chr>, …

molecularProfiles(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 8
#>   molecularAlterationType datatype   name     description showProfileInAnalysi…¹
#>   <chr>                   <chr>      <chr>    <chr>       <lgl>                 
#> 1 PROTEIN_LEVEL           LOG2-VALUE Protein… Protein ex… FALSE                 
#> 2 PROTEIN_LEVEL           Z-SCORE    Protein… Protein ex… TRUE                  
#> 3 COPY_NUMBER_ALTERATION  DISCRETE   Putativ… Putative c… TRUE                  
#> 4 COPY_NUMBER_ALTERATION  CONTINUOUS Capped … Capped rel… FALSE                 
#> 5 MUTATION_EXTENDED       MAF        Mutatio… Mutation d… TRUE                  
#> 6 METHYLATION             CONTINUOUS Methyla… Methylatio… FALSE                 
#> 7 MRNA_EXPRESSION         CONTINUOUS mRNA ex… mRNA gene … FALSE                 
#> 8 MRNA_EXPRESSION         Z-SCORE    mRNA ex… mRNA expre… TRUE                  
#> 9 MRNA_EXPRESSION         Z-SCORE    mRNA ex… Log-transf… TRUE                  
#> # ℹ abbreviated name: ¹​showProfileInAnalysisTab
#> # ℹ 3 more variables: patientLevel <lgl>, molecularProfileId <chr>,
#> #   studyId <chr>

genePanels(cbio)
#> # A tibble: 57 × 2
#>    description                                                       genePanelId
#>    <chr>                                                             <chr>      
#>  1 Targeted (27 cancer genes) sequencing of adenoid cystic carcinom… ACYC_FMI_27
#>  2 Targeted panel of 232 genes.                                      Agilent    
#>  3 Targeted panel of 8 genes.                                        AmpliSeq   
#>  4 ARCHER-HEME gene panel (199 genes)                                ARCHER-HEM…
#>  5 ARCHER-SOLID Gene Panel (62 genes)                                ARCHER-SOL…
#>  6 Targeted sequencing of various tumor types via bait v3.           bait_v3    
#>  7 Targeted sequencing of various tumor types via bait v4.           bait_v4    
#>  8 Targeted sequencing of various tumor types via bait v5.           bait_v5    
#>  9 Targeted panel of 387 cancer-related genes.                       bcc_unige_…
#> 10 Research (CMO) IMPACT-Heme gene panel version 3.                  HemePACT_v3
#> # ℹ 47 more rows

(gp <- getGenePanel(cbio, "AmpliSeq"))
#> # A tibble: 8 × 2
#>   entrezGeneId hugoGeneSymbol
#>          <int> <chr>         
#> 1       171023 ASXL1         
#> 2         1788 DNMT3A        
#> 3         3417 IDH1          
#> 4         3418 IDH2          
#> 5        23451 SF3B1         
#> 6         6427 SRSF2         
#> 7         7157 TP53          
#> 8         7307 U2AF1         

muts <- mutationData(
    api = cbio,
    molecularProfileIds = "acc_tcga_mutations",
    entrezGeneIds = 1:1000,
    sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)
exps <- molecularData(
    api = cbio,
    molecularProfileIds = c("acc_tcga_rna_seq_v2_mrna", "acc_tcga_rppa"),
    entrezGeneIds = 1:1000,
    sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)

sampleLists(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 5
#>   category                                name  description sampleListId studyId
#>   <chr>                                   <chr> <chr>       <chr>        <chr>  
#> 1 all_cases_with_mrna_rnaseq_data         Samp… Samples wi… acc_tcga_rn… acc_tc…
#> 2 all_cases_in_study                      All … All sample… acc_tcga_all acc_tc…
#> 3 all_cases_with_cna_data                 Samp… Samples wi… acc_tcga_cna acc_tc…
#> 4 all_cases_with_mutation_and_cna_data    Samp… Samples wi… acc_tcga_cn… acc_tc…
#> 5 all_cases_with_mutation_and_cna_and_mr… Comp… Samples wi… acc_tcga_3w… acc_tc…
#> 6 all_cases_with_methylation_data         Samp… Samples wi… acc_tcga_me… acc_tc…
#> 7 all_cases_with_methylation_data         Samp… Samples wi… acc_tcga_me… acc_tc…
#> 8 all_cases_with_rppa_data                Samp… Samples pr… acc_tcga_rp… acc_tc…
#> 9 all_cases_with_mutation_data            Samp… Samples wi… acc_tcga_se… acc_tc…

samplesInSampleLists(
    api = cbio,
    sampleListIds = c("acc_tcga_rppa", "acc_tcga_cnaseq")
)
#> CharacterList of length 2
#> [["acc_tcga_cnaseq"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
#> [["acc_tcga_rppa"]] TCGA-OR-A5J2-01 TCGA-OR-A5J3-01 ... TCGA-PK-A5HA-01

genePanels(api = cbio)
#> # A tibble: 57 × 2
#>    description                                                       genePanelId
#>    <chr>                                                             <chr>      
#>  1 Targeted (27 cancer genes) sequencing of adenoid cystic carcinom… ACYC_FMI_27
#>  2 Targeted panel of 232 genes.                                      Agilent    
#>  3 Targeted panel of 8 genes.                                        AmpliSeq   
#>  4 ARCHER-HEME gene panel (199 genes)                                ARCHER-HEM…
#>  5 ARCHER-SOLID Gene Panel (62 genes)                                ARCHER-SOL…
#>  6 Targeted sequencing of various tumor types via bait v3.           bait_v3    
#>  7 Targeted sequencing of various tumor types via bait v4.           bait_v4    
#>  8 Targeted sequencing of various tumor types via bait v5.           bait_v5    
#>  9 Targeted panel of 387 cancer-related genes.                       bcc_unige_…
#> 10 Research (CMO) IMPACT-Heme gene panel version 3.                  HemePACT_v3
#> # ℹ 47 more rows

getGenePanel(api = cbio, genePanelId = "IMPACT341")
#> # A tibble: 341 × 2
#>    entrezGeneId hugoGeneSymbol
#>           <int> <chr>         
#>  1           25 ABL1          
#>  2        84142 ABRAXAS1      
#>  3          207 AKT1          
#>  4          208 AKT2          
#>  5        10000 AKT3          
#>  6          238 ALK           
#>  7          242 ALOX12B       
#>  8       139285 AMER1         
#>  9          324 APC           
#> 10          367 AR            
#> # ℹ 331 more rows

queryGeneTable(api = cbio, by = "entrezGeneId", genes = 7157)
#> # A tibble: 1 × 3
#>   entrezGeneId hugoGeneSymbol type          
#>          <int> <chr>          <chr>         
#> 1         7157 TP53           protein-coding


getDataByGenes(
    cbio, studyId = "acc_tcga", genes = 1:3,
    by = c("entrezGeneId", "hugoGeneSymbol"),
    molecularProfileId = "acc_tcga_rppa",
    sampleListId = "acc_tcga_rppa"
)
#> named list()