This section of the documentation lists the functions that allow users to access the cBioPortal API. The main representation of the API can be obtained from the `cBioPortal` function. The supporting functions listed here give access to specific parts of the API and allow the user to explore the API with individual calls. Many of the functions here are listed for documentation purposes and are recommended for advanced usage only. Users should only need to use the `cBioPortalData` main function to obtain data.

cBioPortal(
  hostname = "www.cbioportal.org",
  protocol = "https",
  api. = "/api/api-docs"
)

getStudies(api, buildReport = FALSE)

clinicalData(api, studyId = NA_character_)

molecularProfiles(
  api,
  studyId = NA_character_,
  projection = c("SUMMARY", "ID", "DETAILED", "META")
)

mutationData(
  api,
  molecularProfileIds = NA_character_,
  entrezGeneIds = NULL,
  sampleIds = NULL
)

molecularData(
  api,
  molecularProfileIds = NA_character_,
  entrezGeneIds = NULL,
  sampleIds = NULL
)

searchOps(api, keyword)

geneTable(api, pageSize = 1000, pageNumber = 0, ...)

samplesInSampleLists(api, sampleListIds = NA_character_)

sampleLists(api, studyId = NA_character_)

allSamples(api, studyId = NA_character_)

genePanels(api)

getGenePanel(api, genePanelId = NA_character_)

genePanelMolecular(
  api,
  molecularProfileId = NA_character_,
  sampleListId = NULL,
  sampleIds = NULL
)

getGenePanelMolecular(api, molecularProfileIds = NA_character_, sampleIds)

getSampleInfo(
  api,
  studyId = NA_character_,
  sampleListIds = NULL,
  projection = c("SUMMARY", "ID", "DETAILED", "META")
)

getDataByGenePanel(
  api,
  studyId = NA_character_,
  genePanelId = NA_character_,
  molecularProfileIds = NULL,
  sampleListId = NULL,
  sampleIds = NULL
)

getDataByGenes(
  api,
  studyId = NA_character_,
  genes = NA_character_,
  genePanelId = NA_character_,
  by = c("entrezGeneId", "hugoGeneSymbol"),
  molecularProfileIds = NULL,
  sampleListId = NULL,
  sampleIds = NULL,
  ...
)

Arguments

hostname

character(1) The internet location of the service (default: 'www.cbioportal.org')

protocol

character(1) The internet protocol used to access the hostname (default: 'https')

api.

character(1) The directory location of the API protocol within the hostname (default: '/api/api-docs')

api

An API object of class `cBioPortal` from the `cBioPortal` function

buildReport

logical(1) Indicates whether to append the build information to the `getStudies()` table (default FALSE)

studyId

character(1) Indicates the "studyId" as taken from `getStudies`

projection

character(default: "SUMMARY") Specify the projection type for data retrieval for details see API documentation

molecularProfileIds

character() A vector of molecular profile IDs

entrezGeneIds

numeric() A vector indicating entrez gene IDs

sampleIds

character() Sample identifiers

keyword

character(1) Keyword or pattern for searching through available operations

pageSize

numeric(1) The number of rows in the table to return

pageNumber

numeric(1) The pagination page number

...

Additional arguments to lower level API functions

sampleListIds

character() A vector of 'sampleListId' as obtained from `sampleLists`

genePanelId

character(1) Identifies the gene panel, as obtained from the `genePanels` function

molecularProfileId

character(1) Indicates a molecular profile ID

sampleListId

character(1) A sample list identifier as obtained from `sampleLists()``

genes

character() Either Entrez gene identifiers or Hugo gene symbols. When included, the 'by' argument indicates the type of identifier provided and 'genePanelId' is ignored. Preference is given to Entrez IDs due to faster query responses.

by

character(1) Either 'entrezGeneId' or 'hugoGeneSymbol' for row metadata (default: 'entrezGeneId')

Value

cBioPortal: An API object of class 'cBioPortal'

cBioPortalData: A data object of class 'MultiAssayExperiment'

API Metadata

* getStudies - Obtain a table of studies and associated metadata and optionally include a `buildReport` status (default FALSE) for each study. When enabled, the 'api_build' and 'pack_build' columns will be added to the table and will show if `MultiAssayExperiment` objects can be generated for that particular study identifier (`studyId`). The 'api_build' column corresponds to datasets obtained with `cBioPortalData` and the 'pack_build' column corresponds to datsets loaded via `cBioDataPack`.

* searchOps - Search through API operations with a keyword

* geneTable - Get a table of all genes by 'entrezGeneId' or 'hugoGeneSymbol'

* sampleLists - obtain all `sampleListIds` for a particular `studyId`

* allSamples - obtain all samples within a particular `studyId`

* genePanels - Show all available gene panels

Patient Data

* clinicalData - Obtain clinical data for a particular study identifier ('studyId')

Molecular Profiles

* molecularProfiles - Produce a molecular profiles dataset for a given study identifier ('studyId')

* molecularData - Produce a dataset of molecular profile data based on `molecularProfileId`, `entrezGeneIds`, and `sampleIds`

Mutation Data

* mutationData - Produce a dataset of mutation data using `molecularProfileId`, `entrezGeneIds`, and `sampleIds`

Sample Data

* samplesInSampleLists - get all samples associated with a 'sampleListId'

* getSampleInfo - Obtain sample metadata for a particular `studyId` or `sampleListId`

Gene Panels

* getGenePanels - Obtain the gene panel for a particular 'genePanelId'

* genePanelMolecular - get gene panel data for a paricular `molecularProfileId` and `sampleListId` combination

* getGenePanelMolecular - get gene panel data for a combination of `molecularProfileId` and `sampleListId` vectors

* getDataByGenePanel - Download data for a gene panel and `molecularProfileId` combination, optionally a `sampleListId` can be provided.

Genes

* getDataByGenes - Download data for a number of genes within `molecularProfileId` indicators, optionally a `sampleListId` can be provided.

Examples

cbio <- cBioPortal()

getStudies(api = cbio)
#> # A tibble: 327 × 13
#>    name      description     publicStudy pmid  citation groups status importDate
#>    <chr>     <chr>           <lgl>       <chr> <chr>    <chr>   <int> <chr>     
#>  1 Pan-Lung… "Whole-exome s… TRUE        2715… TCGA, N… ""          0 2021-04-0…
#>  2 Head and… "TCGA Head and… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#>  3 Ovarian … "Whole exome s… TRUE        2172… TCGA, N… "PUBL…      0 2021-04-2…
#>  4 Uterine … "Whole exome s… TRUE        2363… TCGA, N… "PUBL…      0 2021-04-2…
#>  5 Bladder … "Whole-exome s… TRUE        2898… Roberts… "PUBL…      0 2021-04-2…
#>  6 Cervical… "TCGA Cervical… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#>  7 Cholangi… "TCGA Cholangi… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#>  8 Kidney C… "TCGA Kidney C… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#>  9 Colorect… "TCGA Colorect… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#> 10 Lymphoid… "TCGA Lymphoid… TRUE        NA    NA       "PUBL…      0 2021-04-2…
#> # … with 317 more rows, and 5 more variables: allSampleCount <int>,
#> #   readPermission <lgl>, studyId <chr>, cancerTypeId <chr>,
#> #   referenceGenome <chr>

searchOps(api = cbio, keyword = "molecular")
#>  [1] "fetchGenePanelDataInMultipleMolecularProfilesUsingPOST"   
#>  [2] "fetchGenericAssayDataInMultipleMolecularProfilesUsingPOST"
#>  [3] "fetchGenericAssayDataInMolecularProfileUsingPOST"         
#>  [4] "fetchMolecularDataInMultipleMolecularProfilesUsingPOST"   
#>  [5] "getAllMolecularProfilesUsingGET"                          
#>  [6] "fetchMolecularProfilesUsingPOST"                          
#>  [7] "getMolecularProfileUsingGET"                              
#>  [8] "getDiscreteCopyNumbersInMolecularProfileUsingGET"         
#>  [9] "fetchDiscreteCopyNumbersInMolecularProfileUsingPOST"      
#> [10] "getAllMolecularDataInMolecularProfileUsingGET"            
#> [11] "fetchAllMolecularDataInMolecularProfileUsingPOST"         
#> [12] "getMutationsInMolecularProfileBySampleListIdUsingGET"     
#> [13] "fetchMutationsInMolecularProfileUsingPOST"                
#> [14] "fetchMutationsInMultipleMolecularProfilesUsingPOST"       
#> [15] "getAllMolecularProfilesInStudyUsingGET"                   

## obtain clinical data
acc_clin <- clinicalData(api = cbio, studyId = "acc_tcga")
acc_clin
#> # A tibble: 92 × 85
#>    patientId    AGE   AJCC_PATHOLOGIC_T… ATYPICAL_MITOTIC_F… CAPSULAR_INVASION  
#>    <chr>        <chr> <chr>              <chr>               <chr>              
#>  1 TCGA-OR-A5J1 58    Stage II           Atypical Mitotic F… Invasion of Tumor …
#>  2 TCGA-OR-A5J2 44    Stage IV           Atypical Mitotic F… Invasion of Tumor …
#>  3 TCGA-OR-A5J3 23    Stage III          Atypical Mitotic F… Invasion of Tumor …
#>  4 TCGA-OR-A5J4 23    Stage IV           Atypical Mitotic F… Invasion of Tumor …
#>  5 TCGA-OR-A5J5 30    Stage III          Atypical Mitotic F… Invasion of Tumor …
#>  6 TCGA-OR-A5J6 29    Stage II           Atypical Mitotic F… Invasion of Tumor …
#>  7 TCGA-OR-A5J7 30    Stage III          Atypical Mitotic F… Invasion of Tumor …
#>  8 TCGA-OR-A5J8 66    Stage III          Atypical Mitotic F… Invasion of Tumor …
#>  9 TCGA-OR-A5J9 22    Stage II           Atypical Mitotic F… Invasion of Tumor …
#> 10 TCGA-OR-A5JA 53    Stage IV           Atypical Mitotic F… Invasion of Tumor …
#> # … with 82 more rows, and 80 more variables: CLIN_M_STAGE <chr>,
#> #   CT_SCAN_PREOP_RESULTS <chr>,
#> #   CYTOPLASM_PRESENCE_LESS_THAN_EQUAL_25_PERCENT <chr>,
#> #   DAYS_TO_INITIAL_PATHOLOGIC_DIAGNOSIS <chr>, DFS_MONTHS <chr>,
#> #   DFS_STATUS <chr>, DIFFUSE_ARCHITECTURE <chr>, ETHNICITY <chr>,
#> #   FORM_COMPLETION_DATE <chr>, HISTOLOGICAL_DIAGNOSIS <chr>,
#> #   HISTORY_ADRENAL_HORMONE_EXCESS <chr>, …

molecularProfiles(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 8
#>   molecularAlterat… datatype name    description   showProfileInAn… patientLevel
#>   <chr>             <chr>    <chr>   <chr>         <lgl>            <lgl>       
#> 1 PROTEIN_LEVEL     LOG2-VA… Protei… Protein expr… FALSE            FALSE       
#> 2 PROTEIN_LEVEL     Z-SCORE  Protei… Protein expr… TRUE             FALSE       
#> 3 COPY_NUMBER_ALTE… DISCRETE Putati… Putative cop… TRUE             FALSE       
#> 4 MRNA_EXPRESSION   CONTINU… mRNA e… mRNA gene ex… FALSE            FALSE       
#> 5 MRNA_EXPRESSION   Z-SCORE  mRNA e… mRNA express… TRUE             FALSE       
#> 6 COPY_NUMBER_ALTE… CONTINU… Capped… Capped relat… FALSE            FALSE       
#> 7 METHYLATION       CONTINU… Methyl… Methylation … FALSE            FALSE       
#> 8 MRNA_EXPRESSION   Z-SCORE  mRNA e… Log-transfor… TRUE             FALSE       
#> 9 MUTATION_EXTENDED MAF      Mutati… Mutation dat… TRUE             FALSE       
#> # … with 2 more variables: molecularProfileId <chr>, studyId <chr>

genePanels(cbio)
#> # A tibble: 53 × 2
#>    description                                               genePanelId        
#>    <chr>                                                     <chr>              
#>  1 Targeted (27 cancer genes) sequencing of adenoid cystic … ACYC_FMI_27        
#>  2 Targeted panel of 8 genes.                                AmpliSeq           
#>  3 ARCHER-SOLID Gene Panel (62 genes)                        ARCHER-SOLID-CV1   
#>  4 Targeted sequencing of various tumor types via bait v3.   bait_v3            
#>  5 Targeted sequencing of various tumor types via bait v4.   bait_v4            
#>  6 Targeted sequencing of various tumor types via bait v5.   bait_v5            
#>  7 Foundation Medicine T5a gene panel (323 genes)            FMI-T5a            
#>  8 Foundation Medicine T7 gene panel (429 genes)             FMI-T7             
#>  9 Foundation Medicine T5 gene panel (326 genes)             glioma_mskcc_2019_…
#> 10 Foundation Medicine T7 gene panel (434 genes)             glioma_mskcc_2019_…
#> # … with 43 more rows

(gp <- getGenePanel(cbio, "AmpliSeq"))
#> # A tibble: 8 × 2
#>   entrezGeneId hugoGeneSymbol
#>          <int> <chr>         
#> 1       171023 ASXL1         
#> 2         1788 DNMT3A        
#> 3         3417 IDH1          
#> 4         3418 IDH2          
#> 5        23451 SF3B1         
#> 6         6427 SRSF2         
#> 7         7157 TP53          
#> 8         7307 U2AF1         

muts <- mutationData(
    api = cbio,
    molecularProfileIds = "acc_tcga_mutations",
    entrezGeneIds = 1:1000,
    sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)
exps <- molecularData(
    api = cbio,
    molecularProfileIds = c("acc_tcga_rna_seq_v2_mrna", "acc_tcga_rppa"),
    entrezGeneIds = 1:1000,
    sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)

sampleLists(api = cbio, studyId = "acc_tcga")
#> # A tibble: 9 × 5
#>   category          name          description            sampleListId    studyId
#>   <chr>             <chr>         <chr>                  <chr>           <chr>  
#> 1 all_cases_with_r… Samples with… Samples protein data … acc_tcga_rppa   acc_tc…
#> 2 all_cases_with_m… Samples with… Samples with mutation… acc_tcga_cnaseq acc_tc…
#> 3 all_cases_in_stu… All samples   All samples (92 sampl… acc_tcga_all    acc_tc…
#> 4 all_cases_with_c… Samples with… Samples with CNA data… acc_tcga_cna    acc_tc…
#> 5 all_cases_with_m… Samples with… Samples with mutation… acc_tcga_seque… acc_tc…
#> 6 all_cases_with_m… Samples with… Samples with methylat… acc_tcga_methy… acc_tc…
#> 7 all_cases_with_m… Samples with… Samples with mRNA exp… acc_tcga_rna_s… acc_tc…
#> 8 all_cases_with_m… Complete sam… Samples with mutation… acc_tcga_3way_… acc_tc…
#> 9 all_cases_with_m… Samples with… Samples with methylat… acc_tcga_methy… acc_tc…

samplesInSampleLists(
    api = cbio,
    sampleListIds = c("acc_tcga_rppa", "acc_tcga_cnaseq")
)
#> CharacterList of length 2
#> [["acc_tcga_cnaseq"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
#> [["acc_tcga_rppa"]] TCGA-OR-A5J2-01 TCGA-OR-A5J3-01 ... TCGA-PK-A5HA-01

genePanels(api = cbio)
#> # A tibble: 53 × 2
#>    description                                               genePanelId        
#>    <chr>                                                     <chr>              
#>  1 Targeted (27 cancer genes) sequencing of adenoid cystic … ACYC_FMI_27        
#>  2 Targeted panel of 8 genes.                                AmpliSeq           
#>  3 ARCHER-SOLID Gene Panel (62 genes)                        ARCHER-SOLID-CV1   
#>  4 Targeted sequencing of various tumor types via bait v3.   bait_v3            
#>  5 Targeted sequencing of various tumor types via bait v4.   bait_v4            
#>  6 Targeted sequencing of various tumor types via bait v5.   bait_v5            
#>  7 Foundation Medicine T5a gene panel (323 genes)            FMI-T5a            
#>  8 Foundation Medicine T7 gene panel (429 genes)             FMI-T7             
#>  9 Foundation Medicine T5 gene panel (326 genes)             glioma_mskcc_2019_…
#> 10 Foundation Medicine T7 gene panel (434 genes)             glioma_mskcc_2019_…
#> # … with 43 more rows

getGenePanel(api = cbio, genePanelId = "IMPACT341")
#> # A tibble: 341 × 2
#>    entrezGeneId hugoGeneSymbol
#>           <int> <chr>         
#>  1           25 ABL1          
#>  2        84142 ABRAXAS1      
#>  3          207 AKT1          
#>  4          208 AKT2          
#>  5        10000 AKT3          
#>  6          238 ALK           
#>  7          242 ALOX12B       
#>  8       139285 AMER1         
#>  9          324 APC           
#> 10          367 AR            
#> # … with 331 more rows


getDataByGenePanel(cbio, studyId = "acc_tcga", genePanelId = "IMPACT341",
   molecularProfileId = "acc_tcga_rppa", sampleListId = "acc_tcga_rppa")
#> Warning: 'getDataByGenePanel' is deprecated.
#> Use 'getDataByGenes' instead.
#> See help("Deprecated")
#> $acc_tcga_rppa
#> # A tibble: 2,622 × 9
#>    uniqueSampleKey    uniquePatientKey   entrezGeneId molecularProfil… sampleId 
#>    <chr>              <chr>                     <int> <chr>            <chr>    
#>  1 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…         5728 acc_tcga_rppa    TCGA-OR-…
#>  2 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…          595 acc_tcga_rppa    TCGA-OR-…
#>  3 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…          596 acc_tcga_rppa    TCGA-OR-…
#>  4 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…        10413 acc_tcga_rppa    TCGA-OR-…
#>  5 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…         3791 acc_tcga_rppa    TCGA-OR-…
#>  6 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…         7157 acc_tcga_rppa    TCGA-OR-…
#>  7 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…          207 acc_tcga_rppa    TCGA-OR-…
#>  8 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…          208 acc_tcga_rppa    TCGA-OR-…
#>  9 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…        57521 acc_tcga_rppa    TCGA-OR-…
#> 10 VENHQS1PUi1BNUoyL… VENHQS1PUi1BNUoyO…         2064 acc_tcga_rppa    TCGA-OR-…
#> # … with 2,612 more rows, and 4 more variables: patientId <chr>, studyId <chr>,
#> #   value <dbl>, hugoGeneSymbol <chr>
#> 


getDataByGenes(
    cbio, studyId = "acc_tcga", genes = 1:3,
    by = c("entrezGeneId", "hugoGeneSymbol"),
    molecularProfileId = "acc_tcga_rppa",
    sampleListId = "acc_tcga_rppa"
)
#> named list()