vignettes/cgdsrMigration.Rmd
cgdsrMigration.Rmd
This vignette aims to help developers migrate from the now defunct
cgdsr
CRAN package. Note that the cgdsr
package code is shown for comparison but it is not guaranteed to work.
If you have questions regarding the contents, please create an issue at
the GitHub repository: https://github.com/waldronlab/cBioPortalData/issues
library(cBioPortalData)
cBioPortalData
setup
Here we show the default inputs to the cBioPortal function for clarity.
cbio <- cBioPortal(
hostname = "www.cbioportal.org",
protocol = "https",
api. = "/api/api-docs"
)
getStudies(cbio)
FALSE
[38;5;246m# A tibble: 365 × 13
[39m
FALSE name descr…¹ publi…² groups status impor…³ allSa…⁴ readP…⁵ studyId cance…⁶
FALSE
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<int>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<int>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
FALSE
[38;5;250m 1
[39m Adreno…
[38;5;246m"
[39mTCGA … TRUE
[38;5;246m"
[39mPUBL… 0 2022-1… 92 TRUE acc_tc… acc
FALSE
[38;5;250m 2
[39m Acute …
[38;5;246m"
[39mCompr… TRUE
[38;5;246m"
[39mPUBL… 0 2022-1… 93 TRUE all_st… bll
FALSE
[38;5;250m 3
[39m Hypodi…
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39m
[38;5;246m"
[39m 0 2022-1… 44 TRUE all_st… myeloid
FALSE
[38;5;250m 4
[39m Adenoi…
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39mACYC… 0 2022-1… 12 TRUE acbc_m… acbc
FALSE
[38;5;250m 5
[39m Adenoi…
[38;5;246m"
[39mTarge… TRUE
[38;5;246m"
[39mACYC… 0 2022-1… 28 TRUE acyc_f… acyc
FALSE
[38;5;250m 6
[39m Adenoi…
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39mACYC… 0 2022-1… 25 TRUE acyc_j… acyc
FALSE
[38;5;250m 7
[39m Adenoi…
[38;5;246m"
[39mWGS o… TRUE
[38;5;246m"
[39mACYC… 0 2022-1… 102 TRUE acyc_m… acyc
FALSE
[38;5;250m 8
[39m Adenoi…
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39mACYC
[38;5;246m"
[39m 0 2022-1… 10 TRUE acyc_m… acyc
FALSE
[38;5;250m 9
[39m Adenoi…
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39mACYC… 0 2022-1… 24 TRUE acyc_s… acyc
FALSE
[38;5;250m10
[39m Acute …
[38;5;246m"
[39mWhole… TRUE
[38;5;246m"
[39mPUBL… 0 2022-1… 73 TRUE all_st… bll
FALSE
[38;5;246m# … with 355 more rows, 3 more variables: referenceGenome <chr>, pmid <chr>,
[39m
FALSE
[38;5;246m# citation <chr>, and abbreviated variable names ¹description, ²publicStudy,
[39m
FALSE
[38;5;246m# ³importDate, ⁴allSampleCount, ⁵readPermission, ⁶cancerTypeId
[39m
Note that the studyId
column is important for further
queries.
head(getStudies(cbio)[["studyId"]])
## [1] "acc_tcga" "all_stjude_2015" "all_stjude_2013" "acbc_mskcc_2015"
## [5] "acyc_fmi_2014" "acyc_jhu_2016"
cgdsr
setup
library(cgdsr)
cgds <- CGDS("http://www.cbioportal.org/")
getCancerStudies.CGDS(cgds)
cBioPortalData
(Cases)
patientId
.sampleListId
identifies groups of
patientId
based on profile typesampleLists
function uses studyId
input to return sampleListId
For the sample list identifiers, you can use sampleLists
and inspect the sampleListId
column.
samps <- sampleLists(cbio, "gbm_tcga_pub")
samps[, c("category", "name", "sampleListId")]
## # A tibble: 15 × 3
## category name sampl…¹
## <chr> <chr> <chr>
## 1 all_cases_in_study All samples gbm_tc…
## 2 other Expression Cluster Classical gbm_tc…
## 3 all_cases_with_cna_data Samples with CNA data gbm_tc…
## 4 all_cases_with_mutation_and_cna_data Samples with mutation and CNA d… gbm_tc…
## 5 all_cases_with_mrna_array_data Samples with mRNA data (Agilent… gbm_tc…
## 6 other Expression Cluster Mesenchymal gbm_tc…
## 7 all_cases_with_methylation_data Samples with methylation data gbm_tc…
## 8 all_cases_with_methylation_data Samples with methylation data (… gbm_tc…
## 9 all_cases_with_microrna_data Samples with microRNA data (mic… gbm_tc…
## 10 other Expression Cluster Neural gbm_tc…
## 11 other Expression Cluster Proneural gbm_tc…
## 12 other Sequenced, No Hypermutators gbm_tc…
## 13 other Sequenced, Not Treated gbm_tc…
## 14 other Sequenced, Treated gbm_tc…
## 15 all_cases_with_mutation_data Samples with mutation data gbm_tc…
## # … with abbreviated variable name ¹sampleListId
It is possible to get case_ids
directly when using the
samplesInSampleLists
function. The function handles
multiple sampleList
identifiers.
samplesInSampleLists(
api = cbio,
sampleListIds = c(
"gbm_tcga_pub_expr_classical", "gbm_tcga_pub_expr_mesenchymal"
)
)
## CharacterList of length 2
## [["gbm_tcga_pub_expr_classical"]] TCGA-02-0001-01 ... TCGA-12-0615-01
## [["gbm_tcga_pub_expr_mesenchymal"]] TCGA-02-0004-01 ... TCGA-12-0620-01
To get more information about patients, we can query with
getSampleInfo
function.
getSampleInfo(api = cbio, studyId = "gbm_tcga_pub", projection = "SUMMARY")
## # A tibble: 206 × 6
## uniqueSampleKey uniqu…¹ sampl…² sampl…³ patie…⁴ studyId
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 2 VENHQS0wMi0wMDAzLTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 3 VENHQS0wMi0wMDA0LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 4 VENHQS0wMi0wMDA2LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 5 VENHQS0wMi0wMDA3LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 6 VENHQS0wMi0wMDA5LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 7 VENHQS0wMi0wMDEwLTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 8 VENHQS0wMi0wMDExLTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 9 VENHQS0wMi0wMDE0LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## 10 VENHQS0wMi0wMDE1LTAxOmdibV90Y2dhX3B1… VENHQS… Primar… TCGA-0… TCGA-0… gbm_tc…
## # … with 196 more rows, and abbreviated variable names ¹uniquePatientKey,
## # ²sampleType, ³sampleId, ⁴patientId
cgdsr
(Cases)
case_id
.cancerStudy
identifiercase_list_description
describes the assaysgetCaseLists
and getClinicalData
We obtain the first case_list_id
in the
cgds
object from above and the corresponding clinical data
for that case list (gbm_tcga_pub_all
as the case list in
this example).
clist1 <-
getCaseLists.CGDS(cgds, cancerStudy = "gbm_tcga_pub")[1, "case_list_id"]
getClinicalData.CGDS(cgds, clist1)
cBioPortalData
(Clinical)
Note that a sampleListId
is not required when using the
fetchAllClinicalDataInStudyUsingPOST
internal endpoint.
Data for all patients can be obtained using the
clinicalData
function.
clinicalData(cbio, "gbm_tcga_pub")
## # A tibble: 206 × 24
## patie…¹ DFS_M…² DFS_S…³ KARNO…⁴ OS_MO…⁵ OS_ST…⁶ PRETR…⁷ PRIOR…⁸ SAMPL…⁹ SEX
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 TCGA-0… 4.5041… 1:Recu… 80.0 11.605… 1:DECE… YES NO 1 Fema…
## 2 TCGA-0… 1.3150… 1:Recu… 100.0 4.7342… 1:DECE… NO NO 1 Male
## 3 TCGA-0… 10.323… 1:Recu… 80.0 11.342… 1:DECE… NO NO 1 Male
## 4 TCGA-0… 9.9287… 1:Recu… 80.0 18.345… 1:DECE… NO NO 1 Fema…
## 5 TCGA-0… 17.030… 1:Recu… 80.0 23.178… 1:DECE… YES NO 1 Fema…
## 6 TCGA-0… 8.6794… 1:Recu… 80.0 10.586… 1:DECE… NO NO 1 Fema…
## 7 TCGA-0… 11.539… 1:Recu… 80.0 35.408… 1:DECE… YES NO 1 Fema…
## 8 TCGA-0… 4.7342… 1:Recu… 80.0 20.712… 1:DECE… NO NO 1 Fema…
## 9 TCGA-0… NA NA 100.0 82.553… 1:DECE… NO NO 1 Male
## 10 TCGA-0… 14.991… 1:Recu… 80.0 20.613… 1:DECE… NO NO 1 Male
## # … with 196 more rows, 14 more variables: sampleId <chr>, ACGH_DATA <chr>,
## # CANCER_TYPE <chr>, CANCER_TYPE_DETAILED <chr>, COMPLETE_DATA <chr>,
## # FRACTION_GENOME_ALTERED <chr>, MRNA_DATA <chr>, MUTATION_COUNT <chr>,
## # ONCOTREE_CODE <chr>, SAMPLE_TYPE <chr>, SEQUENCED <chr>,
## # SOMATIC_STATUS <chr>, TMB_NONSYNONYMOUS <chr>, TREATMENT_STATUS <chr>, and
## # abbreviated variable names ¹patientId, ²DFS_MONTHS, ³DFS_STATUS,
## # ⁴KARNOFSKY_PERFORMANCE_SCORE, ⁵OS_MONTHS, ⁶OS_STATUS, …
You can use a different endpoint to obtain data for a single sample.
First, obtain a single sampleId
with the
samplesInSampleLists
function.
clist1 <- "gbm_tcga_pub_all"
samplist <- samplesInSampleLists(cbio, clist1)
onesample <- samplist[["gbm_tcga_pub_all"]][1]
onesample
## [1] "TCGA-02-0001-01"
Then we use the API endpoint to retrieve the data. Note that you
would run httr::content
on the output to extract the
data.
cbio$getAllClinicalDataOfSampleInStudyUsingGET(
sampleId = onesample, studyId = "gbm_tcga_pub"
)
## Response [https://www.cbioportal.org/api/studies/gbm_tcga_pub/samples/TCGA-02-0001-01/clinical-data]
## Date: 2023-01-03 23:48
## Status: 200
## Content-Type: application/json
## Size: 3.31 kB
cgdsr
allows you to obtain clinical data for a case list
subset (54 cases with gbm_tcga_pub_expr_classical
) and
cBioPortalData
provides clinical data for all 206 samples
in gbm_tcga_pub
using the clinicalData
function.
cgdsr
returns a data.frame
with
sampleId
(TCGA.02.0009.01) but not patientId
(TCGA.02.0009)cBioPortalData
returns sampleId
(TCGA-02-0009-01) and patientId
(TCGA-02-0009).cgdsr
provides case_id
s with
.
and cBioPortalData
returns
patientId
s with -
.You may be interested in other clinical data endpoints. For a list,
use the searchOps
function.
searchOps(cbio, "clinical")
## [1] "getAllClinicalAttributesUsingGET"
## [2] "fetchClinicalAttributesUsingPOST"
## [3] "fetchClinicalDataUsingPOST"
## [4] "getAllClinicalAttributesInStudyUsingGET"
## [5] "getClinicalAttributeInStudyUsingGET"
## [6] "getAllClinicalDataInStudyUsingGET"
## [7] "fetchAllClinicalDataInStudyUsingPOST"
## [8] "getAllClinicalDataOfPatientInStudyUsingGET"
## [9] "getAllClinicalDataOfSampleInStudyUsingGET"
cBioPortalData
(molecularProfiles)
molecularProfiles(api = cbio, studyId = "gbm_tcga_pub")
## # A tibble: 10 × 8
## molecularAlterationType datat…¹ name descr…² showP…³ patie…⁴ molec…⁵ studyId
## <chr> <chr> <chr> <chr> <lgl> <lgl> <chr> <chr>
## 1 COPY_NUMBER_ALTERATION DISCRE… Puta… Putati… TRUE FALSE gbm_tc… gbm_tc…
## 2 COPY_NUMBER_ALTERATION DISCRE… Puta… Putati… TRUE FALSE gbm_tc… gbm_tc…
## 3 MUTATION_EXTENDED MAF Muta… Mutati… TRUE FALSE gbm_tc… gbm_tc…
## 4 METHYLATION CONTIN… Meth… Methyl… FALSE FALSE gbm_tc… gbm_tc…
## 5 MRNA_EXPRESSION CONTIN… mRNA… mRNA e… FALSE FALSE gbm_tc… gbm_tc…
## 6 MRNA_EXPRESSION Z-SCORE mRNA… 18,698… TRUE FALSE gbm_tc… gbm_tc…
## 7 MRNA_EXPRESSION Z-SCORE mRNA… Log-tr… TRUE FALSE gbm_tc… gbm_tc…
## 8 MRNA_EXPRESSION CONTIN… micr… expres… FALSE FALSE gbm_tc… gbm_tc…
## 9 MRNA_EXPRESSION Z-SCORE micr… microR… FALSE FALSE gbm_tc… gbm_tc…
## 10 MRNA_EXPRESSION Z-SCORE mRNA… mRNA a… TRUE FALSE gbm_tc… gbm_tc…
## # … with abbreviated variable names ¹datatype, ²description,
## # ³showProfileInAnalysisTab, ⁴patientLevel, ⁵molecularProfileId
Note that we want to pull the molecularProfileId
column
to use in other queries.
cBioPortalData
(Indentify samples and genes)
Currently, some conversion is needed to directly use the
molecularData
function, if you only have Hugo symbols.
First, convert to Entrez gene IDs and then obtain all the samples in the
sample list of interest.
hugoGeneSymbol
to
entrezGeneId
genetab <- queryGeneTable(cbio,
by = "hugoGeneSymbol",
genes = c("NF1", "TP53", "ABL1")
)
genetab
## # A tibble: 3 × 3
## entrezGeneId hugoGeneSymbol type
## <int> <chr> <chr>
## 1 4763 NF1 protein-coding
## 2 25 ABL1 protein-coding
## 3 7157 TP53 protein-coding
entrez <- genetab[["entrezGeneId"]]
allsamps <- samplesInSampleLists(cbio, "gbm_tcga_pub_all")
In the next section, we will show how to use the genes and sample identifiers to obtain the molecular profile data.
cgdsr
(Profile Data)
The getProfileData
function allows for straightforward
retrieval of the molecular profile data with only a case list and
genetic profile identifiers.
getProfileData.CGDS(x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_mrna",
caseList = "gbm_tcga_pub_all"
)
cBioPortalData
cBioPortalData
provides a number of options for
retrieving molecular profile data depending on the use case. Note that
molecularData
is mostly used internally and that the
cBioPortalData
function is the user-friendly method for
downloading such data.
molecularData
We use the translated entrez
identifiers from above.
molecularData(cbio, "gbm_tcga_pub_mrna",
entrezGeneIds = entrez, sampleIds = unlist(allsamps))
## $gbm_tcga_pub_mrna
## # A tibble: 618 × 8
## uniqueSampleKey uniqu…¹ entre…² molec…³ sampl…⁴ patie…⁵ studyId value
## <chr> <chr> <int> <chr> <chr> <chr> <chr> <dbl>
## 1 VENHQS0wMi0wMDAxLTA… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.174
## 2 VENHQS0wMi0wMDAxLTA… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.297
## 3 VENHQS0wMi0wMDAxLTA… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.621
## 4 VENHQS0wMi0wMDAzLTA… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.177
## 5 VENHQS0wMi0wMDAzLTA… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.00107
## 6 VENHQS0wMi0wMDAzLTA… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.00644
## 7 VENHQS0wMi0wMDA0LTA… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.0878
## 8 VENHQS0wMi0wMDA0LTA… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.236
## 9 VENHQS0wMi0wMDA0LTA… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.305
## 10 VENHQS0wMi0wMDA2LTA… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.173
## # … with 608 more rows, and abbreviated variable names ¹uniquePatientKey,
## # ²entrezGeneId, ³molecularProfileId, ⁴sampleId, ⁵patientId
getDataByGenes
The getDataByGenes
function automatically figures out
all the sample identifiers in the study and it allows Hugo and Entrez
identifiers, as well as genePanelId
inputs.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mrna"
)
## $gbm_tcga_pub_mrna
## # A tibble: 618 × 10
## uniqueSamp…¹ uniqu…² entre…³ molec…⁴ sampl…⁵ patie…⁶ studyId value hugoG…⁷
## <chr> <chr> <int> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 VENHQS0wMi0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.174 ABL1
## 2 VENHQS0wMi0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.297 NF1
## 3 VENHQS0wMi0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.621 TP53
## 4 VENHQS0wMi0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.177 ABL1
## 5 VENHQS0wMi0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.00107 NF1
## 6 VENHQS0wMi0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.00644 TP53
## 7 VENHQS0wMi0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.0878 ABL1
## 8 VENHQS0wMi0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.236 NF1
## 9 VENHQS0wMi0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.305 TP53
## 10 VENHQS0wMi0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… -0.173 ABL1
## # … with 608 more rows, 1 more variable: type <chr>, and abbreviated variable
## # names ¹uniqueSampleKey, ²uniquePatientKey, ³entrezGeneId,
## # ⁴molecularProfileId, ⁵sampleId, ⁶patientId, ⁷hugoGeneSymbol
cBioPortalData
: the main end-user function
It is important to note that end users who wish to obtain the data as
easily as possible should use the main cBioPortalData
function:
gbm_pub <- cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"), by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mrna"
)
assay(gbm_pub[["gbm_tcga_pub_mrna"]])[, 1:4]
## TCGA-02-0001-01 TCGA-02-0003-01 TCGA-02-0004-01 TCGA-02-0006-01
## ABL1 -0.1744878 -0.177096729 -0.08782114 -0.1733767
## NF1 -0.2966920 -0.001066810 -0.23626512 -0.1691507
## TP53 0.6213171 0.006435625 -0.30507285 0.3967758
cBioPortalData
(mutationData)
Similar to molecularData
, mutation data can be obtained
with the mutationData
function or the
getDataByGenes
function.
mutationData(
api = cbio,
molecularProfileIds = "gbm_tcga_pub_mutations",
entrezGeneIds = entrez,
sampleIds = unlist(allsamps)
)
## $gbm_tcga_pub_mutations
## # A tibble: 57 × 28
## uniqueSample…¹ uniqu…² molec…³ sampl…⁴ patie…⁵ entre…⁶ studyId center mutat…⁷
## <chr> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 2 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 4763 gbm_tc… genom… Somatic
## 3 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 4 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 5 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 6 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 7 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 4763 gbm_tc… genom… Somatic
## 8 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 9 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 10 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## # … with 47 more rows, 19 more variables: validationStatus <chr>,
## # startPosition <int>, endPosition <int>, referenceAllele <chr>,
## # proteinChange <chr>, mutationType <chr>, functionalImpactScore <chr>,
## # fisValue <dbl>, linkXvar <chr>, linkPdb <chr>, linkMsa <chr>,
## # ncbiBuild <chr>, variantType <chr>, keyword <chr>, chr <chr>,
## # variantAllele <chr>, refseqMrnaId <chr>, proteinPosStart <int>,
## # proteinPosEnd <int>, and abbreviated variable names ¹uniqueSampleKey, …
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mutations"
)
## $gbm_tcga_pub_mutations
## # A tibble: 57 × 30
## uniqueSample…¹ uniqu…² molec…³ sampl…⁴ patie…⁵ entre…⁶ studyId center mutat…⁷
## <chr> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 2 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 4763 gbm_tc… genom… Somatic
## 3 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 4 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 5 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 6 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 7 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 4763 gbm_tc… genom… Somatic
## 8 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 9 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## 10 VENHQS0wMi0wM… VENHQS… gbm_tc… TCGA-0… TCGA-0… 7157 gbm_tc… genom… Somatic
## # … with 47 more rows, 21 more variables: validationStatus <chr>,
## # startPosition <int>, endPosition <int>, referenceAllele <chr>,
## # proteinChange <chr>, mutationType <chr>, functionalImpactScore <chr>,
## # fisValue <dbl>, linkXvar <chr>, linkPdb <chr>, linkMsa <chr>,
## # ncbiBuild <chr>, variantType <chr>, keyword <chr>, chr <chr>,
## # variantAllele <chr>, refseqMrnaId <chr>, proteinPosStart <int>,
## # proteinPosEnd <int>, hugoGeneSymbol <chr>, type <chr>, and abbreviated …
cgdsr
(getMutationData)
getMutationData.CGDS(
x = cgds,
caseList = "getMutationData",
geneticProfile = "gbm_tcga_pub_mutations",
genes = c("NF1", "TP53", "ABL1")
)
cBioPortalData
(CNA)
Copy Number Alteration data can be obtained with the
getDataByGenes
function or by the main
cBioPortal
function.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_cna_rae"
)
## $gbm_tcga_pub_cna_rae
## # A tibble: 609 × 10
## uniqueS…¹ uniqu…² entre…³ molec…⁴ sampl…⁵ patie…⁶ studyId value hugoG…⁷ type
## <chr> <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr>
## 1 VENHQS0w… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 1 ABL1 prot…
## 2 VENHQS0w… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 NF1 prot…
## 3 VENHQS0w… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 TP53 prot…
## 4 VENHQS0w… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 ABL1 prot…
## 5 VENHQS0w… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 NF1 prot…
## 6 VENHQS0w… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 TP53 prot…
## 7 VENHQS0w… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 ABL1 prot…
## 8 VENHQS0w… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 NF1 prot…
## 9 VENHQS0w… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 TP53 prot…
## 10 VENHQS0w… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0 ABL1 prot…
## # … with 599 more rows, and abbreviated variable names ¹uniqueSampleKey,
## # ²uniquePatientKey, ³entrezGeneId, ⁴molecularProfileId, ⁵sampleId,
## # ⁶patientId, ⁷hugoGeneSymbol
cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_cna_rae"
)
## harmonizing input:
## removing 3 colData rownames not in sampleMap 'primary'
## A MultiAssayExperiment object of 1 listed
## experiment with a user-defined name and respective class.
## Containing an ExperimentList class object of length 1:
## [1] gbm_tcga_pub_cna_rae: SummarizedExperiment with 3 rows and 203 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
cgdsr
(CNA)
getProfileData.CGDS(
x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_cna_rae",
caseList = "gbm_tcga_pub_cna"
)
cBioPortalData
(Methylation)
Similar to Copy Number Alteration, Methylation can be obtained by
getDataByGenes
function or by ‘cBioPortalData’
function.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_methylation_hm27"
)
## $gbm_tcga_pub_methylation_hm27
## # A tibble: 174 × 10
## unique…¹ uniqu…² entre…³ molec…⁴ sampl…⁵ patie…⁶ studyId value hugoG…⁷ type
## <chr> <chr> <int> <chr> <chr> <chr> <chr> <dbl> <chr> <chr>
## 1 VENHQS0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.103 ABL1 prot…
## 2 VENHQS0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.112 NF1 prot…
## 3 VENHQS0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.0735 TP53 prot…
## 4 VENHQS0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.202 ABL1 prot…
## 5 VENHQS0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.161 NF1 prot…
## 6 VENHQS0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.152 TP53 prot…
## 7 VENHQS0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.179 ABL1 prot…
## 8 VENHQS0… VENHQS… 4763 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.161 NF1 prot…
## 9 VENHQS0… VENHQS… 7157 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.170 TP53 prot…
## 10 VENHQS0… VENHQS… 25 gbm_tc… TCGA-0… TCGA-0… gbm_tc… 0.176 ABL1 prot…
## # … with 164 more rows, and abbreviated variable names ¹uniqueSampleKey,
## # ²uniquePatientKey, ³entrezGeneId, ⁴molecularProfileId, ⁵sampleId,
## # ⁶patientId, ⁷hugoGeneSymbol
cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_methylation_hm27"
)
## harmonizing input:
## removing 148 colData rownames not in sampleMap 'primary'
## A MultiAssayExperiment object of 1 listed
## experiment with a user-defined name and respective class.
## Containing an ExperimentList class object of length 1:
## [1] gbm_tcga_pub_methylation_hm27: SummarizedExperiment with 3 rows and 58 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
cgdsr
(Methylation)
getProfileData.CGDS(
x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_methylation_hm27",
caseList = "gbm_tcga_pub_methylation_hm27"
)
## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] cBioPortalData_2.10.3 MultiAssayExperiment_1.24.0
## [3] SummarizedExperiment_1.28.0 Biobase_2.58.0
## [5] GenomicRanges_1.50.2 GenomeInfoDb_1.34.4
## [7] IRanges_2.32.0 S4Vectors_0.36.1
## [9] BiocGenerics_0.44.0 MatrixGenerics_1.10.0
## [11] matrixStats_0.63.0 AnVIL_1.10.1
## [13] dplyr_1.0.10 BiocStyle_2.26.0
##
## loaded via a namespace (and not attached):
## [1] rjson_0.2.21 ellipsis_0.3.2
## [3] rprojroot_2.0.3 futile.logger_1.4.3
## [5] XVector_0.38.0 fs_1.5.2
## [7] DT_0.26 bit64_4.0.5
## [9] AnnotationDbi_1.60.0 fansi_1.0.3
## [11] xml2_1.3.3 codetools_0.2-18
## [13] splines_4.2.2 cachem_1.0.6
## [15] knitr_1.41 jsonlite_1.8.4
## [17] Rsamtools_2.14.0 dbplyr_2.2.1
## [19] png_0.1-8 shiny_1.7.4
## [21] BiocManager_1.30.19.5 readr_2.1.3
## [23] compiler_4.2.2 httr_1.4.4
## [25] assertthat_0.2.1 Matrix_1.5-3
## [27] fastmap_1.1.0 limma_3.54.0
## [29] cli_3.5.0 later_1.3.0
## [31] formatR_1.13 htmltools_0.5.4
## [33] prettyunits_1.1.1 tools_4.2.2
## [35] glue_1.6.2 GenomeInfoDbData_1.2.9
## [37] rappdirs_0.3.3 Rcpp_1.0.9
## [39] rapiclient_0.1.3 jquerylib_0.1.4
## [41] pkgdown_2.0.7 vctrs_0.5.1
## [43] Biostrings_2.66.0 RJSONIO_1.3-1.6
## [45] RaggedExperiment_1.22.0 rtracklayer_1.58.0
## [47] xfun_0.36 stringr_1.5.0
## [49] rvest_1.0.3 RTCGAToolbox_2.28.0
## [51] mime_0.12 miniUI_0.1.1.1
## [53] lifecycle_1.0.3 restfulr_0.0.15
## [55] XML_3.99-0.13 zlibbioc_1.44.0
## [57] RCircos_1.2.2 ragg_1.2.4
## [59] hms_1.1.2 promises_1.2.0.1
## [61] parallel_4.2.2 lambda.r_1.2.4
## [63] yaml_2.3.6 curl_4.3.3
## [65] memoise_2.0.1 sass_0.4.4
## [67] biomaRt_2.54.0 stringi_1.7.8
## [69] RSQLite_2.2.20 BiocIO_1.8.0
## [71] desc_1.4.2 GenomicDataCommons_1.22.0
## [73] GenomicFeatures_1.50.3 filelock_1.0.2
## [75] BiocParallel_1.32.5 rlang_1.0.6
## [77] pkgconfig_2.0.3 systemfonts_1.0.4
## [79] bitops_1.0-7 evaluate_0.19
## [81] lattice_0.20-45 purrr_1.0.0
## [83] GenomicAlignments_1.34.0 htmlwidgets_1.6.0
## [85] bit_4.0.5 tidyselect_1.2.0
## [87] magrittr_2.0.3 bookdown_0.31
## [89] R6_2.5.1 generics_0.1.3
## [91] DelayedArray_0.24.0 DBI_1.1.3
## [93] withr_2.5.0 pillar_1.8.1
## [95] survival_3.4-0 KEGGREST_1.38.0
## [97] RCurl_1.98-1.9 tibble_3.1.8
## [99] crayon_1.5.2 futile.options_1.0.1
## [101] utf8_1.2.2 BiocFileCache_2.6.0
## [103] tzdb_0.3.0 rmarkdown_2.19
## [105] progress_1.2.2 grid_4.2.2
## [107] data.table_1.14.6 blob_1.2.3
## [109] digest_0.6.31 xtable_1.8-4
## [111] tidyr_1.2.1 httpuv_1.6.7
## [113] textshaping_0.3.6 TCGAutils_1.18.0
## [115] bslib_0.4.2