Make a GRangesList from TCGA Copy Number data — makeGRangesListFromCopyNumber • TCGAutils

makeGRangesListFromCopyNumber allows the user to convert objects of class data.frame or S4Vectors::DataFrame to a GRangesList. It includes additional features specific to TCGA data such as, hugo symbols, probe numbers, segment means, and ucsc build (if available).

Usage

makeGRangesListFromCopyNumber(
  df,
  split.field,
  names.field = "Hugo_Symbol",
  ...
)

Arguments

df: A data.frame or DataFrame class object. list class objects are coerced to data.frame or DataFrame.
split.field: A character vector of length one indicating the column to be used as sample identifiers
names.field: A character vector of length one indicating the column to be used as names for each of the ranges in the data
...: Additional arguments to pass on to GenomicRanges::makeGRangesListFromDataFrame

Value

A GRangesList class object

Examples

library(GenomicDataCommons)

manif <- files() |>
    filter(~ cases.project.project_id == "TCGA-COAD" &
        data_type == "Copy Number Segment") |>
    manifest(size = 1)

fname <- gdcdata(manif$id)

barcode <- UUIDtoBarcode(names(fname), from_type = "file_id")
barcode <- barcode[["associated_entities.entity_submitter_id"]]

cndata <- read.delim(fname[[1L]], nrows = 10L)

cngrl <- makeGRangesListFromCopyNumber(cndata, split.field = "GDC_Aliquot",
    keep.extra.columns = TRUE)

names(cngrl) <- barcode
GenomeInfoDb::genome(cngrl) <- extractBuild(fname[[1L]])
cngrl
#> GRangesList object of length 1:
#> $`TCGA-AA-3854-01A-01D-0903-01`
#> GRanges object with 10 ranges and 2 metadata columns:
#>        seqnames            ranges strand | Num_Probes Segment_Mean
#>           <Rle>         <IRanges>  <Rle> |  <integer>    <numeric>
#>    [1]        1       62920-98588      * |         12      -0.8484
#>    [2]        1     98602-7492401      * |       3644      -0.0320
#>    [3]        1   7492530-7511749      * |         15       0.4261
#>    [4]        1  7511936-15823420      * |       4672      -0.0411
#>    [5]        1 15827002-15828515      * |         10       0.7638
#>    [6]        1 15839166-16785682      * |        350      -0.0481
#>    [7]        1 16787932-16934716      * |         51      -0.4808
#>    [8]        1 16934753-25164511      * |       5234      -0.0270
#>    [9]        1 25165842-34626073      * |       5014      -0.0443
#>   [10]        1 34637053-34638890      * |         20      -0.4754
#>   -------
#>   seqinfo: 1 sequence from grch38 genome; no seqlengths
#>