Make a GRangesList from TCGA Copy Number data
Source:R/makeGRangesListFromCopyNumber.R
makeGRangesListFromCopyNumber.RdmakeGRangesListFromCopyNumber allows the user to convert objects of
class data.frame or S4Vectors::DataFrame to a
GRangesList. It includes additional
features specific to TCGA data such as, hugo symbols, probe numbers, segment
means, and ucsc build (if available).
Arguments
- df
A
data.frameorDataFrameclass object.listclass objects are coerced todata.frameorDataFrame.- split.field
A
charactervector of length one indicating the column to be used as sample identifiers- names.field
A
charactervector of length one indicating the column to be used as names for each of the ranges in the data- ...
Additional arguments to pass on to GenomicRanges::makeGRangesListFromDataFrame
Value
A GRangesList class object
Examples
library(GenomicDataCommons)
manif <- files() |>
filter(~ cases.project.project_id == "TCGA-COAD" &
data_type == "Copy Number Segment") |>
manifest(size = 1)
fname <- gdcdata(manif$id)
UUIDtoBarcode(names(fname), from_type = "file_id")
#> file_id associated_entities.entity_submitter_id
#> 1 27942c4d-57d1-43c2-a27f-ce4ed46f75a2 TCGA-NH-A5IV-01A-42D-A36W-01
cndata <- read.delim(fname[[1L]])
makeGRangesListFromCopyNumber(
df = cndata,
split.field = "GDC_Aliquot",
keep.extra.columns = TRUE
)
#> GRangesList object of length 1:
#> $`9ffe4929-b7aa-4e04-9876-56b1c4b33139`
#> GRanges object with 222 ranges and 2 metadata columns:
#> seqnames ranges strand | Num_Probes Segment_Mean
#> <Rle> <IRanges> <Rle> | <integer> <numeric>
#> [1] 1 62920-4874007 * | 1887 0.0317
#> [2] 1 4883463-5054493 * | 174 -0.6952
#> [3] 1 5060514-20995776 * | 9503 0.0053
#> [4] 1 20997998-21094512 * | 43 -0.6860
#> [5] 1 21096482-72297687 * | 31164 0.0110
#> ... ... ... ... . ... ...
#> [218] X 81419945-155734427 * | 41590 0.0074
#> [219] X 155736972-155952689 * | 16 -0.8036
#> [220] Y 2782397-6195105 * | 762 -1.9317
#> [221] Y 6195308-21578726 * | 5481 -2.1922
#> [222] Y 21582050-56872112 * | 387 -1.9747
#> -------
#> seqinfo: 24 sequences from an unspecified genome; no seqlengths
#>