This function evaluates the proportion of rejected null hypotheses (= the fraction of significant gene sets) of an enrichment method when applied to random gene sets of defined size.
Usage
evalRandomGS(
method,
se,
nr.gs = 100,
set.size = 5,
alpha = 0.05,
padj = "none",
perc = TRUE,
reps = 100,
rep.block.size = -1,
summarize = TRUE,
save2file = FALSE,
out.dir = NULL,
...
)
Arguments
- method
Enrichment analysis method. A character scalar chosen from
sbeaMethods
andnbeaMethods
, or a user-defined function implementing a method for enrichment analysis.- se
An expression dataset of class
SummarizedExperiment
.- nr.gs
Integer. Number of random gene sets. Defaults to 100.
- set.size
Integer. Gene set size, i.e. number of genes in each random gene set.
- alpha
Numeric. Statistical significance level. Defaults to 0.05.
- padj
Character. Method for adjusting p-values to multiple testing. For available methods see the man page of the stats function
p.adjust
. Defaults to"none"
.- perc
Logical. Should the percentage (between 0 and 100, default) or the proportion (between 0 and 1) of significant gene sets be returned?
- reps
Integer. Number of replications. Defaults to 100.
- rep.block.size
Integer. When running in parallel, splits
reps
into blocks of the indicated size. Defaults to -1, which indicates to not partitionreps
.- summarize
Logical. If
TRUE
(default) returns the mean (mean
) and the standard deviation (sd
) of the proportion of significant gene sets acrossreps
replications. UseFALSE
to return the full vector storing the proportion of significant gene sets for each replication.- save2file
Logical. Should results be saved to file for subsequent benchmarking? Defaults to
FALSE
.- out.dir
Character. Determines the output directory where results are saved to. Defaults to
NULL
, which then writes totools::R_user_dir("GSEABenchmarkeR")
in casesave2file
is set toTRUE
.- ...
Additional arguments passed to the selected enrichment method.
Value
A named numeric vector of length 2 storing mean and standard deviation
of the proportion of significant gene sets across reps
replications
(summarize=TRUE
); or a numeric vector of length reps
storing the
the proportion of significant gene sets for each replication itself
(summarize=FALSE
).
See also
sbea
and nbea
for carrying out set- and network-based enrichment analysis.
BiocParallelParam
and register
for
configuration of parallel computation.
Examples
# loading two datasets from the GEO2KEGG compendium
geo2kegg <- loadEData("geo2kegg", nr.datasets = 2)
#> Loading GEO2KEGG data compendium ...
# only considering the first 1000 probes for demonstration
geo2kegg <- lapply(geo2kegg, function(d) d[1:1000,])
# preprocessing and DE analysis for two of the datasets
geo2kegg <- maPreproc(geo2kegg)
#> Summarizing probe level expression ...
#> Corresponding annotation package not found: hgu133a.db
#> Make sure that you have it installed.
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://p3m.dev/cran/__linux__/noble/latest
#> Bioconductor version 3.21 (BiocManager 1.30.25), R 4.5.0 (2025-04-11)
#> Installing package(s) 'hgu133a.db'
#> Corresponding annotation package not found: hgu133plus2.db
#> Make sure that you have it installed.
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://p3m.dev/cran/__linux__/noble/latest
#> Bioconductor version 3.21 (BiocManager 1.30.25), R 4.5.0 (2025-04-11)
#> Installing package(s) 'hgu133plus2.db'
geo2kegg <- runDE(geo2kegg)
evalRandomGS("camera", geo2kegg[[1]], reps = 3)
#> mean sd
#> 5.333333 1.527525