This function evaluates the proportion of rejected null hypotheses (= the fraction of significant gene sets) of an enrichment method when applied to random gene sets of defined size.
Usage
evalRandomGS(
method,
se,
nr.gs = 100,
set.size = 5,
alpha = 0.05,
padj = "none",
perc = TRUE,
reps = 100,
rep.block.size = -1,
summarize = TRUE,
save2file = FALSE,
out.dir = NULL,
...
)Arguments
- method
Enrichment analysis method. A character scalar chosen from
sbeaMethodsandnbeaMethods, or a user-defined function implementing a method for enrichment analysis.- se
An expression dataset of class
SummarizedExperiment.- nr.gs
Integer. Number of random gene sets. Defaults to 100.
- set.size
Integer. Gene set size, i.e. number of genes in each random gene set.
- alpha
Numeric. Statistical significance level. Defaults to 0.05.
- padj
Character. Method for adjusting p-values to multiple testing. For available methods see the man page of the stats function
p.adjust. Defaults to"none".- perc
Logical. Should the percentage (between 0 and 100, default) or the proportion (between 0 and 1) of significant gene sets be returned?
- reps
Integer. Number of replications. Defaults to 100.
- rep.block.size
Integer. When running in parallel, splits
repsinto blocks of the indicated size. Defaults to -1, which indicates to not partitionreps.- summarize
Logical. If
TRUE(default) returns the mean (mean) and the standard deviation (sd) of the proportion of significant gene sets acrossrepsreplications. UseFALSEto return the full vector storing the proportion of significant gene sets for each replication.- save2file
Logical. Should results be saved to file for subsequent benchmarking? Defaults to
FALSE.- out.dir
Character. Determines the output directory where results are saved to. Defaults to
NULL, which then writes totools::R_user_dir("GSEABenchmarkeR")in casesave2fileis set toTRUE.- ...
Additional arguments passed to the selected enrichment method.
Value
A named numeric vector of length 2 storing mean and standard deviation
of the proportion of significant gene sets across reps replications
(summarize=TRUE); or a numeric vector of length reps storing the
the proportion of significant gene sets for each replication itself
(summarize=FALSE).
See also
sbea and nbea
for carrying out set- and network-based enrichment analysis.
BiocParallelParam and register for
configuration of parallel computation.
Examples
# loading two datasets from the GEO2KEGG compendium
geo2kegg <- loadEData("geo2kegg", nr.datasets = 2)
#> Loading GEO2KEGG data compendium ...
# only considering the first 1000 probes for demonstration
geo2kegg <- lapply(geo2kegg, function(d) d[1:1000,])
# preprocessing and DE analysis for two of the datasets
geo2kegg <- maPreproc(geo2kegg)
#> Summarizing probe level expression ...
#> Corresponding annotation package not found: hgu133a.db
#> Make sure that you have it installed.
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://p3m.dev/cran/__linux__/noble/latest
#> Bioconductor version 3.21 (BiocManager 1.30.25), R 4.5.0 (2025-04-11)
#> Installing package(s) 'hgu133a.db'
#> Corresponding annotation package not found: hgu133plus2.db
#> Make sure that you have it installed.
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://p3m.dev/cran/__linux__/noble/latest
#> Bioconductor version 3.21 (BiocManager 1.30.25), R 4.5.0 (2025-04-11)
#> Installing package(s) 'hgu133plus2.db'
geo2kegg <- runDE(geo2kegg)
evalRandomGS("camera", geo2kegg[[1]], reps = 3)
#> mean sd
#> 5.333333 1.527525