This function evaluates the type I error rate of selected methods for enrichment analysis when applied to one or more expression datasets.
Usage
evalTypeIError(
methods,
exp.list,
gs,
alpha = 0.05,
ea.perm = 1000,
tI.perm = 1000,
perm.block.size = -1,
summarize = TRUE,
save2file = FALSE,
out.dir = NULL,
verbose = TRUE,
...
)
Arguments
- methods
Methods for enrichment analysis. This can be either
a character vector with method names chosen from
sbeaMethods
andnbeaMethods
,a user-defined function implementing a method for enrichment analysis, or
a named list, containing pre-defined and/or user-defined enrichment methods. See examples.
- exp.list
Experiment list. A
list
of datasets, each being of classSummarizedExperiment
.- gs
Gene sets, i.e. a list of character vectors of gene IDs.
- alpha
Numeric. Statistical significance level. Defaults to 0.05.
- ea.perm
Integer. Number of permutations of the sample group assignments during enrichment analysis. Defaults to 1000. Can also be an integer vector matching the length of 'methods' to assign different numbers of permutations for different methods.
- tI.perm
Integer. Number of permutations of the sample group assignments during type I error rate evaluation. Defaults to 1000. Can also be an integer vector matching the length of
methods
to assign different numbers of permutations for different methods.- perm.block.size
Integer. When running in parallel, splits
tI.perm
into blocks of the indicated size. Defaults to -1, which indicates to not partitiontI.perm
.- summarize
Logical. If
TRUE
(default) appliessummary
to the vector of type I error rates acrosstI.perm
permutations of the sample labels. UseFALSE
to return the full vector of type I error rates.- save2file
Logical. Should results be saved to file for subsequent benchmarking? Defaults to
FALSE
.- out.dir
Character. Determines the output directory where results are saved to. Defaults to
NULL
, which then writes totools::R_user_dir("GSEABenchmarkeR")
in casesave2file
is set toTRUE
.- verbose
Logical. Should progress be reported? Defaults to
TRUE
.- ...
Additional arguments passed to the selected enrichment methods.
Value
A list with an entry for each method applied. Each method entry is
a list with an entry for each dataset analyzed. Each dataset entry is either
a summary (summarize=TRUE
) or the full vector of type I error rates
(summarize=FALSE
) across tI.perm
permutations of the sample labels.
See also
sbea
and nbea
for carrying out set- and network-based enrichment analysis.
BiocParallelParam
and register
for
configuration of parallel computation.
Examples
# loading three datasets from the GEO2KEGG compendium
geo2kegg <- loadEData("geo2kegg", nr.datasets=3)
#> Loading GEO2KEGG data compendium ...
# only considering the first 1000 probes for demonstration
geo2kegg <- lapply(geo2kegg, function(d) d[1:1000,])
# preprocessing and DE analysis for two of the datasets
geo2kegg <- maPreproc(geo2kegg[2:3])
#> Summarizing probe level expression ...
geo2kegg <- runDE(geo2kegg)
# getting a subset of human KEGG gene sets
gs.file <- system.file("extdata", package="EnrichmentBrowser")
gs.file <- file.path(gs.file, "hsa_kegg_gs.gmt")
kegg.gs <- EnrichmentBrowser::getGenesets(gs.file)
# evaluating type I error rate of two methods on two datasets
# NOTE: using a small number of permutations for demonstration;
# for a meaningful evaluation tI.perm should be >= 1000
res <- evalTypeIError(geo2kegg, methods=c("ora",
"camera"), gs=kegg.gs, ea.perm=0, tI.perm=3)
#> ora
#> camera
# applying a user-defined enrichment method ...
# ... or a mix of pre-defined and user-defined methods
dummySBEA <- function(se, gs)
{
sig.ps <- sample(seq(0, 0.05, length=1000), 5)
nsig.ps <- sample(seq(0.1, 1, length=1000), length(gs)-5)
ps <- sample(c(sig.ps, nsig.ps), length(gs))
names(ps) <- names(gs)
return(ps)
}
methods <- list(camera = "camera", dummySBEA = dummySBEA)
res <- evalTypeIError(methods, geo2kegg, gs=kegg.gs, tI.perm=3)
#> camera
#> dummySBEA