This function evaluates gene set rankings obtained from the application of enrichment methods to multiple datasets - where each dataset investigates a certain phenotype such as a disease. Given pre-defined phenotype relevance scores for the gene sets, indicating how important a gene set is for the investigated phenotype (as e.g. judged by evidence from the literature), this allows to assess whether enrichment methods produce gene set rankings in which phenotype-relevant gene sets accumulate at the top.
Usage
evalRelevance(
ea.ranks,
rel.ranks,
data2pheno,
method = "wsum",
top = 0,
rel.thresh = 0,
...
)
compOpt(rel.ranks, gs.ids, data2pheno = NULL, top = 0)
compRand(rel.ranks, gs.ids, data2pheno = NULL, perm = 1000)
Arguments
- ea.ranks
Enrichment analysis rankings. A list with an entry for each enrichment method applied. Each entry is a list that stores for each dataset analyzed the resulting gene set ranking, obtained from applying the respective method to the respective dataset. Resulting gene set rankings are assumed to be of class
DataFrame
in which gene sets (required column namedGENE.SET
) are ranked according to a ranking measure such as a gene set p-value (required column namedPVAL
). SeegsRanking
for an example.- rel.ranks
Relevance score rankings. A list with an entry for each phenotype investigated. Each entry should be a
DataFrame
in which gene sets (rownames are assumed to be gene set IDs) are ranked according to a phenotype relevance score (required columnREL.SCORE
).- data2pheno
A named character vector where the names correspond to dataset IDs and the elements of the vector to the corresponding phenotypes investigated.
- method
Character. Determines how the relevance score is summarized across the enrichment analysis ranking. Choose
"wsum"
(default) to compute a weighted sum of the relevance scores,"auc"
to perform a ROC/AUC analysis, or"cor"
to compute a correlation. This can also be a user-defined function for customized behaviors. See Details.- top
Integer. If
top
is non-zero, the evaluation will be restricted to the firsttop
gene sets of each enrichment analysis ranking. Defaults to0
, which will then evaluate the full ranking. If used withmethod="auc"
, it defines the number of gene sets at the top of the relevance ranking that are considered relevant (true positives).- rel.thresh
Numeric. Relevance score threshold. Restricts relevance score rankings (argument
rel.ranks
) to gene sets exceeding the threshold in theREL.SCORE
column.- ...
Additional arguments for computation of the relevance measure as defined by the
method
argument. This includes formethod="wsum"
:perc: Logical. Should observed scores be returned as-is or as a *perc*entage of the respective optimal score. Percentages of the optimal score are typically easier to interpret and are comparable between datasets / phenotypes. Defaults to
TRUE
.rand: Logical. Should gene set rankings be randomized to assess how likely it is to observe a score equal or greater than the respective obtained score? Defaults to
FALSE
.
- gs.ids
Character vector of gene set IDs on which enrichment analysis has been carried out.
- perm
Integer. Number of permutations if
rand
set toTRUE
.
Value
A numeric matrix (rows = datasets, columns = methods) storing in each cell the chosen relevance measure (score, AUC, cor) obtained from applying the respective enrichment method to the respective expression dataset.
Details
The function evalRelevance
evaluates the similarity of a gene set ranking
obtained from enrichment analysis and a gene set ranking based on phenotype
relevance scores. Therefore, the function first transforms the ranks 'r'
from the enrichment analysis to weights 'w' in [0,1] via w = 1 - r / N;
where 'N' denotes the total number of gene sets on which the enrichment
analysis has been carried out. These weights are then multiplied with the
corresponding relevance scores and summed up.
The function compOpt
applies evalRelevance
to the theoretically
optimal case in which the enrichment analysis ranking is identical to the
relevance score ranking. The ratio between observed and optimal score is
useful for comparing observed scores between datasets / phenotypes.
The function compRand
repeatedly applies evalRelevance
to random
rankings obtained from placing the gene sets randomly along the ranking, thereby
assessing how likely it is to observe a score equal or greater than the one
obtained.
It is also possible to inspect other measures for summarizing the phenotype
relevance, instead of calculating weighted relevance scores sums (argument
method="wsum"
, default).
One possibility is to treat the comparison of the EA ranking and the relevance
ranking as a classification problem, and to compute standard classification
performance measures such as the area under the ROC curve (method="auc"
).
However, this requires to divide the relevance rankings (argument rel.ranks
)
into relevant (true positives) and irrelevant (true negatives) gene sets using
the top
argument.
Instead of method="auc"
, this can also be any other
performance measure that the ROCR package (https://rocr.bioinf.mpi-sb.mpg.de)
implements. For example, method="tnr"
for calculation of the true
negative rate. Although such classification performance measures are easy to
interpret, the weighted sum has certain preferable properties such as avoiding
thresholding and accounting for varying degrees of relevance in the relevance
rankings.
It is also possible to compute a standard rank-based correlation measure
such as Spearman's correlation (method="cor"
) to compare the similarity
of the enrichment analysis rankings and the relevance rankings. However, this
might not be optimal for a comparison of an EA ranking going over the full
gene set vector against the typically much smaller vector of gene sets for
which a relevance score is annotated. For this scenario, using
rank correlation reduces the question to "does a subset of the EA ranking
preserve the order of the relevance ranking"; although our question of interest is
rather "is a subset of the relevant gene sets ranked highly in the EA ranking".
See also
runEA
to apply enrichment methods to multiple datasets;
readResults
to read saved rankings as an input for the eval-functions;
Examples
#
# (1) simulated setup: 1 enrichment method applied to 1 dataset
#
# simulate gene set ranking
ea.ranks <- EnrichmentBrowser::makeExampleData("ea.res")
ea.ranks <- EnrichmentBrowser::gsRanking(ea.ranks, signif.only=FALSE)
# simulated relevance score ranking
rel.ranks <- ea.ranks
rel.ranks[,2] <- runif(nrow(ea.ranks), min=1, max=100)
colnames(rel.ranks)[2] <- "REL.SCORE"
rownames(rel.ranks) <- rel.ranks[,"GENE.SET"]
ind <- order(rel.ranks[,"REL.SCORE"], decreasing=TRUE)
rel.ranks <- rel.ranks[ind,]
# evaluate
evalRelevance(ea.ranks, rel.ranks)
#> [1] 259.8351
compOpt(rel.ranks, ea.ranks[,"GENE.SET"])
#> [1] 327.054
compRand(rel.ranks, ea.ranks[,"GENE.SET"], perm=3)
#> [1] 248.7087 274.2963 184.4197
#
# (2) simulated setup: 2 methods & 2 datasets
#
methods <- paste0("m", 1:2)
data.ids <- paste0("d", 1:2)
# simulate gene set rankings
ea.ranks <- sapply(methods, function(m)
sapply(data.ids,
function(d)
{
r <- EnrichmentBrowser::makeExampleData("ea.res")
r <- EnrichmentBrowser::gsRanking(r, signif.only=FALSE)
return(r)
}, simplify=FALSE),
simplify=FALSE)
# simulate a mapping from datasets to disease codes
d2d <- c("ALZ", "BRCA")
names(d2d) <- data.ids
# simulate relevance score rankings
rel.ranks <- lapply(ea.ranks[[1]],
function(rr)
{
rr[,2] <- runif(nrow(rr), min=1, max=100)
colnames(rr)[2] <- "REL.SCORE"
rownames(rr) <- rr[,"GENE.SET"]
ind <- order(rr[,"REL.SCORE"], decreasing=TRUE)
rr <- rr[ind,]
return(rr)
})
names(rel.ranks) <- unname(d2d)
# evaluate
evalRelevance(ea.ranks, rel.ranks, d2d)
#> m1 m2
#> d1 69.66977 69.71445
#> d2 83.88508 90.93625