Perform a LEfSe analysis: the function carries out differential analysis between two sample classes for multiple features and uses linear discriminant analysis to establish their effect sizes. Subclass information for each class can be incorporated into the analysis (see examples). Features with large differences between two sample classes are identified as biomarkers.
lefser(
relab,
kruskal.threshold = 0.05,
wilcox.threshold = 0.05,
lda.threshold = 2,
classCol = "CLASS",
subclassCol = NULL,
assay = 1L,
trim.names = FALSE,
checkAbundances = TRUE,
method = "none",
...,
expr,
groupCol = "GROUP",
blockCol = NULL
)
A SummarizedExperiment with relative abundances in the assay
numeric(1) The p-value for the Kruskal-Wallis Rank Sum Test (default 0.05). If multiple hypothesis testing is performed, this threshold is applied to corrected p-values.
numeric(1) The p-value for the Wilcoxon Rank-Sum Test when 'subclassCol' is present (default 0.05). If multiple hypothesis testing is performed, this threshold is applied to corrected p-values.
numeric(1) The effect size threshold (default 2.0).
character(1) Column name in colData(relab)
indicating
class, usually a factor with two levels (e.g., c("cases", "controls")
;
default "CLASS").
character(1) Optional column name in colData(relab)
indicating the subclasses, usually a factor with two levels (e.g.,
c("adult", "senior")
; default NULL), but can be more than two levels.
The i-th assay matrix in the SummarizedExperiment
('relab';
default 1).
Default is FALSE
. If TRUE
, this function extracts
the most specific taxonomic rank of organism.
logical(1)
Whether to check if the assay data in the
relab
input are relative abundances or counts. If counts are found, a
warning will be emitted (default TRUE
).
Default is "none" as in the original LEfSe implementation. Character string of length one, passed on to p.adjust to set option for multiple testing. For multiple pairwise comparisons, each comparison is adjusted separately. Options are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr" (synonym for "BH"), and "none".
(DEFUNCT) Use relab
instead. A SummarizedExperiment
with relative abundances in the assay
(DEPRECATED) Column name in colData(relab)
indicating
groups, usually a factor with two levels (e.g., c("cases", "controls")
;
default "GROUP").
(DEPRECATED) Optional column name in colData(relab)
indicating the blocks, usually a factor with two levels (e.g., c("adult", "senior")
; default NULL).
Additional inputs to lower level functions (not used).
The function returns a data.frame
with two columns, which are
names of features and their LDA scores.
The LEfSe method expects relative abundances in the expr
input. A warning
will be emitted if the column sums do not result in 1. Use the relativeAb
helper function to convert the data in the SummarizedExperiment
to relative
abundances. The checkAbundances
argument enables checking the data
for presence of relative abundances and can be turned off by setting the
argument to FALSE
.
data(zeller14)
zeller14 <- zeller14[, zeller14$study_condition != "adenoma"]
tn <- get_terminal_nodes(rownames(zeller14))
zeller14tn <- zeller14[tn,]
zeller14tn_ra <- relativeAb(zeller14tn)
# (1) Using classes only
res_class <- lefser(zeller14tn_ra,
classCol = "study_condition")
#> The outcome variable is specified as 'study_condition' and the reference category is 'CRC'.
#> See `?factor` or `?relevel` to change the reference category.
# (2) Using classes and sub-classes
res_subclass <- lefser(zeller14tn_ra,
classCol = "study_condition",
subclassCol = "age_category")
#> The outcome variable is specified as 'study_condition' and the reference category is 'CRC'.
#> See `?factor` or `?relevel` to change the reference category.