Skip to contents

Overview

This analysis:

  1. selects a subset of samples (all CRC-related in this example),
  2. uses the table1 package to display a table of the characteristics of the included cohort,
  3. sorts species in order of descending prevalence,
  4. uses the DT package datatable function to display a searchable, paged table of prevalences, and
  5. writes the prevalences to file.

Required packages:

Select samples

Note that a few species without phylogenetic information are lost.

crc_subset <- filter(sampleMetadata, study_condition == "CRC") %>%
  returnSamples(dataType = "relative_abundance",
                rownames = "short")

Cohort characteristics

Create a summary table of the participants in this cohort:

table1::table1( ~ disease + disease_subtype + age + gender + country + study_name,
                data = colData(crc_subset))
Overall
(N=701)
disease
CRC 625 (89.2%)
CRC;cholesterolemia 1 (0.1%)
CRC;fatty_liver 3 (0.4%)
CRC;fatty_liver;hypertension 12 (1.7%)
CRC;hypercholesterolemia 3 (0.4%)
CRC;hypercholesterolemia;hypertension 1 (0.1%)
CRC;hypertension 20 (2.9%)
CRC;metastases 1 (0.1%)
CRC;T2D 29 (4.1%)
CRC;T2D;fatty_liver;hypertension 4 (0.6%)
CRC;T2D;hypertension 2 (0.3%)
disease_subtype
adenocarcinoma 59 (8.4%)
carcinoma 173 (24.7%)
Missing 469 (66.9%)
age
Mean (SD) 63.3 (11.0)
Median [Min, Max] 64.0 [28.0, 90.0]
gender
female 257 (36.7%)
male 444 (63.3%)
country
AUT 46 (6.6%)
CAN 2 (0.3%)
CHN 74 (10.6%)
DEU 60 (8.6%)
FRA 53 (7.6%)
IND 30 (4.3%)
ITA 61 (8.7%)
JPN 298 (42.5%)
USA 77 (11.0%)
study_name
FengQ_2015 46 (6.6%)
GuptaA_2019 30 (4.3%)
HanniganGD_2017 27 (3.9%)
ThomasAM_2018a 29 (4.1%)
ThomasAM_2018b 32 (4.6%)
ThomasAM_2019_c 40 (5.7%)
VogtmannE_2016 52 (7.4%)
WirbelJ_2018 60 (8.6%)
YachidaS_2019 258 (36.8%)
YuJ_2015 74 (10.6%)
ZellerG_2014 53 (7.6%)

Show species in order of decreasing prevalence

Prevalences shown are the fraction of specimens from CRC patients with non-zero relative abundance.

prevalences <- rowSums(assay(crc_subset) > 0) / ncol(crc_subset) 
prevalences <- tibble(species = names(prevalences), prevalence = signif(prevalences, 2)) %>%
  filter(prevalence > 0) %>%
  arrange(-prevalence)
DT::datatable(prevalences)

Write to disk:

write.csv(prevalences, row.names = FALSE, file = "prevalences.csv")

Download the zipped prevalences file produced by curatedMetagenomicData version 3.2.3: prevalences.csv.zip

Session Info

sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.5.1 (2025-06-13)
##  os       Ubuntu 24.04.2 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language en
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       UTC
##  date     2025-08-27
##  pandoc   3.7.0.2 @ /usr/bin/ (via rmarkdown)
##  quarto   1.6.42 @ /usr/local/bin/quarto
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package                  * version date (UTC) lib source
##  abind                      1.4-8   2024-09-12 [1] RSPM (R 4.5.0)
##  AnnotationDbi              1.70.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  AnnotationHub              3.16.1  2025-07-23 [1] Bioconductor 3.21 (R 4.5.1)
##  ape                        5.8-1   2024-12-16 [1] RSPM (R 4.5.0)
##  beachmat                   2.24.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  beeswarm                   0.4.0   2021-06-01 [1] RSPM (R 4.5.0)
##  Biobase                  * 2.68.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocBaseUtils              1.10.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocFileCache              2.16.1  2025-07-23 [1] Bioconductor 3.21 (R 4.5.1)
##  BiocGenerics             * 0.54.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocManager                1.30.26 2025-06-05 [2] CRAN (R 4.5.1)
##  BiocNeighbors              2.2.0   2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocParallel               1.42.1  2025-06-01 [1] Bioconductor 3.21 (R 4.5.1)
##  BiocSingular               1.24.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocStyle                * 2.36.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  BiocVersion                3.21.1  2024-10-29 [2] Bioconductor 3.21 (R 4.5.1)
##  Biostrings               * 2.76.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  bit                        4.6.0   2025-03-06 [1] RSPM (R 4.5.0)
##  bit64                      4.6.0-1 2025-01-16 [1] RSPM (R 4.5.0)
##  blob                       1.2.4   2023-03-17 [1] RSPM (R 4.5.0)
##  bluster                    1.18.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  bookdown                   0.44    2025-08-21 [1] RSPM (R 4.5.0)
##  bslib                      0.9.0   2025-01-30 [2] RSPM (R 4.5.0)
##  cachem                     1.1.0   2024-05-16 [2] RSPM (R 4.5.0)
##  cellranger                 1.1.0   2016-07-27 [1] RSPM (R 4.5.0)
##  cli                        3.6.5   2025-04-23 [2] RSPM (R 4.5.0)
##  cluster                    2.1.8.1 2025-03-12 [3] CRAN (R 4.5.1)
##  codetools                  0.2-20  2024-03-31 [3] CRAN (R 4.5.1)
##  crayon                     1.5.3   2024-06-20 [2] RSPM (R 4.5.0)
##  crosstalk                  1.2.2   2025-08-26 [1] RSPM (R 4.5.0)
##  curatedMetagenomicData   * 3.16.1  2025-04-22 [1] Bioconductor 3.21 (R 4.5.0)
##  curl                       7.0.0   2025-08-19 [2] RSPM (R 4.5.0)
##  DBI                        1.2.3   2024-06-02 [1] RSPM (R 4.5.0)
##  dbplyr                     2.5.0   2024-03-19 [1] RSPM (R 4.5.0)
##  DECIPHER                   3.4.0   2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  decontam                   1.28.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  DelayedArray               0.34.1  2025-04-17 [1] Bioconductor 3.21 (R 4.5.0)
##  DelayedMatrixStats         1.30.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  desc                       1.4.3   2023-12-10 [2] RSPM (R 4.5.0)
##  digest                     0.6.37  2024-08-19 [2] RSPM (R 4.5.0)
##  DirichletMultinomial       1.50.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  dplyr                    * 1.1.4   2023-11-17 [1] RSPM (R 4.5.0)
##  DT                       * 0.33    2024-04-04 [1] RSPM (R 4.5.0)
##  emmeans                    1.11.2  2025-07-11 [1] RSPM (R 4.5.0)
##  estimability               1.5.1   2024-05-12 [1] RSPM (R 4.5.0)
##  evaluate                   1.0.4   2025-06-18 [2] RSPM (R 4.5.0)
##  ExperimentHub              2.16.1  2025-07-23 [1] Bioconductor 3.21 (R 4.5.1)
##  farver                     2.1.2   2024-05-13 [1] RSPM (R 4.5.0)
##  fastmap                    1.2.0   2024-05-15 [2] RSPM (R 4.5.0)
##  filelock                   1.0.3   2023-12-11 [1] RSPM (R 4.5.0)
##  fillpattern                1.0.2   2024-06-24 [1] RSPM (R 4.5.0)
##  Formula                    1.2-5   2023-02-24 [1] RSPM (R 4.5.0)
##  fs                         1.6.6   2025-04-12 [2] RSPM (R 4.5.0)
##  generics                 * 0.1.4   2025-05-09 [1] RSPM (R 4.5.0)
##  GenomeInfoDb             * 1.44.2  2025-08-20 [1] Bioconductor 3.21 (R 4.5.1)
##  GenomeInfoDbData           1.2.14  2025-05-24 [1] Bioconductor
##  GenomicRanges            * 1.60.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  ggbeeswarm                 0.7.2   2023-04-29 [1] RSPM (R 4.5.0)
##  ggnewscale                 0.5.2   2025-06-20 [1] RSPM (R 4.5.0)
##  ggplot2                    3.5.2   2025-04-09 [1] RSPM (R 4.5.0)
##  ggrepel                    0.9.6   2024-09-07 [1] RSPM (R 4.5.0)
##  ggtext                     0.1.2   2022-09-16 [1] RSPM (R 4.5.0)
##  glue                       1.8.0   2024-09-30 [2] RSPM (R 4.5.0)
##  gridExtra                  2.3     2017-09-09 [1] RSPM (R 4.5.0)
##  gridtext                   0.1.5   2022-09-16 [1] RSPM (R 4.5.0)
##  gtable                     0.3.6   2024-10-25 [1] RSPM (R 4.5.0)
##  hms                        1.1.3   2023-03-21 [1] RSPM (R 4.5.0)
##  htmltools                  0.5.8.1 2024-04-04 [2] RSPM (R 4.5.0)
##  htmlwidgets                1.6.4   2023-12-06 [2] RSPM (R 4.5.0)
##  httr                       1.4.7   2023-08-15 [1] RSPM (R 4.5.0)
##  igraph                     2.1.4   2025-01-23 [1] RSPM (R 4.5.0)
##  IRanges                  * 2.42.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  irlba                      2.3.5.1 2022-10-03 [1] RSPM (R 4.5.0)
##  jquerylib                  0.1.4   2021-04-26 [2] RSPM (R 4.5.0)
##  jsonlite                   2.0.0   2025-03-27 [2] RSPM (R 4.5.0)
##  KEGGREST                   1.48.1  2025-06-22 [1] Bioconductor 3.21 (R 4.5.1)
##  knitr                      1.50    2025-03-16 [2] RSPM (R 4.5.0)
##  lattice                    0.22-7  2025-04-02 [3] CRAN (R 4.5.1)
##  lazyeval                   0.2.2   2019-03-15 [1] RSPM (R 4.5.0)
##  lifecycle                  1.0.4   2023-11-07 [2] RSPM (R 4.5.0)
##  magrittr                   2.0.3   2022-03-30 [2] RSPM (R 4.5.0)
##  MASS                       7.3-65  2025-02-28 [3] CRAN (R 4.5.1)
##  Matrix                     1.7-3   2025-03-11 [3] CRAN (R 4.5.1)
##  MatrixGenerics           * 1.20.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  matrixStats              * 1.5.0   2025-01-07 [1] RSPM (R 4.5.0)
##  memoise                    2.0.1   2021-11-26 [2] RSPM (R 4.5.0)
##  mgcv                       1.9-3   2025-04-04 [3] CRAN (R 4.5.1)
##  mia                        1.16.1  2025-07-13 [1] Bioconductor 3.21 (R 4.5.1)
##  mime                       0.13    2025-03-17 [2] RSPM (R 4.5.0)
##  multcomp                   1.4-28  2025-01-29 [1] RSPM (R 4.5.0)
##  MultiAssayExperiment       1.34.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  mvtnorm                    1.3-3   2025-01-10 [1] RSPM (R 4.5.0)
##  nlme                       3.1-168 2025-03-31 [3] CRAN (R 4.5.1)
##  parallelly                 1.45.1  2025-07-24 [1] RSPM (R 4.5.0)
##  patchwork                  1.3.2   2025-08-25 [1] RSPM (R 4.5.0)
##  permute                    0.9-8   2025-06-25 [1] RSPM (R 4.5.0)
##  pillar                     1.11.0  2025-07-04 [2] RSPM (R 4.5.0)
##  pkgconfig                  2.0.3   2019-09-22 [2] RSPM (R 4.5.0)
##  pkgdown                    2.1.3   2025-05-25 [2] RSPM (R 4.5.0)
##  plyr                       1.8.9   2023-10-02 [1] RSPM (R 4.5.0)
##  png                        0.1-8   2022-11-29 [1] RSPM (R 4.5.0)
##  purrr                      1.1.0   2025-07-10 [2] RSPM (R 4.5.0)
##  R6                         2.6.1   2025-02-15 [2] RSPM (R 4.5.0)
##  ragg                       1.4.0   2025-04-10 [2] RSPM (R 4.5.0)
##  rappdirs                   0.3.3   2021-01-31 [2] RSPM (R 4.5.0)
##  rbiom                      2.2.1   2025-06-27 [1] RSPM (R 4.5.0)
##  RColorBrewer               1.1-3   2022-04-03 [1] RSPM (R 4.5.0)
##  Rcpp                       1.1.0   2025-07-02 [2] RSPM (R 4.5.0)
##  readr                      2.1.5   2024-01-10 [1] RSPM (R 4.5.0)
##  readxl                     1.4.5   2025-03-07 [1] RSPM (R 4.5.0)
##  reshape2                   1.4.4   2020-04-09 [1] RSPM (R 4.5.0)
##  rlang                      1.1.6   2025-04-11 [2] RSPM (R 4.5.0)
##  rmarkdown                  2.29    2024-11-04 [2] RSPM (R 4.5.0)
##  RSQLite                    2.4.3   2025-08-20 [1] RSPM (R 4.5.0)
##  rsvd                       1.0.5   2021-04-16 [1] RSPM (R 4.5.0)
##  S4Arrays                   1.8.1   2025-06-01 [1] Bioconductor 3.21 (R 4.5.1)
##  S4Vectors                * 0.46.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  sandwich                   3.1-1   2024-09-15 [1] RSPM (R 4.5.0)
##  sass                       0.4.10  2025-04-11 [2] RSPM (R 4.5.0)
##  ScaledMatrix               1.16.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  scales                     1.4.0   2025-04-24 [1] RSPM (R 4.5.0)
##  scater                     1.36.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  scuttle                    1.18.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  sessioninfo                1.2.3   2025-02-05 [2] RSPM (R 4.5.0)
##  SingleCellExperiment     * 1.30.1  2025-05-07 [1] Bioconductor 3.21 (R 4.5.0)
##  slam                       0.1-55  2024-11-13 [1] RSPM (R 4.5.0)
##  SparseArray                1.8.1   2025-07-23 [1] Bioconductor 3.21 (R 4.5.1)
##  sparseMatrixStats          1.20.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  stringi                    1.8.7   2025-03-27 [2] RSPM (R 4.5.0)
##  stringr                    1.5.1   2023-11-14 [2] RSPM (R 4.5.0)
##  SummarizedExperiment     * 1.38.1  2025-04-30 [1] Bioconductor 3.21 (R 4.5.0)
##  survival                   3.8-3   2024-12-17 [3] CRAN (R 4.5.1)
##  systemfonts                1.2.3   2025-04-30 [2] RSPM (R 4.5.0)
##  table1                   * 1.4.3   2023-01-06 [1] RSPM (R 4.5.0)
##  textshaping                1.0.1   2025-05-01 [2] RSPM (R 4.5.0)
##  TH.data                    1.1-3   2025-01-17 [1] RSPM (R 4.5.0)
##  tibble                     3.3.0   2025-06-08 [2] RSPM (R 4.5.0)
##  tidyr                      1.3.1   2024-01-24 [1] RSPM (R 4.5.0)
##  tidyselect                 1.2.1   2024-03-11 [1] RSPM (R 4.5.0)
##  tidytree                   0.4.6   2023-12-12 [1] RSPM (R 4.5.0)
##  treeio                     1.32.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  TreeSummarizedExperiment * 2.16.1  2025-05-11 [1] Bioconductor 3.21 (R 4.5.0)
##  tzdb                       0.5.0   2025-03-15 [1] RSPM (R 4.5.0)
##  UCSC.utils                 1.4.0   2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  vctrs                      0.6.5   2023-12-01 [2] RSPM (R 4.5.0)
##  vegan                      2.7-1   2025-06-05 [1] RSPM (R 4.5.0)
##  vipor                      0.4.7   2023-12-18 [1] RSPM (R 4.5.0)
##  viridis                    0.6.5   2024-01-29 [1] RSPM (R 4.5.0)
##  viridisLite                0.4.2   2023-05-02 [1] RSPM (R 4.5.0)
##  withr                      3.0.2   2024-10-28 [2] RSPM (R 4.5.0)
##  xfun                       0.53    2025-08-19 [2] RSPM (R 4.5.0)
##  xml2                       1.4.0   2025-08-20 [2] RSPM (R 4.5.0)
##  xtable                     1.8-4   2019-04-21 [2] RSPM (R 4.5.0)
##  XVector                  * 0.48.0  2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
##  yaml                       2.3.10  2024-07-26 [2] RSPM (R 4.5.0)
##  yulab.utils                0.2.1   2025-08-19 [1] RSPM (R 4.5.0)
##  zoo                        1.8-14  2025-04-10 [1] RSPM (R 4.5.0)
## 
##  [1] /__w/_temp/Library
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/local/lib/R/library
##  * ── Packages attached to the search path.
## 
## ──────────────────────────────────────────────────────────────────────────────