Overview

This analysis:

  1. selects a subset of samples (all CRC-related in this example),
  2. uses the table1 package to display a table of the characteristics of the included cohort,
  3. sorts species in order of descending prevalence,
  4. uses the DT package datatable function to display a searchable, paged table of prevalences, and
  5. writes the prevalences to file.

Required packages:

Select samples

Note that a few species without phylogenetic information are lost.

crc_subset <- filter(sampleMetadata, study_condition == "CRC") %>%
  returnSamples(dataType = "relative_abundance",
                rownames = "short")

Cohort characteristics

Create a summary table of the participants in this cohort:

table1::table1( ~ disease + disease_subtype + age + gender + country + study_name,
                data = colData(crc_subset))
Overall
(N=702)
disease
CRC 626 (89.2%)
CRC;cholesterolemia 1 (0.1%)
CRC;fatty_liver 3 (0.4%)
CRC;fatty_liver;hypertension 12 (1.7%)
CRC;hypercholesterolemia 3 (0.4%)
CRC;hypercholesterolemia;hypertension 1 (0.1%)
CRC;hypertension 20 (2.8%)
CRC;metastases 1 (0.1%)
CRC;T2D 29 (4.1%)
CRC;T2D;fatty_liver;hypertension 4 (0.6%)
CRC;T2D;hypertension 2 (0.3%)
disease_subtype
adenocarcinoma 59 (8.4%)
carcinoma 174 (24.8%)
Missing 469 (66.8%)
age
Mean (SD) 63.3 (11.0)
Median [Min, Max] 64.0 [28.0, 90.0]
gender
female 258 (36.8%)
male 444 (63.2%)
country
AUT 46 (6.6%)
CAN 2 (0.3%)
CHN 75 (10.7%)
DEU 60 (8.5%)
FRA 53 (7.5%)
IND 30 (4.3%)
ITA 61 (8.7%)
JPN 298 (42.5%)
USA 77 (11.0%)
study_name
FengQ_2015 46 (6.6%)
GuptaA_2019 30 (4.3%)
HanniganGD_2017 27 (3.8%)
ThomasAM_2018a 29 (4.1%)
ThomasAM_2018b 32 (4.6%)
ThomasAM_2019_c 40 (5.7%)
VogtmannE_2016 52 (7.4%)
WirbelJ_2018 60 (8.5%)
YachidaS_2019 258 (36.8%)
YuJ_2015 75 (10.7%)
ZellerG_2014 53 (7.5%)

Show species in order of decreasing prevalence

Prevalences shown are the fraction of specimens from CRC patients with non-zero relative abundance.

prevalences <- rowSums(assay(crc_subset) > 0) / ncol(crc_subset) 
prevalences <- tibble(species = names(prevalences), prevalence = signif(prevalences, 2)) %>%
  filter(prevalence > 0) %>%
  arrange(-prevalence)
DT::datatable(prevalences)

Write to disk:

write.csv(prevalences, row.names = FALSE, file = "prevalences.csv")

Download the zipped prevalences file produced by curatedMetagenomicData version 3.2.3: prevalences.csv.zip

Session Info

sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.3 (2022-03-10)
##  os       Ubuntu 20.04.4 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language en
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       UTC
##  date     2022-10-19
##  pandoc   2.17.1.1 @ /usr/local/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package                  * version  date (UTC) lib source
##  AnnotationDbi              1.56.2   2021-11-09 [1] Bioconductor
##  AnnotationHub              3.2.2    2022-03-01 [1] Bioconductor
##  ape                        5.6-2    2022-03-02 [1] CRAN (R 4.1.3)
##  assertthat                 0.2.1    2019-03-21 [1] CRAN (R 4.1.3)
##  beachmat                   2.10.0   2021-10-26 [1] Bioconductor
##  beeswarm                   0.4.0    2021-06-01 [1] CRAN (R 4.1.3)
##  Biobase                  * 2.54.0   2021-10-26 [1] Bioconductor
##  BiocFileCache              2.2.1    2022-01-23 [1] Bioconductor
##  BiocGenerics             * 0.40.0   2021-10-26 [1] Bioconductor
##  BiocManager                1.30.18  2022-05-18 [1] RSPM (R 4.1.0)
##  BiocNeighbors              1.12.0   2021-10-26 [1] Bioconductor
##  BiocParallel               1.28.3   2021-12-09 [1] Bioconductor
##  BiocSingular               1.10.0   2021-10-26 [1] Bioconductor
##  BiocStyle                * 2.22.0   2021-10-26 [1] Bioconductor
##  BiocVersion                3.14.0   2021-05-19 [2] Bioconductor
##  Biostrings               * 2.62.0   2021-10-26 [1] Bioconductor
##  bit                        4.0.4    2020-08-04 [1] CRAN (R 4.1.3)
##  bit64                      4.0.5    2020-08-30 [1] CRAN (R 4.1.3)
##  bitops                     1.0-7    2021-04-24 [1] CRAN (R 4.1.3)
##  blob                       1.2.3    2022-04-10 [1] CRAN (R 4.1.3)
##  bookdown                   0.29     2022-09-12 [1] RSPM (R 4.1.0)
##  bslib                      0.4.0    2022-07-16 [2] RSPM (R 4.1.0)
##  cachem                     1.0.6    2021-08-19 [2] RSPM (R 4.1.0)
##  cli                        3.4.1    2022-09-23 [2] RSPM (R 4.1.0)
##  cluster                    2.1.4    2022-08-22 [3] RSPM (R 4.1.0)
##  colorspace                 2.0-3    2022-02-21 [1] CRAN (R 4.1.3)
##  crayon                     1.5.2    2022-09-29 [2] RSPM (R 4.1.0)
##  crosstalk                  1.2.0    2021-11-04 [1] CRAN (R 4.1.3)
##  curatedMetagenomicData   * 3.2.3    2021-12-22 [1] Bioconductor
##  curl                       4.3.3    2022-10-06 [2] RSPM (R 4.1.0)
##  DBI                        1.1.3    2022-06-18 [1] RSPM (R 4.1.0)
##  dbplyr                     2.2.1    2022-06-27 [1] RSPM (R 4.1.0)
##  DECIPHER                   2.22.0   2021-10-26 [1] Bioconductor
##  decontam                   1.14.0   2021-10-26 [1] Bioconductor
##  DelayedArray               0.20.0   2021-10-26 [1] Bioconductor
##  DelayedMatrixStats         1.16.0   2021-10-26 [1] Bioconductor
##  desc                       1.4.2    2022-09-08 [2] RSPM (R 4.1.0)
##  digest                     0.6.29   2021-12-01 [2] RSPM (R 4.1.0)
##  DirichletMultinomial       1.36.0   2021-10-26 [1] Bioconductor
##  dplyr                    * 1.0.10   2022-09-01 [2] RSPM (R 4.1.0)
##  DT                       * 0.25     2022-09-12 [1] RSPM (R 4.1.0)
##  ellipsis                   0.3.2    2021-04-29 [2] RSPM (R 4.1.0)
##  evaluate                   0.17     2022-10-07 [2] RSPM (R 4.1.0)
##  ExperimentHub              2.2.1    2022-01-23 [1] Bioconductor
##  fansi                      1.0.3    2022-03-24 [2] CRAN (R 4.1.3)
##  fastmap                    1.1.0    2021-01-25 [2] RSPM (R 4.1.0)
##  filelock                   1.0.2    2018-10-05 [1] CRAN (R 4.1.3)
##  Formula                    1.2-4    2020-10-16 [1] CRAN (R 4.1.3)
##  fs                         1.5.2    2021-12-08 [2] RSPM (R 4.1.0)
##  generics                   0.1.3    2022-07-05 [2] RSPM (R 4.1.0)
##  GenomeInfoDb             * 1.30.1   2022-01-30 [1] Bioconductor
##  GenomeInfoDbData           1.2.7    2022-04-11 [1] Bioconductor
##  GenomicRanges            * 1.46.1   2021-11-18 [1] Bioconductor
##  ggbeeswarm                 0.6.0    2017-08-07 [1] CRAN (R 4.1.3)
##  ggplot2                    3.3.6    2022-05-03 [1] RSPM (R 4.1.0)
##  ggrepel                    0.9.1    2021-01-15 [1] CRAN (R 4.1.3)
##  glue                       1.6.2    2022-02-24 [2] RSPM (R 4.1.0)
##  gridExtra                  2.3      2017-09-09 [1] CRAN (R 4.1.3)
##  gtable                     0.3.1    2022-09-01 [1] RSPM (R 4.1.0)
##  htmltools                  0.5.3    2022-07-18 [2] RSPM (R 4.1.0)
##  htmlwidgets                1.5.4    2021-09-08 [2] CRAN (R 4.1.3)
##  httpuv                     1.6.6    2022-09-08 [2] RSPM (R 4.1.0)
##  httr                       1.4.4    2022-08-17 [2] RSPM (R 4.1.0)
##  interactiveDisplayBase     1.32.0   2021-10-26 [1] Bioconductor
##  IRanges                  * 2.28.0   2021-10-26 [1] Bioconductor
##  irlba                      2.3.5.1  2022-10-03 [1] RSPM (R 4.1.0)
##  jquerylib                  0.1.4    2021-04-26 [2] CRAN (R 4.1.3)
##  jsonlite                   1.8.2    2022-10-02 [2] RSPM (R 4.1.0)
##  KEGGREST                   1.34.0   2021-10-26 [1] Bioconductor
##  knitr                      1.40     2022-08-24 [2] RSPM (R 4.1.0)
##  later                      1.3.0    2021-08-18 [2] CRAN (R 4.1.3)
##  lattice                    0.20-45  2021-09-22 [3] CRAN (R 4.1.3)
##  lazyeval                   0.2.2    2019-03-15 [1] CRAN (R 4.1.3)
##  lifecycle                  1.0.3    2022-10-07 [2] RSPM (R 4.1.0)
##  magrittr                   2.0.3    2022-03-30 [2] CRAN (R 4.1.3)
##  MASS                       7.3-58.1 2022-08-03 [3] RSPM (R 4.1.0)
##  Matrix                     1.5-1    2022-09-13 [3] RSPM (R 4.1.0)
##  MatrixGenerics           * 1.6.0    2021-10-26 [1] Bioconductor
##  matrixStats              * 0.62.0   2022-04-19 [1] RSPM (R 4.1.0)
##  memoise                    2.0.1    2021-11-26 [2] RSPM (R 4.1.0)
##  mgcv                       1.8-40   2022-03-29 [3] CRAN (R 4.1.3)
##  mia                        1.2.7    2022-02-08 [1] Bioconductor
##  mime                       0.12     2021-09-28 [2] RSPM (R 4.1.0)
##  MultiAssayExperiment       1.20.0   2021-10-26 [1] Bioconductor
##  munsell                    0.5.0    2018-06-12 [1] CRAN (R 4.1.3)
##  nlme                       3.1-160  2022-10-10 [3] RSPM (R 4.1.0)
##  permute                    0.9-7    2022-01-27 [1] CRAN (R 4.1.3)
##  pillar                     1.8.1    2022-08-19 [2] RSPM (R 4.1.0)
##  pkgconfig                  2.0.3    2019-09-22 [2] RSPM (R 4.1.0)
##  pkgdown                    2.0.6    2022-07-16 [2] RSPM (R 4.1.0)
##  plyr                       1.8.7    2022-03-24 [1] CRAN (R 4.1.3)
##  png                        0.1-7    2013-12-03 [1] CRAN (R 4.1.3)
##  promises                   1.2.0.1  2021-02-11 [2] CRAN (R 4.1.3)
##  purrr                      0.3.5    2022-10-06 [2] RSPM (R 4.1.0)
##  R6                         2.5.1    2021-08-19 [2] RSPM (R 4.1.0)
##  ragg                       1.2.3    2022-09-29 [2] RSPM (R 4.1.0)
##  rappdirs                   0.3.3    2021-01-31 [2] RSPM (R 4.1.0)
##  Rcpp                       1.0.9    2022-07-08 [2] RSPM (R 4.1.0)
##  RCurl                      1.98-1.9 2022-10-03 [1] RSPM (R 4.1.0)
##  reshape2                   1.4.4    2020-04-09 [1] CRAN (R 4.1.3)
##  rlang                      1.0.6    2022-09-24 [2] RSPM (R 4.1.0)
##  rmarkdown                  2.17     2022-10-07 [2] RSPM (R 4.1.0)
##  rprojroot                  2.0.3    2022-04-02 [2] CRAN (R 4.1.3)
##  RSQLite                    2.2.18   2022-10-04 [1] RSPM (R 4.1.0)
##  rsvd                       1.0.5    2021-04-16 [1] CRAN (R 4.1.3)
##  S4Vectors                * 0.32.4   2022-03-24 [1] Bioconductor
##  sass                       0.4.2    2022-07-16 [2] RSPM (R 4.1.0)
##  ScaledMatrix               1.2.0    2021-10-26 [1] Bioconductor
##  scales                     1.2.1    2022-08-20 [1] RSPM (R 4.1.0)
##  scater                     1.22.0   2021-10-26 [1] Bioconductor
##  scuttle                    1.4.0    2021-10-26 [1] Bioconductor
##  sessioninfo                1.2.2    2021-12-06 [2] RSPM (R 4.1.0)
##  shiny                      1.7.2    2022-07-19 [2] RSPM (R 4.1.0)
##  SingleCellExperiment     * 1.16.0   2021-10-26 [1] Bioconductor
##  sparseMatrixStats          1.6.0    2021-10-26 [1] Bioconductor
##  stringi                    1.7.8    2022-07-11 [2] RSPM (R 4.1.0)
##  stringr                    1.4.1    2022-08-20 [2] RSPM (R 4.1.0)
##  SummarizedExperiment     * 1.24.0   2021-10-26 [1] Bioconductor
##  systemfonts                1.0.4    2022-02-11 [2] CRAN (R 4.1.3)
##  table1                   * 1.4.2    2021-06-06 [1] RSPM (R 4.1.0)
##  textshaping                0.3.6    2021-10-13 [2] CRAN (R 4.1.3)
##  tibble                     3.1.8    2022-07-22 [2] RSPM (R 4.1.0)
##  tidyr                      1.2.1    2022-09-08 [2] RSPM (R 4.1.0)
##  tidyselect                 1.2.0    2022-10-10 [2] RSPM (R 4.1.0)
##  tidytree                   0.4.1    2022-09-26 [1] RSPM (R 4.1.0)
##  treeio                     1.18.1   2021-11-14 [1] Bioconductor
##  TreeSummarizedExperiment * 2.2.0    2021-10-26 [1] Bioconductor
##  utf8                       1.2.2    2021-07-24 [2] RSPM (R 4.1.0)
##  vctrs                      0.4.2    2022-09-29 [2] RSPM (R 4.1.0)
##  vegan                      2.6-4    2022-10-11 [1] RSPM (R 4.1.0)
##  vipor                      0.4.5    2017-03-22 [1] CRAN (R 4.1.3)
##  viridis                    0.6.2    2021-10-13 [1] CRAN (R 4.1.3)
##  viridisLite                0.4.1    2022-08-22 [1] RSPM (R 4.1.0)
##  withr                      2.5.0    2022-03-03 [2] RSPM (R 4.1.0)
##  xfun                       0.33     2022-09-12 [2] RSPM (R 4.1.0)
##  xtable                     1.8-4    2019-04-21 [2] CRAN (R 4.1.3)
##  XVector                  * 0.34.0   2021-10-26 [1] Bioconductor
##  yaml                       2.3.5    2022-02-21 [2] RSPM (R 4.1.0)
##  yulab.utils                0.0.5    2022-06-30 [1] RSPM (R 4.1.0)
##  zlibbioc                   1.40.0   2021-10-26 [1] Bioconductor
## 
##  [1] /__w/_temp/Library
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/local/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────