
This analysis:

  1. selects a subset of samples (all CRC-related in this example),
  2. uses the table1 package to display a table of the characteristics of the included cohort,
  3. sorts species in order of descending prevalence,
  4. uses the DT package datatable function to display a searchable, paged table of prevalences, and
  5. writes the prevalences to file.

Required packages:

Select samples

Note that a few species without phylogenetic information are lost.

crc_subset <- filter(sampleMetadata, study_condition == "CRC") %>%
  returnSamples(dataType = "relative_abundance",
                rownames = "short")

Cohort characteristics

Create a summary table of the participants in this cohort:

table1::table1( ~ disease + disease_subtype + age + gender + country + study_name,
                data = colData(crc_subset))
CRC 626 (89.2%)
CRC;cholesterolemia 1 (0.1%)
CRC;fatty_liver 3 (0.4%)
CRC;fatty_liver;hypertension 12 (1.7%)
CRC;hypercholesterolemia 3 (0.4%)
CRC;hypercholesterolemia;hypertension 1 (0.1%)
CRC;hypertension 20 (2.8%)
CRC;metastases 1 (0.1%)
CRC;T2D 29 (4.1%)
CRC;T2D;fatty_liver;hypertension 4 (0.6%)
CRC;T2D;hypertension 2 (0.3%)
adenocarcinoma 59 (8.4%)
carcinoma 174 (24.8%)
Missing 469 (66.8%)
Mean (SD) 63.3 (11.0)
Median [Min, Max] 64.0 [28.0, 90.0]
female 258 (36.8%)
male 444 (63.2%)
AUT 46 (6.6%)
CAN 2 (0.3%)
CHN 75 (10.7%)
DEU 60 (8.5%)
FRA 53 (7.5%)
IND 30 (4.3%)
ITA 61 (8.7%)
JPN 298 (42.5%)
USA 77 (11.0%)
FengQ_2015 46 (6.6%)
GuptaA_2019 30 (4.3%)
HanniganGD_2017 27 (3.8%)
ThomasAM_2018a 29 (4.1%)
ThomasAM_2018b 32 (4.6%)
ThomasAM_2019_c 40 (5.7%)
VogtmannE_2016 52 (7.4%)
WirbelJ_2018 60 (8.5%)
YachidaS_2019 258 (36.8%)
YuJ_2015 75 (10.7%)
ZellerG_2014 53 (7.5%)

Show species in order of decreasing prevalence

Prevalences shown are the fraction of specimens from CRC patients with non-zero relative abundance.

prevalences <- rowSums(assay(crc_subset) > 0) / ncol(crc_subset) 
prevalences <- tibble(species = names(prevalences), prevalence = signif(prevalences, 2)) %>%
  filter(prevalence > 0) %>%

Write to disk:

write.csv(prevalences, row.names = FALSE, file = "prevalences.csv")

Download the zipped prevalences file produced by curatedMetagenomicData version 3.2.3: prevalences.csv.zip

