importBugphyzz imports bugphyzz annotations as a list of tidy data.frames. To learn more about the structure of the data.frames please check the bugphyzz vignette with browseVignettes("bugphyzz") or `vignette("bugphyzz", "bugphyzz").

importBugphyzz(
  version = "10.5281/zenodo.12574596",
  forceDownload = FALSE,
  v = 0.8,
  excludeRarely = TRUE
)

Arguments

version

Character string indicating the version. Default is the latest release on Zenodo. Options: Zenodo DOI, GitHub commit hash, or devel.

forceDownload

Logical value. Force a fresh download of the data or use the one stored in the cache (if available). Default is FALSE.

v

Validation value. Default 0.8 (see details).

excludeRarely

Default is TRUE. Exclude values with Frequency == FALSE (see details).

Value

A list of tidy data frames.

Details

Data structure

The data structure of the data.frames imported with importBugphyzz are detailed in the main vignette. Please run browseVignettes("bugphyzz").

Validation (v argument)

Data imported with importBugphyzz includes annotations imputed through ancestral state reconstruction (ASR) methods. A 10-fold cross-validation approach was implemented to assess the reliability of the data imputed. Mathew's correlation coefficient (MCC) and R-squared (R2) were used for the validation of discrete and numeric attributes. Details can be found at: https://github.com/waldronlab/taxPProValidation. By default, imputed annotations with a MCC or R2 value greater than 0.5 are imported. The minimum value can be adjusted with the v argument (only values between 0 and 1).

Frequency (excludeRarely argument)

One of the variables in the bugphyzz data.frames is "Frequency", which can adopt values of "always", "usually", "sometimes", "rarely", or "never". By default "never" and "rarely" are excluded. "rarely" could be included with excludeRarely = FALSE. To learn more about these frequency keywords please check the bugphyzz vignette with browseVignettes("bugphyzz").

Sources

By default, the datasets imported with the importBugphuzz function will always return a shortened version of the source. Please use vigette("sources", "bugphyz") to see the full sources.

Examples


bp <- importBugphyzz()
#> Using data downloaded on 2024-08-26 15:06:32.
names(bp)
#>  [1] "animal pathogen"                      
#>  [2] "antimicrobial sensitivity"            
#>  [3] "biofilm formation"                    
#>  [4] "butyrate-producing bacteria"          
#>  [5] "extreme environment"                  
#>  [6] "health associated"                    
#>  [7] "host-associated"                      
#>  [8] "hydrogen gas producing"               
#>  [9] "lactate producing"                    
#> [10] "motility"                             
#> [11] "plant pathogenicity"                  
#> [12] "sphingolipid producing"               
#> [13] "spore formation"                      
#> [14] "aerophilicity"                        
#> [15] "antimicrobial resistance"             
#> [16] "arrangement"                          
#> [17] "biosafety level"                      
#> [18] "cogem pathogenicity rating"           
#> [19] "disease association"                  
#> [20] "gram stain"                           
#> [21] "habitat"                              
#> [22] "hemolysis"                            
#> [23] "shape"                                
#> [24] "spore shape"                          
#> [25] "coding genes"                         
#> [26] "genome size"                          
#> [27] "growth temperature"                   
#> [28] "length"                               
#> [29] "mutation rate per site per generation"
#> [30] "mutation rate per site per year"      
#> [31] "optimal ph"                           
#> [32] "width"