Welcome to the Waldron lab for public health data science at the CUNY School of Public Health in New York City. I teach biostatistics, have an active research program in cancer genomics and in metagenomic profiling of the human microbiome, develop methods within the intersection of statistical analysis and computation, and work to develop an inclusive community of researchers and students around open-source bioinformatics software and methods. My lab aims to generate new insights into human health, disease, and treatment through improved tools and novel analysis of publicly available data.


I believe that the health disparities of race, ethnicity, class, and geography can and must be eliminated through social, political, and scientific change. Professionally I am deeply committed to the Bioconductor project for open-source bioinformatics software, through contributions of individual software packages and support for the project as a whole.

Cancer Genomics

I have a long-standing interest in developing methods for and testing hypotheses using cancer genomics data. These efforts have resulted in greater understanding of the role of gene expression in defining disease subtypes and patient outcomes in high-grade serous ovarian carcinoma, colorectal cancer, and other cancers. They have generated software and databases for the analysis of multi-omics data, notably including MultiAssayExperiment and curatedTCGAData.

Human Microbiome Studies

Metagenomic sequencing has enabled probing the microbial communities that colonize the human body with previously unimaginable depth and resolution. I am fascinated by roles the microbiome may play as an interface between the individual and their environment, and the corresponding implications for health. This area of study is made even more enticing by the vast amounts of data becoming publicly available that can be combined and analyzed in new ways. Contributions in this area include the databases curatedMetagenomicData and HMP16SData.


I work to develop an active and inclusive open-source bioinformatics and data science community, through contributions to the Bioconductor project and activities like the NYC R/Bioconductor Meetup, Data Science book club, and Single-cell multimodal data journal club.