Andrew Doxey

University of Waterloo
Project Title: Mining microbiome community structure and biomarker identification through data intensive biology, machine learning, and high throughput technologies
Industry Partner: Metagenom Bio Inc.
Platform: Cloud Analytics, Large-Memory System

Mining Water

Microbial ecosystems such as those associated with the mining industry are complex networks of interacting species and biochemical dependencies. Modeling responses to different or changing environmental conditions is a considerable computational and statistical challenge. Current approaches largely identify important features through differential abundance, ignoring the disparate influence some species or functions exert on the community, greatly simplifying biological complexity. Additionally, these approaches can ignore poorly annotated features (e.g., hypothetical genes, microbial dark matter, ORFans). Excluding these known unknowns and unknown unknowns reduces the resolution and sensitivity of these analyses. Metagenomic feature selection using machine learning has been most widely applied to the human microbiome, which currently has more extensive data than other systems. We will apply a superficially similar but much higher resolution approach to less studied, more dynamic industrial microbiomes, such as mining.