Communications in Information and Systems

Volume 18 (2018)

Number 3

A review on probabilistic models used in microbiome studies

Pages: 173 – 191

DOI: https://dx.doi.org/10.4310/CIS.2018.v18.n3.a3

Authors

Ahmed A. Metwally (Dept. of Bioengineering and Dept. of Computer Science, University of Illinois, Chicago, Il., U.S.A.)

Hani Aldirawi (Department of Mathematics, Statistics & Computer Science, University of Illinois, Chicago, Il., U.S.A.)

Jie Yang (Department of Mathematics, Statistics & Computer Science, University of Illinois, Chicago, Il., U.S.A.)

Abstract

In this paper, we first briefly review the background and significance of the microbiome, the technologies used for collecting microbiome data, and some public resources for downloading microbiome data. We then review the probabilistic models used in the literature in two categories: (1) for read counts from a specific feature, including Poisson, negative binomial, zero-inflated and hurdle models; (2) for read counts from multiple features, including Dirichlet-multinomial, generalized Dirichlet-multinomial, and zero-inflated models, as well as a nonparametric Bayesian model for a flexible number of features. We also review comprehensive comparisons among different probabilistic models.

This work was partially supported by a UIC Chancellor’s Graduate Research Fellowship, and UIC CCTS Pre-doctoral Education for Clinical and Translational Scientists fellowship (UL1TR002003), both awarded to AAM.

Published 22 October 2018