Contents Online
Communications in Information and Systems
Volume 20 (2020)
Number 1
Identifying zero-inflated distributions with a new R package iZID
Pages: 23 – 44
DOI: https://dx.doi.org/10.4310/CIS.2020.v20.n1.a2
Authors
Abstract
Count data with a large portion of zeros arise naturally in many scientific disciplines. When conducting one-sample Kolmogorov–Smirnov (KS) test for count data, the estimated p-value is biased due to plugging in sample estimates of unknown parameters. As a consequence, the result of a KS test could be too conservative. In the newly developed R package “iZID” for zero-inflated count data, we use bootstrapped Monte Carlo estimates to overcome the bias issue in estimating p-values, as well as bootstrapped likelihood ratio tests for zero-inflated model selection. Our new package also provides miscellaneous functions to simulate zero-inflated count data and calculate maximum likelihood estimates of unknown parameters. Compared with other R packages available so far, our package covers more types of zero-inflated distributions and provides adjusted p-value estimates after incorporating the influence of unknown model parameters. To facilitate the potential users, in this paper we provide detailed descriptions of functions in “iZID” and illustrate the use of them with executable R code.
Keywords
count data, hurdle model, Kolmogorov–Smirnov test, model selection, zero-inflated distribution
2010 Mathematics Subject Classification
Primary 62F10, 62G10. Secondary 62F40.
The third-listed author was supported in part by NSF grant DMS-1924859.
Received 9 September 2019
Published 17 April 2020