2017/01/23_KyungMann, Kim (University of Wisconsin, Madison)

January 23th 2017

EEI (Torrecedeira, 86 – Vigo) | Aula de Medios Audiovisuais

Statistical Models for Zero-Inflated Count Data: A Review


2017/01/23 – 11:30 h | KyungMann Kim, Ph.D., University of Wisconsin-Madison

Abstract

Count data are routinely analyzed using Poisson (P) distributions. Due to population heterogeneity, however, they often exhibit over-dispersion known as the extra-Poisson variation. This extra-Poisson variation can be handled in one of two ways, maximum quasi-likelihood method or a latent variable model leading to negative binomial (NB) distribution with a gamma mixing distribution for the Poisson mean. Still there are situations where these models perform poorly because of excess zeroes in the count. There are at least two similar, but conceptually different approaches to handling excess zeroes. In what is commonly known as zero-inflated (ZI) models, we may view data as being generated from a mixture model for a point mass at zero representing “excess” zeroes and a standard non-degenerate distribution including “true” zeroes. This mixture model allows for mixture of two different populations, one non-susceptible for events (resulting in excess zeroes) and the other susceptible (including true zeroes). In contrast, the so-called hurdle (H) models may be conceptualized as having zeroes only from a non-susceptible population and can be modeled using two processes, one generating zeroes (“choice”) and the other generating only the positive counts (“intensity”) from a truncated count distribution. In this presentation, I will review count data regression models with emphasis on zero-inflated count data along with illustration of these models with examples from the literature.


Gallery