On the estimation of mixtures of Poisson regression models with large number of components
2016
Papastamoulis, Panagiotis | Magniette, Marie-Laure | Maugis-Rabusseau, Cathy | Institut des Sciences des Plantes de Paris-Saclay (IPS2 (UMR_9213 / UMR_1403)) ; Institut National de la Recherche Agronomique (INRA)-Université Paris-Sud - Paris 11 (UP11)-Université Paris Diderot - Paris 7 (UPD7)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS) | Institut de Mathématiques de Toulouse UMR5219 (IMT) ; Université Toulouse Capitole (UT Capitole) ; Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse) ; Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J) ; Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3) ; Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)
Modelling heterogeneity in large datasets of counts under the presence of covariates demands advanced clustering methods. Towards this direction a mixture of Poisson regressions is proposed. Conditionally on the covariates and a cluster, the multivariate distribution is a product of independent Poisson distributions. A variety of different parameterizations is taken into account for the slope of the conditional log-means. Also considered is the case of partitioning the response variables into sets of replicates sharing the same conditional log-mean up to an additive constant. Model parameters are estimated via an Expectation Maximization algorithm with Newton Raphson steps. In particular, an efficient initialization is introduced in order to improve the inference: a splitting scheme is combined with a Small-EM strategy. Simulations and application on two real high-throughput sequencing datasets highlight improvements of parameter estimations. The proposed methodology is implemented in the R package poisson glm. mix, available on CRAN.
اظهر المزيد [+] اقل [-]المعلومات البيبليوغرافية
تم تزويد هذا السجل من قبل Institut national de la recherche agronomique