Gavino Puggioni, Ph.D.

Current Research

In the last two decades we have experienced a substantial increase in amount and quality of data from different fields. With the dramatically increased computational power, now availabile at reasonable prices, it is now possible to fit models that are both more sophisticated and less restrictive. In particular, the gap between complex mathematical models, based on scientific theory, and statistical estimation has become narrower. The advent of MCMC methods has, in fact, facilitated the estimation of hierachical models that can include a multilevel structures, for model, data and observations. Most of my research is devoted to the development and estimation of such model, in a collaborative framework with scientists of different disciplines. My research interests lay on both the methodological and the applicational side. I try to achieve a balance between the development of novel models and the application to concrete problems and datasets. My methodological work involves the development of stochastic differential equations based spacetime models. On the applications side, I have focused on environmental, ecological, biological, and financial problems. The unifying framework is Bayesian hierarchical models.

Stochastic Differential Equations modeling
The main body of my Ph.D. dissertation, supervised by Prof. Alan Gelfand, contains the development of estimation strategies for stochastic differential equations (SDE) parameters in a space-time framework. The estimation of parameters of SDE is a difficult task because a closed solution is only available in a restricted number of very special cases. Likelihood strategies based on Euler schemes are affected by discretization bias. In order to fix this bias, model augmentation have been proposed by a substantial part of the literature. My work extends the existing methods to the multivariate case of a system of equations that present a spatial structure either in the parameters (modeled as realizations of Gaussian processes) or in the multivariate Brownian motion itself. The method I have developed are general enough to be applied to several applications, such as population dynamics and spatial epidemiology. Recently, jointly with my collaborators at Emory University, I am applying these methods to the direct estimation of SEIR models for the spread of raccoon rabies in the United States. On a more theoretical side, I am investigating the closed form of multivariate transition densities arising from second order discretization schemes.

Applications in Ecology and Biology
During my graduate studies at Duke University I have been exposed to several applications in biostatistics and ecology. In particular I would like to mention my published work in evaluation of radiologist variability (joint modeling of sensitivity and specificity in screening mammograms evaluation), the papers in modeling wireless sensor data when there are suppression and failure issues, and the applications to soil moisture and hydrological models. In the last year I have been working at Emory University with Prof. Lance Waller, Prof. Leslie Real and others on a number of different projects that provided attracting modeling opportunities and interesting scientific applications. The first completed work cover hierarchical models for assessing coadapted competition in the flowering seasons of Hummingbird-pollinated plants, the second proposed some models for the spatio-temporal analysis of the spread of sugar cane yellow leaf virus. The latter is particularly interesting because it proposes some approaches for point process modeling when the time observations are very infrequent. I have several ongoing projects that include likelihood estimation of SEIR models for raccoon rabies spread, point processes for space-time patterns in sea turtle nesting sites, and point process models of epidermal nerve fibres.

Model Averaging in Time Series Analysis,
Financial Applications
In collaboration with Prof. Abel Rodriguez, I have developed a Bayesian approach to mixed frequency models, where the dependent variable is regressed on (several lags of) explanatory variables observed at different frequencies (usually higher). Typically, the lagged regressors exhibit collinearity and not all of them contribute in equal measure to explain the dependent variable. In order to achieve parsimony and efficience in prediction some deterministic weighting structures are imposed by several authors. Our approach works within a dynamic linear model (DLM) framework, thus enabling the regression structure to be supplemented by random dynamic coefficients. The problem of the collinearity of intraperiod observations is solved using model selection and model averaging methods. In our work we explore predictions based on model averaging, that allow us to account for both model and parameter uncertainty. We illustrate our approach by predicting the gross national product of the United States using the term structure of interest rates. Numerous other financial applications are possible and we are currently working on exploring some of them, in particular stochastic volatility models for high frequency data (minute and tic by tic).

Environmental applications
During my research experience at University of North Carolina at Chapel Hill, working in the Prof. Marc Serre and Jackie Mc Donald's team, we developed space-time estimates of several air pollutants in the UAE area, such as PM10, Ozone, and PM2.5. We implemented both Bayesian Maximum Entropy (BME) and hierarchical models to model data coming from differen sources: monitoring stations, CMAQ computer estimations and satellite measurements. Some of this research is still in progress and we aim to compare the performance of BME and hierarchical models in prediction and the development of a general framework to model jointly data from different sources.

Future Research

As noted in the previous section, several of my projects are still in progress and my results have the potential to be expanded, or applied to different problems. In particular, the SDE modeling can be extended both on the methodolgy (in efficiency with higher order schemes, in generality with partial differential equations and multi index problems) and on the applications side. Attractive opportunities are on SEIR stochastic models, population dynamics and other ecological processes. A recurring theme of my research will be the bridging between statistical and mathematical modeling. In fact, I envision a closer collaboration with mathematicians and other scientists who can benefit from the link between scientific modelling and statistical data analysis. The advantages in terms of interpretability of results, rigour and flexibility in estimation, accuracy in prediction are undeniable. In the next few years I am also planning to work more on spatial point processes and their epidemiology applications, efficient MCMC methods for complex models, multivariate generalization of my financial models for mixed frequency data.