Recursos de colección
Project Euclid (Hosted at Cornell University Library) (192.979 recursos)
Bayesian Analysis
Bayesian Analysis
Taylor-Rodríguez, Daniel; Womack, Andrew J.; Fuentes, Claudio; Bliznyuk, Nikolay
Occupancy models are typically used to determine the probability of a species being present at a given site while accounting for imperfect detection. The survey data underlying these models often include information on several predictors that could potentially characterize habitat suitability and species detectability. Because these variables might not all be relevant, model selection techniques are necessary in this context. In practice, model selection is performed using the Akaike Information Criterion (AIC), as few other alternatives are available. This paper builds an objective Bayesian variable selection framework for occupancy models through the intrinsic prior methodology. The procedure incorporates priors on...
Shahn, Zach; Madigan, David
We provide a general Bayesian framework for modeling treatment effect heterogeneity in experiments with non-categorical outcomes. Our modeling approach incorporates latent class mixture components to capture discrete heterogeneity and regression interaction terms to capture continuous heterogeneity. Flexible error distributions allow robust posterior inference on parameters of interest. Hierarchical shrinkage priors on relevant parameters address multiple comparisons concerns. Leave-one-out cross validation estimates of expected posterior predictive density obtained through importance sampling, together with posterior predictive checks, provide a convenient method for model selection and evaluation. We apply our approach to a clinical trial comparing two HIV treatments and to an instrumental...
Le, Tri; Clarke, Bertrand
In ${\mathcal{M}}$ -open problems where no true model can be conceptualized, it is common to back off from modeling and merely seek good prediction. Even in ${\mathcal{M}}$ -complete problems, taking a predictive approach can be very useful. Stacking is a model averaging procedure that gives a composite predictor by combining individual predictors from a list of models using weights that optimize a cross-validation criterion. We show that the stacking weights also asymptotically minimize a posterior expected loss. Hence we formally provide a Bayesian justification for cross-validation. Often the weights are constrained to be positive and sum to one. For greater...
Ma, Li
We introduce a hierarchical generalization to the Pólya tree that incorporates locally adaptive shrinkage to data features of different scales, while maintaining analytical simplicity and computational efficiency. Inference under the new model proceeds efficiently using general recipes for conjugate hierarchical models, and can be completed extremely efficiently for data sets with large numbers of observations. We illustrate in density estimation that the achieved adaptive shrinkage results in proper smoothing and substantially improves inference. We evaluate the performance of the model through simulation under several schematic scenarios carefully designed to be representative of a variety of applications. We compare its performance...
Roy, Vivekananda; Chakraborty, Sounak
Penalized regression methods such as the lasso and elastic net (EN) have become popular for simultaneous variable selection and coefficient estimation. Implementation of these methods require selection of the penalty parameters. We propose an empirical Bayes (EB) methodology for selecting these tuning parameters as well as computation of the regularization path plots. The EB method does not suffer from the “double shrinkage problem” of frequentist EN. Also it avoids the difficulty of constructing an appropriate prior on the penalty parameters. The EB methodology is implemented by efficient importance sampling method based on multiple Gibbs sampler chains. Since the Markov chains...
Polettini, Silvia
In survey sampling, interest often lies in unplanned domains (or small areas), whose sample sizes may be too small to allow for accurate design-based inference. To improve the direct estimates by borrowing strength from similar domains, most small area methods rely on mixed effects regression models.
¶
This contribution extends the well known Fay–Herriot model (Fay and Herriot, 1979) within a Bayesian approach in two directions. First, the default normality assumption for the random effects is replaced by a nonparametric specification using a Dirichlet process. Second, uncertainty on variances is explicitly introduced, recognizing the fact that they are actually estimated from survey...
Al Labadi, Luai; Evans, Michael
The robustness to the prior of Bayesian inference procedures based on a measure of statistical evidence is considered. These inferences are shown to have optimal properties with respect to robustness. Furthermore, a connection between robustness and prior-data conflict is established. In particular, the inferences are shown to be effectively robust when the choice of prior does not lead to prior-data conflict. When there is prior-data conflict, however, robustness may fail to hold.
DeYoreo, Maria; Reiter, Jerome P.; Hillygus, D. Sunshine
In some contexts, mixture models can fit certain variables well at the expense of others in ways beyond the analyst’s control. For example, when the data include some variables with non-trivial amounts of missing values, the mixture model may fit the marginal distributions of the nearly and fully complete variables at the expense of the variables with high fractions of missing data. Motivated by this setting, we present a mixture model for mixed ordinal and nominal data that splits variables into two groups, focus variables and remainder variables. The model allows the analyst to specify a rich sub-model for the...
Hart, Jeffrey D.; Choi, Taeryon
A nonparametric Bayes procedure is proposed for testing the fit of a parametric model for a distribution. Alternatives to the parametric model are kernel density estimates. Data splitting makes it possible to use kernel estimates for this purpose in a Bayesian setting. A kernel estimate indexed by bandwidth is computed from one part of the data, a training set, and then used as a model for the rest of the data, a validation set. A Bayes factor is calculated from the validation set by comparing the marginal for the kernel model with the marginal for the parametric model of interest....
Xu, Yanxun; Thall, Peter F.; Müller, Peter; Reza, Mehran J.
We propose a Bayesian nonparametric utility-based group sequential design for a randomized clinical trial to compare a gel sealant to standard care for resolving air leaks after pulmonary resection. Clinically, resolving air leaks in the days soon after surgery is highly important, since longer resolution time produces undesirable complications that require extended hospitalization. The problem of comparing treatments is complicated by the fact that the resolution time distributions are skewed and multi-modal, so using means is misleading. We address these challenges by assuming Bayesian nonparametric probability models for the resolution time distributions and basing the comparative test on weighted means....
Pérez, María-Eglée; Pericchi, Luis Raúl; Ramírez, Isabel Cristina
We put forward the Scaled Beta2 (SBeta2) as a flexible and tractable family for modeling scales in both hierarchical and non-hierarchical settings. Various sensible alternatives to the overuse of vague Inverted Gamma priors have been proposed, mainly for hierarchical models. Several of these alternatives are particular cases of the SBeta2 or can be well approximated by it. This family of distributions can be obtained in closed form as a Gamma scale mixture of Gamma distributions, as the Student distribution can be obtained as a Gamma scale mixture of Normal variables. Members of the SBeta2 family arise as intrinsic priors and...
Banerjee, Sudipto
With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two...
Earls, Cecilia; Hooker, Giles
We propose a model for functional data registration that extends current inferential capabilities for unregistered data by providing a flexible probabilistic framework that 1) allows for functional prediction in the context of registration and 2) can be adapted to include smoothing and registration in one model. The proposed inferential framework is a Bayesian hierarchical model where the registered functions are modeled as Gaussian processes. To address the computational demands of inference in high-dimensional Bayesian models, we propose an adapted form of the variational Bayes algorithm for approximate inference that performs similarly to Markov Chain Monte Carlo (MCMC) sampling methods for...
Tak, Hyungsuk; Morris, Carl N.
A Beta-Binomial-Logit model is a Beta-Binomial model with covariate information incorporated via a logistic regression. Posterior propriety of a Bayesian Beta-Binomial-Logit model can be data-dependent for improper hyper-prior distributions. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. When a posterior is improper due to improper hyper-prior distributions, we suggest...
Wang, Min
We consider Bayesian approaches for the hypothesis testing problem in the analysis-of-variance (ANOVA) models. With the aid of the singular value decomposition of the centered designed matrix, we reparameterize the ANOVA models with linear constraints for uniqueness into a standard linear regression model without any constraint. We derive the Bayes factors based on mixtures of $g$ -priors and study their consistency properties with a growing number of parameters. It is shown that two commonly used hyper-priors on $g$ (the Zellner-Siow prior and the beta-prime prior) yield inconsistent Bayes factors due to the presence of an inconsistency region around the null...
Anacleto, Osvaldo; Queen, Catriona
This paper introduces a new class of Bayesian dynamic models for inference and forecasting in high-dimensional time series observed on networks. The new model, called the dynamic chain graph model, is suitable for multivariate time series which exhibit symmetries within subsets of series and a causal drive mechanism between these subsets. The model can accommodate high-dimensional, non-linear and non-normal time series and enables local and parallel computation by decomposing the multivariate problem into separate, simpler sub-problems of lower dimensions. The advantages of the new model are illustrated by forecasting traffic network flows and also modelling gene expression data from transcriptional...
Turek, Daniel; de Valpine, Perry; Paciorek, Christopher J.; Anderson-Bergman, Clifford
Markov chain Monte Carlo (MCMC) sampling is an important and commonly used tool for the analysis of hierarchical models. Nevertheless, practitioners generally have two options for MCMC: utilize existing software that generates a black-box “one size fits all" algorithm, or the challenging (and time consuming) task of implementing a problem-specific MCMC algorithm. Either choice may result in inefficient sampling, and hence researchers have become accustomed to MCMC runtimes on the order of days (or longer) for large models. We propose an automated procedure to determine an efficient MCMC block-sampling algorithm for a given model and computing platform. Our procedure dynamically...
Whitaker, Gavin A.; Golightly, Andrew; Boys, Richard J.; Sherlock, Chris
Stochastic differential equations (SDEs) provide a natural framework for modelling intrinsic stochasticity inherent in many continuous-time physical processes. When such processes are observed in multiple individuals or experimental units, SDE driven mixed-effects models allow the quantification of both between and within individual variation. Performing Bayesian inference for such models using discrete-time data that may be incomplete and subject to measurement error is a challenging problem and is the focus of this paper. We extend a recently proposed MCMC scheme to include the SDE driven mixed-effects framework. Fundamental to our approach is the development of a novel construct that allows for...
Ruggeri, Fabrizio; Sawlan, Zaid; Scavino, Marco; Tempone, Raul
In this work we develop a Bayesian setting to infer unknown parameters in initial-boundary value problems related to linear parabolic partial differential equations. We realistically assume that the boundary data are noisy, for a given prescribed initial condition. We show how to derive the joint likelihood function for the forward problem, given some measurements of the solution field subject to Gaussian noise. Given Gaussian priors for the time-dependent Dirichlet boundary values, we analytically marginalize the joint likelihood using the linearity of the equation. Our hierarchical Bayesian approach is fully implemented in an example that involves the heat equation. In this...
Jo, Seongil; Lee, Jaeyong; Müller, Peter; Quintana, Fernando A.; Trippa, Lorenzo
We consider a novel Bayesian nonparametric model for density estimation with an underlying spatial structure. The model is built on a class of species sampling models, which are discrete random probability measures that can be represented as a mixture of random support points and random weights. Specifically, we construct a collection of spatially dependent species sampling models and propose a mixture model based on this collection. The key idea is the introduction of spatial dependence by modeling the weights through a conditional autoregressive model. We present an extensive simulation study to compare the performance of the proposed model with competitors....