Mostrando recursos 1 - 20 de 74

  1. Prior Distributions for Objective Bayesian Analysis

    Consonni, Guido; Fouskakis, Dimitris; Liseo, Brunero; Ntzoufras, Ioannis
    We provide a review of prior distributions for objective Bayesian analysis. We start by examining some foundational issues and then organize our exposition into priors for: i) estimation or prediction; ii) model selection; iii) high-dimensional models. With regard to i), we present some basic notions, and then move to more recent contributions on discrete parameter space, hierarchical models, nonparametric models, and penalizing complexity priors. Point ii) is the focus of this paper: it discusses principles for objective Bayesian model comparison, and singles out some major concepts for building priors, which are subsequently illustrated in some detail for the classic problem...

  2. Bayesian Cluster Analysis: Point Estimation and Credible Balls (with Discussion)

    Wade, Sara; Ghahramani, Zoubin
    Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to popular algorithms such as agglomerative hierarchical clustering or k-means which return a single clustering solution, Bayesian nonparametric models provide a posterior over the entire space of partitions, allowing one to assess statistical properties, such as uncertainty on the number of clusters. However, an important problem is how to summarize the posterior; the huge dimension of partition space and difficulties in visualizing it add to this problem. In a Bayesian analysis, the posterior of a real-valued parameter of interest is often summarized...

  3. Modelling and Computation Using NCoRM Mixtures for Density Regression

    Griffin, Jim; Leisen, Fabrizio
    Normalized compound random measures are flexible nonparametric priors for related distributions. We consider building general nonparametric regression models using normalized compound random measure mixture models. Posterior inference is made using a novel pseudo-marginal Metropolis-Hastings sampler for normalized compound random measure mixture models. The algorithm makes use of a new general approach to the unbiased estimation of Laplace functionals of compound random measures (which includes completely random measures as a special case). The approach is illustrated on problems of density regression.

  4. Sampling Errors in Nested Sampling Parameter Estimation

    Higson, Edward; Handley, Will; Hobson, Mike; Lasenby, Anthony
    Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors in nested sampling parameter estimation, and presents a new diagrammatic representation for the process. We find no current method can accurately measure the parameter estimation errors of a single nested sampling run, and propose a method for doing so using a new algorithm for dividing nested sampling runs. We empirically verify our conclusions and the accuracy of our new method.

  5. A New Regression Model for Bounded Responses

    Migliorati, Sonia; Di Brisco, Agnese Maria; Ongaro, Andrea
    Aim of this contribution is to propose a new regression model for continuous variables bounded to the unit interval (e.g. proportions) based on the flexible beta (FB) distribution. The latter is a special mixture of two betas, which greatly extends the shapes of the beta distribution mainly in terms of asymmetry, bimodality and heavy tail behaviour. Its special mixture structure ensures good theoretical properties, such as strong identifiability and likelihood boundedness, quite uncommon for mixture models. Moreover, it makes the model computationally very tractable also within the Bayesian framework here adopted. ¶ At the same time, the FB regression model displays...

  6. Variable Selection via Penalized Credible Regions with Dirichlet–Laplace Global-Local Shrinkage Priors

    Zhang, Yan; Bondell, Howard D.
    The method of Bayesian variable selection via penalized credible regions separates model fitting and variable selection. The idea is to search for the sparsest solution within the joint posterior credible regions. Although the approach was successful, it depended on the use of conjugate normal priors. More recently, improvements in the use of global-local shrinkage priors have been made for high-dimensional Bayesian variable selection. In this paper, we incorporate global-local priors into the credible region selection framework. The Dirichlet–Laplace (DL) prior is adapted to linear regression. Posterior consistency for the normal and DL priors are shown, along with variable selection consistency....

  7. Sampling Latent States for High-Dimensional Non-Linear State Space Models with the Embedded HMM Method

    Shestopaloff, Alexander Y.; Neal, Radford M.
    We propose a new scheme for selecting pool states for the embedded Hidden Markov Model (HMM) Markov Chain Monte Carlo (MCMC) method. This new scheme allows the embedded HMM method to be used for efficient sampling in state space models where the state can be high-dimensional. Previously, embedded HMM methods were only applicable to low-dimensional state-space models. We demonstrate that using our proposed pool state selection scheme, an embedded HMM sampler can have similar performance to a well-tuned sampler that uses a combination of Particle Gibbs with Backward Sampling (PGBS) and Metropolis updates. The scaling to higher dimensions is made...

  8. Bayesian Community Detection

    van der Pas, S. L.; van der Vaart, A. W.
    We introduce a Bayesian estimator of the underlying class structure in the stochastic block model, when the number of classes is known. The estimator is the posterior mode corresponding to a Dirichlet prior on the class proportions, a generalized Bernoulli prior on the class labels, and a beta prior on the edge probabilities. We show that this estimator is strongly consistent when the expected degree is at least of order $\log^{2}{n}$ , where $n$ is the number of nodes in the network.

  9. Specification of Informative Prior Distributions for Multinomial Models Using Vine Copulas

    Wilson, Kevin James
    We consider the specification of an informative prior distribution for the probabilities in a multinomial model. We utilise vine copulas: flexible multivariate distributions built using bivariate copulas stacked in a tree structure. We take advantage of a specific vine structure, called a D-vine, to separate the specification of the multivariate prior distribution into that of marginal distributions for the probabilities and parameter values for the bivariate copulas in the vine. We provide guidance on each of the choices to be made in the prior specification and each of the questions to ask the expert to specify the model parameters within...

  10. Power-Expected-Posterior Priors for Generalized Linear Models

    Fouskakis, Dimitris; Ntzoufras, Ioannis; Perrakis, Konstantinos
    The power-expected-posterior (PEP) prior provides an objective, automatic, consistent and parsimonious model selection procedure. At the same time it resolves the conceptual and computational problems due to the use of imaginary data. Namely, (i) it dispenses with the need to select and average across all possible minimal imaginary samples, and (ii) it diminishes the effect that the imaginary data have upon the posterior distribution. These attributes allow for large sample approximations, when needed, in order to reduce the computational burden under more complex models. In this work we generalize the applicability of the PEP methodology, focusing on the framework of...

  11. Some Aspects of Symmetric Gamma Process Mixtures

    Naulet, Zacharie; Barat, Éric
    In this article, we present some specific aspects of symmetric Gamma process mixtures for use in regression models. First we propose a new Gibbs sampler for simulating the posterior. The algorithm is tested on two examples, the mean regression problem with normal errors, and the reconstruction of two dimensional CT images. In a second time, we establish posterior rates of convergence related to the mean regression problem with normal errors. For location-scale and location-modulation mixtures the rates are adaptive over Hölder classes, and in the case of location-modulation mixtures are nearly optimal.

  12. Posterior Predictive p-Values with Fisher Randomization Tests in Noncompliance Settings: Test Statistics vs Discrepancy Measures

    Forastiere, Laura; Mealli, Fabrizia; Miratrix, Luke
    In randomized experiments with noncompliance, one might wish to focus on compliers rather than on the overall sample. In this vein, Rubin (1998) argued that testing for the complier average causal effect and averaging permutation-based $p$ -values over the posterior distribution of the compliance types could increase power as compared to general intent-to-treat tests. The general scheme is a repeated two-step process: impute missing compliance types and conduct a permutation test with the completed data. In this paper, we explore this idea further, comparing the use of discrepancy measures—which depend on unknown but imputed parameters—to classical test statistics and contrasting...

  13. Modeling Skewed Spatial Data Using a Convolution of Gaussian and Log-Gaussian Processes

    Zareifard, Hamid; Khaledi, Majid Jafari; Rivaz, Firoozeh; Vahidi-Asl, Mohammad Q.
    In spatial statistics, it is usual to consider a Gaussian process for spatial latent variables. As the data often exhibit non-normality, we introduce a novel skew process, named hereafter Gaussian-log Gaussian convolution (GLGC) to construct latent spatial models which provide great flexibility in capturing skewness. Some properties including closed-form expressions for the moments and the skewness of the GLGC process are derived. Particularly, we show that the mean square continuity and differentiability of the GLGC process are established by those of the Gaussian and log-Gaussian processes considered in its structure. Moreover, the usefulness of the proposed approach is demonstrated through...

  14. Merging MCMC Subposteriors through Gaussian-Process Approximations

    Nemeth, Christopher; Sherlock, Chris
    Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm...

  15. Variational Hamiltonian Monte Carlo via Score Matching

    Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai
    Traditionally, the field of computational Bayesian statistics has been divided into two main subfields: variational methods and Markov chain Monte Carlo (MCMC). In recent years, however, several methods have been proposed based on combining variational Bayesian inference and MCMC simulation in order to improve their overall accuracy and computational efficiency. This marriage of fast evaluation and flexible approximation provides a promising means of designing scalable Bayesian inference methods. In this paper, we explore the possibility of incorporating variational approximation into a state-of-the-art MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required expensive computation involved in the sampling procedure, which...

  16. Testing Un-Separated Hypotheses by Estimating a Distance

    Salomond, Jean-Bernard
    In this paper we propose a Bayesian answer to testing problems when the hypotheses are not well separated. The idea of the method is to study the posterior distribution of a discrepancy measure between the parameter and the model we want to test for. This is shown to be equivalent to a modification of the testing loss. An advantage of this approach is that it can easily be adapted to complex hypotheses testing which are in general difficult to test for. Asymptotic properties of the test can be derived from the asymptotic behaviour of the posterior distribution of the discrepancy...

  17. Efficient Model Comparison Techniques for Models Requiring Large Scale Data Augmentation

    Touloupou, Panayiota; Alzahrani, Naif; Neal, Peter; Spencer, Simon E. F.; McKinley, Trevelyan J.
    Selecting between competing statistical models is a challenging problem especially when the competing models are non-nested. In this paper we offer a simple solution by devising an algorithm which combines MCMC and importance sampling to obtain computationally efficient estimates of the marginal likelihood which can then be used to compare the models. The algorithm is successfully applied to a longitudinal epidemic data set, where calculating the marginal likelihood is made more challenging by the presence of large amounts of missing data. In this context, our importance sampling approach is shown to outperform existing methods for computing the marginal likelihood.

  18. Bayesian Analysis of RNA-Seq Data Using a Family of Negative Binomial Models

    Zhao, Lili; Wu, Weisheng; Feng, Dai; Jiang, Hui; Nguyen, XuanLong
    The analysis of RNA-Seq data has been focused on three main categories, including gene expression, relative exon usage and transcript expression. Methods have been proposed independently for each category using a negative binomial (NB) model. However, counts following a NB distribution on one feature (e.g., exon) do not guarantee a NB distribution for the other two features (e.g., gene/transcript). In this paper we propose a family of Negative Binomial models, which integrates the gene, exon and transcript analysis under a coherent NB model. The proposed model easily incorporates the uncertainty of assigning reads to transcripts and simplifies substantially the estimation...

  19. Sequential Bayesian Analysis of Multivariate Count Data

    Aktekin, Tevfik; Polson, Nick; Soyer, Refik
    We develop a new class of dynamic multivariate Poisson count models that allow for fast online updating. We refer to this class as multivariate Poisson-scaled beta (MPSB) models. The MPSB model allows for serial dependence in count data as well as dependence with a random common environment across time series. Notable features of our model are analytic forms for state propagation, predictive likelihood densities, and sequential updating via sufficient statistics for the static model parameters. Our approach leads to a fully adapted particle learning algorithm and a new class of predictive likelihoods and marginal distributions which we refer to as...

  20. On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression

    Ghosh, Joyee; Li, Yingbo; Mitra, Robin
    In logistic regression, separation occurs when a linear combination of the predictors can perfectly classify part or all of the observations in the sample, and as a result, finite maximum likelihood estimates of the regression coefficients do not exist. Gelman et al. (2008) recommended independent Cauchy distributions as default priors for the regression coefficients in logistic regression, even in the case of separation, and reported posterior modes in their analyses. As the mean does not exist for the Cauchy prior, a natural question is whether the posterior means of the regression coefficients exist under separation. We prove theorems that provide...

Aviso de cookies: Usamos cookies propias y de terceros para mejorar nuestros servicios, para análisis estadístico y para mostrarle publicidad. Si continua navegando consideramos que acepta su uso en los términos establecidos en la Política de cookies.