Recursos de colección
Xing, Yang
We study the rate of Bayesian consistency for hierarchical priors consisting of prior weights on a model index set and a prior on a density model for each choice of model index. Ghosal, Lember and Van der Vaart [2] have obtained general in-probability theorems on the rate of convergence of the resulting posterior distributions. We extend their results to almost sure assertions. As an application we study log spline densities with a finite number of models and obtain that the Bayes procedure achieves the optimal minimax rate n^{−γ/(2γ+1)} of convergence if the true density of the observations belongs to the...
Li, Jiaona; Peng, Zuoxiang; Nadarajah, Saralees
Based on the methods provided in Caeiro and Gomes (2002) and Fraga Alves (2001), a new class of location invariant Hill-type estimators is derived in this paper. Its asymptotic distributional representation and asymptotic normality are presented, and the optimal choice of sample fraction by Mean Squared Error is also discussed for some special cases. Finally comparison studies are provided for some familiar models by Monte Carlo simulations.
Johnson, Oliver
We analyse the properties of the Principal Fitted Components (PFC) algorithm proposed by Cook. We derive theoretical properties of the resulting estimators, including sufficient conditions under which they are $\sqrt{n}$ -consistent, and explain some of the simulation results given in Cook’s paper. We use techniques from random matrix theory and perturbation theory. We argue that, under Cook’s model at least, the PFC algorithm should outperform the Principal Components algorithm.
Epifani, Ilenia; MacEachern, Steven N.; Peruggia, Mario
Case-deleted analysis is a popular method for evaluating the influence of a subset of cases on inference. The use of Monte Carlo estimation strategies in complicated Bayesian settings leads naturally to the use of importance sampling techniques to assess the divergence between full-data and case-deleted posteriors and to provide estimates under the case-deleted posteriors. However, the dependability of the importance sampling estimators depends critically on the variability of the case-deleted weights. We provide theoretical results concerning the assessment of the dependability of case-deleted importance sampling estimators in several Bayesian models. In particular, these results allow us to establish whether or...
Lecué, Guillaume
We consider the classification problem on the cube [0,1]^{d} when the Bayes rule is known to belong to some new functions classes. These classes are made of prediction rules satisfying some conditions regarding their coefficients when developed over the (overcomplete) basis of indicator functions of dyadic cubes of [0,1]^{d}. The main concern of the paper is on the thorough analysis of the approximation term, which is in general bypassed in the classification literature. An adaptive classifier is designed to achieve the minimax rate of convergence (up to a logarithmic factor) over these functions classes. Lower bounds on the convergence rate...
Kulik, Rafał
We consider the nonparametric estimation of the density function of weakly and strongly dependent processes with noisy observations. We show that in the ordinary smooth case the optimal bandwidth choice can be influenced by long range dependence, as opposite to the standard case, when no noise is present. In particular, if the dependence is moderate the bandwidth, the rates of mean-square convergence and, additionally, central limit theorem are the same as in the i.i.d. case. If the dependence is strong enough, then the bandwidth choice is influenced by the strength of dependence, which is different when compared to the non-noisy...
Böhm, Hilmar; von Sachs, Rainer
In this paper we investigate the performance of periodogram based estimators of the spectral density matrix of possibly high-dimensional time series. We suggest and study shrinkage as a remedy against numerical instabilities due to deteriorating condition numbers of (kernel) smoothed periodogram matrices. Moreover, shrinking the empirical eigenvalues in the frequency domain towards one another also improves at the same time the Mean Squared Error (MSE) of these widely used nonparametric spectral estimators. Compared to some existing time domain approaches, restricted to i.i.d. data, in the frequency domain it is necessary to take the size of the smoothing span as “effective...
Huang, Jianhua Z.; Shen, Haipeng; Buja, Andreas
Two existing approaches to functional principal components analysis (FPCA) are due to Rice and Silverman (1991) and Silverman (1996), both based on maximizing variance but introducing penalization in different ways. In this article we propose an alternative approach to FPCA using penalized rank one approximation to the data matrix. Our contributions are four-fold: (1) by considering invariance under scale transformation of the measurements, the new formulation sheds light on how regularization should be performed for FPCA and suggests an efficient power algorithm for computation; (2) it naturally incorporates spline smoothing of discretized functional data; (3) the connection with smoothing splines...
Loubes, Jean-Michel; Ludeña, Carenne
We tackle the problem of building adaptive estimation procedures for ill-posed inverse problems. For general regularization methods depending on tuning parameters, we construct a penalized method that selects the optimal smoothing sequence without prior knowledge of the regularity of the function to be estimated. We provide for such estimators oracle inequalities and optimal rates of convergence. This penalized approach is applied to Tikhonov regularization and to regularization by projection.
Tribble, Seth D.; Owen, Art B.
In Markov chain Monte Carlo (MCMC) sampling considerable thought goes into constructing random transitions. But those transitions are almost always driven by a simulated IID sequence. Recently it has been shown that replacing an IID sequence by a weakly completely uniformly distributed (WCUD) sequence leads to consistent estimation in finite state spaces. Unfortunately, few WCUD sequences are known. This paper gives general methods for proving that a sequence is WCUD, shows that some specific sequences are WCUD, and shows that certain operations on WCUD sequences yield new WCUD sequences. A numerical example on a 42 dimensional continuous Gibbs sampler found...
Nardi, Yuval; Rinaldo, Alessandro
We establish estimation and model selection consistency, prediction and estimation bounds and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We consider the case of a fixed-dimensional parameter space with increasing sample size and the double asymptotic scenario where the model complexity changes with the sample size.
Van Keilegom, Ingrid; Sánchez Sellero, César; González Manteiga, Wenceslao
Consider a random vector (X,Y) and let m(x)=E(Y|X=x). We are interested in testing $H_{0}:m\in {\cal M}_{\Theta,{\cal G}}=\{\gamma(\cdot,\theta,g):\theta \in \Theta,g\in {\cal G}\}$ for some known function γ, some compact set $\Theta \subset I\!\!R^{p}$ and some function set ${\cal G}$ of real valued functions. Specific examples of this general hypothesis include testing for a parametric regression model, a generalized linear model, a partial linear model, a single index model, but also the selection of explanatory variables can be considered as a special case of this hypothesis.
¶
To test this null hypothesis, we make use of the so-called marked empirical process introduced by [4]...
Fokianos, Konstantinos
Inference based on the penalized density ratio model is proposed and studied. The model under consideration is specified by assuming that the log–likelihood function of two unknown densities is of some parametric form. The model has been extended to cover multiple samples problems while its theoretical properties have been investigated using large sample theory. A main application of the density ratio model is testing whether two, or more, distributions are equal. We extend these results by arguing that the penalized maximum empirical likelihood estimator has less mean square error than that of the ordinary maximum likelihood estimator, especially for small...
Giraud, Christophe
We investigate in this paper the estimation of Gaussian graphs by model selection from a non-asymptotic point of view. We start from an n-sample of a Gaussian law ℙ_{C} in ℝ^{p} and focus on the disadvantageous case where n is smaller than p. To estimate the graph of conditional dependences of ℙ_{C}, we introduce a collection of candidate graphs and then select one of them by minimizing a penalized empirical risk. Our main result assesses the performance of the procedure in a non-asymptotic setting. We pay special attention to the maximal degree D of the graphs that we can handle,...
Debbarh, Mohammed; Viallon, Vivian
It has been recently shown that nonparametric estimators of the additive regression function could be obtained in the presence of right censoring by coupling the marginal integration method with initial kernel-type Inverse Probability of Censoring Weighted estimators of the multivariate regression function [10]. In this paper, we get the exact rate of strong uniform consistency for such estimators. Our uniform limit laws especially lead to the construction of asymptotic simultaneous 100% confidence bands for the true regression function.
Rothman, Adam J.; Bickel, Peter J.; Levina, Elizaveta; Zhu, Ji
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation-based version of the method exhibits better rates in the operator norm. We also derive a fast iterative algorithm for computing the...
Dümbgen, Lutz; Igl, Bernd-Wolfgang; Munk, Axel
Let (X,Y) be a random variable consisting of an observed feature vector $X\in \mathcal{X}$ and an unobserved class label Y∈{1,2,…,L} with unknown joint distribution. In addition, let $\mathcal{D}$ be a training data set consisting of n completely observed independent copies of (X,Y). Usual classification procedures provide point predictors (classifiers) $\widehat{Y}(X,\mathcal{D})$ of Y or estimate the conditional distribution of Y given X. In order to quantify the certainty of classifying X we propose to construct for each θ=1,2,…,L a p-value $\pi_{\theta}(X,\mathcal{D})$ for the null hypothesis that Y=θ, treating Y temporarily as a fixed parameter. In other words, the point predictor $\widehat{Y}(X,\mathcal{D})$...
Prendergast, Luke A.
In this paper we introduce an influence measure based on second order expansion of the RV and GCD measures for the comparison between unperturbed and perturbed eigenvectors of a symmetric matrix estimator. Example estimators are considered to highlight how this measure compliments recent influence analysis. Importantly, we also show how a sample based version of this measure can be used to accurately and efficiently detect influential observations in practice.
Bilancia, Massimo; Stea, Girolamo
A wealth of epidemiological data suggests an association between mortality/morbidity from pulmonary and cardiovascular adverse events and air pollution, but uncertainty remains as to the extent implied by those associations although the abundance of the data. In this paper we describe an SSA (Singular Spectrum Analysis) based approach in order to decompose the time-series of particulate matter concentration into a set of exposure variables, each one representing a different timescale. We implement our methodology to investigate both acute and long-term effects of PM_{10} exposure on morbidity from respiratory causes within the urban area of Bari, Italy.
Autin, Florent
This paper deals with the problem of function estimation. Using the white noise model setting, we provide a method to construct a new wavelet procedure based on thresholding rules which takes advantage of the dyadic structure of the wavelet decomposition. We prove that this new procedure performs very well since, on the one hand, it is adaptive and near-minimax over a large class of Besov spaces and, on the other hand, the maximal functional space (maxiset) where this procedure attains a given rate of convergence is very large. More than this, by studying the shape of its maxiset, we prove...