Recursos de colección
Project Euclid (Hosted at Cornell University Library) (191.996 recursos)
The Annals of Statistics
The Annals of Statistics
Wang, Qinwen; Yao, Jianfeng
Consider two $p$-variate populations, not necessarily Gaussian, with covariance matrices $\Sigma_{1}$ and $\Sigma_{2}$, respectively. Let $S_{1}$ and $S_{2}$ be the corresponding sample covariance matrices with degrees of freedom $m$ and $n$. When the difference $\Delta$ between $\Sigma_{1}$ and $\Sigma_{2}$ is of small rank compared to $p,m$ and $n$, the Fisher matrix $S:=S_{2}^{-1}S_{1}$ is called a spiked Fisher matrix. When $p,m$ and $n$ grow to infinity proportionally, we establish a phase transition for the extreme eigenvalues of the Fisher matrix: a displacement formula showing that when the eigenvalues of $\Delta$ (spikes) are above (or under) a critical value, the associated extreme...
Dicker, Lee H.; Erdogdu, Murat A.
We derive convenient uniform concentration bounds and finite sample multivariate normal approximation results for quadratic forms, then describe some applications involving variance components estimation in linear random-effects models. Random-effects models and variance components estimation are classical topics in statistics, with a corresponding well-established asymptotic theory. However, our finite sample results for quadratic forms provide additional flexibility for easily analyzing random-effects models in nonstandard settings, which are becoming more important in modern applications (e.g., genomics). For instance, in addition to deriving novel non-asymptotic bounds for variance components estimators in classical linear random-effects models, we provide a concentration bound for variance components...
Belloni, Alexandre; Oliveira, Roberto I.
We study a variable length Markov chain model associated with a group of stationary processes that share the same context tree but each process has potentially different conditional probabilities. We propose a new model selection and estimation method which is computationally efficient. We develop oracle and adaptivity inequalities, as well as model selection properties, that hold under continuity of the transition probabilities and polynomial $\beta$-mixing. In particular, model misspecification is allowed.
¶
These results are applied to interesting families of processes. For Markov processes, we obtain uniform rate of convergence for the estimation error of transition probabilities as well as perfect model...
Klopp, Olga; Tsybakov, Alexandre B.; Verzelen, Nicolas
Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities—the ordinary block constant least squares estimator, and its restricted version. We show that they satisfy oracle inequalities with respect to the block constant oracle. As a consequence, we derive optimal rates of estimation of the probability matrix. Our results cover the important setting of...
Ghoshdastidar, Debarghya; Dukkipati, Ambedkar
Hypergraph partitioning lies at the heart of a number of problems in machine learning and network sciences. Many algorithms for hypergraph partitioning have been proposed that extend standard approaches for graph partitioning to the case of hypergraphs. However, theoretical aspects of such methods have seldom received attention in the literature as compared to the extensive studies on the guarantees of graph partitioning. For instance, consistency results of spectral graph partitioning under the stochastic block model are well known. In this paper, we present a planted partition model for sparse random nonuniform hypergraphs that generalizes the stochastic block model. We derive...
Li, Zeng; Wang, Qinwen; Yao, Jianfeng
Identifying the number of factors in a high-dimensional factor model has attracted much attention in recent years and a general solution to the problem is still lacking. A promising ratio estimator based on singular values of lagged sample auto-covariance matrices has been recently proposed in the literature with a reasonably good performance under some specific assumption on the strength of the factors. Inspired by this ratio estimator and as a first main contribution, this paper proposes a complete theory of such sample singular values for both the factor part and the noise part under the large-dimensional scheme where the dimension...
Chernozhukov, Victor; Galichon, Alfred; Hallin, Marc; Henry, Marc
We propose new concepts of statistical depth, multivariate quantiles, vector quantiles and ranks, ranks and signs, based on canonical transportation maps between a distribution of interest on $\mathbb{R}^{d}$ and a reference distribution on the $d$-dimensional unit ball. The new depth concept, called Monge–Kantorovich depth, specializes to halfspace depth for $d=1$ and in the case of spherical distributions, but for more general distributions, differs from the latter in the ability for its contours to account for non-convex features of the distribution of interest. We propose empirical counterparts to the population versions of those Monge–Kantorovich depth contours, quantiles, ranks, signs and vector...
Khare, Kshitij; Pal, Subhadip; Su, Zhihua
The envelope model is a new paradigm to address estimation and prediction in multivariate analysis. Using sufficient dimension reduction techniques, it has the potential to achieve substantial efficiency gains compared to standard models. This model was first introduced by [Statist. Sinica 20 (2010) 927–960] for multivariate linear regression, and has since been adapted to many other contexts. However, a Bayesian approach for analyzing envelope models has not yet been investigated in the literature. In this paper, we develop a comprehensive Bayesian framework for estimation and model selection in envelope models in the context of multivariate linear regression. Our framework has...
Ning, Yang; Liu, Han
We consider the problem of uncertainty assessment for low dimensional components in high dimensional models. Specifically, we propose a novel decorrelated score function to handle the impact of high dimensional nuisance parameters. We consider both hypothesis tests and confidence regions for generic penalized M-estimators. Unlike most existing inferential methods which are tailored for individual models, our method provides a general framework for high dimensional inference and is applicable to a wide variety of applications. In particular, we apply this general framework to study five illustrative examples: linear regression, logistic regression, Poisson regression, Gaussian graphical model and additive hazards model. For...
Koltchinskii, Vladimir; Lounici, Karim
Let $X,X_{1},\dots,X_{n}$ be i.i.d. Gaussian random variables in a separable Hilbert space $\mathbb{H}$ with zero mean and covariance operator $\Sigma=\mathbb{E}(X\otimes X)$, and let $\hat{\Sigma}:=n^{-1}\sum_{j=1}^{n}(X_{j}\otimes X_{j})$ be the sample (empirical) covariance operator based on $(X_{1},\dots,X_{n})$. Denote by $P_{r}$ the spectral projector of $\Sigma$ corresponding to its $r$th eigenvalue $\mu_{r}$ and by $\hat{P}_{r}$ the empirical counterpart of $P_{r}$. The main goal of the paper is to obtain tight bounds on
¶
\[\sup_{x\in\mathbb{R}}\vert\mathbb{P} \{\frac{\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}-\mathbb{E}\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}}{\operatorname{Var}^{1/2}(\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2})}\leq x\}-\Phi (x)\vert ,\] where $\Vert \cdot \Vert_{2}$ denotes the Hilbert–Schmidt norm and $\Phi$ is the standard normal distribution function. Such accuracy of normal approximation of the distribution of...
Balakrishnan, Sivaraman; Wainwright, Martin J.; Yu, Bin
The EM algorithm is a widely used tool in maximum-likelihood estimation in incomplete data problems. Existing theoretical work has focused on conditions under which the iterates or likelihood values converge, and the associated rates of convergence. Such guarantees do not distinguish whether the ultimate fixed point is a near global optimum or a bad local optimum of the sample likelihood, nor do they relate the obtained fixed point to the global optima of the idealized population likelihood (obtained in the limit of infinite data). This paper develops a theoretical framework for quantifying when and how quickly EM-type iterates converge to...
Chernozhukov, Victor; Hansen, Christian; Liao, Yuan
Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero parameters. We consider a generalization of these two basic models, termed here a “sparse $+$ dense” model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for...
Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.
Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing...
Kim, Arlene K. H.; Samworth, Richard J.
The estimation of a log-concave density on $\mathbb{R}^{d}$ represents a central problem in the area of nonparametric inference under shape constraints. In this paper, we study the performance of log-concave density estimators with respect to global loss functions, and adopt a minimax approach. We first show that no statistical procedure based on a sample of size $n$ can estimate a log-concave density with respect to the squared Hellinger loss function with supremum risk smaller than order $n^{-4/5}$, when $d=1$, and order $n^{-2/(d+1)}$ when $d\geq2$. In particular, this reveals a sense in which, when $d\geq3$, log-concave density estimation is fundamentally more...
Ho, Nhat; Nguyen, XuanLong
We establish minimax lower bounds and maximum likelihood convergence rates of parameter estimation for mean-covariance multivariate Gaussian mixtures, shape-rate Gamma mixtures and some variants of finite mixture models, including the setting where the number of mixing components is bounded but unknown. These models belong to what we call “weakly identifiable” classes, which exhibit specific interactions among mixing parameters driven by the algebraic structures of the class of kernel densities and their partial derivatives. Accordingly, both the minimax bounds and the maximum likelihood parameter estimation rates in these models, obtained under some compactness conditions on the parameter space, are shown to...
Devroye, Luc; Lerasle, Matthieu; Lugosi, Gabor; Oliveira, Roberto I.
We discuss the possibilities and limitations of estimating the mean of a real-valued random variable from independent and identically distributed observations from a nonasymptotic point of view. In particular, we define estimators with a sub-Gaussian behavior even for certain heavy-tailed distributions. We also prove various impossibility results for mean estimators.
Gu, Yuwen; Zou, Hui
Asymmetric least squares regression is an important method that has wide applications in statistics, econometrics and finance. The existing work on asymmetric least squares only considers the traditional low dimension and large sample setting. In this paper, we systematically study the Sparse Asymmetric LEast Squares (SALES) regression under high dimensions where the penalty functions include the Lasso and nonconvex penalties. We develop a unified efficient algorithm for fitting SALES and establish its theoretical properties. As an important application, SALES is used to detect heteroscedasticity in high-dimensional data. Another method for detecting heteroscedasticity is the sparse quantile regression. However, both SALES...
Xu, Min; Chen, Minhua; Lafferty, John
We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the relevant variables. Our approach is a two-stage quadratic programming method that estimates a sum of one-dimensional convex functions, followed by one-dimensional concave regression fits on the residuals. In contrast to previous methods for sparse additive models, the optimization is finite dimensional and requires no tuning parameters for smoothness. Under appropriate assumptions, we prove that the procedure is faithful in the population setting, yielding...
Yu, Zhou; Dong, Yuexiao; Shao, Jun
Model-free variable selection has been implemented under the sufficient dimension reduction framework since the seminal paper of Cook [Ann. Statist. 32 (2004) 1062–1092]. In this paper, we extend the marginal coordinate test for sliced inverse regression (SIR) in Cook (2004) and propose a novel marginal SIR utility for the purpose of ultrahigh dimensional feature selection. Two distinct procedures, Dantzig selector and sparse precision matrix estimation, are incorporated to get two versions of sample level marginal SIR utilities. Both procedures lead to model-free variable selection consistency with predictor dimensionality $p$ diverging at an exponential rate of the sample size $n$. As...
Yuan, Ming; Zhou, Ding-Xuan
We establish minimax optimal rates of convergence for estimation in a high dimensional additive model assuming that it is approximately sparse. Our results reveal a behavior universal to this class of high dimensional problems. In the sparse regime when the components are sufficiently smooth or the dimensionality is sufficiently large, the optimal rates are identical to those for high dimensional linear regression and, therefore, there is no additional cost to entertain a nonparametric model. Otherwise, in the so-called smooth regime, the rates coincide with the optimal rates for estimating a univariate function and, therefore, they are immune to the “curse...