Recursos de colección
Project Euclid (Hosted at Cornell University Library) (192.979 recursos)
The Annals of Statistics
The Annals of Statistics
Kong, Yinfei; Li, Daoji; Fan, Yingying; Lv, Jinchi
Feature interactions can contribute to a large proportion of variation in many prediction models. In the era of big data, the coexistence of high dimensionality in both responses and covariates poses unprecedented challenges in identifying important interactions. In this paper, we suggest a two-stage interaction identification method, called the interaction pursuit via distance correlation (IPDC), in the setting of high-dimensional multi-response interaction models that exploits feature screening applied to transformed variables with distance correlation followed by feature selection. Such a procedure is computationally efficient, generally applicable beyond the heredity assumption, and effective even when the number of responses diverges with...
Loh, Po-Ling
We study theoretical properties of regularized robust $M$-estimators, applicable when data are drawn from a sparse high-dimensional linear model and contaminated by heavy-tailed distributions and/or outliers in the additive errors and covariates. We first establish a form of local statistical consistency for the penalized regression estimators under fairly mild conditions on the error distribution: When the derivative of the loss function is bounded and satisfies a local restricted curvature condition, all stationary points within a constant radius of the true regression vector converge at the minimax rate enjoyed by the Lasso with sub-Gaussian errors. When an appropriate nonconvex regularizer is...
Rousseau, Judith; Szabo, Botond
We consider the asymptotic behaviour of the marginal maximum likelihood empirical Bayes posterior distribution in general setting. First, we characterize the set where the maximum marginal likelihood estimator is located with high probability. Then we provide oracle type of upper and lower bounds for the contraction rates of the empirical Bayes posterior. We also show that the hierarchical Bayes posterior achieves the same contraction rate as the maximum marginal likelihood empirical Bayes posterior. We demonstrate the applicability of our general results for various models and prior distributions by deriving upper and lower bounds for the contraction rates of the corresponding...
Paindaveine, Davy; Verdebout, Thomas
We revisit, in an original and challenging perspective, the problem of testing the null hypothesis that the mode of a directional signal is equal to a given value. Motivated by a real data example where the signal is weak, we consider this problem under asymptotic scenarios for which the signal strength goes to zero at an arbitrary rate $\eta_{n}$. Both under the null and the alternative, we focus on rotationally symmetric distributions. We show that, while they are asymptotically equivalent under fixed signal strength, the classical Wald and Watson tests exhibit very different (null and nonnull) behaviours when the signal...
Chakraborty, Anirvan; Chaudhuri, Probal
Tests based on mean vectors and spatial signs and ranks for a zero mean in one-sample problems and for the equality of means in two-sample problems have been studied in the recent literature for high-dimensional data with the dimension larger than the sample size. For the above testing problems, we show that under suitable sequences of alternatives, the powers of the mean-based tests and the tests based on spatial signs and ranks tend to be same as the data dimension tends to infinity for any sample size when the coordinate variables satisfy appropriate mixing conditions. Further, their limiting powers do...
Dereudre, David; Lavancier, Frédéric
Strong consistency of the maximum likelihood estimator (MLE) for parametric Gibbs point process models is established. The setting is very general. It includes pairwise pair potentials, finite and infinite multibody interactions and geometrical interactions, where the range can be finite or infinite. The Gibbs interaction may depend linearly or nonlinearly on the parameters, a particular case being hardcore parameters and interaction range parameters. As important examples, we deduce the consistency of the MLE for all parameters of the Strauss model, the hardcore Strauss model, the Lennard–Jones model and the area-interaction model.
Hang, Hanyuan; Steinwart, Ingo
We establish a Bernstein-type inequality for a class of stochastic processes that includes the classical geometrically $\phi$-mixing processes, Rio’s generalization of these processes and many time-discrete dynamical systems. Modulo a logarithmic factor and some constants, our Bernstein-type inequality coincides with the classical Bernstein inequality for i.i.d. data. We further use this new Bernstein-type inequality to derive an oracle inequality for generic regularized empirical risk minimization algorithms and data generated by such processes. Applying this oracle inequality to support vector machines using the Gaussian kernels for binary classification, we obtain essentially the same rate as for i.i.d. processes, and for least...
Xu, Gongjun
Statistical latent class models are widely used in social and psychological researches, yet it is often difficult to establish the identifiability of the model parameters. In this paper, we consider the identifiability issue of a family of restricted latent class models, where the restriction structures are needed to reflect pre-specified assumptions on the related assessment. We establish the identifiability results in the strict sense and specify which types of restriction structure would give the identifiability of the model parameters. The results not only guarantee the validity of many of the popularly used models, but also provide a guideline for the...
Nandy, Preetam; Maathuis, Marloes H.; Richardson, Thomas S.
We consider the estimation of joint causal effects from observational data. In particular, we propose new methods to estimate the effect of multiple simultaneous interventions (e.g., multiple gene knockouts), under the assumption that the observational data come from an unknown linear structural equation model with independent errors. We derive asymptotic variances of our estimators when the underlying causal structure is partly known, as well as high-dimensional consistency when the causal structure is fully unknown and the joint distribution is multivariate Gaussian. We also propose a generalization of our methodology to the class of nonparanormal distributions. We evaluate the estimators in...
Cai, T. Tony; Guo, Zijian
Confidence sets play a fundamental role in statistical inference. In this paper, we consider confidence intervals for high-dimensional linear regression with random design. We first establish the convergence rates of the minimax expected length for confidence intervals in the oracle setting where the sparsity parameter is given. The focus is then on the problem of adaptation to sparsity for the construction of confidence intervals. Ideally, an adaptive confidence interval should have its length automatically adjusted to the sparsity of the unknown regression vector, while maintaining a pre-specified coverage probability. It is shown that such a goal is in general not...
Cardot, Hervé; Cénac, Peggy; Godichon-Baggioni, Antoine
Estimation procedures based on recursive algorithms are interesting and powerful techniques that are able to deal rapidly with very large samples of high dimensional data. The collected data may be contaminated by noise so that robust location indicators, such as the geometric median, may be preferred to the mean. In this context, an estimator of the geometric median based on a fast and efficient averaged nonlinear stochastic gradient algorithm has been developed by [Bernoulli 19 (2013) 18–43]. This work aims at studying more precisely the nonasymptotic behavior of this nonlinear algorithm by giving nonasymptotic confidence balls in general separable Hilbert...
Li, Jun; Zhong, Ping-Shou
The paper considers the problem of recovering the sparse different components between two high-dimensional means of column-wise dependent random vectors. We show that dependence can be utilized to lower the identification boundary for signal recovery. Moreover, an optimal convergence rate for the marginal false nondiscovery rate (mFNR) is established under dependence. The convergence rate is faster than the optimal rate without dependence. To recover the sparse signal bearing dimensions, we propose a Dependence-Assisted Thresholding and Excising (DATE) procedure, which is shown to be rate optimal for the mFNR with the marginal false discovery rate (mFDR) controlled at a pre-specified level....
Cheng, Dan; Schwartzman, Armin
A topological multiple testing scheme is presented for detecting peaks in images under stationary ergodic Gaussian noise, where tests are performed at local maxima of the smoothed observed signals. The procedure generalizes the one-dimensional scheme of Schwartzman, Gavrilov and Adler [Ann. Statist. 39 (2011) 3290–3319] to Euclidean domains of arbitrary dimension. Two methods are developed according to two different ways of computing p-values: (i) using the exact distribution of the height of local maxima, available explicitly when the noise field is isotropic [Extremes 18 (2015) 213–240; Expected number and height distribution of critical points of smooth isotropic Gaussian random fields...
Wang, Y. X. Rachel; Bickel, Peter J.
The stochastic block model (SBM) provides a popular framework for modeling community structures in networks. However, more attention has been devoted to problems concerning estimating the latent node labels and the model parameters than the issue of choosing the number of blocks. We consider an approach based on the log likelihood ratio statistic and analyze its asymptotic properties under model misspecification. We show the limiting distribution of the statistic in the case of underfitting is normal and obtain its convergence rate in the case of overfitting. These conclusions remain valid when the average degree grows at a polylog rate. The...
Lok, Judith J.
In observational studies, treatment may be adapted to covariates at several times without a fixed protocol, in continuous time. Treatment influences covariates, which influence treatment, which influences covariates and so on. Then even time-dependent Cox-models cannot be used to estimate the net treatment effect. Structural nested models have been applied in this setting. Structural nested models are based on counterfactuals: the outcome a person would have had had treatment been withheld after a certain time. Previous work on continuous-time structural nested models assumes that counterfactuals depend deterministically on observed data, while conjecturing that this assumption can be relaxed. This article...
Wang, Qinwen; Yao, Jianfeng
Consider two $p$-variate populations, not necessarily Gaussian, with covariance matrices $\Sigma_{1}$ and $\Sigma_{2}$, respectively. Let $S_{1}$ and $S_{2}$ be the corresponding sample covariance matrices with degrees of freedom $m$ and $n$. When the difference $\Delta$ between $\Sigma_{1}$ and $\Sigma_{2}$ is of small rank compared to $p,m$ and $n$, the Fisher matrix $S:=S_{2}^{-1}S_{1}$ is called a spiked Fisher matrix. When $p,m$ and $n$ grow to infinity proportionally, we establish a phase transition for the extreme eigenvalues of the Fisher matrix: a displacement formula showing that when the eigenvalues of $\Delta$ (spikes) are above (or under) a critical value, the associated extreme...
Dicker, Lee H.; Erdogdu, Murat A.
We derive convenient uniform concentration bounds and finite sample multivariate normal approximation results for quadratic forms, then describe some applications involving variance components estimation in linear random-effects models. Random-effects models and variance components estimation are classical topics in statistics, with a corresponding well-established asymptotic theory. However, our finite sample results for quadratic forms provide additional flexibility for easily analyzing random-effects models in nonstandard settings, which are becoming more important in modern applications (e.g., genomics). For instance, in addition to deriving novel non-asymptotic bounds for variance components estimators in classical linear random-effects models, we provide a concentration bound for variance components...
Belloni, Alexandre; Oliveira, Roberto I.
We study a variable length Markov chain model associated with a group of stationary processes that share the same context tree but each process has potentially different conditional probabilities. We propose a new model selection and estimation method which is computationally efficient. We develop oracle and adaptivity inequalities, as well as model selection properties, that hold under continuity of the transition probabilities and polynomial $\beta$-mixing. In particular, model misspecification is allowed.
¶
These results are applied to interesting families of processes. For Markov processes, we obtain uniform rate of convergence for the estimation error of transition probabilities as well as perfect model...
Klopp, Olga; Tsybakov, Alexandre B.; Verzelen, Nicolas
Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities—the ordinary block constant least squares estimator, and its restricted version. We show that they satisfy oracle inequalities with respect to the block constant oracle. As a consequence, we derive optimal rates of estimation of the probability matrix. Our results cover the important setting of...
Ghoshdastidar, Debarghya; Dukkipati, Ambedkar
Hypergraph partitioning lies at the heart of a number of problems in machine learning and network sciences. Many algorithms for hypergraph partitioning have been proposed that extend standard approaches for graph partitioning to the case of hypergraphs. However, theoretical aspects of such methods have seldom received attention in the literature as compared to the extensive studies on the guarantees of graph partitioning. For instance, consistency results of spectral graph partitioning under the stochastic block model are well known. In this paper, we present a planted partition model for sparse random nonuniform hypergraphs that generalizes the stochastic block model. We derive...