Recursos de colección
Project Euclid (Hosted at Cornell University Library) (192.674 recursos)
Statistics Surveys
Statistics Surveys
Josse, Julie; Holmes, Susan
Simple correlation coefficients between two variables have been generalized to measure association between two matrices in many ways. Coefficients such as the RV coefficient, the distance covariance (dCov) coefficient and kernel based coefficients are being used by different research communities. Scientists use these coefficients to test whether two random vectors are linked. Once it has been ascertained that there is such association through testing, then a next step, often ignored, is to explore and uncover the association’s underlying patterns.
¶
This article provides a survey of various measures of dependence between random vectors and tests of independence and emphasizes the connections and...
Bradley, Jonathan R.; Cressie, Noel; Shi, Tao
In this article, we review and compare a number of methods of spatial prediction, where each method is viewed as an algorithm that processes spatial data. To demonstrate the breadth of available choices, we consider both traditional and more-recently-introduced spatial predictors. Specifically, in our exposition we review: traditional stationary kriging, smoothing splines, negative-exponential distance-weighting, fixed rank kriging, modified predictive processes, a stochastic partial differential equation approach, and lattice kriging. This comparison is meant to provide a service to practitioners wishing to decide between spatial predictors. Hence, we provide technical material for the unfamiliar, which includes the definition and motivation for...
Dimiccoli, Mariella
Cone regression is a particular case of quadratic programming that minimizes a weighted sum of squared residuals under a set of linear inequality constraints. Several important statistical problems such as isotonic, concave regression or ANOVA under partial orderings, just to name a few, can be considered as particular instances of the cone regression problem. Given its relevance in Statistics, this paper aims to address the fundamentals of cone regression from a theoretical and practical point of view. Several formulations of the cone regression problem are considered and, focusing on the particular case of concave regression as an example, several algorithms...
Mashreghi, Zeinab; Haziza, David; Léger, Christian
We review bootstrap methods in the context of survey data where the effect of the sampling design on the variability of estimators has to be taken into account. We present the methods in a unified way by classifying them in three classes: pseudo-population, direct, and survey weights methods. We cover variance estimation and the construction of confidence intervals for stratified simple random sampling as well as some unequal probability sampling designs. We also address the problem of variance estimation in presence of imputation to compensate for item non-response.
Marteau, Clément; Sapatinas, Theofanis
We are concerned with minimax signal detection. In this setting, we discuss non-asymptotic and asymptotic approaches through a unified treatment. In particular, we consider a Gaussian sequence model that contains classical models as special cases, such as, direct, well-posed inverse and ill-posed inverse problems. Working with certain ellipsoids in the space of squared-summable sequences of real numbers, with a ball of positive radius removed, we compare the construction of lower and upper bounds for the minimax separation radius (non-asymptotic approach) and the minimax separation rate (asymptotic approach) that have been proposed in the literature. Some additional contributions, bringing to light...
McGoff, Kevin; Mukherjee, Sayan; Pillai, Natesh
The topic of statistical inference for dynamical systems has been studied widely across several fields. In this survey we focus on methods related to parameter estimation for nonlinear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research.
Ferreira, José A.
This article provides a concise and essentially self-contained exposition of some of the most important models and non-parametric methods for the analysis of observational data, and a substantial number of illustrations of their application. Although for the most part our presentation follows P. Rosenbaum’s book, “Observational Studies”, and naturally draws on related literature, it contains original elements and simplifies and generalizes some basic results. The illustrations, based on simulated data, show the methods at work in some detail, highlighting pitfalls and emphasizing certain subjective aspects of the statistical analyses.
Dümbgen, Lutz; Pauly, Markus; Schweizer, Thomas
This survey provides a self-contained account of $M$-estimation of multivariate scatter. In particular, we present new proofs for existence of the underlying $M$-functionals and discuss their weak continuity and differentiability. This is done in a rather general framework with matrix-valued random variables. By doing so we reveal a connection between Tyler’s (1987a) $M$-functional of scatter and the estimation of proportional covariance matrices. Moreover, this general framework allows us to treat a new class of scatter estimators, based on symmetrizations of arbitrary order. Finally these results are applied to $M$-estimation of multivariate location and scatter via multivariate $t$-distributions.
Chauveau, Didier; Hunter, David R.; Levine, Michael
The conditional independence assumption for nonparametric multivariate finite mixture models, a weaker form of the well-known conditional independence assumption for random effects models for longitudinal data, is the subject of an increasing number of theoretical and algorithmic developments in the statistical literature. After presenting a survey of this literature, including an in-depth discussion of the all-important identifiability results, this article describes and extends an algorithm for estimation of the parameters in these models. The algorithm works for any number of components in three or more dimensions. It possesses a descent property and can be easily adapted to situations where the...
Saumard, Adrien; Wellner, Jon A.
We review and formulate results concerning log-concavity and strong-log-concavity in both discrete and continuous settings. We show how preservation of log-concavity and strong log-concavity on $\mathbb{R}$ under convolution follows from a fundamental monotonicity result of Efron (1965). We provide a new proof of Efron’s theorem using the recent asymmetric Brascamp-Lieb inequality due to Otto and Menz (2013). Along the way we review connections between log-concavity and other areas of mathematics and statistics, including concentration of measure, log-Sobolev inequalities, convex geometry, MCMC algorithms, Laplace approximations, and machine learning.
Sverdlov, Oleksandr; Wong, Weng Kee; Ryeznik, Yevgen
Adaptive clinical trials are becoming increasingly popular research designs for clinical investigation. Adaptive designs are particularly useful in phase I cancer studies where clinical data are scant and the goals are to assess the drug dose-toxicity profile and to determine the maximum tolerated dose while minimizing the number of study patients treated at suboptimal dose levels.
¶
In the current work we give an overview of adaptive design methods for phase I cancer trials. We find that modern statistical literature is replete with novel adaptive designs that have clearly defined objectives and established statistical properties, and are shown to outperform conventional dose...
Vehtari, Aki; Ojanen, Janne
Errata for “A survey of Bayesian predictive methods for model assessment, selection and comparison” by A. Vehtari and J. Ojanen, Statistics Surveys, 6 (2012), 142–228. doi:10.1214/12-SS102.
Schonlau, Matthias; Kroh, Martin; Watson, Nicole
While household panel surveys are longitudinal in nature cross-sectional sampling weights are also of interest. The computation of cross-sectional weights is challenging because household compositions change over time. Sampling probabilities of household entrants after wave 1 are generally not known and assigning them zero weight is not satisfying. Two common approaches to cross-sectional weighting address this issue: (1) “shared weights” and (2) modeling or estimating unobserved sampling probabilities based on person-level characteristics. We survey how several well-known national household panels address cross-sectional weights for different groups of respondents (including immigrants and births) and in different situations (including household mergers and...
Simpson, Sean L.; Bowman, F. DuBois; Laurienti, Paul J.
Complex functional brain network analyses have exploded over the last decade, gaining traction due to their profound clinical implications. The application of network science (an interdisciplinary offshoot of graph theory) has facilitated these analyses and enabled examining the brain as an integrated system that produces complex behaviors. While the field of statistics has been integral in advancing activation analyses and some connectivity analyses in functional neuroimaging research, it has yet to play a commensurate role in complex network analyses. Fusing novel statistical methods with network-based functional neuroimage analysis will engender powerful analytical tools that will aid in our understanding of...
Vehtari, Aki; Ojanen, Janne
To date, several methods exist in the statistical literature for model assessment, which purport themselves specifically as Bayesian predictive methods. The decision theoretic assumptions on which these methods are based are not always clearly stated in the original articles, however. The aim of this survey is to provide a unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them. We review the various assumptions that are made in this context and discuss the connections between different approaches, with an emphasis on how each method approximates the expected utility of using a Bayesian model...
Heckman, Nancy
The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $\sum_{j}(Y_{j}-\mu(t_{j}))^{2}+\lambda \int_{a}^{b}[\mu''(t)]^{2}\,dt$, where the data are $t_{j},Y_{j}$, $j=1,\ldots,n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function $\mu$ in another way, the sum of squares may be replaced by...
Picka, Jeffrey
This paper gives an overview of statistical inference for disordered sphere packing processes. These processes are used extensively in physics and engineering in order to represent the internal structure of composite materials, packed bed reactors, and powders at rest, and are used as initial arrangements of grains in the study of avalanches and other problems involving powders in motion. Packing processes are spatial processes which are neither stationary nor ergodic. Classical spatial statistical models and procedures cannot be applied to these processes, but alternative models and procedures can be developed based on ideas from statistical physics.
¶
Most of the development of...
Clarke, Bertrand; Clarke, Jennifer
We review predictive techniques from several traditional branches of statistics. Starting with prediction based on the normal model and on the empirical distribution function, we proceed to techniques for various forms of regression and classification. Then, we turn to time series, longitudinal data, and survival analysis. Our focus throughout is on the mechanics of prediction more than on the properties of predictors.
Huang, Mingyan; Zhang, Daowen
An important feature of linear mixed models and generalized linear mixed models is that the conditional mean of the response given the random effects, after transformed by a link function, is linearly related to the fixed covariate effects and random effects. Therefore, it is of practical importance to test the adequacy of this assumption, particularly the assumption of linear covariate effects. In this paper, we review procedures that can be used for testing polynomial covariate effects in these popular models. Specifically, four types of hypothesis testing approaches are reviewed, i.e. R tests, likelihood ratio tests, score tests and residual-based tests....
Bou-Hamad, Imad; Larocque, Denis; Ben-Ameur, Hatem
This paper presents a non–technical account of the developments in tree–based methods for the analysis of survival data with censoring. This review describes the initial developments, which mainly extended the existing basic tree methodologies to censored data as well as to more recent work. We also cover more complex models, more specialized methods, and more specific problems such as multivariate data, the use of time–varying covariates, discrete–scale survival data, and ensemble methods applied to survival trees. A data example is used to illustrate some methods that are implemented in R.