Recursos de colección
Project Euclid (Hosted at Cornell University Library) (192.674 recursos)
Electronic Journal of Statistics
Electronic Journal of Statistics
Dehling, Herold; Rooch, Aeneas; Taqqu, Murad S.
We investigate the power of the CUSUM test and the Wilcoxon change-point tests for a shift in the mean of a process with long-range dependent noise. We derive analytic formulas for the power of these tests under local alternatives. These results enable us to calculate the asymptotic relative efficiency (ARE) of the CUSUM test and the Wilcoxon change point test. We obtain the surprising result that for Gaussian data, the ARE of these two tests equals $1$, in contrast to the case of i.i.d. noise when the ARE is known to be $3/\pi$.
Preinerstorfer, David
We analytically investigate size and power properties of a popular family of procedures for testing linear restrictions on the coefficient vector in a linear regression model with temporally dependent errors. The tests considered are autocorrelation-corrected F-type tests based on prewhitened nonparametric covariance estimators that possibly incorporate a data-dependent bandwidth parameter, e.g., estimators as considered in Andrews and Monahan (1992), Newey and West (1994), or Rho and Shao (2013). For design matrices that are generic in a measure theoretic sense we prove that these tests either suffer from extreme size distortions or from strong power deficiencies. Despite this negative result we...
Bahraoui, Tarik; Quessy, Jean-François
A new class of rank statistics is proposed to assess that the copula of a multivariate population is radially symmetric. The proposed test statistics are weighted $L_{2}$ functional distances between a nonparametric estimator of the characteristic function that one can associate to a copula and its complex conjugate. It will be shown that these statistics behave asymptotically as degenerate V-statistics of order four and that the limit distributions have expressions in terms of weighted sums of independent chi-square random variables. A suitably adapted and asymptotically valid multiplier bootstrap procedure is proposed for the computation of $p$-values. One advantage of the...
Beirlant, Jan; Fraga Alves, Isabel; Reynkens, Tom
In several applications, ultimately at the largest data, truncation effects can be observed when analysing tail characteristics of statistical distributions. In some cases truncation effects are forecasted through physical models such as the Gutenberg-Richter relation in geophysics, while at other instances the nature of the measurement process itself may cause under recovery of large values, for instance due to flooding in river discharge readings. Recently, Beirlant, Fraga Alves and Gomes (2016) discussed tail fitting for truncated Pareto-type distributions. Using examples from earthquake analysis, hydrology and diamond valuation we demonstrate the need for a unified treatment of extreme value analysis for...
Marchand, Éric; Perron, François; Yadegari, Iraj
For a normally distributed $X\sim N(\mu,\sigma^{2})$ and for estimating $\mu$ when restricted to an interval $[-m,m]$ under general loss $F(|d-\mu|)$ with strictly increasing and absolutely continuous $F$, we establish the inadmissibility of the restricted maximum likelihood estimator $\delta_{\hbox{mle}}$ for a large class of $F$’s and provide explicit improvements. In particular, we give conditions on $F$ and $m$ for which the Bayes estimator $\delta_{BU}$ with respect to the boundary uniform prior $\pi(-m)=\pi(m)=1/2$ dominates $\delta_{\hbox{mle}}$. Specific examples include $L^{s}$ loss with $s>1$, as well as reflected normal loss. Connections and implications for predictive density estimation are outlined, and numerical evaluations illustrate the...
Arias-Castro, Ery; Chen, Shiyun
We study a stylized multiple testing problem where the test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We first show that, in the normal means model where the test statistics are normal Z-scores, the well-known method of Benjamini and Hochberg [4] is optimal in some asymptotic sense. We then show that this is also the case of a recent distribution-free method proposed by Barber and Candès [14]. The method is distribution-free in the sense that it is agnostic to the null distribution — it only requires that the null distribution be symmetric....
Karwa, Vishesh; Pelsmajer, Michael J.; Petrović, Sonja; Stasi, Despina; Wilburne, Dane
The $k$-core decomposition is a widely studied summary statistic that describes a graph’s global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic for a family of exponential random graph models. We study the properties and behavior of the model family, implement a Markov chain Monte Carlo algorithm for simulating graphs from the model, implement a direct sampler from the set of...
Xu, Kun; Ma, Yanyuan; Wang, Yuanjia
We propose methods to estimate the distribution functions for multiple populations from mixture data that are only known to belong to a specific population with certain probabilities. The problem is motivated from kin-cohort studies collecting phenotype data in families for various diseases such as the Huntington’s disease (HD) or breast cancer. Relatives in these studies are not genotyped hence only their probabilities of carrying a known causal mutation (e.g., BRCA1 gene mutation or HD gene mutation) can be derived. In addition, phenotype observations from the same family may be correlated due to shared life style or other genes associated with...
Godichon-Baggioni, Antoine; Portier, Bruno
The objective of this work is to propose a new algorithm to fit a sphere on a noisy 3D point cloud distributed around a complete or a truncated sphere. More precisely, we introduce a projected Robbins-Monro algorithm and its averaged version for estimating the center and the radius of the sphere. We give asymptotic results such as the almost sure convergence of these algorithms as well as the asymptotic normality of the averaged algorithm. Furthermore, some non-asymptotic results will be given, such as the rates of convergence in quadratic mean. Some numerical experiments show the efficiency of the proposed algorithm...
Cohen, Samuel N.
In stochastic decision problems, one often wants to estimate the underlying probability measure statistically, and then to use this estimate as a basis for decisions. We shall consider how the uncertainty in this estimation can be explicitly and consistently incorporated in the valuation of decisions, using the theory of nonlinear expectations.
Benditkis, Julia; Janssen, Arnold
Much effort has been made to improve the famous step up procedure of Benjamini and Hochberg given by linear critical values $\frac{i\alpha}{n}$. It is pointed out by Gavrilov, Benjamini and Sarkar that step down multiple testing procedures based on the critical values $\beta_{i}=\frac{i\alpha}{n+1-i(1-\alpha)}$ still control the false discovery rate (FDR) at the upper bound $\alpha$ under basic independence assumptions. Since that result is no longer true for step up procedures and for step down procedures, if the p-values are dependent, a big discussion about the corresponding FDR starts in the literature. The present paper establishes finite sample formulas and bounds...
Abeysekera, Waruni; Kabaila, Paul
Casella and Hwang, 1983, JASA, introduced a broad class of recentered confidence spheres for the mean $\boldsymbol{\theta}$ of a multivariate normal distribution with covariance matrix $\sigma^{2}\boldsymbol{I}$, for $\sigma^{2}$ known. Both the center and radius functions of these confidence spheres are flexible functions of the data. For the particular case of confidence spheres centered on the positive-part James-Stein estimator and with radius determined by empirical Bayes considerations, they show numerically that these confidence spheres have the desired minimum coverage probability $1-\alpha$ and dominate the usual confidence sphere in terms of scaled volume. We shift the focus from the scaled volume to...
Rudelson, Mark; Zhou, Shuheng
Suppose that we observe $y\in\mathbb{R}^{n}$ and $X\in\mathbb{R}^{n\times m}$ in the following errors-in-variables model: \begin{eqnarray*}y&=&X_{0}\beta^{*}+\epsilon\\X&=&X_{0}+W\end{eqnarray*} where $X_{0}$ is an $n\times m$ design matrix with independent subgaussian row vectors, $\epsilon\in\mathbb{R}^{n}$ is a noise vector and $W$ is a mean zero $n\times m$ random noise matrix with independent subgaussian column vectors, independent of $X_{0}$ and $\epsilon$. This model is significantly different from those analyzed in the literature in the sense that we allow the measurement error for each covariate to be a dependent vector across its $n$ observations. Such error structures appear in the science literature when modeling the trial-to-trial fluctuations in response...
De Backer, Mickaël; El Ghouch, Anouar; Van Keilegom, Ingrid
When facing multivariate covariates, general semiparametric regression techniques come at hand to propose flexible models that are unexposed to the curse of dimensionality. In this work a semiparametric copula-based estimator for conditional quantiles is investigated for both complete or right-censored data. In spirit, the methodology is extending the recent work of Noh, El Ghouch and Bouezmarni [34] and Noh, El Ghouch and Van Keilegom [35], as the main idea consists in appropriately defining the quantile regression in terms of a multivariate copula and marginal distributions. Prior estimation of the latter and simple plug-in lead to an easily implementable estimator expressed,...
Dette, Holger; Preuss, Philip; Sen, Kemal
An important problem in time series analysis is the discrimination between non-stationarity and long-range dependence. Most of the literature considers the problem of testing specific parametric hypotheses of non-stationarity (such as a change in the mean) against long-range dependent stationary alternatives. In this paper we suggest a simple approach, which can be used to test the null-hypothesis of a general non-stationary short-memory against the alternative of a non-stationary long-memory process. The test procedure works in the spectral domain and uses a sequence of approximating tvFARIMA models to estimate the time varying long-range dependence parameter. We prove uniform consistency of this...
Brault, Vincent; Chiquet, Julien; Lévy-Leduc, Céline
In this paper, we propose a novel modeling and a new methodology for estimating the location of block boundaries in a random matrix consisting of a block-wise constant matrix corrupted with white noise. Our method consists in rewriting this problem as a variable selection issue. A penalized least-squares criterion with an $\ell_{1}$-type penalty is used for dealing with this problem. Firstly, some theoretical results ensuring the consistency of our block boundaries estimators are provided. Secondly, we explain how to implement our approach in a very efficient way. This implementation is available in the R package blockseg which can be found...
Chen, Li; Cao, Hongyuan
We study partially linear models for asynchronous longitudinal data to incorporate nonlinear time trend effects. Local and global estimating equations are developed for estimating the parametric and nonparametric effects. We show that with a proper choice of the kernel bandwidth parameter, one can obtain consistent and asymptotically normal parameter estimates for the linear effects. Asymptotic properties of the estimated nonlinear effects are established. Extensive simulation studies provide numerical support for the theoretical findings. Data from an HIV study are used to illustrate our methodology.
Bao, Zhigang; Hu, Jiang; Pan, Guangming; Zhou, Wang
In this paper, we are concerned with the independence test for $k$ high-dimensional sub-vectors of a normal vector, with fixed positive integer $k$. A natural high-dimensional extension of the classical sample correlation matrix, namely block correlation matrix, is proposed for this purpose. We then construct the so-called Schott type statistic as our test statistic, which turns out to be a particular linear spectral statistic of the block correlation matrix. Interestingly, the limiting behavior of the Schott type statistic can be figured out with the aid of the Free Probability Theory and the Random Matrix Theory. Specifically, we will bring the...
Niu, Cuizhen; Zhu, Lixing
This paper is devoted to implementing model checking for parametric single-index models with missing responses at random. Two dimension reduction adaptive-to-model tests applying to the missing responses situation are proposed. Unlike the existing smoothing tests, our methods can greatly alleviate the curse of dimensionality in the sense that the tests behave like a test with only one covariate. It results in better significance level maintenance and higher power than the classical tests. The finite sample performance is evaluated through several simulation studies and a comparison with other popularly used tests. A real data analysis is conducted for illustration.
Long, James P.
Misspecified models often provide useful information about the true data generating distribution. For example, if $y$ is a non–linear function of $x$ the least squares estimator $\widehat{\beta}$ is an estimate of $\beta$, the slope of the best linear approximation to the non–linear function. Motivated by problems in astronomy, we study how to incorporate observation measurement error variances into fitting parameters of misspecified models. Our asymptotic theory focuses on the particular case of linear regression where often weighted least squares procedures are used to account for heteroskedasticity. We find that when the response is a non–linear function of the independent variable,...