Mostrando recursos 1 - 20 de 884

  1. Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators

    Zhang, Yuchen; Wainwright, Martin J.; Jordan, Michael I.
    For the problem of high-dimensional sparse linear regression, it is known that an $\ell_{0}$-based estimator can achieve a $1/n$ “fast” rate for prediction error without any conditions on the design matrix, whereas in the absence of restrictive conditions on the design matrix, popular polynomial-time methods only guarantee the $1/\sqrt{n}$ “slow” rate. In this paper, we show that the slow rate is intrinsic to a broad class of M-estimators. In particular, for estimators based on minimizing a least-squares cost function together with a (possibly nonconvex) coordinate-wise separable regularizer, there is always a “bad” local optimum such that the associated prediction error...

  2. Some properties of the autoregressive-aided block bootstrap

    Niebuhr, Tobias; Kreiss, Jens-Peter; Paparoditis, Efstathios
    We investigate properties of a hybrid bootstrap procedure for general, strictly stationary sequences, called the autoregressive-aided block bootstrap which combines a parametric autoregressive bootstrap with a nonparametric moving block bootstrap. The autoregressive-aided block bootstrap consists of two main steps, namely an autoregressive model fit and an ensuing (moving) block resampling of residuals. The linear parametric model-fit prewhitenes the time series so that the dependence structure of the remaining residuals gets closer to that of a white noise sequence, while the moving block bootstrap applied to these residuals captures nonlinear features that are not taken into account by the linear autoregressive...

  3. Adaptive wavelet multivariate regression with errors in variables

    Chichignoud, Michaël; Hoang, Van Ha; Pham Ngoc, Thanh Mai; Rivoirard, Vincent
    In the multidimensional setting, we consider the errors-in- variables model. We aim at estimating the unknown nonparametric multivariate regression function with errors in the covariates. We devise an adaptive estimators based on projection kernels on wavelets and a deconvolution operator. We propose an automatic and fully data driven procedure to select the wavelet level resolution. We obtain an oracle inequality and optimal rates of convergence over anisotropic Hölder classes. Our theoretical results are illustrated by some simulations.

  4. Prediction weighted maximum frequency selection

    Liu, Hongmei; Rao, J. Sunil
    Shrinkage estimators that possess the ability to produce sparse solutions have become increasingly important to the analysis of today’s complex datasets. Examples include the LASSO, the Elastic-Net and their adaptive counterparts. Estimation of penalty parameters still presents difficulties however. While variable selection consistent procedures have been developed, their finite sample performance can often be less than satisfactory. We develop a new strategy for variable selection using the adaptive LASSO and adaptive Elastic-Net estimators with $p_{n}$ diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on dimension....

  5. Cross-calibration of probabilistic forecasts

    Strähl, Christof; Ziegel, Johanna
    When providing probabilistic forecasts for uncertain future events, it is common to strive for calibrated forecasts, that is, the predictive distribution should be compatible with the observed outcomes. Often, there are several competing forecasters of different skill. We extend common notions of calibration where each forecaster is analyzed individually, to stronger notions of cross-calibration where each forecaster is analyzed with respect to the other forecasters. In particular, cross-calibration distinguishes forecasters with respect to increasing information sets. We provide diagnostic tools and statistical tests to assess cross-calibration. The methods are illustrated in simulation examples and applied to probabilistic forecasts for inflation...

  6. Sequential quantiles via Hermite series density estimation

    Stephanou, Michael; Varughese, Melvin; Macdonald, Iain
    Sequential quantile estimation refers to incorporating observations into quantile estimates in an incremental fashion thus furnishing an online estimate of one or more quantiles at any given point in time. Sequential quantile estimation is also known as online quantile estimation. This area is relevant to the analysis of data streams and to the one-pass analysis of massive data sets. Applications include network traffic and latency analysis, real time fraud detection and high frequency trading. We introduce new techniques for online quantile estimation based on Hermite series estimators in the settings of static quantile estimation and dynamic quantile estimation. In the...

  7. Support vector regression for right censored data

    Goldberg, Yair; Kosorok, Michael R.
    We develop a unified approach for classification and regression support vector machines for when the responses are subject to right censoring. We provide finite sample bounds on the generalization error of the algorithm, prove risk consistency for a wide class of probability measures, and study the associated learning rates. We apply the general methodology to estimation of the (truncated) mean, median, quantiles, and for classification problems. We present a simulation study that demonstrates the performance of the proposed approach.

  8. A geometric approach to pairwise Bayesian alignment of functional data using importance sampling

    Kurtek, Sebastian
    We present a Bayesian model for pairwise nonlinear registration of functional data. We use the Riemannian geometry of the space of warping functions to define appropriate prior distributions and sample from the posterior using importance sampling. A simple square-root transformation is used to simplify the geometry of the space of warping functions, which allows for computation of sample statistics, such as the mean and median, and a fast implementation of a $k$-means clustering algorithm. These tools allow for efficient posterior inference, where multiple modes of the posterior distribution corresponding to multiple plausible alignments of the given functions are found. We...

  9. Estimation and inference of error-prone covariate effect in the presence of confounding variables

    Liu, Jianxuan; Ma, Yanyuan; Zhu, Liping; Carroll, Raymond J.
    We introduce a general single index semiparametric measurement error model for the case that the main covariate of interest is measured with error and modeled parametrically, and where there are many other variables also important to the modeling. We propose a semiparametric bias-correction approach to estimate the effect of the covariate of interest. The resultant estimators are shown to be root-$n$ consistent, asymptotically normal and locally efficient. Comprehensive simulations and an analysis of an empirical data set are performed to demonstrate the finite sample performance and the bias reduction of the locally efficient estimators.

  10. Asymptotic behavior of the Laplacian quasi-maximum likelihood estimator of affine causal processes

    Bardet, Jean-Marc; Boularouk, Yakoub; Djaballah, Khedidja
    We prove the consistency and asymptotic normality of the Laplacian Quasi-Maximum Likelihood Estimator (QMLE) for a general class of causal time series including ARMA, AR($\infty$), GARCH, ARCH($\infty$), ARMA-GARCH, APARCH, ARMA-APARCH,..., processes. We notably exhibit the advantages (moment order and robustness) of this estimator compared to the classical Gaussian QMLE. Numerical simulations confirms the accuracy of this estimator.

  11. On the estimation of the mean of a random vector

    Joly, Emilien; Lugosi, Gábor; Imbuzeiro Oliveira, Roberto
    We study the problem of estimating the mean of a multivariate distribution based on independent samples. The main result is the proof of existence of an estimator with a non-asymptotic sub-Gaussian performance for all distributions satisfying some mild moment assumptions.

  12. Parameter estimation of Gaussian stationary processes using the generalized method of moments

    Barboza, Luis A.; Viens, Frederi G.
    We consider the class of all stationary Gaussian process with explicit parametric spectral density. Under some conditions on the autocovariance function, we defined a GMM estimator that satisfies consistency and asymptotic normality, using the Breuer-Major theorem and previous results on ergodicity. This result is applied to the joint estimation of the three parameters of a stationary Ornstein-Uhlenbeck (fOU) process driven by a fractional Brownian motion. The asymptotic normality of its GMM estimator applies for any $H$ in $(0,1)$ and under some restrictions on the remaining parameters. A numerical study is performed in the fOU case, to illustrate the estimator’s practical...

  13. Hypothesis testing of the drift parameter sign for fractional Ornstein–Uhlenbeck process

    Kukush, Alexander; Mishura, Yuliya; Ralchenko, Kostiantyn
    We consider the fractional Ornstein–Uhlenbeck process with an unknown drift parameter and known Hurst parameter $H$. We propose a new method to test the hypothesis of the sign of the parameter and prove the consistency of the test. Contrary to the previous works, our approach is applicable for all $H\in(0,1)$.

  14. Semiparametric single-index model for estimating optimal individualized treatment strategy

    Song, Rui; Luo, Shikai; Zeng, Donglin; Zhang, Hao Helen; Lu, Wenbin; Li, Zhiguo
    Different from the standard treatment discovery framework which is used for finding single treatments for a homogenous group of patients, personalized medicine involves finding therapies that are tailored to each individual in a heterogeneous group. In this paper, we propose a new semiparametric additive single-index model for estimating individualized treatment strategy. The model assumes a flexible and nonparametric link function for the interaction between treatment and predictive covariates. We estimate the rule via monotone B-splines and establish the asymptotic properties of the estimators. Both simulations and an real data application demonstrate that the proposed method has a competitive performance.

  15. Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals

    Song, Yanglei; Fellouris, Georgios
    Assuming that data are collected sequentially from independent streams, we consider the simultaneous testing of multiple binary hypotheses under two general setups; when the number of signals (correct alternatives) is known in advance, and when we only have a lower and an upper bound for it. In each of these setups, we propose feasible procedures that control, without any distributional assumptions, the familywise error probabilities of both type I and type II below given, user-specified levels. Then, in the case of i.i.d. observations in each stream, we show that the proposed procedures achieve the optimal expected sample size, under every...

  16. Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance

    Choi, Hee Min; Román, Jorge Carlos
    We consider the intractable posterior density that results when the one-way logistic analysis of variance model is combined with a flat prior. We analyze Polson, Scott and Windle’s (2013) data augmentation (DA) algorithm for exploring the posterior. The Markov operator associated with the DA algorithm is shown to be trace-class.

  17. Minimum disparity estimation in controlled branching processes

    González, Miguel; Minuesa, Carmen; del Puerto, Inés
    Minimum disparity estimation in controlled branching processes is dealt with by assuming that the offspring law belongs to a general parametric family. Under some regularity conditions it is proved that the minimum disparity estimators proposed -based on the nonparametric maximum likelihood estimator of the offspring law when the entire family tree is observed- are consistent and asymptotic normally distributed. Moreover, the robustness of the estimators proposed is discussed. Through a simulated example, focusing on the minimum Hellinger and negative exponential disparity estimators, it is shown that both are robust against outliers, and the minimum negative exponential estimator is also robust...

  18. TIGER: A tuning-insensitive approach for optimally estimating Gaussian graphical models

    Liu, Han; Wang, Lie
    We propose a new procedure for optimally estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuning-free and non-asymptotically tuning-insensitive: It requires very little effort to choose the tuning parameter in finite sample settings. Computationally, our procedure is significantly faster than existing methods due to its tuning-insensitive property. Theoretically, the obtained estimator simultaneously achieves minimax lower bounds for precision matrix estimation under different norms. Empirically, we illustrate the advantages of the proposed method using simulated and real examples. The R package camel implementing the proposed methods is also available on the Comprehensive R Archive Network.

  19. Large-scale mode identification and data-driven sciences

    Mukhopadhyay, Subhadeep
    Bump-hunting or mode identification is a fundamental problem that arises in almost every scientific field of data-driven discovery. Surprisingly, very few data modeling tools are available for automatic (not requiring manual case-by-case investigation), objective (not subjective), and nonparametric (not based on restrictive parametric model assumptions) mode discovery, which can scale to large data sets. This article introduces LPMode–an algorithm based on a new theory for detecting multimodality of a probability density. We apply LPMode to answer important research questions arising in various fields from environmental science, ecology, econometrics, analytical chemistry to astronomy and cancer genomics.

  20. Simple confidence intervals for MCMC without CLTs

    Rosenthal, Jeffrey S.
    This short note argues that 95% confidence intervals for MCMC estimates can be obtained even without establishing a CLT, by multiplying their widths by 2.3.

Aviso de cookies: Usamos cookies propias y de terceros para mejorar nuestros servicios, para análisis estadístico y para mostrarle publicidad. Si continua navegando consideramos que acepta su uso en los términos establecidos en la Política de cookies.