arXiv
(422,153 recursos)
This is one of the most extensive subject based repositories in the world in the field of physics, mathematics, astronomy, computer sciences and quantitative biology. This is the principal site with almost 20 mirror versions around the globe. The site is supported by an extensive collection of information and background documentation. An RSS feed is available for anyone interested in keeping up-to-date with newly added materials.
Mostrando recursos 41 - 60 de 110
41.
Sparse Estimators and the Oracle Property, or the Return of Hodges'
Estimator - Leeb, Hannes; Poetscher, Benedikt M.
We point out some pitfalls related to the concept of an oracle property as
used in Fan and Li (2001, 2002, 2004) which are reminiscent of the well-known
pitfalls related to Hodges' estimator. The oracle property is often a
consequence of sparsity of an estimator. We show that any estimator satisfying
a sparsity property has maximal risk that converges to the supremum of the loss
function; in particular, the maximal risk diverges to infinity whenever the
loss function is unbounded. For ease of presentation the result is set in the
framework of a linear regression model, but generalizes far beyond that
setting. In a Monte Carlo study we...
42.
Can One Estimate The Unconditional Distribution of Post-Model-Selection
Estimators? - Leeb, Hannes; Poetscher, Benedikt M.
We consider the problem of estimating the unconditional distribution of a
post-model-selection estimator. The notion of a post-model-selection estimator
here refers to the combined procedure resulting from first selecting a model
(e.g., by a model selection criterion like AIC or by a hypothesis testing
procedure) and then estimating the parameters in the selected model (e.g., by
least-squares or maximum likelihood), all based on the same data set. We show
that it is impossible to estimate the unconditional distribution with
reasonable accuracy even asymptotically. In particular, we show that no
estimator for this distribution can be uniformly consistent (not even locally).
This follows as a corollary to (local) minimax lower...
43.
Theoretical Aspects of the SOM Algorithm - Cottrell, Marie; Fort, Jean-Claude; Pagès, Gilles
The SOM algorithm is very astonishing. On the one hand, it is very simple to
write down and to simulate, its practical properties are clear and easy to
observe. But, on the other hand, its theoretical properties still remain
without proof in the general case, despite the great efforts of several
authors. In this paper, we pass in review the last results and provide some
conjectures for the future work.
44.
Traitement Des Donnees Manquantes Au Moyen De L'Algorithme De Kohonen - Cottrell, Marie; Ibbou, Smail; Letrémy, Patrick
Nous montrons comment il est possible d'utiliser l'algorithme d'auto
organisation de Kohonen pour traiter des donn\'ees avec valeurs manquantes et
estimer ces derni\`eres. Apr\`es un rappel m\'ethodologique, nous illustrons
notre propos \`a partir de trois applications \`a des donn\'ees r\'eelles.
-----
We show how it is possible to use the Kohonen self-organizing algorithm to
deal with data which contain missing values and to estimate them. After a
methodological recall, we illustrate our purpose from three real databases
applications.
45.
Information-Based Asset Pricing - Brody, Dorje C.; Hughston, Lane P.; Macrina, Andrea
A new framework for asset price dynamics is introduced in which the concept
of noisy information about future cash flows is used to derive the price
processes. In this framework an asset is defined by its cash-flow structure.
Each cash flow is modelled by a random variable that can be expressed as a
function of a collection of independent random variables called market factors.
With each such "X-factor" we associate a market information process, the values
of which are accessible to market agents. Each information process is a sum of
two terms; one contains true information about the value of the market factor;
the other represents "noise". The...
46.
On the Computational Complexity of MCMC-based Estimators in Large
Samples - Belloni, Alexandre; Chernozhukov, Victor
In this paper we examine the implications of the statistical large sample
theory for the computational complexity of Bayesian and quasi-Bayesian
estimation carried out using Metropolis random walks. Our analysis is motivated
by the Laplace-Bernstein-Von Mises central limit theorem, which states that in
large samples the posterior or quasi-posterior approaches a normal density.
Using this observation, we establish polynomial bounds on the computational
complexity of general Metropolis random walks methods in large samples. Our
analysis covers cases, where the underlying log-likelihood or extremum
criterion function is possibly non-concave, discontinuous, and of increasing
dimension. However, the central limit theorem restricts the deviations from
continuity and log-concavity of the log-likelihood or extremum...
47.
Inference on Eigenvalues of Wishart Distribution Using Asymptotics with
respect to the Dispersion of Population Eigenvalues - Sheena, Yo; Takemura, Akimichi
In this paper we derive some new and practical results on testing and
interval estimation problems for the population eigenvalues of a Wishart matrix
based on the asymptotic theory for block-wise infinite dispersion of the
population eigenvalues. This new type of asymptotic theory has been developed
by the present authors in Takemura and Sheena (2005) and Sheena and Takemura
(2007a,b) and in these papers it was applied to point estimation problem of
population covariance matrix in a decision theoretic framework. In this paper
we apply it to some testing and interval estimation problems. We show that the
approximation based on this type of asymptotics is generally much better...
48.
Structural adaptation via $L_p$-norm oracle inequalities - Goldenhsluger, A.; Lepski, O.
In this paper we study the problem of adaptive estimation of a multivariate
function satisfying some structural assumption. We propose a novel estimation
procedure that adapts simultaneously to unknown structure and smoothness of the
underlying function. The problem of structural adaptation is stated as the
problem of selection from a given collection of estimators. We develop a
general selection rule and establish for it global oracle inequalities under
arbitrary $\rL_p$--losses. These results are applied for adaptive estimation in
the additive multi--index model.
49.
A universal procedure for aggregating estimators - Goldenshluger, A.
In this paper we study the aggregation problem that can be formulated as
follows. Assume that we have a family of estimators ${\cal F}$ built on the
basis of available observations. The goal is to construct a new estimator whose
risk is as close as possible to that of the best estimator in the family. We
propose a general aggregation scheme that is universal in the following sense:
it applies for families of arbitrary estimators and a wide variety of models
and global risk measures. The procedure is based on comparison of empirical
estimates of certain linear functionals with estimates induced by the family
${\cal F}$. We derive...
50.
Analytic crossing probabilities for certain barriers by Brownian motion - Kahale, Nabil
We calculate crossing probabilities and one-sided last exit time densities
for a class of moving barriers on an interval [0,T] via Schwartz distributions.
We derive crossing probabilities and first hitting time densities for another
class of barriers on [0,T] by proving a Schwartz distribution version of the
method of images. Analytic expressions for crossing probabilities and related
densities are given for new explicit and semi-explicit barriers.
51.
On bounds and algorithms for frequency synchronization for collaborative
communication systems - Parker, Peter A.; Mitran, Patrick; Bliss, Daniel W.; Tarokh, Vahid
Cooperative diversity systems are wireless communication systems designed to
exploit cooperation among users to mitigate the effects of multipath fading. In
fairly general conditions, it has been shown that these systems can achieve the
diversity order of an equivalent MISO channel and, if the node geometry
permits, virtually the same outage probability can be achieved as that of the
equivalent MISO channel for a wide range of applicable SNR. However, much of
the prior analysis has been performed under the assumption of perfect timing
and frequency offset synchronization. In this paper, we derive the estimation
bounds and associated maximum likelihood estimators for frequency offset
estimation in a cooperative communication...
52.
Multiple pattern matching: A Markov chain approach - Lladser, Manuel; Betterton, M. D.; Knight, Rob
RNA motifs typically consist of short, modular patterns that include base
pairs formed within and between modules. Estimating the abundance of these
patterns is of fundamental importance for assessing the statistical
significance of matches in genomewide searches, and for predicting whether a
given function has evolved many times in different species or arose from a
single common ancestor. In this manuscript, we review in an integrated and
self-contained manner some basic concepts of automata theory, generating
functions and transfer matrix methods that are relevant to pattern analysis in
biological sequences. We formalize, in a general framework, the concept of
Markov chain embedding to analyze patterns in random strings produced...
53.
Density Estimation of Censored Data with Infinite-Order Kernels - Berg, Arthur; Politis, Dimitris N.
Higher-order accurate density estimation under random right censorship is
achieved using kernel estimators from a family of infinite-order kernels. A
compatible bandwidth selection procedure is also proposed that automatically
adapts to level of smoothness of the underlying lifetime density. The
combination of infinite-order kernels with the new bandwidth selection
procedure produces a considerably improved estimate of the lifetime density and
hazard function surpassing the performance of competing estimators.
Infinite-order estimators are also utilized in a secondary manner as pilot
estimators in the plug-in approach for bandwidth choice in second-order
kernels. Simulations illustrate the improved accuracy of the proposed estimator
against other nonparametric estimators of the density and hazard function.
54.
On the Marginal Distributions of Stationary AR(1) Sequences - Satheesh, S; Sandhya, E
In this note we correct an omission in our paper (Satheesh and Sandhya, 2005)
in defining semi-selfdecomposable laws and also show with examples that the
marginal distributions of a stationary AR(1) process need not even be
infinitely divisible.
55.
Quantile and Probability Curves Without Crossing - Chernozhukov, Victor; Fernandez-Val, Ivan; Galichon, Alfred
The most common approach to estimating conditional quantile curves is to fit
a curve, typically linear, pointwise for each quantile. Linear functional
forms, coupled with pointwise fitting, are used for a number of reasons
including parsimony of the resulting approximations and good computational
properties. The resulting fits, however, may not respect a logical monotonicity
requirement -- that the quantile curve be increasing as a function of
probability. This paper studies the natural monotonization of these empirical
curves induced by sampling from the estimated non-monotone model, and then
taking the resulting conditional quantile curves that by construction are
monotone in the probability. This construction of monotone quantile curves may
be seen...
56.
Improving Estimates of Monotone Functions by Rearrangement - Chernozhukov, Victor; Fernandez-Val, Ivan; Galichon, Alfred
Suppose that a target function is monotonic, namely, weakly increasing, and
an original estimate of the target function is available, which is not weakly
increasing. Many common estimation methods used in statistics produce such
estimates. We show that these estimates can always be improved with no harm
using rearrangement techniques: The rearrangement methods, univariate and
multivariate, transform the original estimate to a monotonic estimate, and the
resulting estimate is closer to the true curve in common metrics than the
original estimate. We illustrate the results with a computational example and
an empirical example dealing with age-height growth charts.
57.
Recovery of edges from spectral data with noise -- a new perspective - Engelberg, Shlomo; Tadmor, Eitan
We consider the problem of detecting edges in piecewise smooth functions from
their N-degree spectral content, which is assumed to be corrupted by noise.
There are three scales involved: the "smoothness" scale of order 1/N, the noise
scale of order $\eta$ and the O(1) scale of the jump discontinuities. We use
concentration factors which are adjusted to the noise variance, $\eta$ >> 1/N,
in order to detect the underlying O(1)-edges, which are separated from the
noise scale, $\eta$ << 1.
58.
Semiparametric efficiency in GMM models with auxiliary data - Chen, Xiaohong; Hong, Han; Tarozzi, Alessandro
We study semiparametric efficiency bounds and efficient estimation of
parameters defined through general moment restrictions with missing data.
Identification relies on auxiliary data containing information about the
distribution of the missing variables conditional on proxy variables that are
observed in both the primary and the auxiliary database, when such distribution
is common to the two data sets. The auxiliary sample can be independent of the
primary sample, or can be a subset of it. For both cases, we derive bounds when
the probability of missing data given the proxy variables is unknown, or known,
or belongs to a correctly specified parametric family. We find that the
conditional probability is...
59.
Support vector machine for functional data classification - Rossi, Fabrice; Villa, Nathalie
In many applications, input data are sampled functions taking their values in
infinite dimensional spaces rather than standard vectors. This fact has complex
consequences on data analysis algorithms that motivate modifications of them.
In fact most of the traditional data analysis tools for regression,
classification and clustering have been adapted to functional inputs under the
general name of functional Data Analysis (FDA). In this paper, we investigate
the use of Support Vector Machines (SVMs) for functional data analysis and we
focus on the problem of curves discrimination. SVMs are large margin classifier
tools based on implicit non linear mappings of the considered data into high
dimensional spaces thanks to...
60.
Un r\'esultat de consistance pour des SVM fonctionnels par interpolation
spline - Villa, Nathalie; Rossi, Fabrice
This Note proposes a new methodology for function classification with Support
Vector Machine (SVM). Rather than relying on projection on a truncated Hilbert
basis as in our previous work, we use an implicit spline interpolation that
allows us to compute SVM on the derivatives of the studied functions. To that
end, we propose a kernel defined directly on the discretizations of the
observed functions. We show that this method is universally consistent.